Commit Graph

33 Commits

Author SHA1 Message Date
Tom
b56aec763d refactor: speed up conversion from dataframe to list of records (#12806) 2021-02-07 12:01:28 -08:00
Ville Brofeldt
980dd2fd41 pylint: accept specific 2 character names by default (#9460)
* lint: accept 2 letter names by default

* Address review comments

* Remove e and d from good-names
2020-04-08 20:32:26 +03:00
Rob DiCiuccio
6537d5ed8c Replace pandas.DataFrame with PyArrow.Table for nullable int typing (#8733)
* Use PyArrow Table for query result serialization

* Cleanup dev comments

* Additional cleanup

* WIP: tests

* Remove explicit dtype logic from db_engine_specs

* Remove obsolete  column property

* SupersetTable column types

* Port SupersetDataFrame methods to SupersetTable

* Add test for nullable boolean columns

* Support datetime values with timezone offsets

* Black formatting

* Pylint

* More linting/formatting

* Resolve issue with timezones not appearing in results

* Types

* Enable running of tests in tests/db_engine_specs

* Resolve application context errors

* Refactor and add tests for pyodbc.Row conversion

* Appease isort, regardless of isort:skip

* Re-enable RESULTS_BACKEND_USE_MSGPACK default based on benchmarks

* Dataframe typing and nits

* Renames to reduce ambiguity
2020-01-03 11:55:39 -05:00
Ville Brofeldt
5b690f9411 chore: refactor, add typing and fix uncovered errors (#8900)
* Add type annotations and fix inconsistencies

* Address review comments

* Remove incorrect typing of jsonable obj
2019-12-31 09:26:23 +02:00
John Bodley
9fc37ea9f1 [ci] Deprecate flake8 (#8409)
* [ci] Deprecate flake8

* Addressing @villebro's comments
2019-10-18 14:44:27 -07:00
Beto Dealmeida
7090725de9 Fix no data in Presto (#8268)
* Fix no data in Presto

* Fix test
2019-09-20 14:31:13 +02:00
Beto Dealmeida
8e1fc2b0ba Fix array casting (#8253) 2019-09-18 13:32:58 -07:00
Beto Dealmeida
b9be01fcd8 Handle int64 columns with missing data in SQL Lab (#8226)
* Handle int64 columns with missing data in SQL Lab

* Fix docstring

* Add unit test

* Small fix

* Small fixes

* Fix cursor description update

* Better fix

* Fix unit test, black

* Fix nan comparison in unit test
2019-09-17 08:16:09 -07:00
Rob DiCiuccio
7595d9e5fd [SQL Lab] Async query results serialization with MessagePack and PyArrow (#8069)
* Add support for msgpack results_backend serialization

* Serialize DataFrame with PyArrow rather than JSON

* Adjust dependencies, de-lint

* Add tests for (de)serialization methods

* Add MessagePack config info to Installation docs

* Enable msgpack/arrow serialization by default

* [Fix] Prevent msgpack serialization on synchronous queries

* Add type annotations
2019-08-27 14:23:40 -07:00
Ville Brofeldt
f53acd84a7 Bump pandas to 0.24 (#7852) 2019-07-14 08:38:11 -07:00
John Bodley
5c58fd1802 [format] Using Black (#7769) 2019-06-25 13:34:48 -07:00
Maxime Beauchemin
f742b9876b Making thrift, pyhive and tableschema as extra_requires (#6696)
* Making thrift, pyhive and tableschema as extra_requires

Looking at the dependency tree for license related questions, I noticed
that tableschema had a huge tree, and only people running Hive really
need it. Making this as well as pyhive and thrift optional.

Also bumping some python dependencies

* Run pip-compile

* Removing refs to past.builtins (from future lib)

* Add thrift
2019-01-19 14:27:18 -08:00
Maxime Beauchemin
1dd4d7a587 Apply ASF licenses throughout the code base (#5800)
* Add license headers

* reabased

* lint

* Removing licenses from vendors folder
2019-01-15 15:53:27 -08:00
Maxime Beauchemin
bbfd69a138 [utils.py] gathering/refactoring into a "utils/" folder (#6095)
* [utils] gathering/refactoring into a "utils/" folder

Moving current utils.py into utils/core.py and moving other *util*
modules under this new "utils/" as well.

Following steps include eroding at "utils/core.py" and breaking it down
into smaller modules.

* Improve tests

* Make loading examples in scope for tests

* Remove test class attrs examples_loaded and requires_examples
2018-10-16 17:59:34 -07:00
timifasubaa
46c86672c8 remove utf8 declaration (#6096) 2018-10-15 11:53:24 -07:00
timifasubaa
dd9eeda03e remove future (#6065) 2018-10-13 09:39:04 -07:00
Ville Brofeldt
77fe9ef130 Force quoted column aliases for Oracle-like databases (#5686)
* Replace dataframe label override logic with table column override

* Add mutation to any_date_col

* Linting

* Add mutation to oracle and redshift

* Fine tune how and which labels are mutated

* Implement alias quoting logic for oracle-like databases

* Fix and align column and metric sqla_col methods

* Clean up typos and redundant logic

* Move new attribute to old location

* Linting

* Replace old sqla_col property references with function calls

* Remove redundant calls to mutate_column_label

* Move duplicated logic to common function

* Add db_engine_specs to all sqla_col calls

* Add missing mydb

* Add note about snowflake-sqlalchemy regression

* Make db_engine_spec mandatory in sqla_col

* Small refactoring and cleanup

* Remove db_engine_spec from get_from_clause call

* Make db_engine_spec mandatory in adhoc_metric_to_sa

* Remove redundant mutate_expression_label call

* Add missing db_engine_specs to adhoc_metric_to_sa

* Rename arg label_name to label in get_column_label()

* Rename label function and add docstring

* Remove redundant db_engine_spec args

* Rename col_label to label

* Remove get_column_name wrapper and make direct calls to db_engine_spec

* Remove unneeded db_engine_specs

* Rename sa_ vars to sqla_
2018-09-03 22:49:58 -07:00
Maxime Beauchemin
6e8c7f7b20 [viz flow] detect TIMESTAMP, transition to line chart (#5634)
* [viz flow] detect TIMESTAMP, transition to line chart

* Refactor is_date
2018-08-21 21:33:34 -07:00
Maxime Beauchemin
4c2be71e83 [bugfix] TIMESTAMP not detected as date (#5629)
* [bugfix] TIMESTAMP not detected as date

* minor tweak
2018-08-14 13:01:28 -07:00
Ville Brofeldt
e1f4db8e24 Match viz dataframe column case to form_data fields for Snowflake, Oracle and Redshift (#5487)
* Add function to fix dataframe column case

* Fix broken handle_nulls method

* Add case sensitivity option to dedup

* Refactor function definition and call location

* Remove added blank line

* Move df column rename logit to db_engine_spec

* Remove redundant variable

* Update comments in db_engine_specs

* Tie df adjustment to db_engine_spec class attribute

* Fix dedup error

* Linting

* Check for db_engine_spec attribute prior to adjustment

* Rename case sensitivity flag

* Linting

* Remove function that was moved to db_engine_specs

* Get metrics names from utils

* Remove double import and rename dedup variable
2018-08-03 09:53:56 -07:00
Ville Brofeldt
a165aec822 Fix broken dedup and remove redundant db_spec logic (#5467)
* Fix broken dedup and remove redundant db_spec logic

* Add test case
2018-07-23 10:41:38 -07:00
Maxime Beauchemin
777d876a52 Improve database type inference (#4724)
* Improve database type inference

Python's DBAPI isn't super clear and homogeneous on the
cursor.description specification, and this PR attempts to improve
inferring the datatypes returned in the cursor.

This work started around Presto's TIMESTAMP type being mishandled as
string as the database driver (pyhive) returns it as a string. The work
here fixes this bug and does a better job at inferring MySQL and Presto types.
It also creates a new method in db_engine_specs allowing for other
databases engines to implement and become more precise on type-inference
as needed.

* Fixing tests

* Adressing comments

* Using infer_objects

* Removing faulty line

* Addressing PrestoSpec redundant method comment

* Fix rebase issue

* Fix tests
2018-06-27 21:35:12 -07:00
John Bodley
d533ce0967 [pylint] prepping for enabling pylint for non-errors (#4884) 2018-04-28 20:08:09 -07:00
fabianmenges
9a79d33e0d [BUGFIX]: JavaScripts max int is 2^53 - 1, longs are bigger (#4005)
* [BUGFIX]: Java scripts max int is 2^53 - 1 longs are bigger and frequently used as IDs this is a hacky fix.

* Keep tuple as tuple
2018-04-04 00:36:23 -07:00
John Bodley
d57a37e341 [flake8] Adding flake8-coding (#4477) 2018-02-25 15:06:11 -08:00
bolkedebruin
4ae77ba8af Workaround pandas bug in datetimes with time zones (#3910)
A bug in to_dict(orient="records") in pandas/core/frame.py prevents
datetimes with time zones to be worked with. This works around the
issue in superset by re-implementing the logic of pandas in the
correct way. Until pandas fixes the issue this code should stay.

https://github.com/pandas-dev/pandas/issues/18372

This closes #1929
2017-11-20 08:33:18 -08:00
John Bodley
17623f71d4 [flake8] Resolving C??? errors (#3787) 2017-11-07 21:32:45 -08:00
John Bodley
e2bca47421 [flake8] Resolve I??? errors (#3797) 2017-11-07 20:23:40 -08:00
fabianmenges
490c707eb6 Handling pandas ExtensionDtypes (#2937) 2017-09-12 09:36:28 -07:00
Maxime Beauchemin
1f8e48b374 [sqllab] assign types for visualize flow (#2458)
* [sqllab] assign types for visualize flow

Somehow when using the visualize flow, the types were not
assigned at all, creating some bugs downstream. This PR attempts to get
the information required based on what pandas is knows and the types in
the data itself.

* Fixing tests

* Fixing tests

* Fixing more tests

* Fixing the last py3 tests
2017-03-24 09:23:51 -07:00
Bogdan
0779da6d24 Keep column order in .csv (#2377) 2017-03-10 14:49:11 -08:00
Benedict Jin
1f58e18b6f Some code refactoring (#2139) 2017-02-08 11:52:58 -08:00
Maxime Beauchemin
15b67b2c6c [WiP] rename project from Caravel to Superset (#1576)
* Change in files

* Renamin files and folders

* cleaning up a single piece of lint

* Removing boat picture from docs

* add superset word mark

* Update rename note in docs

* Fixing images

* Pinning datatables

* Fixing issues with mapbox-gl

* Forgot to rename one file

* Linting

* v0.13.0

* adding pyyaml to dev-reqs
2016-11-09 23:08:22 -08:00