Commit Graph

152 Commits

Author SHA1 Message Date
John Bodley
c54b067c6a [db-engine-spec] Aligning Hive/Presto partition logic (#7007)
(cherry picked from commit 05be866117)
2019-03-17 23:24:55 -07:00
Maxime Beauchemin
f742b9876b Making thrift, pyhive and tableschema as extra_requires (#6696)
* Making thrift, pyhive and tableschema as extra_requires

Looking at the dependency tree for license related questions, I noticed
that tableschema had a huge tree, and only people running Hive really
need it. Making this as well as pyhive and thrift optional.

Also bumping some python dependencies

* Run pip-compile

* Removing refs to past.builtins (from future lib)

* Add thrift
2019-01-19 14:27:18 -08:00
Beto Dealmeida
00388811b6 Allow empty results in Hive (from SET, eg) (#6695)
* Allow empty results in Hive (from SET, eg)

* Remove patch

* Merge heads

* Delete merge heads
2019-01-18 10:11:59 -08:00
Ville Brofeldt
7ee8afb608 Improve support for BigQuery, Redshift, Oracle, Db2, Snowflake (#5827)
* Conditionally mutate and quote sqla labels decouple sqla logic from viz.py

* Prefix hashed label with underscore if bigquery label exceeds 128 chars

* Add comments for label cache

* Rename to mutated_labels and simply

* Rename mutated_label to get_label and simplify make_label_compatible in db_engine_specs

* Add note about deterministic and unique mutated labels

* add hash to label that has been prefixed with underscore

* Fix PEP8 escape warning

* Fix DeckPathViz get_metric_label call
2019-01-18 08:24:11 -08:00
Maxime Beauchemin
e03e276571 Bump some of the requirements-dev.txt (#6700)
* Bump some of the requirements-dev.txt

* addressing comments
2019-01-16 20:40:16 -08:00
Maxime Beauchemin
1dd4d7a587 Apply ASF licenses throughout the code base (#5800)
* Add license headers

* reabased

* lint

* Removing licenses from vendors folder
2019-01-15 15:53:27 -08:00
Chinh Nguyen
284a0cccd3 Add fix for pyodbc+mssql (#6621)
* add fix for odbc+mssql

* fix for pylint/pep8
2019-01-13 09:30:05 -08:00
ghsalem
f761237260 fixing issue #6572 with Oracle date handling (#6580)
* fix Oracle engine specs for dates issue #6572

* fix Oracle engine specs for dates issue #6572

* fix Oracle engine specs for dates issue #6572, removing comment

* ng a trailing space
2018-12-28 08:51:48 -08:00
Maxime Beauchemin
d427db0a8b [SQL Lab] Allow running multiple statements (#6112)
* Allow running multiple statements from SQL Lab

* fix tests

* More tests

* merge heads

* fix heads
2018-12-22 10:28:22 -08:00
Maxime Beauchemin
6e942c9fb3 Make boto3/botocore installation optional (#6540)
* Make boto3 installation optional

* pylinting
2018-12-21 12:27:57 -08:00
Ville Brofeldt
5bac723df4 Refactor teradata to new time_grain_functions spec (#6539)
* Refactor teradata to new time_grain_functions spec

* Add test for time_grain_functions
2018-12-16 08:53:29 -08:00
Beto Dealmeida
f366bbe735 Google spreadsheets (#5915)
* Google spreadsheets

* Fetch table metadata in SQL Lab

* Show full URL for spreadsheet

* Fix version

* Remove sqllab changes
2018-12-10 13:11:54 -08:00
Maxime Beauchemin
cc3a625a4b Use py3's f-strings instead of s.format(**locals()) (#6448)
* Use py3's f-strings instead of s.format(**locals())

In light of the bug reported here
https://github.com/apache/incubator-superset/issues/6347, which seems
like an odd `.format()` issue in py3, I greped and replaced all
instances of `.format(**locals())` using py3's f-strings

* lint

* fix tests
2018-12-02 13:50:49 -08:00
Junda Yang
f1cae2ecdd override get_view_names in PrestoEngineSpec (#6459)
* override get_view_names in PrestoEngineSpec

* add test

* flake 8

* flake 8
2018-11-28 15:13:38 -08:00
John Bodley
74f0817bf0 [hive] Fixing where lastest partition logic (#6357) 2018-11-12 10:07:38 -08:00
Junda Yang
c552c125d7 Move metadata cache one layer up (#6153)
* Update wording

* nit update for api endpoint url

* move metadata cache one layer up

* refactor cache

* fix flake8 and DatabaseTablesAsync

* nit

* remove logging for cache

* only fetch for all tables that allows cross schema fetch

* default allow_multi_schema_metadata_fetch to False

* address comments

* remove unused defaultdict

* flake 8
2018-10-31 13:23:26 -07:00
Sumedh Sakdeo
71d6ff40d0 partition and clustering bigquery keys (#6212)
* partition and clustering bigquery keys

* flake8
2018-10-29 11:23:21 -07:00
Maxime Beauchemin
bbfd69a138 [utils.py] gathering/refactoring into a "utils/" folder (#6095)
* [utils] gathering/refactoring into a "utils/" folder

Moving current utils.py into utils/core.py and moving other *util*
modules under this new "utils/" as well.

Following steps include eroding at "utils/core.py" and breaking it down
into smaller modules.

* Improve tests

* Make loading examples in scope for tests

* Remove test class attrs examples_loaded and requires_examples
2018-10-16 17:59:34 -07:00
Junda Yang
177bed3bb6 allow cache and force refresh on table list (#6078)
* allow cache and force refresh on table list

* wording

* flake8

* javascript test

* address comments

* nit
2018-10-16 13:14:45 -07:00
timifasubaa
46c86672c8 remove utf8 declaration (#6096) 2018-10-15 11:53:24 -07:00
timifasubaa
dd9eeda03e remove future (#6065) 2018-10-13 09:39:04 -07:00
Junda Yang
712c1aa767 Allow user to force refresh metadata (#5933)
* Allow user to force refresh metadata

* fix javascript test error

* nit

* fix styling

* allow custom cache timeout configuration on any database

* minor improvement

* nit

* fix test

* nit

* preserve the old endpoint
2018-10-08 20:25:40 -07:00
John Bodley
1ee08fc216 [select-star] Adding optional schema to view (#6051) 2018-10-08 10:32:40 -07:00
timifasubaa
00c4c7ec4b fix csv upload bugs (#5940) 2018-09-20 10:34:15 -05:00
livinm
83fa7af42a Enable Teradata (#5870)
* Enable Teradata 

New DB engine spec for Teradata:
- LimitMethod should be WRAP_SQL since Teradata does not supporting "LIMIT" clause  (TOP)
- Timegrains for Teradata is added

* Update formatting to pass flake8 tests
2018-09-13 08:01:25 -07:00
Ville Brofeldt
77fe9ef130 Force quoted column aliases for Oracle-like databases (#5686)
* Replace dataframe label override logic with table column override

* Add mutation to any_date_col

* Linting

* Add mutation to oracle and redshift

* Fine tune how and which labels are mutated

* Implement alias quoting logic for oracle-like databases

* Fix and align column and metric sqla_col methods

* Clean up typos and redundant logic

* Move new attribute to old location

* Linting

* Replace old sqla_col property references with function calls

* Remove redundant calls to mutate_column_label

* Move duplicated logic to common function

* Add db_engine_specs to all sqla_col calls

* Add missing mydb

* Add note about snowflake-sqlalchemy regression

* Make db_engine_spec mandatory in sqla_col

* Small refactoring and cleanup

* Remove db_engine_spec from get_from_clause call

* Make db_engine_spec mandatory in adhoc_metric_to_sa

* Remove redundant mutate_expression_label call

* Add missing db_engine_specs to adhoc_metric_to_sa

* Rename arg label_name to label in get_column_label()

* Rename label function and add docstring

* Remove redundant db_engine_spec args

* Rename col_label to label

* Remove get_column_name wrapper and make direct calls to db_engine_spec

* Remove unneeded db_engine_specs

* Rename sa_ vars to sqla_
2018-09-03 22:49:58 -07:00
Christine Chambers
ae3fb04036 Bug: fixing async syntax for python 3.7 (#5759)
* Bug: fixing async syntax for python 3.7

Rename async to async_ so superset installs for python 3.7.

* Addressing PR comments. Use kwargs instead of explicitly specifying async_ so downstream engines (e.g. PyHive) that supports async can choose to use the async_ in pythonwq3.7 and async in <=python3.6

* addressing additional pr comments
2018-08-28 17:40:45 -07:00
Sumedh Sakdeo
80e777823b Field names in big query can contain only alphanumeric and underscore (#5641)
* Field names in big query can contain only alphanumeric and underscore

* bad quote

* better place for mutating labels

* lint

* bug fix thanks to mistercrunch

* lint

* lint again
2018-08-21 13:45:42 -07:00
Sumedh Sakdeo
0fbda33c68 Handling bigquery dialect when previewing data (#5655)
* Handling bigquery dialect when previewing data

* review comments

* lint
2018-08-20 22:04:22 -07:00
Sumedh Sakdeo
5966a674e5 Explore View Perf Fix (#5637) 2018-08-15 12:27:08 -07:00
Sumedh Sakdeo
c9bd5a6167 Fetch a batch of rows from bigquery (#5632)
* Fetch a batch of rows from bigquery

* unused const

* review comments
2018-08-14 21:44:04 -07:00
Ville Brofeldt
e1f4db8e24 Match viz dataframe column case to form_data fields for Snowflake, Oracle and Redshift (#5487)
* Add function to fix dataframe column case

* Fix broken handle_nulls method

* Add case sensitivity option to dedup

* Refactor function definition and call location

* Remove added blank line

* Move df column rename logit to db_engine_spec

* Remove redundant variable

* Update comments in db_engine_specs

* Tie df adjustment to db_engine_spec class attribute

* Fix dedup error

* Linting

* Check for db_engine_spec attribute prior to adjustment

* Rename case sensitivity flag

* Linting

* Remove function that was moved to db_engine_specs

* Get metrics names from utils

* Remove double import and rename dedup variable
2018-08-03 09:53:56 -07:00
Maxime Beauchemin
fe6846b8db [sql lab] simplify the visualize flow (#5523)
* [sql lab] simplify the visualize flow

The "visualize flow" linking SQL Lab to the "explore view" has never
worked so great for people, here's a list of issues:

* it's not really clear to users that their query is wrapped as a
subquery, and the explore view runs queries on top of it

* lint + fix tests

* Addressing comments
2018-08-02 10:52:38 -07:00
Ville Brofeldt
c1e6c68a3e Add time grain blacklist and addons to config.py (#5380)
* Add interim grains

* Refactor and add blacklist

* Change PT30M to PT0.5H

* Linting

* Linting

* Add time grain addons to config.py and refactor engine spec logic

* Remove redundant import and clean up config.py

* Fix bad rebase

* Implement changes proposed by @betodealmeida

* Revert removal of name from Grain

* Linting
2018-07-30 23:44:30 -07:00
Maxime Beauchemin
cd55998d63 Improve hive/pyhive error message regex (#5502) 2018-07-27 08:31:37 -07:00
Maxime Beauchemin
41286b7545 [sql lab] extract Hive error messages (#5495)
* [sql lab] extract Hive error messages

So pyhive returns an exception object with a stringified thrift error
object. This PR uses a regex to extract the errorMessage portion of that
string.

* Unit test
2018-07-26 15:17:55 -07:00
Ville Brofeldt
a165aec822 Fix broken dedup and remove redundant db_spec logic (#5467)
* Fix broken dedup and remove redundant db_spec logic

* Add test case
2018-07-23 10:41:38 -07:00
John Bodley
7fcc2af68f [sql] Correct SQL parameter formatting (#5178) 2018-07-21 12:01:26 -07:00
George
0d5443e392 Add week granularity for Clickhouse (#5455) 2018-07-21 09:53:21 -07:00
timifasubaa
f8a6e09220 [sqllab] Fix sqllab limit regex issue with sqlparse (#5295)
* include items after limit to the modified query

* use sqlparse
2018-07-16 15:27:30 -07:00
timifasubaa
22b7c2db62 quote hive column names (#5368) 2018-07-13 15:51:16 -07:00
timifasubaa
28ba5a9ddb use schema form field in upload csv (#5303) 2018-07-06 09:46:53 -07:00
aaronbannin
252cba20de impala support for epoch timestamps (#5349) 2018-07-04 19:25:58 -04:00
EvelynTurner
ad9103f5ba [Bug fix] Divide by 1000.000 in epoch_ms_to_dttm() to not lose precision in Presto (#5211)
* Fix how the annotation layer interpretes the timestamp string without timezone info; use it as UTC

* [Bug fix] Fixed/Refactored annotation layer code so that non-timeseries annotations are applied based on the updated chart object after adding all data

* [Bug fix] Fixed/Refactored annotation layer code so that non-timeseries annotations are applied based on the updated chart object after adding all data

* Fixed indentation

* Fix the key string value in case series.key is a string

* Fix the key string value in case series.key is a string

* [Bug fix] Divide by 1000.000 in epoch_ms_to_dttm() to not lose precision in Presto

* [Bug fix] Divide by 1000.000 in epoch_ms_to_dttm() to not lose precision in Presto
2018-07-04 19:19:57 -04:00
Minh Mai
059b64dad7 normalize column names for Redshift (#5337) 2018-07-04 17:30:37 -04:00
Maxime Beauchemin
777d876a52 Improve database type inference (#4724)
* Improve database type inference

Python's DBAPI isn't super clear and homogeneous on the
cursor.description specification, and this PR attempts to improve
inferring the datatypes returned in the cursor.

This work started around Presto's TIMESTAMP type being mishandled as
string as the database driver (pyhive) returns it as a string. The work
here fixes this bug and does a better job at inferring MySQL and Presto types.
It also creates a new method in db_engine_specs allowing for other
databases engines to implement and become more precise on type-inference
as needed.

* Fixing tests

* Adressing comments

* Using infer_objects

* Removing faulty line

* Addressing PrestoSpec redundant method comment

* Fix rebase issue

* Fix tests
2018-06-27 21:35:12 -07:00
timifasubaa
b0eee129e9 add more precise types to hive table from csv (#5267) 2018-06-25 16:12:01 -07:00
timifasubaa
bd24f854c9 specify hve namespace for tables (#5268) 2018-06-25 12:04:27 -07:00
timifasubaa
0e5293b9be Update db_engine_specs.py (#5264) 2018-06-21 16:01:34 -07:00
Maxime Beauchemin
c89933d870 [sql lab] quote schema and table name (#5195)
fixes https://github.com/apache/incubator-superset/issues/4595
2018-06-18 08:42:08 -07:00