This problem especially happens with pinot when you select two metrics
with different aliases but same function. For example, effectively the
sql like 'select type, count(*) as one, count(*) as two from bar group
by type'. In such a case, pinot will return two columns, both named
count_star.
So when we try to do a df['count_star'], the result is a Dataframe and
not a Series. This causes a KeyError in the get_df method.
So we push the DB specific label mutation inside get_df before we do any
other mutation.
This PR introduces the backend changes for a tagging system for Superset, allowing dashboards, charts and queries to be tagged. It also allows searching for a given tag, and will be the basis for a new landing page (see #5327).
# Implicit tags
Dashboard, chart and (saved) queries have implicit tags related to their owners, types and favorites. For example, all objects owned by the admin have the tag `owner:1`. All charts have the tag `type:chart`. Objects favorited by the admin have the tag `favorited_by:1`.
These tags are automatically added by a migration script, and kept in sync through SQLAlchemy event listeners. They are currently not surfaced to the user, but can be searched for. For example, it's possible to search for `owner:1` in the welcome page to see all objects owned by the admin, or even search for `owner:{{ current_user_id() }}`.
* Revert "creating new circular-json safe stringify and replacing one call (#6772)"
This reverts commit 11a7ad00b7.
* Revert "Improve Unicode support for MSSQL (#6690)"
This reverts commit c44ae612df.
* Revert "Fix uniqueness constraints on tables table (#6718)"
This reverts commit c4fb7a0a87.
Summary: Superset code enforces (in Table crud view pre_add) that the
table is unique within <database, schema, table_name). Indeed in commit
15b67b2c6c (in 2016), the model was
updated to reflect that. However, it was never ported over to a
migration.
I am fixing that in this diff. I am choosing to make this be a new
migration instead of fixing an existing one since I want to fix existing
installations also cleanly.
I also considered removing the uniqueness constraint, but that won't
work: First because anyway there are other places where the <database,
schema, table> uniqueness is enforced in code. But also, the .sql field
isn't a first citizen yet: The schema of the table is picked up from the
table-name and the sql part is only used when creating the explore
query. So indeed we want this uniqueness constraint. (Also it breaks the
unit tests in dict_import_export_tests.py)
[Perhaps it can be removed when we have true .sql support, but for now
the user would have to create a database view and he can use that as the
'table name'. That way he gets schema inference also]
Also added INFO logging to the alembic migration.
* [scheduled reports] Add support for scheduled reports
* Scheduled email reports for slice and dashboard visualization
(attachment or inline)
* Scheduled email reports for slice data (CSV attachment on inline table)
* Each schedule has a list of recipients (all of them can receive a single mail,
or separate mails)
* All outgoing mails can have a mandatory bcc - for audit purposes.
* Each dashboard/slice can have multiple schedules.
In addition, this PR also makes a few minor improvements to the celery
infrastructure.
* Create a common celery app
* Added more celery annotations for the tasks
* Introduced celery beat
* Update docs about concurrency / pools
* [scheduled reports] - Debug mode for scheduled emails
* [scheduled reports] - Ability to send test mails
* [scheduled reports] - Test email functionality - minor improvements
* [scheduled reports] - Rebase with master. Minor fixes
* [scheduled reports] - Add warning messages
* [scheduled reports] - flake8
* [scheduled reports] - fix rebase
* [scheduled reports] - fix rebase
* [scheduled reports] - fix flake8
* [scheduled reports] Rebase in prep for merge
* Fixed alembic tree after rebase
* Updated requirements to latest version of packages (and tested)
* Removed py2 stuff
* [scheduled reports] - fix flake8
* [scheduled reports] - address review comments
* [scheduled reports] - rebase with master
* Use py3's f-strings instead of s.format(**locals())
In light of the bug reported here
https://github.com/apache/incubator-superset/issues/6347, which seems
like an odd `.format()` issue in py3, I greped and replaced all
instances of `.format(**locals())` using py3's f-strings
* lint
* fix tests
* Deprecate database attribute allow_run_sync
There's really 2 modes of operations in SQL Lab, sync or async
though currently there are 2 boolean flags: allow_run_sync and
allow_run_async, leading to 4 potential combinations, only 2 of which
actually makes sense.
The original vision is that we'd expose the choice to users and they
would decide which `Run` or `Run Async` button to hit.
Later on we decided to have a
single button and for each database to be either sync or async.
This PR cleans up allow_run_sync by removing references to it.
* Fix build
* Add db migration
* using batch_op
* [SIP-5] Open a new /api/v1/query endpoint that takes query_obj
- Introduce a new handle_superset_exception decorator to avoid repeating the logic for catching SupersetExceptions
- Create a query_obj_backfill method that takes form_data and constructs a query_obj that will be constructed in the client in the future. Use the backfill in explore_json.
- Create a new /api/v1/query endpoint that takes query_obj only and returns the payload data. Note the query_obj is constructed in the client. The endpoint currently only handles query_obj for table view viz (we'll be adding support to new viz types as we go).
- Unit test to verify the new endpoint for table view
* fix tests and lint errors
* - Move the new query endpoint into its own api.py view.
- Create QueryObject and QueryContext class to encapsulate query_object to be built from the client and additional info (e.g. datasource) needed to get the data payload for a given query
- Remove the query_obj_backfill as we'll start building the first query_object on the client so it no longer makes sense to have a short-lived backfill for the matter of days.
* Fixing lint and test errors
* Fixing additional lint error from the previous rebase.
* fixing additional lint error
* addressing additional pr comments
* Make /query accept a list of queries in the query_context object.
* fixing a lint error
* - Move time_shift based calculation and since, until check into util
- Add typing info for get_since_until
- Add new unit tests to verify time_shift calculation and the since until check
* [utils] gathering/refactoring into a "utils/" folder
Moving current utils.py into utils/core.py and moving other *util*
modules under this new "utils/" as well.
Following steps include eroding at "utils/core.py" and breaking it down
into smaller modules.
* Improve tests
* Make loading examples in scope for tests
* Remove test class attrs examples_loaded and requires_examples
* Allow user to force refresh metadata
* fix javascript test error
* nit
* fix styling
* allow custom cache timeout configuration on any database
* minor improvement
* nit
* fix test
* nit
* preserve the old endpoint
* Add schema level access control on csv upload
* add db migrate merge point
* fix flake 8
* fix test
* remove unnecessary db migration
* fix flake
* nit
* fix test for test_schemas_access_for_csv_upload_endpoint
* fix test_csv_import test
* use security_manager to check whether schema is allowed to be accessed
* bring security manager to the party
* flake8 & repush to retrigger test
* address comments
* remove trailing comma
* add column to annotation model
* update migration file
* change to Text
* remove old migration file
* add migration file
* remove JSON
* add new column to view
* add comma back
* linting
* lint some more
* missing comma
* rename columns
* add degrade and new migration file
* update version
* fixe changed name
* remove json from list columns
* Template dashboard
* Fix MySQL test
* Model for user attributes
* Redirect to welcome dash
* Fix lint
* Add missing file
* Add migration script
* Fix lint
* Fix more lint
* Improve URLs for Chart and Dashboard ModelViews
Prior to this, the ModelView for Chart and Dashboard would be
at `/slicemodelview/list/` and `/dashboardmodelview/list/`.
Now we have cleaner URLs at `/chart/list/` and `/dashboard/list/`
* Fix unrelated js lint
* addressing comments
* [sql lab] simplify the visualize flow
The "visualize flow" linking SQL Lab to the "explore view" has never
worked so great for people, here's a list of issues:
* it's not really clear to users that their query is wrapped as a
subquery, and the explore view runs queries on top of it
* lint + fix tests
* Addressing comments
* Add interim grains
* Refactor and add blacklist
* Change PT30M to PT0.5H
* Linting
* Linting
* Add time grain addons to config.py and refactor engine spec logic
* Remove redundant import and clean up config.py
* Fix bad rebase
* Implement changes proposed by @betodealmeida
* Revert removal of name from Grain
* Linting
The dashboard page bootstrap data currently attempts to generate the
`SELECT` statement with column name details for each table represented
in the dash. This means it calls the database(s) and waits for column
details prior to returning any HTML.
This makes the default select_star property generate a simple
`SELECT *` with no column details instead, which doesn't require
interogating the DBs.
* WIP
* Working version
* Clean code
* Fix lint
* Fix unit test; show only for sqla
* Working on UX
* Dropdown working 66%
* Working but needs CSS
* Fixed table
* Fix lint
* Fix unit test
* Fix languages path
* Fixes
* Fix Javascript lint