Compare commits

..

7 Commits

Author SHA1 Message Date
Beto Dealmeida
8bda849d9c Fix lint 2026-05-19 20:18:50 -04:00
Beto Dealmeida
cd8d1c1c92 Custom form 2026-05-19 20:14:26 -04:00
Beto Dealmeida
00d75f962b OAuth2 2026-05-19 19:43:37 -04:00
Beto Dealmeida
3fb7c26b66 Pass configuration 2026-05-19 16:34:57 -04:00
Beto Dealmeida
41bf352f95 feat: semantic layer DB engine spec 2026-05-19 16:08:24 -04:00
Evan Rusackas
1230b9091b docs: hide Component Playground top-level nav item (#40247)
Co-authored-by: Claude Code <noreply@anthropic.com>
2026-05-19 09:32:53 -07:00
madhushreeag
852d0182b5 fix(roles): prevent 404 and silent user removal on large role edits (#40178)
Co-authored-by: madhushree agarwal <madhushree_agarwal@apple.com>
2026-05-19 09:13:43 -07:00
7 changed files with 785 additions and 172 deletions

View File

@@ -1,168 +0,0 @@
---
title: Dashboard Performance
hide_title: true
sidebar_position: 5
version: 1
---
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
# Dashboard Performance
A dashboard's perceived speed is determined by three independent things: how
many charts have to render, how many queries the backend can execute
concurrently, and how quickly the underlying data warehouse can return
results. Superset gives you levers for the first two; the third belongs to
your warehouse. This page covers the dashboard-side levers and the practical
guidance around them.
## Is there a maximum chart count per dashboard?
**No hard limit is enforced** — Superset has no configuration key that
caps the number of charts on a dashboard. In practice, dashboards behave
well up to a few dozen charts. Beyond that, you'll typically feel friction
on the initial load and during cross-filter / time-range updates, even with
the lazy-loading optimizations described below.
Rough thresholds to keep in mind:
- **Under ~25 charts**: usually no perceptible problem.
- **2550 charts**: still fine, but you start to want tabs to break the
page into chunks the user actually looks at.
- **Over ~50 charts**: split into multiple dashboards or use tabs
aggressively. The bottleneck is rarely Superset itself — it's the
warehouse executing dozens of queries in parallel and the browser
rendering dozens of chart frames.
These are guidelines, not guarantees. A dashboard of 100 sparkline-style
charts hitting a fast cache behaves very differently from a dashboard of
20 heavy aggregations against a cold warehouse.
## Lazy rendering — `DASHBOARD_VIRTUALIZATION`
Superset's dashboard layout is virtualized at the row level. Charts that
are far below the user's current scroll position are not rendered (and
therefore don't fetch data) until the user scrolls them into view, and they
are unmounted again if scrolled well past. This is on by default.
**Feature flag**: `DASHBOARD_VIRTUALIZATION` (default: `True`)
The flag is `stable` and marked for path-to-deprecation — meaning the
behavior will eventually be non-optional, but the flag still exists so
operators can disable it if a specific layout misbehaves.
**Behavior** (from `superset-frontend/src/dashboard/components/gridComponents/Row/Row.tsx`):
- A chart is rendered when its row scrolls within **1 viewport height** of
the visible area.
- A chart is unmounted when its row scrolls more than **4 viewport
heights** away from the visible area.
- Tabs that aren't currently selected don't render their content at all
(see below).
- The unmounting half is skipped in **embedded** mode (so an embedded
dashboard keeps its charts mounted once they've been seen, which avoids
re-fetching on scroll-up). Both halves are skipped for **headless /
bot** rendering (so screenshot / report jobs load every chart).
## Deferred data fetch — `DASHBOARD_VIRTUALIZATION_DEFER_DATA`
By default, `DASHBOARD_VIRTUALIZATION` controls *rendering* — but charts
that don't render also don't fetch data, because Superset's chart
components issue their data request on mount. `DASHBOARD_VIRTUALIZATION_DEFER_DATA`
is a supplementary flag that further defers the data request itself, useful
for backends where opening a connection or compiling a query is expensive
even if the result is later thrown away.
**Feature flag**: `DASHBOARD_VIRTUALIZATION_DEFER_DATA` (default: `False`)
Enable this if you see warehouse load spike on dashboard *open* even
though most charts are off-screen.
## Per-tab lazy loading
**This is on by default and has no flag.** A tab's content is not rendered
until the user activates that tab, so charts inside an unselected tab do
not fetch data on dashboard open. When the user clicks the tab, that
tab's charts mount and fetch in the normal way.
Practically: tabs are the single most effective tool for a large
dashboard. Splitting 60 charts across 4 tabs effectively turns dashboard
open into "load ~15 charts," and the remaining ones lazy-load only if the
user goes looking.
## Is there a switch to cap concurrent chart queries?
**No.** Superset does not implement a frontend-side concurrent-request
limiter. Each chart issues its own data request when it mounts, and the
browser handles parallelism (typically ~6 in-flight HTTP requests per
origin, then the rest queue). Backend throughput is bounded by your
Gunicorn worker count for synchronous query execution, or by your Celery
worker pool when [async queries](./async-queries-celery.mdx) are enabled.
If you need to throttle warehouse load, the right place is:
1. The warehouse itself (connection pool / concurrency limits).
2. Superset's Celery configuration (smaller worker pool when async
queries are on).
3. Splitting heavy charts across tabs or separate dashboards (each
dashboard load only fetches what's visible).
## Splitting strategies
When a dashboard outgrows comfortable performance, the options in order
of effort:
**1. Move sections into tabs.** Same dashboard, but only the active tab's
charts fetch. This is the cheapest change and often the only one needed.
**2. Cache aggressively.** A Redis cache backend (see
[Caching](./cache.mdx)) means repeat dashboard loads serve from cache
rather than re-hitting the warehouse. This is especially impactful for
dashboards opened by many users in close succession.
**3. Enable async queries.** [Async query execution](./async-queries-celery.mdx)
via Celery decouples query duration from request lifetime, so a slow
chart doesn't block the page. The user sees other charts come in as
their queries complete.
**4. Split into multiple dashboards.** Group related charts into purpose-
specific dashboards rather than one mega-dashboard. Link them from a
landing dashboard or a navigation menu.
**5. Pre-aggregate at the warehouse level.** If the same expensive
aggregation appears across many charts, materialize it as a view or
scheduled table in the warehouse so each chart query is a cheap lookup.
## Operational notes
- The feature flags above are set in `superset_config.py`, e.g.:
```python
FEATURE_FLAGS = {
"DASHBOARD_VIRTUALIZATION": True,
"DASHBOARD_VIRTUALIZATION_DEFER_DATA": True,
}
```
- See [Feature Flags](./feature-flags.mdx) for the full list of supported
flags and their lifecycle stages.
- Report and screenshot jobs (alerts, scheduled reports, dashboard
exports) intentionally bypass row virtualization so the rendered
artifact includes every chart, not just the ones above the fold.

View File

@@ -70,7 +70,7 @@
}
},
"components": {
"disabled": false,
"disabled": true,
"lastVersion": "current",
"includeCurrentVersion": true,
"onlyIncludeVersions": [

View File

@@ -169,6 +169,7 @@ describe('RoleListEditModal', () => {
true,
);
// updateRoleUsers is called with the hydrated user IDs
const userArg = mockUpdateRoleUsers.mock.calls[0][1];
expect(userArg).toEqual([5, 7]);
expect(userArg.every((id: unknown) => typeof id === 'number')).toBe(true);
@@ -225,6 +226,8 @@ describe('RoleListEditModal', () => {
expect(decodedQuery).toEqual({
page_size: 100,
page: 0,
order_column: 'id',
order_direction: 'asc',
filters: [
{
col: 'roles',

View File

@@ -129,8 +129,18 @@ function RoleListEditModal({
fetchPaginatedData({
endpoint: `/api/v1/security/users/`,
pageSize: 100,
setData: setRoleUsers,
setData: (users: UserObject[]) => {
const seen = new Set<number>();
setRoleUsers(
users.filter(u => {
if (seen.has(u.id)) return false;
seen.add(u.id);
return true;
}),
);
},
filters,
orderBy: { column: 'id', direction: 'asc' },
setLoadingState: (loading: boolean) => setLoadingRoleUsers(loading),
loadingKey: 'roleUsers',
addDangerToast,
@@ -218,7 +228,6 @@ function RoleListEditModal({
value: user.id,
label: user.username,
}));
formRef.current.setFieldsValue({
roleUsers: userOptions,
});
@@ -279,8 +288,8 @@ function RoleListEditModal({
const handleFormSubmit = async (values: RoleForm) => {
try {
const userIds = values.roleUsers?.map(user => user.value) || [];
const permissionIds = mapSelectedIds(values.rolePermissions);
const userIds = mapSelectedIds(values.roleUsers);
const groupIds = mapSelectedIds(values.roleGroups);
await Promise.all([
updateRoleName(id, values.roleName),

View File

@@ -28,6 +28,7 @@ interface FetchPaginatedOptions {
setData: (data: any[]) => void;
setLoadingState: Dispatch<SetStateAction<any>>;
filters?: SupersetFilter[];
orderBy?: { column: string; direction: 'asc' | 'desc' };
loadingKey: string;
addDangerToast: (message: string) => void;
errorMessage?: string;
@@ -38,6 +39,8 @@ interface QueryObj {
page_size: number;
page: number;
filters?: SupersetFilter[];
order_column?: string;
order_direction?: 'asc' | 'desc';
}
interface SupersetFilter {
@@ -51,6 +54,7 @@ export const fetchPaginatedData = async ({
pageSize = 100,
setData,
filters,
orderBy,
setLoadingState,
loadingKey,
addDangerToast,
@@ -66,6 +70,10 @@ export const fetchPaginatedData = async ({
if (filters) {
queryObj.filters = filters;
}
if (orderBy) {
queryObj.order_column = orderBy.column;
queryObj.order_direction = orderBy.direction;
}
const encodedQuery = rison.encode(queryObj);
const response = await SupersetClient.get({

View File

@@ -0,0 +1,348 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
"""
An interface to any server implementing the Semantic Layer REST API.
"""
from __future__ import annotations
from typing import Any, TYPE_CHECKING, TypedDict
from apispec import APISpec
from apispec.ext.marshmallow import MarshmallowPlugin
from flask import g
from marshmallow import fields, Schema
from shillelagh.exceptions import UnauthenticatedError
from sqlalchemy.engine.url import URL
from superset.databases.types import EncryptedDict
from superset.databases.utils import make_url_safe
from superset.db_engine_specs.base import MetricType
from superset.db_engine_specs.shillelagh import ShillelaghEngineSpec
from superset.errors import ErrorLevel, SupersetError, SupersetErrorType
from superset.exceptions import OAuth2TokenRefreshError
from superset.utils import json
if TYPE_CHECKING:
from sqlalchemy.engine.reflection import Inspector
from superset.models.core import Database
from superset.sql.parse import Table
from superset.superset_typing import ResultSetColumnType
SELECT_STAR_MESSAGE = (
"The Semantic Layer API does not support data preview, since the view is a "
"virtual table that is not materialized. An administrator should configure "
'the database in Apache Superset so that the "Disable SQL Lab data preview '
'queries" option under "Advanced""SQL Lab" is enabled.'
)
_TRUTHY = {"1", "true", "yes", "on"}
ma_plugin = MarshmallowPlugin()
class SemanticAPIParametersSchema(Schema):
"""Form schema for the Semantic Layer API connection wizard."""
host = fields.Str(
required=True,
metadata={"description": "Hostname of the Semantic Layer API server."},
)
port = fields.Int(
required=False,
allow_none=True,
metadata={"description": "Optional port (omit to use the URL default)."},
)
secure = fields.Bool(
load_default=False,
metadata={"description": "Use HTTPS to reach the server."},
)
additional_configuration = fields.Dict(
required=False,
allow_none=True,
metadata={
"description": (
"Per-tenant JSON object forwarded to the server on every call. "
"Sent as ``runtime_configuration`` when listing views and as "
"``additional_configuration`` for each view."
),
},
)
oauth2_client_info = EncryptedDict(
required=False,
allow_none=True,
metadata={
"description": (
"OAuth2 client credentials. Provide ``id`` and ``secret``; the "
"authorisation and token URIs are auto-filled from the server "
"host unless explicitly set."
),
"default": {"id": "", "secret": "", "scope": ""},
},
)
class SemanticAPIParametersType(TypedDict, total=False):
host: str
port: int | None
secure: bool
additional_configuration: dict[str, Any] | None
oauth2_client_info: dict[str, Any] | None
class SemanticAPIPropertiesType(TypedDict, total=False):
parameters: SemanticAPIParametersType
masked_encrypted_extra: str
class SemanticAPIEngineSpec(ShillelaghEngineSpec):
"""
Engine for the Semantic Layer REST API.
Connection URL::
semanticapi://<host>[:port]/[?secure=true]
OAuth2 is supported. The database's ``encrypted_extra`` should look like::
{
"oauth2_client_info": {
"id": "demo-client",
"secret": "demo-secret",
"scope": "",
"authorization_request_uri": "http://<host>:<port>/authorize",
"token_request_uri": "http://<host>:<port>/token"
}
}
Per-tenant ``additional_configuration`` is forwarded through
``extra.engine_params.connect_args.additional_configuration``.
"""
engine = "semanticapi"
engine_name = "Semantic Layer API"
default_driver = "apsw"
sqlalchemy_uri_placeholder = "semanticapi://<host>[:port]/?secure=<true|false>"
parameters_schema = SemanticAPIParametersSchema()
# OAuth 2.0 — the authorisation and token URIs live on each database's
# ``encrypted_extra.oauth2_client_info`` since they're derived from that
# database's server.
supports_oauth2 = True
oauth2_exception = (UnauthenticatedError, OAuth2TokenRefreshError)
oauth2_token_request_type = "data" # noqa: S105
# pylint: disable=invalid-name
encrypted_extra_sensitive_fields = {
"$.oauth2_client_info.secret": "OAuth2 Client Secret",
}
@classmethod
def build_sqlalchemy_uri(
cls,
parameters: SemanticAPIParametersType,
encrypted_extra: dict[str, Any] | None = None,
) -> str:
"""
Turn form parameters into a ``semanticapi://`` URL.
``additional_configuration`` is JSON-encoded onto the query string,
and the OAuth2 ``authorization_request_uri`` / ``token_request_uri``
are filled in (if missing) from the server's host:port.
"""
host = parameters.get("host") or ""
port = parameters.get("port")
secure = bool(parameters.get("secure"))
query: dict[str, str] = {}
if secure:
query["secure"] = "true"
if config := parameters.get("additional_configuration"):
query["additional_configuration"] = (
config if isinstance(config, str) else json.dumps(config)
)
if encrypted_extra and (oauth2 := encrypted_extra.get("oauth2_client_info")):
scheme = "https" if secure else "http"
netloc = f"{host}:{port}" if port else host
base = f"{scheme}://{netloc}"
oauth2.setdefault("authorization_request_uri", f"{base}/authorize")
oauth2.setdefault("token_request_uri", f"{base}/token")
oauth2.setdefault("scope", "")
return str(URL.create(cls.engine, host=host, port=port, query=query))
@classmethod
def get_parameters_from_uri(
cls,
uri: str,
encrypted_extra: dict[str, Any] | None = None,
) -> SemanticAPIParametersType:
"""
Inverse of :meth:`build_sqlalchemy_uri` for repopulating the form.
"""
url = make_url_safe(uri)
parameters: SemanticAPIParametersType = {
"host": url.host or "",
"port": url.port,
"secure": str(url.query.get("secure", "")).lower() in _TRUTHY,
}
if raw := url.query.get("additional_configuration"):
try:
parameters["additional_configuration"] = json.loads(raw)
except json.JSONDecodeError:
parameters["additional_configuration"] = None
if encrypted_extra and "oauth2_client_info" in encrypted_extra:
parameters["oauth2_client_info"] = encrypted_extra["oauth2_client_info"]
return parameters
@classmethod
def parameters_json_schema(cls) -> Any:
"""Return the form's OpenAPI schema, used by the frontend to render it."""
# imported lazily because ``superset.databases.schemas`` touches the
# Flask app context at import time.
from superset.databases.schemas import encrypted_field_properties
if not cls.parameters_schema:
return None
spec = APISpec(
title="Database Parameters",
version="1.0.0",
openapi_version="3.0.0",
plugins=[ma_plugin],
)
ma_plugin.init_spec(spec)
ma_plugin.converter.add_attribute_function(encrypted_field_properties)
spec.components.schema(cls.__name__, schema=cls.parameters_schema)
return spec.to_dict()["components"]["schemas"][cls.__name__]
@classmethod
def validate_parameters(
cls,
properties: SemanticAPIPropertiesType,
) -> list[SupersetError]:
"""Surface missing ``host`` (the only field we strictly require)."""
errors: list[SupersetError] = []
parameters = properties.get("parameters", {})
if not parameters.get("host"):
errors.append(
SupersetError(
message="Host is required.",
error_type=SupersetErrorType.CONNECTION_MISSING_PARAMETERS_ERROR,
level=ErrorLevel.WARNING,
extra={"missing": ["host"]},
),
)
return errors
@classmethod
def needs_oauth2(cls, ex: Exception) -> bool:
"""
``UnauthenticatedError`` from the adapter signals an expired/missing
bearer token — i.e. we need to run the OAuth2 dance.
"""
return bool(g) and hasattr(g, "user") and isinstance(ex, cls.oauth2_exception)
@classmethod
def impersonate_user(
cls,
database: Database, # noqa: ARG003
username: str | None, # noqa: ARG003
user_token: str | None,
url: URL,
engine_kwargs: dict[str, Any],
) -> tuple[URL, dict[str, Any]]:
"""
Inject the user's OAuth2 access token into the URL query string so
that the dialect (and therefore the adapter) sends it as a Bearer.
"""
if user_token:
url = url.update_query_dict({"access_token": user_token})
return url, engine_kwargs
@classmethod
def adjust_engine_params(
cls,
uri: URL,
connect_args: dict[str, Any],
catalog: str | None = None,
schema: str | None = None,
) -> tuple[URL, dict[str, Any]]:
"""
Fold ``additional_configuration`` from the database's ``extra`` field
(placed by the user under ``engine_params.connect_args``) into the URL
query string, so the dialect can pick it up.
"""
uri, connect_args = super().adjust_engine_params(
uri, connect_args, catalog, schema
)
if (config := connect_args.pop("additional_configuration", None)) is not None:
query = dict(uri.query)
query["additional_configuration"] = (
config if isinstance(config, str) else json.dumps(config)
)
uri = uri.set(query=query)
return uri, connect_args
@classmethod
def select_star(cls, *args: Any, **kwargs: Any) -> str:
"""
Return a stand-in ``SELECT *`` that explains why preview is disabled.
"""
message = SELECT_STAR_MESSAGE.replace("'", "''")
return f"SELECT '{message}' AS warning"
@classmethod
def get_columns(
cls,
inspector: Inspector,
table: Table,
options: dict[str, Any] | None = None,
) -> list[ResultSetColumnType]:
"""
Return only the view's dimensions; metrics are surfaced by ``get_metrics``.
"""
columns: list[ResultSetColumnType] = []
for column in inspector.get_columns(table.table, table.schema):
if "computed" in column:
continue
column["column_name"] = column["name"]
columns.append(column)
return columns
@classmethod
def get_metrics(
cls,
database: Database, # noqa: ARG003
inspector: Inspector,
table: Table,
) -> list[MetricType]:
"""
Translate the view's metric columns into Superset metric definitions.
"""
return [
{
"metric_name": column["name"],
"expression": column["computed"]["sqltext"],
"description": column["comment"],
}
for column in inspector.get_columns(table.table, table.schema)
if "computed" in column
]

View File

@@ -0,0 +1,413 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
# pylint: disable=import-outside-toplevel
from typing import Any
from unittest.mock import MagicMock, patch
from sqlalchemy.engine.url import make_url
from superset.utils import json
def _columns() -> list[dict[str, Any]]:
return [
{
"name": "region",
"type": "TEXT",
"nullable": True,
"default": None,
"comment": "dimension",
},
{
"name": "total_revenue",
"type": "FLOAT",
"nullable": True,
"default": None,
"comment": "metric",
"computed": {"sqltext": "total_revenue", "persisted": True},
},
]
def test_engine_spec_identity() -> None:
"""
The engine and dialect names line up with the shillelagh dialect.
"""
from superset.db_engine_specs.semantic_api import SemanticAPIEngineSpec
assert SemanticAPIEngineSpec.engine == "semanticapi"
assert SemanticAPIEngineSpec.engine_name == "Semantic Layer API"
assert "semanticapi://" in SemanticAPIEngineSpec.sqlalchemy_uri_placeholder
def test_select_star_returns_warning() -> None:
"""
Data preview is replaced with a single-row warning message.
"""
from superset.db_engine_specs.semantic_api import (
SELECT_STAR_MESSAGE,
SemanticAPIEngineSpec,
)
sql = SemanticAPIEngineSpec.select_star()
assert sql.startswith("SELECT '")
assert sql.endswith("' AS warning")
# The single-quote escape is preserved verbatim.
assert SELECT_STAR_MESSAGE.replace("'", "''") in sql
def test_get_columns_filters_metrics() -> None:
"""
Only non-computed columns (dimensions) flow through ``get_columns``.
"""
from superset.db_engine_specs.semantic_api import SemanticAPIEngineSpec
inspector = MagicMock()
inspector.get_columns.return_value = _columns()
table = MagicMock()
table.table = "sales"
table.schema = None
columns = SemanticAPIEngineSpec.get_columns(inspector, table)
assert [c["name"] for c in columns] == ["region"]
assert columns[0]["column_name"] == "region"
def test_adjust_engine_params_folds_extra_into_url() -> None:
"""
``additional_configuration`` from ``connect_args`` ends up on the URL query.
"""
from superset.db_engine_specs.semantic_api import SemanticAPIEngineSpec
uri, connect_args = SemanticAPIEngineSpec.adjust_engine_params(
make_url("sqlite://"),
{"additional_configuration": {"workspace": "acme"}, "other": 1},
)
assert "additional_configuration" not in connect_args
assert connect_args["other"] == 1
assert json.loads(uri.query["additional_configuration"]) == {"workspace": "acme"}
def test_adjust_engine_params_string_passthrough() -> None:
"""
A pre-serialised string config is forwarded verbatim.
"""
from superset.db_engine_specs.semantic_api import SemanticAPIEngineSpec
uri, _ = SemanticAPIEngineSpec.adjust_engine_params(
make_url("sqlite://"),
{"additional_configuration": '{"workspace":"acme"}'},
)
assert uri.query["additional_configuration"] == '{"workspace":"acme"}'
def test_adjust_engine_params_no_extra() -> None:
"""
Without ``additional_configuration`` the URL is untouched.
"""
from superset.db_engine_specs.semantic_api import SemanticAPIEngineSpec
uri, connect_args = SemanticAPIEngineSpec.adjust_engine_params(
make_url("sqlite://"),
{"other": 1},
)
assert uri.query == {}
assert connect_args == {"other": 1}
def test_supports_oauth2_flag() -> None:
"""
The engine spec advertises OAuth2 support and treats ``UnauthenticatedError``
as the trigger to start the dance.
"""
from shillelagh.exceptions import UnauthenticatedError
from superset.db_engine_specs.semantic_api import SemanticAPIEngineSpec
from superset.exceptions import OAuth2TokenRefreshError
assert SemanticAPIEngineSpec.supports_oauth2 is True
assert UnauthenticatedError in SemanticAPIEngineSpec.oauth2_exception
assert OAuth2TokenRefreshError in SemanticAPIEngineSpec.oauth2_exception
assert SemanticAPIEngineSpec.encrypted_extra_sensitive_fields == {
"$.oauth2_client_info.secret": "OAuth2 Client Secret",
}
def test_needs_oauth2_for_unauthenticated() -> None:
"""
``UnauthenticatedError`` with a logged-in user starts the dance.
"""
from shillelagh.exceptions import UnauthenticatedError
from superset.db_engine_specs.semantic_api import SemanticAPIEngineSpec
fake_g = MagicMock()
fake_g.user = MagicMock()
with patch("superset.db_engine_specs.semantic_api.g", fake_g):
assert (
SemanticAPIEngineSpec.needs_oauth2(UnauthenticatedError("expired")) is True
)
def test_needs_oauth2_ignores_unrelated_exceptions() -> None:
"""
Unrelated exception types don't trigger the dance.
"""
from superset.db_engine_specs.semantic_api import SemanticAPIEngineSpec
fake_g = MagicMock()
fake_g.user = MagicMock()
with patch("superset.db_engine_specs.semantic_api.g", fake_g):
assert SemanticAPIEngineSpec.needs_oauth2(RuntimeError("boom")) is False
def test_needs_oauth2_requires_user_context() -> None:
"""
Outside a request (no ``g.user``) the dance is not started.
"""
from shillelagh.exceptions import UnauthenticatedError
from superset.db_engine_specs.semantic_api import SemanticAPIEngineSpec
class _NoUserG:
def __bool__(self) -> bool:
return True
with patch("superset.db_engine_specs.semantic_api.g", _NoUserG()):
assert (
SemanticAPIEngineSpec.needs_oauth2(UnauthenticatedError("expired")) is False
)
def test_impersonate_user_injects_access_token() -> None:
"""
With a cached OAuth2 token the URL gains ``?access_token=…``.
"""
from superset.db_engine_specs.semantic_api import SemanticAPIEngineSpec
url, engine_kwargs = SemanticAPIEngineSpec.impersonate_user(
database=MagicMock(),
username=None,
user_token="demo-access-token", # noqa: S106
url=make_url("sqlite://"),
engine_kwargs={"connect_args": {}},
)
assert url.query["access_token"] == "demo-access-token" # noqa: S105
assert engine_kwargs == {"connect_args": {}}
def test_impersonate_user_without_token_is_a_noop() -> None:
"""
Without a token, neither the URL nor engine_kwargs are changed.
"""
from superset.db_engine_specs.semantic_api import SemanticAPIEngineSpec
original_url = make_url("sqlite://")
url, engine_kwargs = SemanticAPIEngineSpec.impersonate_user(
database=MagicMock(),
username=None,
user_token=None,
url=original_url,
engine_kwargs={},
)
assert url is original_url
assert engine_kwargs == {}
def test_build_sqlalchemy_uri_minimum() -> None:
"""
Just host → ``semanticapi://host/``.
"""
from superset.db_engine_specs.semantic_api import SemanticAPIEngineSpec
assert (
SemanticAPIEngineSpec.build_sqlalchemy_uri(
{"host": "localhost"},
)
== "semanticapi://localhost"
)
def test_build_sqlalchemy_uri_full() -> None:
"""
Host, port, secure, ``additional_configuration`` and OAuth client info all
round-trip cleanly. The OAuth2 URIs get auto-filled from host:port.
"""
from superset.db_engine_specs.semantic_api import SemanticAPIEngineSpec
encrypted_extra = {
"oauth2_client_info": {"id": "demo-client", "secret": "demo-secret"},
}
uri = SemanticAPIEngineSpec.build_sqlalchemy_uri(
{
"host": "h",
"port": 8000,
"secure": True,
"additional_configuration": {"workspace": "acme"},
},
encrypted_extra,
)
parsed = make_url(uri)
assert parsed.host == "h"
assert parsed.port == 8000
assert parsed.query["secure"] == "true"
assert json.loads(parsed.query["additional_configuration"]) == {"workspace": "acme"}
oauth2 = encrypted_extra["oauth2_client_info"]
assert oauth2["authorization_request_uri"] == "https://h:8000/authorize"
assert oauth2["token_request_uri"] == "https://h:8000/token" # noqa: S105
assert oauth2["scope"] == ""
def test_build_sqlalchemy_uri_oauth_uri_overrides_preserved() -> None:
"""
Existing OAuth URIs are left alone (no clobbering of explicit overrides).
"""
from superset.db_engine_specs.semantic_api import SemanticAPIEngineSpec
encrypted_extra = {
"oauth2_client_info": {
"id": "x",
"secret": "y",
"authorization_request_uri": "https://idp/authorize",
"token_request_uri": "https://idp/token",
},
}
SemanticAPIEngineSpec.build_sqlalchemy_uri({"host": "h"}, encrypted_extra)
oauth2 = encrypted_extra["oauth2_client_info"]
assert oauth2["authorization_request_uri"] == "https://idp/authorize"
assert oauth2["token_request_uri"] == "https://idp/token" # noqa: S105
def test_build_sqlalchemy_uri_additional_configuration_string() -> None:
"""
A pre-serialised ``additional_configuration`` string is forwarded as-is.
"""
from superset.db_engine_specs.semantic_api import SemanticAPIEngineSpec
uri = SemanticAPIEngineSpec.build_sqlalchemy_uri(
{"host": "h", "additional_configuration": '{"x":1}'}, # type: ignore[typeddict-item]
)
assert make_url(uri).query["additional_configuration"] == '{"x":1}'
def test_get_parameters_from_uri_roundtrip() -> None:
"""
A URL produced by ``build_sqlalchemy_uri`` is parsed back to the same parameters.
"""
from superset.db_engine_specs.semantic_api import SemanticAPIEngineSpec
encrypted_extra = {"oauth2_client_info": {"id": "x", "secret": "y"}}
uri = SemanticAPIEngineSpec.build_sqlalchemy_uri(
{
"host": "h",
"port": 8000,
"secure": True,
"additional_configuration": {"workspace": "acme"},
},
encrypted_extra,
)
params = SemanticAPIEngineSpec.get_parameters_from_uri(uri, encrypted_extra)
assert params["host"] == "h"
assert params["port"] == 8000
assert params["secure"] is True
assert params["additional_configuration"] == {"workspace": "acme"}
oauth2 = params["oauth2_client_info"]
assert oauth2 is not None
assert oauth2["id"] == "x"
def test_get_parameters_from_uri_invalid_additional_configuration() -> None:
"""
Garbage in ``additional_configuration`` doesn't blow up — set to None.
"""
from superset.db_engine_specs.semantic_api import SemanticAPIEngineSpec
params = SemanticAPIEngineSpec.get_parameters_from_uri(
"semanticapi://h/?additional_configuration=not-json",
)
assert params["additional_configuration"] is None
def test_validate_parameters_missing_host() -> None:
"""
A missing host produces a clear validation error.
"""
from superset.db_engine_specs.semantic_api import SemanticAPIEngineSpec
errors = SemanticAPIEngineSpec.validate_parameters({"parameters": {}})
assert len(errors) == 1
extra = errors[0].extra
assert extra is not None
assert extra["missing"] == ["host"]
def test_validate_parameters_happy() -> None:
"""
With a host, no errors are reported.
"""
from superset.db_engine_specs.semantic_api import SemanticAPIEngineSpec
assert (
SemanticAPIEngineSpec.validate_parameters(
{"parameters": {"host": "localhost"}},
)
== []
)
def test_parameters_json_schema_exposes_fields() -> None:
"""
``parameters_json_schema`` advertises every form field so the frontend
can render it.
"""
from superset.db_engine_specs.semantic_api import SemanticAPIEngineSpec
schema = SemanticAPIEngineSpec.parameters_json_schema()
properties = schema["properties"]
assert set(properties).issuperset(
{"host", "port", "secure", "additional_configuration", "oauth2_client_info"},
)
# oauth2_client_info is marked as encrypted so the frontend stores it in
# ``encrypted_extra``.
assert properties["oauth2_client_info"].get("x-encrypted-extra") is True
def test_get_metrics_extracts_computed() -> None:
"""
Computed columns become Superset metric definitions.
"""
from superset.db_engine_specs.semantic_api import SemanticAPIEngineSpec
inspector = MagicMock()
inspector.get_columns.return_value = _columns()
table = MagicMock()
table.table = "sales"
table.schema = None
metrics = SemanticAPIEngineSpec.get_metrics(MagicMock(), inspector, table)
assert metrics == [
{
"metric_name": "total_revenue",
"expression": "total_revenue",
"description": "metric",
},
]