Compare commits

..

2 Commits

Author SHA1 Message Date
Evan
de30eed14f chore(helm): bump chart version to 0.17.0
The chart lint requires a version bump when chart files change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 17:37:33 -07:00
Claude Code
3da2a210c7 fix(helm)!: replace dockerize initContainer with bash TCP wait
Drops `apache/superset:dockerize` from the chart entirely. The five
initContainers that gate startup on Postgres / Redis now run from the
same `apache/superset` image we're already pulling, using bash's
built-in `/dev/tcp/host/port` redirect for the readiness probe — no
external `dockerize`, `nc`, or busybox needed.

A trivy scan of the current published `apache/superset:dockerize`
(image created 2024-05-09, alpine 3.19.1 EOSL) found 3 CRITICAL,
25 HIGH, 71 MEDIUM, and 24 LOW CVEs — 64 of them in the bundled
`dockerize` Go binary itself (stale Go stdlib + golang.org/x/{net,
crypto}); the rest in the alpine base. Rebuilding the image on a
fresher base would just defer the same problem; removing the
dependency eliminates it.

Verified `/bin/bash` 5.2.15 is present in `apache/superset:latest`
and supports the `/dev/tcp` redirect (the image's `/bin/sh` is dash,
which does not — hence the explicit `/bin/bash` invocation).
Rendered the chart with `helm template` and confirmed all five
initContainers (supersetNode, init, supersetWorker,
supersetCeleryBeat, supersetCeleryFlower) emit the expected
bash-based probe and pull the main superset image.

The 120s timeout from `dockerize -timeout 120s` is preserved via a
SECONDS-based deadline in the bash loop. Two-port waits (postgres
+ redis) factor out a small `wait_for` helper to keep the script
readable.

BREAKING CHANGE: chart `values.yaml` no longer defines `initImage`.
Operators who customised `.Values.initImage.repository/tag/pullPolicy`
must remove those overrides — they are silently ignored. Operators
who fully overrode `.Values.supersetNode.initContainers` (etc.) are
unaffected; their override still wins. Chart bumped 0.15.5 → 0.16.0.

Closes #40424

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-10 16:09:17 -07:00
6 changed files with 96 additions and 367 deletions

View File

@@ -29,7 +29,7 @@ maintainers:
- name: craig-rueda
email: craig@craigrueda.com
url: https://github.com/craig-rueda
version: 0.16.0 # See [README](https://github.com/apache/superset/blob/master/helm/superset/README.md#versioning) for version details.
version: 0.17.0 # See [README](https://github.com/apache/superset/blob/master/helm/superset/README.md#versioning) for version details.
dependencies:
- name: postgresql
version: 16.7.27

View File

@@ -23,7 +23,7 @@ NOTE: This file is generated by helm-docs: https://github.com/norwoodj/helm-docs
# superset
![Version: 0.16.0](https://img.shields.io/badge/Version-0.16.0-informational?style=flat-square)
![Version: 0.17.0](https://img.shields.io/badge/Version-0.17.0-informational?style=flat-square)
Apache Superset is a modern, enterprise-ready business intelligence web application
@@ -111,9 +111,6 @@ On helm this can be set on `extraSecretEnv.SUPERSET_SECRET_KEY` or `configOverri
| init.resources | object | `{}` | |
| init.tolerations | list | `[]` | |
| init.topologySpreadConstraints | list | `[]` | TopologySpreadConstrains to be added to init job |
| initImage.pullPolicy | string | `"IfNotPresent"` | |
| initImage.repository | string | `"apache/superset"` | |
| initImage.tag | string | `"dockerize"` | |
| nameOverride | string | `nil` | Provide a name to override the name of the chart |
| nodeSelector | object | `{}` | |
| postgresql | object | see `values.yaml` | Configuration values for the postgresql dependency. ref: https://github.com/bitnami/charts/tree/main/bitnami/postgresql |

View File

@@ -194,11 +194,6 @@ image:
imagePullSecrets: []
initImage:
repository: apache/superset
tag: dockerize
pullPolicy: IfNotPresent
service:
type: ClusterIP
port: 8088
@@ -303,15 +298,28 @@ supersetNode:
# @default -- a container waiting for postgres
initContainers:
- name: wait-for-postgres
image: "{{ .Values.initImage.repository }}:{{ .Values.initImage.tag }}"
imagePullPolicy: "{{ .Values.initImage.pullPolicy }}"
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: "{{ .Values.image.pullPolicy }}"
envFrom:
- secretRef:
name: "{{ tpl .Values.envFromSecret . }}"
command:
- /bin/sh
- /bin/bash
- -c
- dockerize -wait "tcp://$DB_HOST:$DB_PORT" -timeout 120s
- |
# bash's /dev/tcp redirect performs a TCP connect; no external
# `dockerize`, `nc`, or busybox needed. SECONDS-based deadline
# mirrors the prior `dockerize -timeout 120s` behaviour.
SECONDS=0
until (echo > /dev/tcp/"$DB_HOST"/"$DB_PORT") 2>/dev/null; do
if [ "$SECONDS" -ge 120 ]; then
echo "timeout waiting for postgres at $DB_HOST:$DB_PORT after 120s" >&2
exit 1
fi
echo "waiting for postgres at $DB_HOST:$DB_PORT (elapsed ${SECONDS}s)"
sleep 2
done
echo "postgres at $DB_HOST:$DB_PORT is up"
resources:
limits:
memory: "256Mi"
@@ -407,15 +415,31 @@ supersetWorker:
# @default -- a container waiting for postgres and redis
initContainers:
- name: wait-for-postgres-redis
image: "{{ .Values.initImage.repository }}:{{ .Values.initImage.tag }}"
imagePullPolicy: "{{ .Values.initImage.pullPolicy }}"
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: "{{ .Values.image.pullPolicy }}"
envFrom:
- secretRef:
name: "{{ tpl .Values.envFromSecret . }}"
command:
- /bin/sh
- /bin/bash
- -c
- dockerize -wait "tcp://$DB_HOST:$DB_PORT" -wait "tcp://$REDIS_HOST:$REDIS_PORT" -timeout 120s
- |
# See supersetNode.initContainers for the rationale.
SECONDS=0
wait_for() {
local host=$1 port=$2 name=$3
until (echo > /dev/tcp/"$host"/"$port") 2>/dev/null; do
if [ "$SECONDS" -ge 120 ]; then
echo "timeout waiting for $name at $host:$port after 120s" >&2
exit 1
fi
echo "waiting for $name at $host:$port (elapsed ${SECONDS}s)"
sleep 2
done
echo "$name at $host:$port is up"
}
wait_for "$DB_HOST" "$DB_PORT" postgres
wait_for "$REDIS_HOST" "$REDIS_PORT" redis
resources:
limits:
memory: "256Mi"
@@ -495,15 +519,31 @@ supersetCeleryBeat:
# @default -- a container waiting for postgres
initContainers:
- name: wait-for-postgres-redis
image: "{{ .Values.initImage.repository }}:{{ .Values.initImage.tag }}"
imagePullPolicy: "{{ .Values.initImage.pullPolicy }}"
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: "{{ .Values.image.pullPolicy }}"
envFrom:
- secretRef:
name: "{{ tpl .Values.envFromSecret . }}"
command:
- /bin/sh
- /bin/bash
- -c
- dockerize -wait "tcp://$DB_HOST:$DB_PORT" -wait "tcp://$REDIS_HOST:$REDIS_PORT" -timeout 120s
- |
# See supersetNode.initContainers for the rationale.
SECONDS=0
wait_for() {
local host=$1 port=$2 name=$3
until (echo > /dev/tcp/"$host"/"$port") 2>/dev/null; do
if [ "$SECONDS" -ge 120 ]; then
echo "timeout waiting for $name at $host:$port after 120s" >&2
exit 1
fi
echo "waiting for $name at $host:$port (elapsed ${SECONDS}s)"
sleep 2
done
echo "$name at $host:$port is up"
}
wait_for "$DB_HOST" "$DB_PORT" postgres
wait_for "$REDIS_HOST" "$REDIS_PORT" redis
resources:
limits:
memory: "256Mi"
@@ -594,15 +634,31 @@ supersetCeleryFlower:
# @default -- a container waiting for postgres and redis
initContainers:
- name: wait-for-postgres-redis
image: "{{ .Values.initImage.repository }}:{{ .Values.initImage.tag }}"
imagePullPolicy: "{{ .Values.initImage.pullPolicy }}"
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: "{{ .Values.image.pullPolicy }}"
envFrom:
- secretRef:
name: "{{ tpl .Values.envFromSecret . }}"
command:
- /bin/sh
- /bin/bash
- -c
- dockerize -wait "tcp://$DB_HOST:$DB_PORT" -wait "tcp://$REDIS_HOST:$REDIS_PORT" -timeout 120s
- |
# See supersetNode.initContainers for the rationale.
SECONDS=0
wait_for() {
local host=$1 port=$2 name=$3
until (echo > /dev/tcp/"$host"/"$port") 2>/dev/null; do
if [ "$SECONDS" -ge 120 ]; then
echo "timeout waiting for $name at $host:$port after 120s" >&2
exit 1
fi
echo "waiting for $name at $host:$port (elapsed ${SECONDS}s)"
sleep 2
done
echo "$name at $host:$port is up"
}
wait_for "$DB_HOST" "$DB_PORT" postgres
wait_for "$REDIS_HOST" "$REDIS_PORT" redis
resources:
limits:
memory: "256Mi"
@@ -764,15 +820,26 @@ init:
# @default -- a container waiting for postgres
initContainers:
- name: wait-for-postgres
image: "{{ .Values.initImage.repository }}:{{ .Values.initImage.tag }}"
imagePullPolicy: "{{ .Values.initImage.pullPolicy }}"
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: "{{ .Values.image.pullPolicy }}"
envFrom:
- secretRef:
name: "{{ tpl .Values.envFromSecret . }}"
command:
- /bin/sh
- /bin/bash
- -c
- dockerize -wait "tcp://$DB_HOST:$DB_PORT" -timeout 120s
- |
# See supersetNode.initContainers for the rationale.
SECONDS=0
until (echo > /dev/tcp/"$DB_HOST"/"$DB_PORT") 2>/dev/null; do
if [ "$SECONDS" -ge 120 ]; then
echo "timeout waiting for postgres at $DB_HOST:$DB_PORT after 120s" >&2
exit 1
fi
echo "waiting for postgres at $DB_HOST:$DB_PORT (elapsed ${SECONDS}s)"
sleep 2
done
echo "postgres at $DB_HOST:$DB_PORT is up"
resources:
limits:
memory: "256Mi"

View File

@@ -22,22 +22,13 @@ from typing import Any, TypedDict
from flask import current_app as app
from flask_babel import gettext as __
from superset import db, is_feature_enabled, security_manager
from superset import db, security_manager
from superset.commands.base import BaseCommand
from superset.errors import ErrorLevel, SupersetError, SupersetErrorType
from superset.exceptions import (
SupersetDisallowedSQLFunctionException,
SupersetDisallowedSQLTableException,
SupersetDMLNotAllowedException,
SupersetErrorException,
SupersetTimeoutException,
)
from superset.exceptions import SupersetErrorException, SupersetTimeoutException
from superset.jinja_context import get_template_processor
from superset.models.core import Database
from superset.models.sql_lab import Query
from superset.sql.parse import SQLScript
from superset.utils import core as utils
from superset.utils.rls import apply_rls
logger = logging.getLogger(__name__)
@@ -78,85 +69,6 @@ class QueryEstimationCommand(BaseCommand):
)
security_manager.raise_for_access(database=self._database)
def _apply_sql_security(self, sql: str) -> str:
"""Run the disallowed-function/table, DML and RLS controls against the
SQL to be estimated, mirroring ``sql_lab.execute_sql_statements``.
Returns the SQL with RLS predicates injected (when ``RLS_IN_SQLLAB`` is
enabled), so the cost estimate reflects the same constrained query the
user would actually be allowed to run.
"""
db_engine_spec = self._database.db_engine_spec
parsed_script = SQLScript(sql, engine=db_engine_spec.engine)
disallowed_functions = app.config["DISALLOWED_SQL_FUNCTIONS"].get(
db_engine_spec.engine,
set(),
)
if disallowed_functions and parsed_script.check_functions_present(
disallowed_functions
):
raise SupersetDisallowedSQLFunctionException(disallowed_functions)
disallowed_tables = app.config["DISALLOWED_SQL_TABLES"].get(
db_engine_spec.engine,
set(),
)
if disallowed_tables and parsed_script.check_tables_present(disallowed_tables):
found_tables = set()
for statement in parsed_script.statements:
present = {table.table.lower() for table in statement.tables}
for table in disallowed_tables:
if table.lower() in present:
found_tables.add(table)
raise SupersetDisallowedSQLTableException(found_tables or disallowed_tables)
if parsed_script.has_mutation() and not self._database.allow_dml:
raise SupersetDMLNotAllowedException()
if is_feature_enabled("RLS_IN_SQLLAB"):
# Resolve the default catalog/schema the same way the execution path
# does (``sql_lab.execute_sql_statements``) before injecting RLS.
# Crucially this goes through ``get_default_schema_for_query`` rather
# than the plain ``get_default_schema``, so engine-specific per-query
# security gates run too — e.g. ``PostgresEngineSpec`` rejects a query
# that sets ``search_path``. Resolving against the static default
# schema instead would both skip that gate and let unqualified tables
# dodge the RLS predicates the real query enforces, defeating the
# security parity this command exists to provide.
catalog = self._catalog or self._database.get_default_catalog()
# Build a transient (unsaved) Query so the engine spec can resolve the
# effective per-query schema exactly as the executor does. Mirror the
# probe built in ``SupersetSecurityManager.raise_for_access``: set a
# ``client_id`` (the column is ``nullable=False``) and expunge it, so
# the ``database`` backref's ``cascade="all, delete-orphan"`` cannot
# autoflush this incomplete row into the session when ``apply_rls``
# issues its own ``db.session`` query below.
probe_query = Query(
database=self._database,
sql=self._sql,
schema=self._schema or None,
catalog=catalog,
client_id=utils.shortid()[:10],
user_id=utils.get_user_id(),
)
db.session.expunge(probe_query)
# Always resolve through ``get_default_schema_for_query`` — even when
# the caller pinned a schema — so the engine's per-query security gate
# runs (e.g. ``PostgresEngineSpec`` rejects a query that sets
# ``search_path``), exactly as the executor does unconditionally. Only
# the resulting value falls back to the resolved default; an explicit
# schema still wins for the RLS predicate target.
resolved_schema = self._database.get_default_schema_for_query(
probe_query, self._template_params
)
schema = self._schema or resolved_schema or ""
for statement in parsed_script.statements:
apply_rls(self._database, catalog, schema, statement)
return parsed_script.format()
return sql
def run(
self,
) -> list[dict[str, Any]]:
@@ -167,12 +79,6 @@ class QueryEstimationCommand(BaseCommand):
template_processor = get_template_processor(self._database)
sql = template_processor.process_template(sql, **self._template_params)
# Apply the same SQL security controls used by the execution path
# (sql_lab.execute_sql_statements) so cost estimation cannot be used to
# probe disallowed functions/tables, bypass the DML guard, or confirm
# the existence of rows hidden by row-level security.
sql = self._apply_sql_security(sql)
timeout = app.config["SQLLAB_QUERY_COST_ESTIMATE_TIMEOUT"]
timeout_msg = f"The estimation exceeded the {timeout} seconds timeout."
try:

View File

@@ -112,35 +112,6 @@ class TestQueryEstimationCommand(SupersetTestCase):
result = command.run()
assert result == payload
@patch("superset.commands.sql_lab.estimate.is_feature_enabled", return_value=True)
def test_apply_sql_security_rls_does_not_pollute_session(
self, mock_is_feature_enabled: Mock
) -> None:
"""Regression test for the RLS schema-resolution probe Query.
``_apply_sql_security`` builds a transient ``Query`` so the engine spec
can resolve the effective per-query schema. Because the ``database``
backref cascades ``all, delete-orphan``, that transient joins the
session; if it isn't expunged, the very next ``apply_rls`` call issues
its own ``db.session`` query, autoflush fires, and the probe — whose
``client_id`` column is ``nullable=False`` — raises ``IntegrityError``.
A mocked session (as in the unit tests) hides this entirely, so exercise
the real session and real ``apply_rls`` here with ``RLS_IN_SQLLAB`` on.
"""
database = get_example_database()
params = {"database_id": database.id, "sql": "SELECT * FROM some_table"}
schema = EstimateQueryCostSchema()
data: EstimateQueryCostSchema = schema.dump(params)
command = estimate.QueryEstimationCommand(data)
command._database = database
with override_user(self.get_user("admin")):
# Must not raise IntegrityError from an autoflushed probe Query.
command._apply_sql_security("SELECT * FROM some_table")
# And no transient probe Query may be left pending in the session.
assert not any(isinstance(obj, Query) for obj in db.session.new)
class TestSqlResultExportCommand(SupersetTestCase):
@pytest.fixture

View File

@@ -16,7 +16,6 @@
# under the License.
"""Unit tests for resource-level authorization in QueryEstimationCommand."""
from typing import cast
from unittest.mock import MagicMock, patch
import pytest
@@ -144,214 +143,3 @@ def test_raise_for_access_called_with_correct_database(
call_kwargs = mock_security_manager.raise_for_access.call_args.kwargs
assert call_kwargs["database"] is mock_database
# ---------------------------------------------------------------------------
# SQL security controls applied on the estimate path (parity with executor)
# ---------------------------------------------------------------------------
def _make_command_with_db(
sql: str, *, allow_dml: bool = False, engine: str = "postgresql"
) -> QueryEstimationCommand:
command = QueryEstimationCommand(_make_params(sql=sql))
command._database = MagicMock()
command._database.db_engine_spec.engine = engine
command._database.allow_dml = allow_dml
command._catalog = None
command._schema = ""
return command
@patch("superset.commands.sql_lab.estimate.app")
def test_apply_sql_security_blocks_dml_when_not_allowed(mock_app: MagicMock) -> None:
mock_app.config = {"DISALLOWED_SQL_FUNCTIONS": {}, "DISALLOWED_SQL_TABLES": {}}
from superset.exceptions import SupersetDMLNotAllowedException
command = _make_command_with_db("INSERT INTO t VALUES (1)", allow_dml=False)
with pytest.raises(SupersetDMLNotAllowedException):
command._apply_sql_security("INSERT INTO t VALUES (1)")
@patch("superset.commands.sql_lab.estimate.app")
def test_apply_sql_security_allows_dml_when_enabled(mock_app: MagicMock) -> None:
mock_app.config = {"DISALLOWED_SQL_FUNCTIONS": {}, "DISALLOWED_SQL_TABLES": {}}
command = _make_command_with_db("INSERT INTO t VALUES (1)", allow_dml=True)
# No exception; SQL returned unchanged (RLS disabled by default).
assert command._apply_sql_security("INSERT INTO t VALUES (1)")
@patch("superset.commands.sql_lab.estimate.app")
def test_apply_sql_security_blocks_disallowed_table(mock_app: MagicMock) -> None:
mock_app.config = {
"DISALLOWED_SQL_FUNCTIONS": {},
"DISALLOWED_SQL_TABLES": {"postgresql": {"secrets"}},
}
from superset.exceptions import SupersetDisallowedSQLTableException
command = _make_command_with_db("SELECT * FROM secrets", allow_dml=True)
with pytest.raises(SupersetDisallowedSQLTableException):
command._apply_sql_security("SELECT * FROM secrets")
@patch("superset.commands.sql_lab.estimate.app")
def test_apply_sql_security_blocks_disallowed_function(mock_app: MagicMock) -> None:
"""A disallowed function cannot be probed via cost estimation either."""
mock_app.config = {
"DISALLOWED_SQL_FUNCTIONS": {"postgresql": {"PG_SLEEP"}},
"DISALLOWED_SQL_TABLES": {},
}
from superset.exceptions import SupersetDisallowedSQLFunctionException
command = _make_command_with_db("SELECT pg_sleep(1)", allow_dml=True)
with pytest.raises(SupersetDisallowedSQLFunctionException):
command._apply_sql_security("SELECT pg_sleep(1)")
@patch("superset.commands.sql_lab.estimate.app")
def test_apply_sql_security_allows_benign_select(mock_app: MagicMock) -> None:
"""A benign statement passes through unchanged (no false positives)."""
mock_app.config = {"DISALLOWED_SQL_FUNCTIONS": {}, "DISALLOWED_SQL_TABLES": {}}
command = _make_command_with_db("SELECT 1", allow_dml=False)
# No disallowed content, no mutation, RLS disabled -> returned unchanged.
assert command._apply_sql_security("SELECT 1") == "SELECT 1"
@patch("superset.commands.sql_lab.estimate.apply_rls")
@patch("superset.commands.sql_lab.estimate.Query")
@patch("superset.commands.sql_lab.estimate.db")
@patch("superset.commands.sql_lab.estimate.is_feature_enabled", return_value=True)
@patch("superset.commands.sql_lab.estimate.app")
def test_apply_sql_security_injects_rls_when_enabled(
mock_app: MagicMock,
mock_is_feature_enabled: MagicMock,
mock_db: MagicMock,
mock_query: MagicMock,
mock_apply_rls: MagicMock,
) -> None:
"""With RLS_IN_SQLLAB enabled, RLS predicates are applied per statement so
the estimate reflects the constrained query the user could actually run."""
mock_app.config = {"DISALLOWED_SQL_FUNCTIONS": {}, "DISALLOWED_SQL_TABLES": {}}
command = _make_command_with_db("SELECT * FROM t", allow_dml=False)
result = command._apply_sql_security("SELECT * FROM t")
mock_is_feature_enabled.assert_called_with("RLS_IN_SQLLAB")
mock_apply_rls.assert_called_once()
# The transient probe Query is expunged so its (deliberately incomplete)
# row can't autoflush into the session when apply_rls queries below.
mock_db.session.expunge.assert_called_once_with(mock_query.return_value)
assert isinstance(result, str)
@patch("superset.commands.sql_lab.estimate.Query")
@patch("superset.commands.sql_lab.estimate.db")
@patch("superset.commands.sql_lab.estimate.apply_rls")
@patch("superset.commands.sql_lab.estimate.is_feature_enabled", return_value=True)
@patch("superset.commands.sql_lab.estimate.app")
def test_apply_sql_security_resolves_default_schema_for_rls(
mock_app: MagicMock,
mock_is_feature_enabled: MagicMock,
mock_apply_rls: MagicMock,
mock_db: MagicMock,
mock_query: MagicMock,
) -> None:
"""When no catalog/schema is supplied, RLS must be applied against the
database's *resolved* default catalog/schema — mirroring the execution path
(``SQLExecutor`` / ``sql_lab.execute_sql_statements``). Passing the raw
``""``/``None`` would let unqualified tables dodge RLS predicates that the
real query enforces, defeating the security parity goal of this command.
"""
mock_app.config = {"DISALLOWED_SQL_FUNCTIONS": {}, "DISALLOWED_SQL_TABLES": {}}
command = _make_command_with_db("SELECT * FROM t", allow_dml=False)
database = cast(MagicMock, command._database)
# Caller passed nothing: schema is "" and catalog is None.
command._schema = ""
command._catalog = None
database.get_default_catalog.return_value = "default_catalog"
database.get_default_schema_for_query.return_value = "public"
command._apply_sql_security("SELECT * FROM t")
# Default catalog/schema are resolved before injection, in the same order
# as the executor (catalog first, then schema derived per-query). The schema
# goes through ``get_default_schema_for_query`` so engine-specific per-query
# security gates (e.g. the Postgres ``search_path`` check) run as well.
database.get_default_catalog.assert_called_once_with()
database.get_default_schema_for_query.assert_called_once()
# RLS is applied with the *resolved* values, never the raw ""/None.
# apply_rls(database, catalog, schema, statement)
call_args = mock_apply_rls.call_args.args
assert call_args[1] == "default_catalog"
assert call_args[2] == "public"
@patch("superset.commands.sql_lab.estimate.Query")
@patch("superset.commands.sql_lab.estimate.db")
@patch("superset.commands.sql_lab.estimate.apply_rls")
@patch("superset.commands.sql_lab.estimate.is_feature_enabled", return_value=True)
@patch("superset.commands.sql_lab.estimate.app")
def test_apply_sql_security_respects_explicit_catalog_schema(
mock_app: MagicMock,
mock_is_feature_enabled: MagicMock,
mock_apply_rls: MagicMock,
mock_db: MagicMock,
mock_query: MagicMock,
) -> None:
"""An explicitly supplied catalog short-circuits default-catalog resolution,
and the explicit schema wins as the RLS target — but the schema resolver
``get_default_schema_for_query`` is still invoked so the engine's per-query
security gate runs even when a schema is pinned (parity with the executor,
which calls it unconditionally)."""
mock_app.config = {"DISALLOWED_SQL_FUNCTIONS": {}, "DISALLOWED_SQL_TABLES": {}}
command = _make_command_with_db("SELECT * FROM t", allow_dml=False)
database = cast(MagicMock, command._database)
command._catalog = "my_catalog"
command._schema = "my_schema"
command._apply_sql_security("SELECT * FROM t")
# Explicit catalog wins, so the default-catalog lookup is skipped...
database.get_default_catalog.assert_not_called()
# ...but the schema gate must run even when a schema is pinned, otherwise an
# explicit-schema estimate could smuggle a ``SET search_path`` past the gate
# the executor enforces.
database.get_default_schema_for_query.assert_called_once()
call_args = mock_apply_rls.call_args.args
assert call_args[1] == "my_catalog"
assert call_args[2] == "my_schema"
@patch("superset.commands.sql_lab.estimate.Query")
@patch("superset.commands.sql_lab.estimate.db")
@patch("superset.commands.sql_lab.estimate.apply_rls")
@patch("superset.commands.sql_lab.estimate.is_feature_enabled", return_value=True)
@patch("superset.commands.sql_lab.estimate.app")
def test_apply_sql_security_propagates_engine_schema_gate(
mock_app: MagicMock,
mock_is_feature_enabled: MagicMock,
mock_apply_rls: MagicMock,
mock_db: MagicMock,
mock_query: MagicMock,
) -> None:
"""Default-schema resolution goes through ``get_default_schema_for_query``,
so an engine-specific per-query security gate (e.g. the Postgres
``search_path`` check that rejects ``SET search_path = ...``) is enforced on
the estimate path too, rather than being silently bypassed.
"""
mock_app.config = {"DISALLOWED_SQL_FUNCTIONS": {}, "DISALLOWED_SQL_TABLES": {}}
command = _make_command_with_db(
"SET search_path = secret; SELECT * FROM t", allow_dml=True
)
database = cast(MagicMock, command._database)
command._schema = ""
command._catalog = None
database.get_default_catalog.return_value = "default_catalog"
database.get_default_schema_for_query.side_effect = _security_exception()
with pytest.raises(SupersetSecurityException):
command._apply_sql_security("SET search_path = secret; SELECT * FROM t")
# RLS injection must not happen once the schema gate has rejected the query.
mock_apply_rls.assert_not_called()