mirror of
https://github.com/apache/superset.git
synced 2026-06-11 18:49:15 +00:00
Compare commits
7 Commits
tanstack-r
...
feat/aes-g
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
10c49b61f9 | ||
|
|
6671c90103 | ||
|
|
0fe86ea0c3 | ||
|
|
731a476acc | ||
|
|
05d87edf27 | ||
|
|
782c1db8d8 | ||
|
|
f748de3423 |
30
UPDATING.md
30
UPDATING.md
@@ -59,6 +59,36 @@ Both default to empty (no behavior change). They apply to both the `LOCAL_EXTENS
|
||||
|
||||
The Dynamic Group By chart customization now orders its display values according to the "Sort display control values" toggle: ascending (A–Z), descending (Z–A), or the dataset's source order when the toggle is unset. Previously the dropdown always sorted alphabetically. Existing dashboards where the toggle was never set will show options in source order instead of A–Z; open the customization and enable the toggle to restore alphabetical ordering.
|
||||
|
||||
### Selectable encryption engine for app-encrypted fields (AES-GCM)
|
||||
|
||||
App-encrypted fields (database passwords, SSH tunnel credentials, OAuth tokens, etc.) can now use authenticated **AES-GCM** encryption instead of the historical unauthenticated **AES-CBC**. A new config selects the engine for the default adapter:
|
||||
|
||||
```python
|
||||
# "aes" (AES-CBC, historical default) | "aes-gcm" (authenticated, recommended for new installs)
|
||||
SQLALCHEMY_ENCRYPTED_FIELD_ENGINE = "aes"
|
||||
```
|
||||
|
||||
**No action required / no behavior change:** the default remains `"aes"`, so existing installs are unaffected.
|
||||
|
||||
**Opting in on an existing install:** flipping the engine on a populated database without re-encrypting first will make stored secrets undecryptable, because the two ciphertext formats are not compatible. A migrator is provided. Recommended runbook:
|
||||
|
||||
1. Take a metadata-DB backup.
|
||||
2. Re-encrypt existing secrets into the new engine (the `SECRET_KEY` is unchanged):
|
||||
```bash
|
||||
superset re-encrypt-secrets --engine aes-gcm
|
||||
```
|
||||
3. Set `SQLALCHEMY_ENCRYPTED_FIELD_ENGINE = "aes-gcm"` in your config.
|
||||
4. Restart Superset.
|
||||
5. Re-run the migrator once more after the restart:
|
||||
```bash
|
||||
superset re-encrypt-secrets --engine aes-gcm
|
||||
```
|
||||
A live instance keeps writing *new* secrets as AES-CBC during the window between step 2 and the restart in step 4; this second pass sweeps those up (it is idempotent, so already-migrated values are skipped).
|
||||
|
||||
Schedule the cutover in a quiet window. Runtime reads use only the single configured engine, so in a multi-worker deployment there is an unavoidable brief decrypt-outage between the migration commit and the last worker restarting with the new config — each migrator run is transactional, but the fleet-wide cutover is not zero-downtime.
|
||||
|
||||
The migration is transactional (all-or-nothing) and idempotent — it can be safely re-run or resumed. Note that AES-GCM, unlike AES-CBC, does not support querying directly over encrypted columns; audit any code that filters on an encrypted column before switching. See the SIP at `docs/sip/authenticated-encryption-at-rest.md` for details.
|
||||
|
||||
### Granular Export Controls
|
||||
|
||||
A new feature flag `GRANULAR_EXPORT_CONTROLS` introduces three fine-grained permissions that replace the legacy `can_csv` permission:
|
||||
|
||||
133
docs/sip/authenticated-encryption-at-rest.md
Normal file
133
docs/sip/authenticated-encryption-at-rest.md
Normal file
@@ -0,0 +1,133 @@
|
||||
<!--
|
||||
Licensed to the Apache Software Foundation (ASF) under one
|
||||
or more contributor license agreements. See the NOTICE file
|
||||
distributed with this work for additional information
|
||||
regarding copyright ownership. The ASF licenses this file
|
||||
to you under the Apache License, Version 2.0 (the
|
||||
"License"); you may not use this file except in compliance
|
||||
with the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing,
|
||||
software distributed under the License is distributed on an
|
||||
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
KIND, either express or implied. See the License for the
|
||||
specific language governing permissions and limitations
|
||||
under the License.
|
||||
-->
|
||||
|
||||
# SIP: Authenticated encryption (AES-GCM) for app-encrypted fields
|
||||
|
||||
## [DRAFT — proposal for discussion]
|
||||
|
||||
This document is a draft proposal accompanying the code in this PR. It is
|
||||
intended to seed the formal SIP discussion. The code here ships the
|
||||
backward-compatible engine selection **and** the re-encryption migrator
|
||||
(Phases 1–2 below); both are opt-in and change nothing for existing installs by
|
||||
default. Flipping the default for fresh installs (Phase 3) remains future work.
|
||||
|
||||
## Motivation
|
||||
|
||||
Superset app-encrypts a number of sensitive fields before persisting them to
|
||||
the metadata database, including:
|
||||
|
||||
- database connection passwords and `encrypted_extra` (`superset/models/core.py`),
|
||||
- SSH tunnel credentials — password, private key, private-key password
|
||||
(`superset/databases/ssh_tunnel/models.py`),
|
||||
- OAuth2 tokens and other secrets stored via `EncryptedType`.
|
||||
|
||||
These fields are encrypted with `sqlalchemy_utils.EncryptedType`, which
|
||||
**defaults to `AesEngine` (AES-CBC)**. AES-CBC provides confidentiality but is
|
||||
**unauthenticated**: it has no integrity tag. An attacker with write access to
|
||||
the ciphertext (e.g. direct metadata-DB access, a backup, or a compromised
|
||||
replica) can perform **bit-flipping / chosen-ciphertext manipulation** to
|
||||
silently alter the decrypted plaintext of a secret without detection.
|
||||
|
||||
`AesGcmEngine` (AES-GCM) is authenticated encryption: tampering causes
|
||||
decryption to fail loudly rather than yielding attacker-influenced plaintext.
|
||||
Using authenticated encryption for secrets at rest is an ASVS L1 expectation
|
||||
(11.3.2 / cryptography best practice).
|
||||
|
||||
`config.py` already documents that operators *can* switch to GCM by writing a
|
||||
custom `AbstractEncryptedFieldAdapter`, but:
|
||||
|
||||
1. it is opt-in, undocumented as a security recommendation, and easy to miss;
|
||||
2. there is **no migration path** — flipping the engine on a populated database
|
||||
makes every existing secret undecryptable, because GCM ciphertext is not
|
||||
format-compatible with CBC.
|
||||
|
||||
## Proposed change
|
||||
|
||||
A three-part change, delivered incrementally so existing deployments are never
|
||||
broken:
|
||||
|
||||
### Phase 1 — engine selection (this PR)
|
||||
|
||||
- Add a `SQLALCHEMY_ENCRYPTED_FIELD_ENGINE` config (`"aes"` | `"aes-gcm"`),
|
||||
**defaulting to `"aes"`** (no behavior change for existing installs).
|
||||
- Teach the default `SQLAlchemyUtilsAdapter` to honor it (an explicit `engine`
|
||||
kwarg still wins, so the migrator can pin an engine).
|
||||
- This lets **new** deployments choose AES-GCM from day one with a one-line
|
||||
config, instead of writing a custom adapter.
|
||||
|
||||
### Phase 2 — CBC→GCM re-encryption migrator (this PR)
|
||||
|
||||
The existing `SecretsMigrator` (previously only used for `SECRET_KEY` rotation)
|
||||
gains an **engine migration** mode that:
|
||||
|
||||
1. discovers every `EncryptedType` column (via `discover_encrypted_fields()`),
|
||||
2. decrypts each value with the **source** engine (AES-CBC) under the current
|
||||
`SECRET_KEY`,
|
||||
3. re-encrypts with the **target** engine (AES-GCM),
|
||||
4. runs transactionally per the existing all-or-nothing semantics, and is
|
||||
idempotent per column (already-migrated values are skipped), so a run can be
|
||||
safely repeated or resumed.
|
||||
|
||||
Exposed via a new `--engine` option on the existing CLI command:
|
||||
`superset re-encrypt-secrets --engine aes-gcm`, runnable by operators with a DB
|
||||
backup in hand. The `SECRET_KEY` is unchanged; an engine change and a key
|
||||
rotation can also be combined (pass `--previous_secret_key` as well).
|
||||
|
||||
### Phase 3 — flip the default for new installs
|
||||
|
||||
Once the migrator and docs are in place, change the default to `"aes-gcm"` for
|
||||
**fresh** installs only (e.g. keyed off an empty metadata DB / documented in
|
||||
`UPDATING.md`), keeping existing installs on `"aes"` until they run Phase 2.
|
||||
|
||||
## New or changed public interfaces
|
||||
|
||||
- New config: `SQLALCHEMY_ENCRYPTED_FIELD_ENGINE: Literal["aes", "aes-gcm"]`.
|
||||
- New (Phase 2) CLI: `superset re-encrypt-secrets --engine <name>`.
|
||||
- No schema changes; ciphertext format changes per migrated column.
|
||||
|
||||
## Migration plan and compatibility
|
||||
|
||||
- **Backward compatible by default.** Phase 1 changes nothing unless the
|
||||
operator opts in.
|
||||
- Switching an existing deployment to `"aes-gcm"` **without** running the Phase
|
||||
2 migrator will make existing secrets undecryptable — this is called out in
|
||||
the config comment and must be in `UPDATING.md`.
|
||||
- Recommended operator runbook: take a metadata-DB backup → run
|
||||
`re-encrypt-secrets --engine aes-gcm` → set
|
||||
`SQLALCHEMY_ENCRYPTED_FIELD_ENGINE = "aes-gcm"` → restart.
|
||||
- `AesEngine` allows queryability over encrypted fields; AES-GCM does not.
|
||||
Any code that filters/queries on an encrypted column directly must be audited
|
||||
before Phase 3 (none is expected, but it must be verified).
|
||||
|
||||
## Rejected alternatives
|
||||
|
||||
- **Flip the default immediately.** Rejected: bricks every existing
|
||||
deployment's secrets with no migration path.
|
||||
- **Document-only (custom adapter).** Status quo; high friction and no
|
||||
migration tooling — most operators will never do it.
|
||||
|
||||
## Open questions
|
||||
|
||||
- GCM→CBC rollback (for operators who need queryability) already works via the
|
||||
same command (`re-encrypt-secrets --engine aes`), since the migrator is
|
||||
engine-symmetric. Should rollback be documented as a supported path or
|
||||
discouraged?
|
||||
- The migrator already supports a concurrent `SECRET_KEY` rotation + engine
|
||||
change in a single pass (pass `--previous_secret_key` alongside `--engine`).
|
||||
Is that combination worth calling out in the operator docs, or kept advanced?
|
||||
@@ -31,7 +31,7 @@ from flask_appbuilder.api.manager import resolver
|
||||
|
||||
import superset.utils.database as database_utils
|
||||
from superset.utils.decorators import transaction
|
||||
from superset.utils.encrypt import SecretsMigrator
|
||||
from superset.utils.encrypt import ENCRYPTION_ENGINES, SecretsMigrator
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -110,17 +110,37 @@ def update_api_docs() -> None:
|
||||
help="An optional previous secret key, if PREVIOUS_SECRET_KEY "
|
||||
"is not set on the config",
|
||||
)
|
||||
def re_encrypt_secrets(previous_secret_key: Optional[str] = None) -> None:
|
||||
@click.option(
|
||||
"--engine",
|
||||
"-e",
|
||||
"target_engine_name",
|
||||
required=False,
|
||||
type=click.Choice(sorted(ENCRYPTION_ENGINES), case_sensitive=False),
|
||||
help="Re-encrypt all app-encrypted fields with this encryption engine "
|
||||
"(e.g. 'aes-gcm' for authenticated encryption). The SECRET_KEY is "
|
||||
"unchanged. Take a metadata-DB backup first, then set "
|
||||
"SQLALCHEMY_ENCRYPTED_FIELD_ENGINE to the same value and restart.",
|
||||
)
|
||||
def re_encrypt_secrets(
|
||||
previous_secret_key: Optional[str] = None,
|
||||
target_engine_name: Optional[str] = None,
|
||||
) -> None:
|
||||
previous_secret_key = previous_secret_key or current_app.config.get(
|
||||
"PREVIOUS_SECRET_KEY"
|
||||
)
|
||||
if previous_secret_key is None:
|
||||
target_engine = (
|
||||
ENCRYPTION_ENGINES[target_engine_name] if target_engine_name else None
|
||||
)
|
||||
if previous_secret_key is None and target_engine is None:
|
||||
click.secho(
|
||||
"No previous secret key provided; nothing to re-encrypt.",
|
||||
"No previous secret key or target engine provided; nothing to re-encrypt.",
|
||||
fg="yellow",
|
||||
)
|
||||
return
|
||||
secrets_migrator = SecretsMigrator(previous_secret_key=previous_secret_key)
|
||||
secrets_migrator = SecretsMigrator(
|
||||
previous_secret_key=previous_secret_key,
|
||||
target_engine=target_engine,
|
||||
)
|
||||
try:
|
||||
stats = secrets_migrator.run()
|
||||
except Exception as exc: # pylint: disable=broad-except
|
||||
|
||||
@@ -270,7 +270,10 @@ SQLALCHEMY_CUSTOM_PASSWORD_STORE = None
|
||||
# as key material. Do note that AesEngine allows for queryability over the
|
||||
# encrypted fields.
|
||||
#
|
||||
# To change the default engine you need to define your own adapter:
|
||||
# To switch the engine used by the default adapter, prefer the
|
||||
# ``SQLALCHEMY_ENCRYPTED_FIELD_ENGINE`` knob below (e.g. "aes-gcm"). Defining a
|
||||
# custom adapter, as shown next, is only needed for behaviour the built-in
|
||||
# engines do not cover:
|
||||
#
|
||||
# e.g.:
|
||||
#
|
||||
@@ -295,6 +298,16 @@ SQLALCHEMY_ENCRYPTED_FIELD_TYPE_ADAPTER = ( # pylint: disable=invalid-name
|
||||
SQLAlchemyUtilsAdapter
|
||||
)
|
||||
|
||||
# Encryption engine used by the default SQLAlchemyUtilsAdapter for app-encrypted
|
||||
# fields. Options:
|
||||
# "aes" - AES-CBC (historical default; unauthenticated, queryable)
|
||||
# "aes-gcm" - AES-GCM (authenticated encryption; recommended for NEW installs)
|
||||
# WARNING: changing this on a database that already holds encrypted secrets
|
||||
# (database passwords, SSH tunnel credentials, OAuth tokens, ...) will make
|
||||
# those values undecryptable unless they are re-encrypted first. See the
|
||||
# authenticated-encryption SIP/migration before switching an existing install.
|
||||
SQLALCHEMY_ENCRYPTED_FIELD_ENGINE: Literal["aes", "aes-gcm"] = "aes"
|
||||
|
||||
# Extends the default SQLGlot dialects with additional dialects
|
||||
SQLGLOT_DIALECTS_EXTENSIONS: DialectExtensions | Callable[[], DialectExtensions] = {}
|
||||
|
||||
|
||||
@@ -19,17 +19,76 @@ from abc import ABC, abstractmethod
|
||||
from dataclasses import dataclass
|
||||
from typing import Any, Optional
|
||||
|
||||
from flask import Flask
|
||||
from flask import current_app, Flask
|
||||
from flask_babel import lazy_gettext as _
|
||||
from sqlalchemy import Table, text, TypeDecorator
|
||||
from sqlalchemy.engine import Connection, Dialect, Row
|
||||
from sqlalchemy_utils import EncryptedType as SqlaEncryptedType
|
||||
from sqlalchemy_utils.types.encrypted.encrypted_type import (
|
||||
AesEngine,
|
||||
AesGcmEngine,
|
||||
EncryptionDecryptionBaseEngine,
|
||||
)
|
||||
|
||||
|
||||
class EncryptedType(SqlaEncryptedType):
|
||||
cache_ok = True
|
||||
|
||||
|
||||
# Named encryption engines selectable via the ``SQLALCHEMY_ENCRYPTED_FIELD_ENGINE``
|
||||
# config. "aes" (AES-CBC) is the historical default; "aes-gcm" is authenticated
|
||||
# encryption (recommended for new deployments). NOTE: switching an existing
|
||||
# deployment from "aes" to "aes-gcm" requires re-encrypting all stored secrets
|
||||
# first — see the SIP referenced in the docs. Changing this on a populated
|
||||
# database without that migration will make existing secrets undecryptable.
|
||||
ENCRYPTION_ENGINES: dict[str, type[EncryptionDecryptionBaseEngine]] = {
|
||||
"aes": AesEngine,
|
||||
"aes-gcm": AesGcmEngine,
|
||||
}
|
||||
|
||||
# The historical fallback engine when the config does not name one.
|
||||
DEFAULT_ENCRYPTION_ENGINE_NAME = "aes"
|
||||
|
||||
# Engines whose ciphertext is authenticated: a successful decrypt is
|
||||
# cryptographic proof the value is genuinely in that form. AES-GCM carries an
|
||||
# authentication tag; AES-CBC does not, so a CBC "success" can be coincidental.
|
||||
# Classification logic (the migrator's idempotency fast path) must let an
|
||||
# authenticated decrypt win over an unauthenticated one, never the reverse.
|
||||
AUTHENTICATED_ENGINES: frozenset[type[EncryptionDecryptionBaseEngine]] = frozenset(
|
||||
{AesGcmEngine}
|
||||
)
|
||||
|
||||
|
||||
def _is_authenticated_engine(engine: type[EncryptionDecryptionBaseEngine]) -> bool:
|
||||
return engine in AUTHENTICATED_ENGINES
|
||||
|
||||
|
||||
def resolve_encryption_engine(engine_name: str) -> type[EncryptionDecryptionBaseEngine]:
|
||||
"""Resolve a configured engine name to its engine class, fail-closed.
|
||||
|
||||
The value is normalized (trimmed, lower-cased, underscores → hyphens) so it
|
||||
matches the case-insensitive CLI ``click.Choice``. An unrecognized name
|
||||
raises so a misconfiguration fails at field-construction (startup) rather
|
||||
than silently degrading to unauthenticated AES-CBC — which would let an
|
||||
operator who typo'd ``"aes_gcm"`` believe they had authenticated encryption,
|
||||
and, after a GCM migration, write new secrets as CBC into a GCM database.
|
||||
|
||||
The offending value is deliberately kept out of the error message: it comes
|
||||
from the app config (which also holds ``SECRET_KEY``), and static analysis
|
||||
flags interpolating config-sourced values into logs/errors as potential
|
||||
clear-text secret exposure. The set of valid engine names is enough to
|
||||
diagnose a typo.
|
||||
"""
|
||||
normalized = engine_name.strip().lower().replace("_", "-")
|
||||
try:
|
||||
return ENCRYPTION_ENGINES[normalized]
|
||||
except KeyError:
|
||||
raise ValueError(
|
||||
"Unrecognized SQLALCHEMY_ENCRYPTED_FIELD_ENGINE. Valid engines: "
|
||||
+ ", ".join(sorted(ENCRYPTION_ENGINES))
|
||||
) from None
|
||||
|
||||
|
||||
ENC_ADAPTER_TAG_ATTR_NAME = "__created_by_enc_field_adapter__"
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -65,6 +124,21 @@ class SQLAlchemyUtilsAdapter( # pylint: disable=too-few-public-methods
|
||||
**kwargs: Optional[dict[str, Any]],
|
||||
) -> TypeDecorator:
|
||||
if app_config:
|
||||
# Select the encryption engine from config, defaulting to the
|
||||
# historical AES-CBC engine for backward compatibility when the key
|
||||
# is absent. A *present but unrecognized* value fails closed (see
|
||||
# ``resolve_encryption_engine``) rather than silently degrading to
|
||||
# AES-CBC. An explicit ``engine`` kwarg (e.g. from the migrator)
|
||||
# always takes precedence.
|
||||
if "engine" not in kwargs:
|
||||
engine_name = (
|
||||
app_config.get("SQLALCHEMY_ENCRYPTED_FIELD_ENGINE")
|
||||
or DEFAULT_ENCRYPTION_ENGINE_NAME
|
||||
)
|
||||
# ``**kwargs`` is loosely annotated as ``Optional[dict]`` here, so
|
||||
# route the resolved engine class through an ``Any`` local.
|
||||
engine_cls: Any = resolve_encryption_engine(engine_name)
|
||||
kwargs["engine"] = engine_cls
|
||||
return EncryptedType(*args, lambda: app_config["SECRET_KEY"], **kwargs)
|
||||
|
||||
raise Exception( # pylint: disable=broad-exception-raised
|
||||
@@ -101,10 +175,40 @@ class EncryptedFieldFactory:
|
||||
|
||||
|
||||
class SecretsMigrator:
|
||||
def __init__(self, previous_secret_key: str) -> None:
|
||||
"""Re-encrypts every app-encrypted column in the ORM.
|
||||
|
||||
Two modes, which can also be combined:
|
||||
|
||||
- **Key rotation** — pass ``previous_secret_key``. Values are decrypted
|
||||
under the previous key and re-encrypted under the current ``SECRET_KEY``
|
||||
using the currently-configured engine.
|
||||
- **Engine migration** — pass ``target_engine`` (e.g. ``AesGcmEngine``).
|
||||
Values are decrypted under the current ``SECRET_KEY`` with the source
|
||||
engine and re-encrypted with the target engine. This is how an existing
|
||||
install moves from AES-CBC to authenticated AES-GCM without bricking
|
||||
stored secrets; the ``SECRET_KEY`` itself is unchanged.
|
||||
|
||||
Both modes share the same all-or-nothing transaction and per-column
|
||||
idempotency: a value already readable in the target form is left untouched,
|
||||
so a run can be safely repeated or resumed.
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
previous_secret_key: Optional[str] = None,
|
||||
target_engine: Optional[type[Any]] = None,
|
||||
) -> None:
|
||||
from superset import db # pylint: disable=import-outside-toplevel
|
||||
|
||||
self._db = db
|
||||
self._secret_key = current_app.config["SECRET_KEY"]
|
||||
self._target_engine = target_engine
|
||||
# In engine-migration mode the SECRET_KEY does not change: the source
|
||||
# ciphertext is decrypted under the current key (with the source
|
||||
# engine), so default the "previous" key to the current one when the
|
||||
# caller only asked for an engine change.
|
||||
if target_engine is not None and not previous_secret_key:
|
||||
previous_secret_key = self._secret_key
|
||||
self._previous_secret_key = previous_secret_key
|
||||
self._dialect: Dialect = db.engine.url.get_dialect()
|
||||
|
||||
@@ -177,6 +281,89 @@ class SecretsMigrator:
|
||||
cols = ",".join(pk_columns + column_names)
|
||||
return conn.execute(f"SELECT {cols} FROM {table_name}") # noqa: S608
|
||||
|
||||
def _target_type(self, encrypted_type: EncryptedType) -> EncryptedType:
|
||||
"""The EncryptedType to re-encrypt a value *into*.
|
||||
|
||||
For a key rotation this is the column's own configured type (current
|
||||
engine + lazily-resolved current key). For an engine migration it is a
|
||||
type pinned to the requested target engine under the current key.
|
||||
"""
|
||||
if self._target_engine is None:
|
||||
return encrypted_type
|
||||
return EncryptedType(
|
||||
type_in=encrypted_type.underlying_type,
|
||||
key=self._secret_key,
|
||||
engine=self._target_engine,
|
||||
)
|
||||
|
||||
def _source_decryptors(self, encrypted_type: EncryptedType) -> list[EncryptedType]:
|
||||
"""Candidate decryptors, tried in order, to recover a value's plaintext.
|
||||
|
||||
1. The column's configured type — current key + currently-configured
|
||||
engine. During an engine migration this reads the source ciphertext
|
||||
while the config still points at the source engine.
|
||||
2. The previous key under each supported engine. Trying the column's
|
||||
configured engine covers ``SECRET_KEY`` rotation for any engine
|
||||
(including AES-GCM); also trying the historical AES-CBC engine
|
||||
covers an engine migration whose config was flipped to the target
|
||||
engine *before* the migrator ran (the source data is still CBC under
|
||||
the current key, which the previous key defaults to in
|
||||
engine-migration mode).
|
||||
"""
|
||||
decryptors = [encrypted_type]
|
||||
if self._previous_secret_key:
|
||||
# Try the column's own engine first, then any remaining supported
|
||||
# engines (notably the historical AES-CBC), de-duplicated so we
|
||||
# never build the same decryptor twice.
|
||||
engines: list[type[EncryptionDecryptionBaseEngine]] = [
|
||||
type(encrypted_type.engine)
|
||||
]
|
||||
for engine in ENCRYPTION_ENGINES.values():
|
||||
if engine not in engines:
|
||||
engines.append(engine)
|
||||
# When the previous key equals the current key (engine-migration
|
||||
# mode, or a no-op rotation) the column's own engine under that key
|
||||
# is already ``decryptors[0]``; don't append it again as a fallback.
|
||||
if self._previous_secret_key == self._secret_key:
|
||||
engines = engines[1:]
|
||||
decryptors.extend(
|
||||
EncryptedType(
|
||||
type_in=encrypted_type.underlying_type,
|
||||
key=self._previous_secret_key,
|
||||
engine=engine,
|
||||
)
|
||||
for engine in engines
|
||||
)
|
||||
return decryptors
|
||||
|
||||
def _decrypts_under_authenticated_engine(
|
||||
self, encrypted_type: EncryptedType, raw_value: bytes
|
||||
) -> bool:
|
||||
"""Whether ``raw_value`` decrypts under any authenticated engine.
|
||||
|
||||
A successful authenticated (AES-GCM) decrypt is cryptographic proof the
|
||||
value is genuinely in that engine's form — the authentication tag makes
|
||||
a coincidental success negligibly unlikely. Tried under both the current
|
||||
and previous keys so it holds during a combined key-rotation + engine
|
||||
migration. Used to stop an *unauthenticated* target decrypt from
|
||||
coincidentally classifying an authenticated value as "already migrated".
|
||||
"""
|
||||
keys = {self._secret_key}
|
||||
if self._previous_secret_key:
|
||||
keys.add(self._previous_secret_key)
|
||||
for engine in AUTHENTICATED_ENGINES:
|
||||
for key in keys:
|
||||
try:
|
||||
EncryptedType(
|
||||
type_in=encrypted_type.underlying_type,
|
||||
key=key,
|
||||
engine=engine,
|
||||
).process_result_value(raw_value, self._dialect)
|
||||
except Exception: # noqa: BLE001, S112 # pylint: disable=broad-except
|
||||
continue
|
||||
return True
|
||||
return False
|
||||
|
||||
def _re_encrypt_row(
|
||||
self,
|
||||
conn: Connection,
|
||||
@@ -187,24 +374,28 @@ class SecretsMigrator:
|
||||
stats: ReEncryptStats,
|
||||
) -> None:
|
||||
"""
|
||||
Re encrypts all columns in a Row
|
||||
Re-encrypts all columns in a Row into the target form.
|
||||
|
||||
Re-encryption is idempotent per column: we first ask whether the
|
||||
current key can already decrypt the value, and skip if so. Only if
|
||||
the current key fails do we fall back to decrypting with the
|
||||
previous key and re-encrypting. Checking the current key first
|
||||
keeps ``run()`` idempotent regardless of what ``previous_secret_key``
|
||||
the caller supplies — even re-running with the same (unchanged)
|
||||
``SECRET_KEY`` will not rewrite rows.
|
||||
The "target form" is current-key + currently-configured engine for a
|
||||
``SECRET_KEY`` rotation, or current-key + the requested engine for an
|
||||
engine migration (see ``_target_type``).
|
||||
|
||||
Re-encryption is idempotent per column: a value that can already be
|
||||
read in the target form is left untouched. This keeps ``run()``
|
||||
idempotent regardless of what ``previous_secret_key`` the caller
|
||||
supplies, and lets an interrupted engine migration be resumed.
|
||||
Otherwise the plaintext is recovered from the first candidate source
|
||||
decryptor that can read it (see ``_source_decryptors``) and re-encrypted
|
||||
into the target form.
|
||||
|
||||
NULL values are never encrypted, so they are reported separately
|
||||
(neither re-encrypted nor "skipped because already current").
|
||||
|
||||
Per-column outcomes are accumulated onto ``stats`` so the caller can
|
||||
report a summary. Columns whose ciphertext is unreadable under both
|
||||
keys are counted as failures and logged; the exception is not
|
||||
propagated, so processing continues. The caller is responsible for
|
||||
raising once all rows have been scanned.
|
||||
report a summary. Columns whose ciphertext is unreadable under any
|
||||
candidate key/engine are counted as failures and logged; the exception
|
||||
is not propagated, so processing continues. The caller is responsible
|
||||
for raising once all rows have been scanned.
|
||||
|
||||
If no columns need re-encryption, no UPDATE is issued.
|
||||
|
||||
@@ -223,38 +414,67 @@ class SecretsMigrator:
|
||||
stats.null += 1
|
||||
continue
|
||||
|
||||
# Fast path: if the current key can already read the value,
|
||||
# leave it untouched. A failure here simply means we need to try
|
||||
# the previous key below — not a condition worth logging.
|
||||
try:
|
||||
encrypted_type.process_result_value(raw_value, self._dialect)
|
||||
except Exception: # noqa: BLE001, S110 # pylint: disable=broad-except
|
||||
pass
|
||||
else:
|
||||
stats.skipped += 1
|
||||
continue
|
||||
target_type = self._target_type(encrypted_type)
|
||||
|
||||
# Current key cannot decrypt — try the previous key.
|
||||
previous_encrypted_type = EncryptedType(
|
||||
type_in=encrypted_type.underlying_type, key=self._previous_secret_key
|
||||
# Fast path: if the value can already be read in the target form,
|
||||
# leave it untouched. A failure here simply means we need to try the
|
||||
# source decryptors below — not a condition worth logging.
|
||||
#
|
||||
# Caveat for an *unauthenticated* target (AES-CBC, e.g. an
|
||||
# ``--engine aes`` rollback): CBC decryption has no authentication
|
||||
# tag, so an authenticated (AES-GCM) ciphertext can coincidentally
|
||||
# "decrypt" under CBC and be wrongly classified as already migrated —
|
||||
# leaving it as GCM, which then bricks once the config is flipped to
|
||||
# CBC, with the run still reporting success. So when the target is
|
||||
# unauthenticated, first rule out that the value is provably in an
|
||||
# authenticated form; an authenticated decrypt must win over an
|
||||
# unauthenticated one. (A forward migration to AES-GCM is unaffected:
|
||||
# an authenticated target read is itself proof.)
|
||||
target_is_authenticated = _is_authenticated_engine(type(target_type.engine))
|
||||
provably_authenticated = (
|
||||
not target_is_authenticated
|
||||
and self._decrypts_under_authenticated_engine(encrypted_type, raw_value)
|
||||
)
|
||||
try:
|
||||
unencrypted_value = previous_encrypted_type.process_result_value(
|
||||
raw_value, self._dialect
|
||||
)
|
||||
except Exception as prev_ex: # noqa: BLE001 # pylint: disable=broad-except
|
||||
if not provably_authenticated:
|
||||
try:
|
||||
target_type.process_result_value(raw_value, self._dialect)
|
||||
except ( # noqa: S110 # pylint: disable=broad-except
|
||||
Exception # noqa: BLE001
|
||||
):
|
||||
pass
|
||||
else:
|
||||
stats.skipped += 1
|
||||
continue
|
||||
|
||||
# Recover the plaintext from the first source decryptor that can
|
||||
# read the value (current engine/key, then previous key).
|
||||
unencrypted_value = None
|
||||
decrypted = False
|
||||
last_error: Optional[Exception] = None
|
||||
for decryptor in self._source_decryptors(encrypted_type):
|
||||
try:
|
||||
unencrypted_value = decryptor.process_result_value(
|
||||
raw_value, self._dialect
|
||||
)
|
||||
except Exception as ex: # noqa: BLE001 # pylint: disable=broad-except
|
||||
last_error = ex
|
||||
continue
|
||||
decrypted = True
|
||||
break
|
||||
|
||||
if not decrypted:
|
||||
logger.error(
|
||||
"Column [%s.%s] cannot be decrypted under the previous"
|
||||
" or current secret key (%s: %s)",
|
||||
"Column [%s.%s] cannot be decrypted under any known"
|
||||
" key/engine (%s: %s)",
|
||||
table_name,
|
||||
column_name,
|
||||
type(prev_ex).__name__,
|
||||
prev_ex,
|
||||
type(last_error).__name__ if last_error else "None",
|
||||
last_error,
|
||||
)
|
||||
stats.failed += 1
|
||||
continue
|
||||
|
||||
re_encrypted_columns[column_name] = encrypted_type.process_bind_param(
|
||||
re_encrypted_columns[column_name] = target_type.process_bind_param(
|
||||
unencrypted_value,
|
||||
self._dialect,
|
||||
)
|
||||
@@ -276,12 +496,13 @@ class SecretsMigrator:
|
||||
|
||||
def run(self) -> ReEncryptStats:
|
||||
"""
|
||||
Re-encrypt every encrypted column in the ORM under the current
|
||||
``SECRET_KEY``.
|
||||
Re-encrypt every encrypted column in the ORM into the target form
|
||||
(current ``SECRET_KEY`` and, for an engine migration, the target
|
||||
engine).
|
||||
|
||||
Returns per-value counts of re-encrypted, skipped (already under the
|
||||
current key), and failed (undecryptable) outcomes. If any failures
|
||||
occurred the transaction is rolled back by raising after the
|
||||
Returns per-value counts of re-encrypted, skipped (already in the
|
||||
target form), null, and failed (undecryptable) outcomes. If any
|
||||
failures occurred the transaction is rolled back by raising after the
|
||||
summary is logged, so partial re-encryption never commits.
|
||||
"""
|
||||
encrypted_meta_info = self.discover_encrypted_fields()
|
||||
|
||||
@@ -25,10 +25,12 @@ import pytest
|
||||
import yaml # noqa: F401
|
||||
from flask import current_app
|
||||
from freezegun import freeze_time
|
||||
from sqlalchemy_utils.types.encrypted.encrypted_type import AesGcmEngine
|
||||
|
||||
import superset.cli.importexport
|
||||
import superset.cli.thumbnails
|
||||
import superset.cli.update
|
||||
import superset.utils.encrypt
|
||||
from superset import db
|
||||
from superset.models.dashboard import Dashboard
|
||||
from tests.integration_tests.fixtures.birth_names_dashboard import (
|
||||
@@ -364,3 +366,94 @@ def test_re_encrypt_secrets_failure_exits_nonzero(app_context):
|
||||
# The failure path must be handled by the CLI, not leaked as an
|
||||
# uncaught exception.
|
||||
assert response.exception is None or isinstance(response.exception, SystemExit)
|
||||
|
||||
|
||||
def test_re_encrypt_secrets_engine_option_invokes_migrator(app_context):
|
||||
"""
|
||||
When --engine is provided, the CLI must resolve the engine name to the
|
||||
correct engine class and pass it to SecretsMigrator as target_engine.
|
||||
"""
|
||||
current_app.config.pop("PREVIOUS_SECRET_KEY", None)
|
||||
runner = current_app.test_cli_runner()
|
||||
with mock.patch.object(
|
||||
superset.cli.update,
|
||||
"SecretsMigrator",
|
||||
) as migrator_mock:
|
||||
migrator_mock.return_value.run.return_value = (
|
||||
superset.utils.encrypt.ReEncryptStats()
|
||||
)
|
||||
response = runner.invoke(
|
||||
superset.cli.update.re_encrypt_secrets,
|
||||
["--engine", "aes-gcm"],
|
||||
)
|
||||
|
||||
assert response.exit_code == 0
|
||||
call_kwargs = migrator_mock.call_args.kwargs
|
||||
assert call_kwargs.get("target_engine") is AesGcmEngine
|
||||
assert call_kwargs.get("previous_secret_key") is None
|
||||
|
||||
|
||||
def test_re_encrypt_secrets_engine_option_case_insensitive(app_context):
|
||||
"""
|
||||
The --engine option must be case-insensitive per
|
||||
click.Choice(..., case_sensitive=False).
|
||||
"""
|
||||
current_app.config.pop("PREVIOUS_SECRET_KEY", None)
|
||||
runner = current_app.test_cli_runner()
|
||||
with mock.patch.object(
|
||||
superset.cli.update,
|
||||
"SecretsMigrator",
|
||||
) as migrator_mock:
|
||||
migrator_mock.return_value.run.return_value = (
|
||||
superset.utils.encrypt.ReEncryptStats()
|
||||
)
|
||||
response = runner.invoke(
|
||||
superset.cli.update.re_encrypt_secrets,
|
||||
["--engine", "AES-GCM"],
|
||||
)
|
||||
|
||||
assert response.exit_code == 0
|
||||
assert migrator_mock.call_args.kwargs.get("target_engine") is AesGcmEngine
|
||||
|
||||
|
||||
def test_re_encrypt_secrets_combined_key_rotation_and_engine(app_context):
|
||||
"""
|
||||
--previous_secret_key and --engine combine in a single run: the migrator
|
||||
must receive both the previous key (for decryption) and the target engine
|
||||
(for re-encryption). This is the mode most likely to regress, since the
|
||||
single-option tests each pin only the other's variable.
|
||||
"""
|
||||
current_app.config.pop("PREVIOUS_SECRET_KEY", None)
|
||||
runner = current_app.test_cli_runner()
|
||||
with mock.patch.object(
|
||||
superset.cli.update,
|
||||
"SecretsMigrator",
|
||||
) as migrator_mock:
|
||||
migrator_mock.return_value.run.return_value = (
|
||||
superset.utils.encrypt.ReEncryptStats()
|
||||
)
|
||||
response = runner.invoke(
|
||||
superset.cli.update.re_encrypt_secrets,
|
||||
["--previous_secret_key", "old-key", "--engine", "aes-gcm"],
|
||||
)
|
||||
|
||||
assert response.exit_code == 0
|
||||
call_kwargs = migrator_mock.call_args.kwargs
|
||||
assert call_kwargs.get("target_engine") is AesGcmEngine
|
||||
assert call_kwargs.get("previous_secret_key") == "old-key"
|
||||
|
||||
|
||||
def test_re_encrypt_secrets_engine_option_invalid_raises_usage(app_context):
|
||||
"""
|
||||
An unrecognized engine name must produce a click usage error, not a
|
||||
traceback or silent failure.
|
||||
"""
|
||||
runner = current_app.test_cli_runner()
|
||||
response = runner.invoke(
|
||||
superset.cli.update.re_encrypt_secrets,
|
||||
["--engine", "nonexistent-engine"],
|
||||
)
|
||||
|
||||
assert response.exit_code != 0
|
||||
assert "Invalid value" in response.output or "Usage:" in response.output
|
||||
assert "aes" in response.output or "aes-gcm" in response.output
|
||||
|
||||
341
tests/unit_tests/utils/encrypt_test.py
Normal file
341
tests/unit_tests/utils/encrypt_test.py
Normal file
@@ -0,0 +1,341 @@
|
||||
# Licensed to the Apache Software Foundation (ASF) under one
|
||||
# or more contributor license agreements. See the NOTICE file
|
||||
# distributed with this work for additional information
|
||||
# regarding copyright ownership. The ASF licenses this file
|
||||
# to you under the Apache License, Version 2.0 (the
|
||||
# "License"); you may not use this file except in compliance
|
||||
# with the License. You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing,
|
||||
# software distributed under the License is distributed on an
|
||||
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
# KIND, either express or implied. See the License for the
|
||||
# specific language governing permissions and limitations
|
||||
# under the License.
|
||||
|
||||
from unittest import mock
|
||||
from unittest.mock import MagicMock
|
||||
|
||||
import pytest
|
||||
from sqlalchemy import String
|
||||
from sqlalchemy.engine import make_url
|
||||
from sqlalchemy_utils.types.encrypted.encrypted_type import AesEngine, AesGcmEngine
|
||||
|
||||
from superset.utils.encrypt import (
|
||||
EncryptedType,
|
||||
ReEncryptStats,
|
||||
SecretsMigrator,
|
||||
SQLAlchemyUtilsAdapter,
|
||||
)
|
||||
|
||||
SECRET = {"SECRET_KEY": "x" * 32}
|
||||
SECRET_KEY = "k" * 32
|
||||
DIALECT = make_url("sqlite://").get_dialect()
|
||||
|
||||
|
||||
def _encrypted_type(engine: type) -> EncryptedType:
|
||||
"""A standalone EncryptedType for a given engine under SECRET_KEY."""
|
||||
return EncryptedType(String(1024), key=lambda: SECRET_KEY, engine=engine)
|
||||
|
||||
|
||||
def _engine_migrator(target_engine: type) -> SecretsMigrator:
|
||||
"""Build a SecretsMigrator in engine-migration mode without an app context.
|
||||
|
||||
``__init__`` reads ``current_app`` and the DB dialect, so — like the
|
||||
existing row-level tests that override ``_dialect`` — we set the few
|
||||
attributes ``_re_encrypt_row`` actually uses directly.
|
||||
"""
|
||||
migrator = SecretsMigrator.__new__(SecretsMigrator)
|
||||
migrator._secret_key = SECRET_KEY # noqa: SLF001
|
||||
migrator._target_engine = target_engine # noqa: SLF001
|
||||
# Engine migration keeps the SECRET_KEY; previous key defaults to current.
|
||||
migrator._previous_secret_key = SECRET_KEY # noqa: SLF001
|
||||
migrator._dialect = DIALECT # noqa: SLF001
|
||||
return migrator
|
||||
|
||||
|
||||
def test_default_engine_is_aes_cbc() -> None:
|
||||
"""Without config, the adapter keeps the historical AES-CBC engine."""
|
||||
field = SQLAlchemyUtilsAdapter().create(SECRET, String(128))
|
||||
assert isinstance(field.engine, AesEngine)
|
||||
|
||||
|
||||
def test_aes_gcm_engine_selected_by_config() -> None:
|
||||
"""SQLALCHEMY_ENCRYPTED_FIELD_ENGINE='aes-gcm' selects authenticated AES-GCM."""
|
||||
field = SQLAlchemyUtilsAdapter().create(
|
||||
{**SECRET, "SQLALCHEMY_ENCRYPTED_FIELD_ENGINE": "aes-gcm"},
|
||||
String(128),
|
||||
)
|
||||
assert isinstance(field.engine, AesGcmEngine)
|
||||
|
||||
|
||||
def test_unknown_engine_raises_fail_closed() -> None:
|
||||
"""An unrecognized engine name fails closed at field construction.
|
||||
|
||||
Silently falling back to unauthenticated AES-CBC would let an operator who
|
||||
typo'd the engine believe they had authenticated encryption — and, after a
|
||||
GCM migration, write new secrets as CBC into a GCM database. The error must
|
||||
not leak the configured value (it shares the config namespace as SECRET_KEY)
|
||||
but must list the valid engines so a typo is diagnosable.
|
||||
"""
|
||||
with pytest.raises(
|
||||
ValueError, match="Unrecognized SQLALCHEMY_ENCRYPTED_FIELD_ENGINE"
|
||||
) as exc_info:
|
||||
SQLAlchemyUtilsAdapter().create(
|
||||
{**SECRET, "SQLALCHEMY_ENCRYPTED_FIELD_ENGINE": "bogus"},
|
||||
String(128),
|
||||
)
|
||||
message = str(exc_info.value)
|
||||
assert "bogus" not in message
|
||||
assert "aes" in message
|
||||
assert "aes-gcm" in message
|
||||
|
||||
|
||||
def test_engine_name_is_normalized() -> None:
|
||||
"""Engine names are case/separator-normalized to match the CLI's Choice."""
|
||||
for name in ("AES-GCM", "aes_gcm", " Aes-Gcm "):
|
||||
field = SQLAlchemyUtilsAdapter().create(
|
||||
{**SECRET, "SQLALCHEMY_ENCRYPTED_FIELD_ENGINE": name},
|
||||
String(128),
|
||||
)
|
||||
assert isinstance(field.engine, AesGcmEngine)
|
||||
|
||||
|
||||
def test_explicit_engine_kwarg_takes_precedence() -> None:
|
||||
"""An explicit engine kwarg overrides the config (used by the migrator)."""
|
||||
field = SQLAlchemyUtilsAdapter().create(
|
||||
{**SECRET, "SQLALCHEMY_ENCRYPTED_FIELD_ENGINE": "aes-gcm"},
|
||||
String(128),
|
||||
engine=AesEngine,
|
||||
)
|
||||
assert isinstance(field.engine, AesEngine)
|
||||
|
||||
|
||||
def test_engine_migration_cbc_to_gcm_re_encrypts() -> None:
|
||||
"""CBC source value is re-encrypted into GCM under the same SECRET_KEY.
|
||||
|
||||
Mirrors the recommended runbook: the migrator runs while the config still
|
||||
points at AES-CBC (the column type), re-encrypting into AES-GCM.
|
||||
"""
|
||||
cbc = _encrypted_type(AesEngine)
|
||||
ciphertext = cbc.process_bind_param("hunter2", DIALECT)
|
||||
|
||||
migrator = _engine_migrator(AesGcmEngine)
|
||||
conn = MagicMock()
|
||||
row = {"id": 1, "password": ciphertext}
|
||||
stats = ReEncryptStats()
|
||||
|
||||
migrator._re_encrypt_row( # noqa: SLF001
|
||||
conn, row, "dbs", {"password": cbc}, ["id"], stats
|
||||
)
|
||||
|
||||
assert stats == ReEncryptStats(re_encrypted=1)
|
||||
assert conn.execute.call_count == 1
|
||||
new_value = conn.execute.call_args.kwargs["password"]
|
||||
# The stored value changed and now decrypts as GCM back to the plaintext.
|
||||
assert new_value != ciphertext
|
||||
gcm = _encrypted_type(AesGcmEngine)
|
||||
assert gcm.process_result_value(new_value, DIALECT) == "hunter2"
|
||||
|
||||
|
||||
def test_engine_migration_idempotent_for_already_target() -> None:
|
||||
"""A value already in the target (GCM) form is skipped — runs are resumable.
|
||||
|
||||
The column is still configured as CBC (config not yet flipped), but the
|
||||
value has already been migrated to GCM, so it must be left untouched.
|
||||
"""
|
||||
gcm_value = _encrypted_type(AesGcmEngine).process_bind_param("hunter2", DIALECT)
|
||||
cbc_column = _encrypted_type(AesEngine)
|
||||
|
||||
migrator = _engine_migrator(AesGcmEngine)
|
||||
conn = MagicMock()
|
||||
row = {"id": 1, "password": gcm_value}
|
||||
stats = ReEncryptStats()
|
||||
|
||||
migrator._re_encrypt_row( # noqa: SLF001
|
||||
conn, row, "dbs", {"password": cbc_column}, ["id"], stats
|
||||
)
|
||||
|
||||
assert stats == ReEncryptStats(skipped=1)
|
||||
assert conn.execute.call_count == 0
|
||||
|
||||
|
||||
def test_engine_migration_reads_cbc_after_config_already_flipped() -> None:
|
||||
"""CBC source is still migrated when the config was flipped to GCM first.
|
||||
|
||||
If an operator sets the config engine to GCM before running the migrator,
|
||||
the column type can no longer read the CBC value; the previous-key (== the
|
||||
current key) AES-CBC decryptor recovers it and it is re-encrypted as GCM.
|
||||
"""
|
||||
cbc_value = _encrypted_type(AesEngine).process_bind_param("hunter2", DIALECT)
|
||||
gcm_column = _encrypted_type(AesGcmEngine)
|
||||
|
||||
migrator = _engine_migrator(AesGcmEngine)
|
||||
conn = MagicMock()
|
||||
row = {"id": 1, "password": cbc_value}
|
||||
stats = ReEncryptStats()
|
||||
|
||||
migrator._re_encrypt_row( # noqa: SLF001
|
||||
conn, row, "dbs", {"password": gcm_column}, ["id"], stats
|
||||
)
|
||||
|
||||
assert stats == ReEncryptStats(re_encrypted=1)
|
||||
new_value = conn.execute.call_args.kwargs["password"]
|
||||
assert gcm_column.process_result_value(new_value, DIALECT) == "hunter2"
|
||||
|
||||
|
||||
def test_engine_migration_gcm_to_cbc_rolls_back() -> None:
|
||||
"""GCM source value is rolled back to CBC under the same SECRET_KEY.
|
||||
|
||||
The reverse of the forward migration (``--engine aes``). The idempotency
|
||||
fast-path decrypts in the *target* form first; since the target here is
|
||||
unauthenticated AES-CBC, this guards against it mis-reading the AES-GCM
|
||||
ciphertext and wrongly skipping the value instead of re-encrypting it.
|
||||
"""
|
||||
gcm_value = _encrypted_type(AesGcmEngine).process_bind_param("hunter2", DIALECT)
|
||||
gcm_column = _encrypted_type(AesGcmEngine)
|
||||
|
||||
migrator = _engine_migrator(AesEngine)
|
||||
conn = MagicMock()
|
||||
row = {"id": 1, "password": gcm_value}
|
||||
stats = ReEncryptStats()
|
||||
|
||||
migrator._re_encrypt_row( # noqa: SLF001
|
||||
conn, row, "dbs", {"password": gcm_column}, ["id"], stats
|
||||
)
|
||||
|
||||
assert stats == ReEncryptStats(re_encrypted=1)
|
||||
new_value = conn.execute.call_args.kwargs["password"]
|
||||
assert new_value != gcm_value
|
||||
# The rolled-back value now decrypts as AES-CBC back to the plaintext.
|
||||
assert _encrypted_type(AesEngine).process_result_value(new_value, DIALECT) == (
|
||||
"hunter2"
|
||||
)
|
||||
|
||||
|
||||
def test_rollback_authenticated_probe_wins_over_spurious_cbc_skip() -> None:
|
||||
"""Rolling back to unauthenticated CBC must re-encrypt a provably-GCM value,
|
||||
never skip it — even if the unauthenticated target decrypt coincidentally
|
||||
succeeds. The authenticated (GCM) interpretation must win.
|
||||
|
||||
The coincidental CBC-decrypt-of-a-GCM-blob can't be crafted deterministically
|
||||
(it's a ~2^-128 event), so this pins the *ordering invariant* instead: force
|
||||
the target (CBC) read to "succeed", and assert the value is still re-encrypted
|
||||
because the authenticated probe is consulted first and wins. Without the
|
||||
guard this row would be wrongly counted as ``skipped``.
|
||||
"""
|
||||
gcm_value = _encrypted_type(AesGcmEngine).process_bind_param("hunter2", DIALECT)
|
||||
cbc_column = _encrypted_type(AesEngine)
|
||||
|
||||
migrator = _engine_migrator(AesEngine) # target = unauthenticated CBC (rollback)
|
||||
|
||||
# Simulate the spurious case: the unauthenticated CBC target read "succeeds"
|
||||
# even though the value is really GCM.
|
||||
spurious_target = MagicMock()
|
||||
spurious_target.engine = AesEngine()
|
||||
spurious_target.underlying_type = cbc_column.underlying_type
|
||||
spurious_target.process_result_value.return_value = "garbage"
|
||||
spurious_target.process_bind_param.return_value = b"new-cbc-ciphertext"
|
||||
|
||||
conn = MagicMock()
|
||||
row = {"id": 1, "password": gcm_value}
|
||||
stats = ReEncryptStats()
|
||||
|
||||
with mock.patch.object(migrator, "_target_type", return_value=spurious_target):
|
||||
migrator._re_encrypt_row( # noqa: SLF001
|
||||
conn, row, "dbs", {"password": cbc_column}, ["id"], stats
|
||||
)
|
||||
|
||||
# Re-encrypted, NOT skipped: the GCM authenticator beat the spurious CBC read.
|
||||
assert stats == ReEncryptStats(re_encrypted=1)
|
||||
assert spurious_target.process_bind_param.call_count == 1
|
||||
|
||||
|
||||
def test_combined_key_rotation_and_engine_migration() -> None:
|
||||
"""Old-key AES-CBC value → current-key AES-GCM in a single run.
|
||||
|
||||
Exercises the combined mode (``--previous_secret_key`` + ``--engine``): the
|
||||
source ciphertext is CBC under the *previous* key, and must be recovered and
|
||||
re-encrypted as GCM under the *current* key. This is the mode most likely to
|
||||
regress, since each single-mode test pins only the other's variable.
|
||||
"""
|
||||
old_key = "o" * 32
|
||||
cbc_old = EncryptedType(String(1024), key=lambda: old_key, engine=AesEngine)
|
||||
old_value = cbc_old.process_bind_param("hunter2", DIALECT)
|
||||
cbc_column = _encrypted_type(AesEngine)
|
||||
|
||||
migrator = _engine_migrator(AesGcmEngine)
|
||||
migrator._previous_secret_key = old_key # noqa: SLF001 # rotate key too
|
||||
|
||||
conn = MagicMock()
|
||||
row = {"id": 1, "password": old_value}
|
||||
stats = ReEncryptStats()
|
||||
|
||||
migrator._re_encrypt_row( # noqa: SLF001
|
||||
conn, row, "dbs", {"password": cbc_column}, ["id"], stats
|
||||
)
|
||||
|
||||
assert stats == ReEncryptStats(re_encrypted=1)
|
||||
new_value = conn.execute.call_args.kwargs["password"]
|
||||
# The migrated value decrypts as GCM under the *current* key.
|
||||
assert _encrypted_type(AesGcmEngine).process_result_value(new_value, DIALECT) == (
|
||||
"hunter2"
|
||||
)
|
||||
|
||||
|
||||
def _key_rotation_migrator(previous_secret_key: str) -> SecretsMigrator:
|
||||
"""Build a SecretsMigrator in key-rotation mode without an app context.
|
||||
|
||||
Like ``_engine_migrator`` but with no target engine: values are decrypted
|
||||
under ``previous_secret_key`` and re-encrypted under the current key using
|
||||
the column's own engine.
|
||||
"""
|
||||
migrator = SecretsMigrator.__new__(SecretsMigrator)
|
||||
migrator._secret_key = SECRET_KEY # noqa: SLF001
|
||||
migrator._target_engine = None # noqa: SLF001
|
||||
migrator._previous_secret_key = previous_secret_key # noqa: SLF001
|
||||
migrator._dialect = DIALECT # noqa: SLF001
|
||||
return migrator
|
||||
|
||||
|
||||
def test_key_rotation_for_aes_gcm_column() -> None:
|
||||
"""SECRET_KEY rotation works for an AES-GCM column.
|
||||
|
||||
The previous-key fallback must use the column's AES-GCM engine, otherwise
|
||||
GCM ciphertext written under the old key cannot be decrypted and the
|
||||
rotation rolls back.
|
||||
"""
|
||||
old_key = "o" * 32
|
||||
gcm_old = EncryptedType(String(1024), key=lambda: old_key, engine=AesGcmEngine)
|
||||
old_value = gcm_old.process_bind_param("hunter2", DIALECT)
|
||||
gcm_column = _encrypted_type(AesGcmEngine)
|
||||
|
||||
migrator = _key_rotation_migrator(previous_secret_key=old_key)
|
||||
conn = MagicMock()
|
||||
row = {"id": 1, "password": old_value}
|
||||
stats = ReEncryptStats()
|
||||
|
||||
migrator._re_encrypt_row( # noqa: SLF001
|
||||
conn, row, "dbs", {"password": gcm_column}, ["id"], stats
|
||||
)
|
||||
|
||||
assert stats == ReEncryptStats(re_encrypted=1)
|
||||
new_value = conn.execute.call_args.kwargs["password"]
|
||||
assert gcm_column.process_result_value(new_value, DIALECT) == "hunter2"
|
||||
|
||||
|
||||
def test_engine_migration_unreadable_value_counts_as_failure() -> None:
|
||||
"""A value no engine/key can read is a failure, not a silent pass-through."""
|
||||
migrator = _engine_migrator(AesGcmEngine)
|
||||
conn = MagicMock()
|
||||
row = {"id": 1, "password": b"not-valid-ciphertext"}
|
||||
stats = ReEncryptStats()
|
||||
|
||||
migrator._re_encrypt_row( # noqa: SLF001
|
||||
conn, row, "dbs", {"password": _encrypted_type(AesEngine)}, ["id"], stats
|
||||
)
|
||||
|
||||
assert stats == ReEncryptStats(failed=1)
|
||||
assert conn.execute.call_count == 0
|
||||
Reference in New Issue
Block a user