mirror of
https://github.com/apache/superset.git
synced 2026-04-07 18:35:15 +00:00
155 lines
5.0 KiB
Plaintext
155 lines
5.0 KiB
Plaintext
---
|
|
title: Caching
|
|
hide_title: true
|
|
sidebar_position: 3
|
|
version: 1
|
|
---
|
|
|
|
# Caching
|
|
|
|
Superset uses [Flask-Caching](https://flask-caching.readthedocs.io/) for caching purposes.
|
|
Flask-Caching supports various caching backends, including Redis (recommended), Memcached,
|
|
SimpleCache (in-memory), or the local filesystem.
|
|
[Custom cache backends](https://flask-caching.readthedocs.io/en/latest/#custom-cache-backends)
|
|
are also supported.
|
|
|
|
Caching can be configured by providing dictionaries in
|
|
`superset_config.py` that comply with [the Flask-Caching config specifications](https://flask-caching.readthedocs.io/en/latest/#configuring-flask-caching).
|
|
|
|
The following cache configurations can be customized in this way:
|
|
|
|
- Dashboard filter state (required): `FILTER_STATE_CACHE_CONFIG`.
|
|
- Explore chart form data (required): `EXPLORE_FORM_DATA_CACHE_CONFIG`
|
|
- Metadata cache (optional): `CACHE_CONFIG`
|
|
- Charting data queried from datasets (optional): `DATA_CACHE_CONFIG`
|
|
|
|
For example, to configure the filter state cache using Redis:
|
|
|
|
```python
|
|
FILTER_STATE_CACHE_CONFIG = {
|
|
'CACHE_TYPE': 'RedisCache',
|
|
'CACHE_DEFAULT_TIMEOUT': 86400,
|
|
'CACHE_KEY_PREFIX': 'superset_filter_cache',
|
|
'CACHE_REDIS_URL': 'redis://localhost:6379/0'
|
|
}
|
|
```
|
|
|
|
## Dependencies
|
|
|
|
In order to use dedicated cache stores, additional python libraries must be installed
|
|
|
|
- For Redis: we recommend the [redis](https://pypi.python.org/pypi/redis) Python package
|
|
- Memcached: we recommend using [pylibmc](https://pypi.org/project/pylibmc/) client library as
|
|
`python-memcached` does not handle storing binary data correctly.
|
|
|
|
These libraries can be installed using pip.
|
|
|
|
## Fallback Metastore Cache
|
|
|
|
Note, that some form of Filter State and Explore caching are required. If either of these caches
|
|
are undefined, Superset falls back to using a built-in cache that stores data in the metadata
|
|
database. While it is recommended to use a dedicated cache, the built-in cache can also be used
|
|
to cache other data.
|
|
|
|
For example, to use the built-in cache to store chart data, use the following config:
|
|
|
|
```python
|
|
DATA_CACHE_CONFIG = {
|
|
"CACHE_TYPE": "SupersetMetastoreCache",
|
|
"CACHE_KEY_PREFIX": "superset_results", # make sure this string is unique to avoid collisions
|
|
"CACHE_DEFAULT_TIMEOUT": 86400, # 60 seconds * 60 minutes * 24 hours
|
|
}
|
|
```
|
|
|
|
## Chart Cache Timeout
|
|
|
|
The cache timeout for charts may be overridden by the settings for an individual chart, dataset, or
|
|
database. Each of these configurations will be checked in order before falling back to the default
|
|
value defined in `DATA_CACHE_CONFIG`.
|
|
|
|
Note, that by setting the cache timeout to `-1`, caching for charting data can be disabled, either
|
|
per chart, dataset or database, or by default if set in `DATA_CACHE_CONFIG`.
|
|
|
|
## SQL Lab Query Results
|
|
|
|
Caching for SQL Lab query results is used when async queries are enabled and is configured using
|
|
`RESULTS_BACKEND`.
|
|
|
|
Note that this configuration does not use a flask-caching dictionary for its configuration, but
|
|
instead requires a cachelib object.
|
|
|
|
See [Async Queries via Celery](/docs/configuration/async-queries-celery) for details.
|
|
|
|
## Caching Thumbnails
|
|
|
|
This is an optional feature that can be turned on by activating its [feature flag](/docs/configuration/configuring-superset#feature-flags) on config:
|
|
|
|
```
|
|
FEATURE_FLAGS = {
|
|
"THUMBNAILS": True,
|
|
"THUMBNAILS_SQLA_LISTENERS": True,
|
|
}
|
|
```
|
|
|
|
By default thumbnails are rendered per user, and will fall back to the Selenium user for anonymous users.
|
|
To always render thumbnails as a fixed user (`admin` in this example), use the following configuration:
|
|
|
|
```python
|
|
from superset.tasks.types import FixedExecutor
|
|
|
|
THUMBNAIL_EXECUTORS = [FixedExecutor("admin")]
|
|
```
|
|
|
|
For this feature you will need a cache system and celery workers. All thumbnails are stored on cache
|
|
and are processed asynchronously by the workers.
|
|
|
|
An example config where images are stored on S3 could be:
|
|
|
|
```python
|
|
from flask import Flask
|
|
from s3cache.s3cache import S3Cache
|
|
|
|
...
|
|
|
|
class CeleryConfig(object):
|
|
broker_url = "redis://localhost:6379/0"
|
|
imports = (
|
|
"superset.sql_lab",
|
|
"superset.tasks.thumbnails",
|
|
)
|
|
result_backend = "redis://localhost:6379/0"
|
|
worker_prefetch_multiplier = 10
|
|
task_acks_late = True
|
|
|
|
|
|
CELERY_CONFIG = CeleryConfig
|
|
|
|
def init_thumbnail_cache(app: Flask) -> S3Cache:
|
|
return S3Cache("bucket_name", 'thumbs_cache/')
|
|
|
|
|
|
THUMBNAIL_CACHE_CONFIG = init_thumbnail_cache
|
|
```
|
|
|
|
Using the above example cache keys for dashboards will be `superset_thumb__dashboard__{ID}`. You can
|
|
override the base URL for selenium using:
|
|
|
|
```
|
|
WEBDRIVER_BASEURL = "https://superset.company.com"
|
|
```
|
|
|
|
Additional selenium web drive configuration can be set using `WEBDRIVER_CONFIGURATION`. You can
|
|
implement a custom function to authenticate selenium. The default function uses the `flask-login`
|
|
session cookie. Here's an example of a custom function signature:
|
|
|
|
```python
|
|
def auth_driver(driver: WebDriver, user: "User") -> WebDriver:
|
|
pass
|
|
```
|
|
|
|
Then on configuration:
|
|
|
|
```
|
|
WEBDRIVER_AUTH_FUNC = auth_driver
|
|
```
|