feat(docs): Docusaurus multi-versioning, Developer Portal starter kit (#34271)

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-18 23:55:00 +00:00 · 2025-08-22 09:53:01 -07:00
parent 2b2cc96f11
commit 0a45a89786
71 changed files with 16791 additions and 1809 deletions
--- a/docs/versioned_docs/version-6.0.0/configuration/async-queries-celery.mdx
+++ b/docs/versioned_docs/version-6.0.0/configuration/async-queries-celery.mdx
@@ -0,0 +1,104 @@
+---
+title: Async Queries via Celery
+hide_title: true
+sidebar_position: 4
+version: 1
+---
+
+# Async Queries via Celery
+
+## Celery
+
+On large analytic databases, it’s common to run queries that execute for minutes or hours. To enable
+support for long running queries that execute beyond the typical web request’s timeout (30-60
+seconds), it is necessary to configure an asynchronous backend for Superset which consists of:
+
+- one or many Superset workers (which is implemented as a Celery worker), and can be started with
+  the `celery worker` command, run `celery worker --help` to view the related options.
+- a celery broker (message queue) for which we recommend using Redis or RabbitMQ
+- a results backend that defines where the worker will persist the query results
+
+Configuring Celery requires defining a `CELERY_CONFIG` in your `superset_config.py`. Both the worker
+and web server processes should have the same configuration.
+
+```python
+class CeleryConfig(object):
+    broker_url = "redis://localhost:6379/0"
+    imports = (
+        "superset.sql_lab",
+        "superset.tasks.scheduler",
+    )
+    result_backend = "redis://localhost:6379/0"
+    worker_prefetch_multiplier = 10
+    task_acks_late = True
+    task_annotations = {
+        "sql_lab.get_sql_results": {
+            "rate_limit": "100/s",
+        },
+    }
+
+CELERY_CONFIG = CeleryConfig
+```
+
+To start a Celery worker to leverage the configuration, run the following command:
+
+```bash
+celery --app=superset.tasks.celery_app:app worker --pool=prefork -O fair -c 4
+```
+
+To start a job which schedules periodic background jobs, run the following command:
+
+```bash
+celery --app=superset.tasks.celery_app:app beat
+```
+
+To setup a result backend, you need to pass an instance of a derivative of from
+from flask_caching.backends.base import BaseCache to the RESULTS_BACKEND configuration key in your superset_config.py. You can
+use Memcached, Redis, S3 (https://pypi.python.org/pypi/s3werkzeugcache), memory or the file system
+(in a single server-type setup or for testing), or to write your own caching interface. Your
+`superset_config.py` may look something like:
+
+```python
+# On S3
+from s3cache.s3cache import S3Cache
+S3_CACHE_BUCKET = 'foobar-superset'
+S3_CACHE_KEY_PREFIX = 'sql_lab_result'
+RESULTS_BACKEND = S3Cache(S3_CACHE_BUCKET, S3_CACHE_KEY_PREFIX)
+
+# On Redis
+from flask_caching.backends.rediscache import RedisCache
+RESULTS_BACKEND = RedisCache(
+    host='localhost', port=6379, key_prefix='superset_results')
+```
+
+For performance gains, [MessagePack](https://github.com/msgpack/msgpack-python) and
+[PyArrow](https://arrow.apache.org/docs/python/) are now used for results serialization. This can be
+disabled by setting `RESULTS_BACKEND_USE_MSGPACK = False` in your `superset_config.py`, should any
+issues arise. Please clear your existing results cache store when upgrading an existing environment.
+
+**Important Notes**
+
+- It is important that all the worker nodes and web servers in the Superset cluster _share a common
+  metadata database_. This means that SQLite will not work in this context since it has limited
+  support for concurrency and typically lives on the local file system.
+
+- There should _only be one instance of celery beat running_ in your entire setup. If not,
+  background jobs can get scheduled multiple times resulting in weird behaviors like duplicate
+  delivery of reports, higher than expected load / traffic etc.
+
+- SQL Lab will _only run your queries asynchronously if_ you enable **Asynchronous Query Execution**
+  in your database settings (Sources > Databases > Edit record).
+
+## Celery Flower
+
+Flower is a web based tool for monitoring the Celery cluster which you can install from pip:
+
+```bash
+pip install flower
+```
+
+You can run flower using:
+
+```bash
+celery --app=superset.tasks.celery_app:app flower
+```