mirror of
https://github.com/apache/superset.git
synced 2026-04-14 13:44:46 +00:00
docs: bifurcate documentation into user and admin sections (#38196)
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
286
docs/admin_docs/installation/docker-compose.mdx
Normal file
286
docs/admin_docs/installation/docker-compose.mdx
Normal file
@@ -0,0 +1,286 @@
|
||||
---
|
||||
title: Docker Compose
|
||||
hide_title: true
|
||||
sidebar_position: 5
|
||||
version: 1
|
||||
---
|
||||
|
||||
import useBaseUrl from "@docusaurus/useBaseUrl";
|
||||
|
||||
# Using Docker Compose
|
||||
|
||||
<img src={useBaseUrl("/img/docker-compose.webp" )} width="150" />
|
||||
<br /><br />
|
||||
|
||||
:::caution
|
||||
Since `docker compose` is primarily designed to run a set of containers on **a single host**
|
||||
and can't support requirements for **high availability**, we do not support nor recommend
|
||||
using our `docker compose` constructs to support production-type use-cases. For single host
|
||||
environments, we recommend using [minikube](https://minikube.sigs.k8s.io/docs/start/) along
|
||||
with our [installing on k8s](https://superset.apache.org/admin-docs/installation/running-on-kubernetes)
|
||||
documentation.
|
||||
:::
|
||||
|
||||
As mentioned in our [quickstart guide](/user-docs/quickstart), the fastest way to try
|
||||
Superset locally is using Docker Compose on a Linux or Mac OSX
|
||||
computer. Superset does not have official support for Windows. It's also the easiest
|
||||
way to launch a fully functioning **development environment** quickly.
|
||||
|
||||
Note that there are 4 major ways we support to run `docker compose`:
|
||||
|
||||
1. **docker-compose.yml:** for interactive development, where we mount your local folder with the
|
||||
frontend/backend files that you can edit and experience the changes you
|
||||
make in the app in real time
|
||||
1. **docker-compose-light.yml:** a lightweight configuration with minimal services (database,
|
||||
Superset app, and frontend dev server) for development. Uses in-memory caching instead of Redis
|
||||
and is designed for running multiple instances simultaneously
|
||||
1. **docker-compose-non-dev.yml** where we just build a more immutable image based on the
|
||||
local branch and get all the required images running. Changes in the local branch
|
||||
at the time you fire this up will be reflected, but changes to the code
|
||||
while `up` won't be reflected in the app
|
||||
1. **docker-compose-image-tag.yml** where we fetch an image from docker-hub say for the
|
||||
`5.0.0` release for instance, and fire it up so you can try it. Here what's in
|
||||
the local branch has no effects on what's running, we just fetch and run
|
||||
pre-built images from docker-hub. For `docker compose` to work along with the
|
||||
Postgres image it boots up, you'll want to point to a `-dev`-suffixed TAG, as in
|
||||
`export TAG=5.0.0-dev` or `export TAG=4.1.2-dev`, with `latest-dev` being the default.
|
||||
The `dev` builds include the `psycopg2-binary` required to connect
|
||||
to the Postgres database launched as part of the `docker compose` builds.
|
||||
|
||||
More on these approaches after setting up the requirements for either.
|
||||
|
||||
## Requirements
|
||||
|
||||
Note that this documentation assumes that you have [Docker](https://www.docker.com) and
|
||||
[git](https://git-scm.com/) installed. Note also that we used to use `docker-compose` but that
|
||||
is on the path to deprecation so we now use `docker compose` instead.
|
||||
|
||||
## 1. Clone Superset's GitHub repository
|
||||
|
||||
[Clone Superset's repo](https://github.com/apache/superset) in your terminal with the
|
||||
following command:
|
||||
|
||||
```bash
|
||||
git clone --depth=1 https://github.com/apache/superset.git
|
||||
```
|
||||
|
||||
Once that command completes successfully, you should see a new `superset` folder in your
|
||||
current directory.
|
||||
|
||||
## 2. Launch Superset Through Docker Compose
|
||||
|
||||
First let's assume you're familiar with `docker compose` mechanics. Here we'll refer generally
|
||||
to `docker compose up` even though in some cases you may want to force a check for newer remote
|
||||
images using `docker compose pull`, force a build with `docker compose build` or force a build
|
||||
on latest base images using `docker compose build --pull`. In most cases though, the simple
|
||||
`up` command should do just fine. Refer to docker compose docs for more information on the topic.
|
||||
|
||||
### Option #1 - for an interactive development environment
|
||||
|
||||
```bash
|
||||
# The --build argument insures all the layers are up-to-date
|
||||
docker compose up --build
|
||||
```
|
||||
|
||||
:::tip
|
||||
When running in development mode the `superset-node`
|
||||
container needs to finish building assets in order for the UI to render properly. If you would just
|
||||
like to try out Superset without making any code changes follow the steps documented for
|
||||
`production` or a specific version below.
|
||||
:::
|
||||
|
||||
:::tip
|
||||
By default, we mount the local superset-frontend folder here and run `npm install` as well
|
||||
as `npm run dev` which triggers webpack to compile/bundle the frontend code. Depending
|
||||
on your local setup, especially if you have less than 16GB of memory, it may be very slow to
|
||||
perform those operations. In this case, we recommend you set the env var
|
||||
`BUILD_SUPERSET_FRONTEND_IN_DOCKER` to `false`, and to run this locally instead in a terminal.
|
||||
Simply trigger `npm i && npm run dev`, this should be MUCH faster.
|
||||
:::
|
||||
|
||||
:::tip
|
||||
Sometimes, your npm-related state can get out-of-wack, running `npm run prune` from
|
||||
the `superset-frontend/` folder will nuke the various' packages `node_module/` folders
|
||||
and help you start fresh. In the context of `docker compose` setting
|
||||
`export NPM_RUN_PRUNE=true` prior to running `docker compose up` will trigger that
|
||||
from within docker. This will slow down the startup, but will fix various npm-related issues.
|
||||
:::
|
||||
|
||||
### Option #2 - lightweight development with multiple instances
|
||||
|
||||
For a lighter development setup that uses fewer resources and supports running multiple instances:
|
||||
|
||||
```bash
|
||||
# Single lightweight instance (default port 9001)
|
||||
docker compose -f docker-compose-light.yml up
|
||||
|
||||
# Multiple instances with different ports
|
||||
NODE_PORT=9001 docker compose -p superset-1 -f docker-compose-light.yml up
|
||||
NODE_PORT=9002 docker compose -p superset-2 -f docker-compose-light.yml up
|
||||
NODE_PORT=9003 docker compose -p superset-3 -f docker-compose-light.yml up
|
||||
```
|
||||
|
||||
This configuration includes:
|
||||
- PostgreSQL database (internal network only)
|
||||
- Superset application server
|
||||
- Frontend development server with webpack hot reloading
|
||||
- In-memory caching (no Redis)
|
||||
- Isolated volumes and networks per instance
|
||||
|
||||
Access each instance at `http://localhost:{NODE_PORT}` (e.g., `http://localhost:9001`).
|
||||
|
||||
### Option #3 - build a set of immutable images from the local branch
|
||||
|
||||
```bash
|
||||
docker compose -f docker-compose-non-dev.yml up
|
||||
```
|
||||
|
||||
### Option #4 - boot up an official release
|
||||
|
||||
```bash
|
||||
# Set the version you want to run
|
||||
export TAG=5.0.0
|
||||
# Fetch the tag you're about to check out (assuming you shallow-cloned the repo)
|
||||
git fetch --depth=1 origin tag $TAG
|
||||
# Could also fetch all tags too if you've got bandwidth to spare
|
||||
# git fetch --tags
|
||||
# Checkout the corresponding git ref
|
||||
git checkout $TAG
|
||||
# Fire up docker compose
|
||||
docker compose -f docker-compose-image-tag.yml up
|
||||
```
|
||||
|
||||
Here various release tags, github SHA, and latest `master` can be referenced by the TAG env var.
|
||||
Refer to the docker-related documentation to learn more about existing tags you can point to
|
||||
from Docker Hub.
|
||||
|
||||
:::note
|
||||
For option #2 and #3, we recommend checking out the release tag from the git repository
|
||||
(ie: `git checkout 5.0.0`) for more guaranteed results. This ensures that the `docker-compose.*.yml`
|
||||
configurations and that the mounted `docker/` scripts are in sync with the image you are
|
||||
looking to fire up.
|
||||
:::
|
||||
|
||||
## `docker compose` tips & configuration
|
||||
|
||||
:::caution
|
||||
All of the content belonging to a Superset instance - charts, dashboards, users, etc. - is stored in
|
||||
its metadata database. In production, this database should be backed up. The default installation
|
||||
with docker compose will store that data in a PostgreSQL database contained in a Docker
|
||||
[volume](https://docs.docker.com/storage/volumes/), which is not backed up.
|
||||
|
||||
Again, **THE DOCKER-COMPOSE INSTALLATION IS NOT PRODUCTION-READY OUT OF THE BOX.**
|
||||
|
||||
:::
|
||||
|
||||
You should see a stream of logging output from the containers being launched on your machine. Once
|
||||
this output slows, you should have a running instance of Superset on your local machine! To avoid
|
||||
the wall of text on future runs, add the `-d` option to the end of the `docker compose up` command.
|
||||
|
||||
### Configuring Further
|
||||
|
||||
The following is for users who want to configure how Superset runs in Docker Compose; otherwise, you
|
||||
can skip to the next section.
|
||||
|
||||
You can install additional python packages and apply config overrides by following the steps
|
||||
mentioned in [docker/README.md](https://github.com/apache/superset/tree/master/docker#configuration)
|
||||
|
||||
Note that `docker/.env` sets the default environment variables for all the docker images
|
||||
used by `docker compose`, and that `docker/.env-local` can be used to override those defaults.
|
||||
Also note that `docker/.env-local` is referenced in our `.gitignore`,
|
||||
preventing developers from risking committing potentially sensitive configuration to the repository.
|
||||
|
||||
One important variable is `SUPERSET_LOAD_EXAMPLES` which determines whether the `superset_init`
|
||||
container will populate example data and visualizations into the metadata database. These examples
|
||||
are helpful for learning and testing out Superset but unnecessary for experienced users and
|
||||
production deployments. The loading process can sometimes take a few minutes and a good amount of
|
||||
CPU, so you may want to disable it on a resource-constrained device.
|
||||
|
||||
For more advanced or dynamic configurations that are typically managed in a `superset_config.py` file
|
||||
located in your `PYTHONPATH`, note that it can be done by providing a
|
||||
`docker/pythonpath_dev/superset_config_docker.py` that will be ignored by git
|
||||
(preventing you to commit/push your local configuration back to the repository).
|
||||
The mechanics of this are in `docker/pythonpath_dev/superset_config.py` where you can see
|
||||
that the logic runs a `from superset_config_docker import *`
|
||||
|
||||
:::note
|
||||
Users often want to connect to other databases from Superset. Currently, the easiest way to
|
||||
do this is to modify the `docker-compose-non-dev.yml` file and add your database as a service that
|
||||
the other services depend on (via `x-superset-depends-on`). Others have attempted to set
|
||||
`network_mode: host` on the Superset services, but these generally break the installation,
|
||||
because the configuration requires use of the Docker Compose DNS resolver for the service names.
|
||||
If you have a good solution for this, let us know!
|
||||
:::
|
||||
|
||||
:::note
|
||||
Superset uses [Scarf Gateway](https://about.scarf.sh/scarf-gateway) to collect telemetry
|
||||
data. Knowing the installation counts for different Superset versions informs the project's
|
||||
decisions about patching and long-term support. Scarf purges personally identifiable information
|
||||
(PII) and provides only aggregated statistics.
|
||||
|
||||
To opt-out of this data collection for packages downloaded through the Scarf Gateway by your docker
|
||||
compose based installation, edit the `x-superset-image:` line in your `docker-compose.yml` and
|
||||
`docker-compose-non-dev.yml` files, replacing `apachesuperset.docker.scarf.sh/apache/superset` with
|
||||
`apache/superset` to pull the image directly from Docker Hub.
|
||||
|
||||
To disable the Scarf telemetry pixel, set the `SCARF_ANALYTICS` environment variable to `False` in
|
||||
your terminal and/or in your `docker/.env` file.
|
||||
:::
|
||||
|
||||
## 3. Log in to Superset
|
||||
|
||||
Your local Superset instance also includes a Postgres server to store your data and is already
|
||||
pre-loaded with some example datasets that ship with Superset. You can access Superset now via your
|
||||
web browser by visiting `http://localhost:8088`. Note that many browsers now default to `https` - if
|
||||
yours is one of them, please make sure it uses `http`.
|
||||
|
||||
Log in with the default username and password:
|
||||
|
||||
```bash
|
||||
username: admin
|
||||
```
|
||||
|
||||
```bash
|
||||
password: admin
|
||||
```
|
||||
|
||||
## 4. Connecting Superset to your local database instance
|
||||
|
||||
When running Superset using `docker` or `docker compose` it runs in its own docker container, as if
|
||||
the Superset was running in a separate machine entirely. Therefore attempts to connect to your local
|
||||
database with the hostname `localhost` won't work as `localhost` refers to the docker container
|
||||
Superset is running in, and not your actual host machine. Fortunately, docker provides an easy way
|
||||
to access network resources in the host machine from inside a container, and we will leverage this
|
||||
capability to connect to our local database instance.
|
||||
|
||||
Here the instructions are for connecting to postgresql (which is running on your host machine) from
|
||||
Superset (which is running in its docker container). Other databases may have slightly different
|
||||
configurations but gist would be same and boils down to 2 steps -
|
||||
|
||||
1. **(Mac users may skip this step)** Configuring the local postgresql/database instance to accept
|
||||
public incoming connections. By default, postgresql only allows incoming connections from
|
||||
`localhost` and under Docker, unless you use `--network=host`, `localhost` will refer to different
|
||||
endpoints on the host machine and in a docker container respectively. Allowing postgresql to accept
|
||||
connections from the Docker involves making one-line changes to the files `postgresql.conf` and
|
||||
`pg_hba.conf`; you can find helpful links tailored to your OS / PG version on the web easily for
|
||||
this task. For Docker it suffices to only whitelist IPs `172.0.0.0/8` instead of `*`, but in any
|
||||
case you are _warned_ that doing this in a production database _may_ have disastrous consequences as
|
||||
you are opening your database to the public internet.
|
||||
1. Instead of `localhost`, try using `host.docker.internal` (Mac users, Ubuntu) or `172.18.0.1`
|
||||
(Linux users) as the hostname when attempting to connect to the database. This is a Docker internal
|
||||
detail -- what is happening is that, in Mac systems, Docker Desktop creates a dns entry for the
|
||||
hostname `host.docker.internal` which resolves to the correct address for the host machine, whereas
|
||||
in Linux this is not the case (at least by default). If neither of these 2 hostnames work then you
|
||||
may want to find the exact hostname you want to use, for that you can do `ifconfig` or
|
||||
`ip addr show` and look at the IP address of `docker0` interface that must have been created by
|
||||
Docker for you. Alternately if you don't even see the `docker0` interface try (if needed with sudo)
|
||||
`docker network inspect bridge` and see if there is an entry for `"Gateway"` and note the IP
|
||||
address.
|
||||
|
||||
## 4. To build or not to build
|
||||
|
||||
When running `docker compose up`, docker will build what is required behind the scene, but
|
||||
may use the docker cache if assets already exist. Running `docker compose build` prior to
|
||||
`docker compose up` or the equivalent shortcut `docker compose up --build` ensures that your
|
||||
docker images match the definition in the repository. This should only apply to the main
|
||||
docker-compose.yml file (default) and not to the alternative methods defined above.
|
||||
Reference in New Issue
Block a user