mirror of
https://github.com/apache/superset.git
synced 2026-04-26 19:44:58 +00:00
docs: Refactor Documentation Structure (#28161)
Co-authored-by: Evan Rusackas <evan@preset.io> Co-authored-by: Sam Firke <sfirke@users.noreply.github.com>
This commit is contained in:
145
docs/docs/using-superset/chart-params.mdx
Normal file
145
docs/docs/using-superset/chart-params.mdx
Normal file
@@ -0,0 +1,145 @@
|
||||
---
|
||||
title: Chart Parameters Reference
|
||||
hide_title: true
|
||||
sidebar_position: 4
|
||||
version: 1
|
||||
---
|
||||
|
||||
## Chart Parameters
|
||||
|
||||
Chart parameters are stored as a JSON encoded string in the `slices.params` column and are often referenced throughout the code as form-data. Currently the form-data is neither versioned nor typed as thus is somewhat free-formed. Note in the future there may be merit in using something like [JSON Schema](https://json-schema.org/) to both annotate and validate the JSON object in addition to using a Mypy `TypedDict` (introduced in Python 3.8) for typing the form-data in the backend. This section serves as a potential primer for that work.
|
||||
|
||||
The following tables provide a non-exhaustive list of the various fields which can be present in the JSON object grouped by the Explorer pane sections. These values were obtained by extracting the distinct fields from a legacy deployment consisting of tens of thousands of charts and thus some fields may be missing whilst others may be deprecated.
|
||||
|
||||
Note not all fields are correctly categorized. The fields vary based on visualization type and may appear in different sections depending on the type. Verified deprecated columns may indicate a missing migration and/or prior migrations which were unsuccessful and thus future work may be required to clean up the form-data.
|
||||
|
||||
### Datasource & Chart Type
|
||||
|
||||
| Field | Type | Notes |
|
||||
| ----------------- | -------- | ------------------------------------ |
|
||||
| `database_name` | _string_ | _Deprecated?_ |
|
||||
| `datasource` | _string_ | `<datasource_id>__<datasource_type>` |
|
||||
| `datasource_id` | _string_ | _Deprecated?_ See `datasource` |
|
||||
| `datasource_name` | _string_ | _Deprecated?_ |
|
||||
| `datasource_type` | _string_ | _Deprecated?_ See `datasource` |
|
||||
| `viz_type` | _string_ | The **Visualization Type** widget |
|
||||
|
||||
### Time
|
||||
|
||||
| Field | Type | Notes |
|
||||
| ------------------ | -------- | ------------------------------------- |
|
||||
| `granularity_sqla` | _string_ | The SQLA **Time Column** widget |
|
||||
| `time_grain_sqla` | _string_ | The SQLA **Time Grain** widget |
|
||||
| `time_range` | _string_ | The **Time range** widget |
|
||||
|
||||
### GROUP BY
|
||||
|
||||
| Field | Type | Notes |
|
||||
| ------------------------- | --------------- | ----------------- |
|
||||
| `metrics` | _array(string)_ | See Query section |
|
||||
| `order_asc` | - | See Query section |
|
||||
| `row_limit` | - | See Query section |
|
||||
| `timeseries_limit_metric` | - | See Query section |
|
||||
|
||||
### NOT GROUPED BY
|
||||
|
||||
| Field | Type | Notes |
|
||||
| --------------- | --------------- | ----------------------- |
|
||||
| `order_by_cols` | _array(string)_ | The **Ordering** widget |
|
||||
| `row_limit` | - | See Query section |
|
||||
|
||||
### Y Axis 1
|
||||
|
||||
| Field | Type | Notes |
|
||||
| --------------- | ---- | -------------------------------------------------- |
|
||||
| `metric` | - | The **Left Axis Metric** widget. See Query section |
|
||||
| `y_axis_format` | - | See Y Axis section |
|
||||
|
||||
### Y Axis 2
|
||||
|
||||
| Field | Type | Notes |
|
||||
| ---------- | ---- | --------------------------------------------------- |
|
||||
| `metric_2` | - | The **Right Axis Metric** widget. See Query section |
|
||||
|
||||
### Query
|
||||
|
||||
| Field | Type | Notes |
|
||||
| ------------------------------------------------------------------------------------------------------ | ------------------------------------------------- | ------------------------------------------------- |
|
||||
| `adhoc_filters` | _array(object)_ | The **Filters** widget |
|
||||
| `extra_filters` | _array(object)_ | Another pathway to the **Filters** widget.<br/>It is generally used to pass dashboard filter parameters to a chart.<br/>It can be used for appending additional filters to a chart that has been saved with its own filters on an ad-hoc basis if the chart is being used as a standalone widget.<br/><br/>For implementation examples see : [utils test.py](https://github.com/apache/superset/blob/66a4c94a1ed542e69fe6399bab4c01d4540486cf/tests/utils_tests.py#L181)<br/>For insight into how superset processes the contents of this parameter see: [exploreUtils/index.js](https://github.com/apache/superset/blob/93c7f5bb446ec6895d7702835f3157426955d5a9/superset-frontend/src/explore/exploreUtils/index.js#L159) |
|
||||
| `columns` | _array(string)_ | The **Breakdowns** widget |
|
||||
| `groupby` | _array(string)_ | The **Group by** or **Series** widget |
|
||||
| `limit` | _number_ | The **Series Limit** widget |
|
||||
| `metric`<br/>`metric_2`<br/>`metrics`<br/>`percent_metrics`<br/>`secondary_metric`<br/>`size`<br/>`x`<br/>`y` | _string_,_object_,_array(string)_,_array(object)_ | The metric(s) depending on the visualization type |
|
||||
| `order_asc` | _boolean_ | The **Sort Descending** widget |
|
||||
| `row_limit` | _number_ | The **Row limit** widget |
|
||||
| `timeseries_limit_metric` | _object_ | The **Sort By** widget |
|
||||
|
||||
The `metric` (or equivalent) and `timeseries_limit_metric` fields are all composed of either metric names or the JSON representation of the `AdhocMetric` TypeScript type. The `adhoc_filters` is composed of the JSON represent of the `AdhocFilter` TypeScript type (which can comprise of columns or metrics depending on whether it is a WHERE or HAVING clause). The `all_columns`, `all_columns_x`, `columns`, `groupby`, and `order_by_cols` fields all represent column names.
|
||||
|
||||
### Chart Options
|
||||
|
||||
| Field | Type | Notes |
|
||||
| -------------- | --------- | --------------------------- |
|
||||
| `color_picker` | _object_ | The **Fixed Color** widget |
|
||||
| `label_colors` | _object_ | The **Color Scheme** widget |
|
||||
| `normalized` | _boolean_ | The **Normalized** widget |
|
||||
|
||||
### Y Axis
|
||||
|
||||
| Field | Type | Notes |
|
||||
| ---------------- | -------- | ---------------------------- |
|
||||
| `y_axis_2_label` | _N/A_ | _Deprecated?_ |
|
||||
| `y_axis_format` | _string_ | The **Y Axis Format** widget |
|
||||
| `y_axis_zero` | _N/A_ | _Deprecated?_ |
|
||||
|
||||
Note the `y_axis_format` is defined under various section for some charts.
|
||||
|
||||
### Other
|
||||
|
||||
| Field | Type | Notes |
|
||||
| -------------- | -------- | ----- |
|
||||
| `color_scheme` | _string_ | |
|
||||
|
||||
### Unclassified
|
||||
|
||||
| Field | Type | Notes |
|
||||
| ----------------------------- | ----- | ----- |
|
||||
| `add_to_dash` | _N/A_ | |
|
||||
| `code` | _N/A_ | |
|
||||
| `collapsed_fieldsets` | _N/A_ | |
|
||||
| `comparison type` | _N/A_ | |
|
||||
| `country_fieldtype` | _N/A_ | |
|
||||
| `default_filters` | _N/A_ | |
|
||||
| `entity` | _N/A_ | |
|
||||
| `expanded_slices` | _N/A_ | |
|
||||
| `filter_immune_slice_fields` | _N/A_ | |
|
||||
| `filter_immune_slices` | _N/A_ | |
|
||||
| `flt_col_0` | _N/A_ | |
|
||||
| `flt_col_1` | _N/A_ | |
|
||||
| `flt_eq_0` | _N/A_ | |
|
||||
| `flt_eq_1` | _N/A_ | |
|
||||
| `flt_op_0` | _N/A_ | |
|
||||
| `flt_op_1` | _N/A_ | |
|
||||
| `goto_dash` | _N/A_ | |
|
||||
| `import_time` | _N/A_ | |
|
||||
| `label` | _N/A_ | |
|
||||
| `linear_color_scheme` | _N/A_ | |
|
||||
| `new_dashboard_name` | _N/A_ | |
|
||||
| `new_slice_name` | _N/A_ | |
|
||||
| `num_period_compare` | _N/A_ | |
|
||||
| `period_ratio_type` | _N/A_ | |
|
||||
| `perm` | _N/A_ | |
|
||||
| `rdo_save` | _N/A_ | |
|
||||
| `refresh_frequency` | _N/A_ | |
|
||||
| `remote_id` | _N/A_ | |
|
||||
| `resample_fillmethod` | _N/A_ | |
|
||||
| `resample_how` | _N/A_ | |
|
||||
| `rose_area_proportion` | _N/A_ | |
|
||||
| `save_to_dashboard_id` | _N/A_ | |
|
||||
| `schema` | _N/A_ | |
|
||||
| `series` | _N/A_ | |
|
||||
| `show_bubbles` | _N/A_ | |
|
||||
| `slice_name` | _N/A_ | |
|
||||
| `timed_refresh_immune_slices` | _N/A_ | |
|
||||
| `userid` | _N/A_ | |
|
||||
212
docs/docs/using-superset/creating-your-first-dashboard.mdx
Normal file
212
docs/docs/using-superset/creating-your-first-dashboard.mdx
Normal file
@@ -0,0 +1,212 @@
|
||||
---
|
||||
title: Creating Your First Dashboard
|
||||
hide_title: true
|
||||
sidebar_position: 1
|
||||
version: 1
|
||||
---
|
||||
|
||||
import useBaseUrl from "@docusaurus/useBaseUrl";
|
||||
|
||||
## Creating Your First Dashboard
|
||||
|
||||
This section is focused on documentation for end-users who will be using Superset
|
||||
for the data analysis and exploration workflow
|
||||
(data analysts, business analysts, data
|
||||
scientists, etc). In addition to this site, [Preset.io](http://preset.io/) maintains an updated set of end-user
|
||||
documentation at [docs.preset.io](https://docs.preset.io/).
|
||||
|
||||
This tutorial targets someone who wants to create charts and dashboards in Superset. We’ll show you
|
||||
how to connect Superset to a new database and configure a table in that database for analysis.
|
||||
You’ll also explore the data you’ve exposed and add a visualization to a dashboard so that you get a
|
||||
feel for the end-to-end user experience.
|
||||
|
||||
### Connecting to a new database
|
||||
|
||||
Superset itself doesn't have a storage layer to store your data but instead pairs with
|
||||
your existing SQL-speaking database or data store.
|
||||
|
||||
First things first, we need to add the connection credentials to your database to be able
|
||||
to query and visualize data from it. If you're using Superset locally via
|
||||
[Docker compose](/docs/installation/docker-compose), you can
|
||||
skip this step because a Postgres database, named **examples**, is included and
|
||||
pre-configured in Superset for you.
|
||||
|
||||
Under the **+** menu in the top right, select Data, and then the _Connect Database_ option:
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/tutorial_01_add_database_connection.png")} width="600" />{" "} <br/><br/>
|
||||
|
||||
Then select your database type in the resulting modal:
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/tutorial_02_select_database.png" )} width="600" />{" "} <br/><br/>
|
||||
|
||||
Once you've selected a database, you can configure a number of advanced options in this window,
|
||||
or for the purposes of this walkthrough, you can click the link below all these fields:
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/tutorial_03a_database_connection_string_link.png" )} width="600" />{" "} <br/><br/>
|
||||
|
||||
Once you've clicked that link you only need to specify two things (the database name and SQLAlchemy URI):
|
||||
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/tutorial_03b_connection_string_details.png" )} width="600" />{" "} <br/><br/>
|
||||
|
||||
As noted in the text below the form, you should refer to the SQLAlchemy documentation on
|
||||
[creating new connection URIs](https://docs.sqlalchemy.org/en/12/core/engines.html#database-urls)
|
||||
for your target database.
|
||||
|
||||
Click the **Test Connection** button to confirm things work end to end. If the connection looks good, save the configuration
|
||||
by clicking the **Connect** button in the bottom right corner of the modal window:
|
||||
|
||||
Congratulations, you've just added a new data source in Superset!
|
||||
|
||||
### Registering a new table
|
||||
|
||||
Now that you’ve configured a data source, you can select specific tables (called **Datasets** in Superset)
|
||||
that you want exposed in Superset for querying.
|
||||
|
||||
Navigate to **Data ‣ Datasets** and select the **+ Dataset** button in the top right corner.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/tutorial_08_sources_tables.png" )} />
|
||||
|
||||
A modal window should pop up in front of you. Select your **Database**,
|
||||
**Schema**, and **Table** using the drop downs that appear. In the following example,
|
||||
we register the **cleaned_sales_data** table from the **examples** database.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/tutorial_09_add_new_table.png" )} />
|
||||
|
||||
To finish, click the **Add** button in the bottom right corner. You should now see your dataset in the list of datasets.
|
||||
|
||||
### Customizing column properties
|
||||
|
||||
Now that you've registered your dataset, you can configure column properties
|
||||
for how the column should be treated in the Explore workflow:
|
||||
|
||||
- Is the column temporal? (should it be used for slicing & dicing in time series charts?)
|
||||
- Should the column be filterable?
|
||||
- Is the column dimensional?
|
||||
- If it's a datetime column, how should Superset parse
|
||||
the datetime format? (using the [ISO-8601 string pattern](https://en.wikipedia.org/wiki/ISO_8601))
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/tutorial_column_properties.png" )} />
|
||||
|
||||
### Superset semantic layer
|
||||
|
||||
Superset has a thin semantic layer that adds many quality of life improvements for analysts.
|
||||
The Superset semantic layer can store 2 types of computed data:
|
||||
|
||||
1. Virtual metrics: you can write SQL queries that aggregate values
|
||||
from multiple column (e.g. `SUM(recovered) / SUM(confirmed)`) and make them
|
||||
available as columns for (e.g. `recovery_rate`) visualization in Explore.
|
||||
Aggregate functions are allowed and encouraged for metrics.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/tutorial_sql_metric.png" )} />
|
||||
|
||||
You can also certify metrics if you'd like for your team in this view.
|
||||
|
||||
2. Virtual calculated columns: you can write SQL queries that
|
||||
customize the appearance and behavior
|
||||
of a specific column (e.g. `CAST(recovery_rate) as float`).
|
||||
Aggregate functions aren't allowed in calculated columns.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/tutorial_calculated_column.png" )} />
|
||||
|
||||
### Creating charts in Explore view
|
||||
|
||||
Superset has 2 main interfaces for exploring data:
|
||||
|
||||
- **Explore**: no-code viz builder. Select your dataset, select the chart,
|
||||
customize the appearance, and publish.
|
||||
- **SQL Lab**: SQL IDE for cleaning, joining, and preparing data for Explore workflow
|
||||
|
||||
We'll focus on the Explore view for creating charts right now.
|
||||
To start the Explore workflow from the **Datasets** tab, start by clicking the name
|
||||
of the dataset that will be powering your chart.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/tutorial_launch_explore.png" )} /><br/><br/>
|
||||
|
||||
You're now presented with a powerful workflow for exploring data and iterating on charts.
|
||||
|
||||
- The **Dataset** view on the left-hand side has a list of columns and metrics,
|
||||
scoped to the current dataset you selected.
|
||||
- The **Data** preview below the chart area also gives you helpful data context.
|
||||
- Using the **Data** tab and **Customize** tabs, you can change the visualization type,
|
||||
select the temporal column, select the metric to group by, and customize
|
||||
the aesthetics of the chart.
|
||||
|
||||
As you customize your chart using drop-down menus, make sure to click the **Run** button
|
||||
to get visual feedback.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/tutorial_explore_run.jpg" )} />
|
||||
|
||||
In the following screenshot, we craft a grouped Time-series Bar Chart to visualize
|
||||
our quarterly sales data by product line just by clicking options in drop-down menus.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/tutorial_explore_settings.jpg" )} />
|
||||
|
||||
### Creating a slice and dashboard
|
||||
|
||||
To save your chart, first click the **Save** button. You can either:
|
||||
|
||||
- Save your chart and add it to an existing dashboard
|
||||
- Save your chart and add it to a new dashboard
|
||||
|
||||
In the following screenshot, we save the chart to a new "Superset Duper Sales Dashboard":
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/tutorial_save_slice.png" )} />
|
||||
|
||||
To publish, click **Save and goto Dashboard**.
|
||||
|
||||
Behind the scenes, Superset will create a slice and store all the information needed
|
||||
to create your chart in its thin data layer
|
||||
(the query, chart type, options selected, name, etc).
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/tutorial_first_dashboard.png" )} style={{width: "100%", maxWidth: "500px"}} />
|
||||
|
||||
To resize the chart, start by clicking the Edit Dashboard button in the top right corner.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/tutorial_edit_button.png" )} width="300" />
|
||||
|
||||
Then, click and drag the bottom right corner of the chart until the chart layout snaps
|
||||
into a position you like onto the underlying grid.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/tutorial_chart_resize.png" )} style={{width: "100%", maxWidth: "500px"}} />
|
||||
|
||||
Click **Save** to persist the changes.
|
||||
|
||||
Congrats! You’ve successfully linked, analyzed, and visualized data in Superset. There are a wealth
|
||||
of other table configuration and visualization options, so please start exploring and creating
|
||||
slices and dashboards of your own
|
||||
|
||||
ֿ
|
||||
### Manage access to Dashboards
|
||||
|
||||
|
||||
Access to dashboards is managed via owners (users that have edit permissions to the dashboard)
|
||||
|
||||
Non-owner users access can be managed two different ways:
|
||||
|
||||
1. Dataset permissions - if you add to the relevant role permissions to datasets it automatically grants implicit access to all dashboards that uses those permitted datasets
|
||||
2. Dashboard roles - if you enable **DASHBOARD_RBAC** [feature flag](/docs/configuration/configuring-superset#feature-flags) then you be able to manage which roles can access the dashboard
|
||||
- Granting a role access to a dashboard will bypass dataset level checks. Having dashboard access implicitly grants read access to all the featured charts in the dashboard, and thereby also all the associated datasets.
|
||||
- If no roles are specified for a dashboard, regular **Dataset permissions** will apply.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/tutorial_dashboard_access.png" )} />
|
||||
|
||||
### Customizing dashboard
|
||||
|
||||
The following URL parameters can be used to modify how the dashboard is rendered:
|
||||
- `standalone`:
|
||||
- `0` (default): dashboard is displayed normally
|
||||
- `1`: Top Navigation is hidden
|
||||
- `2`: Top Navigation + title is hidden
|
||||
- `3`: Top Navigation + title + top level tabs are hidden
|
||||
- `show_filters`:
|
||||
- `0`: render dashboard without Filter Bar
|
||||
- `1` (default): render dashboard with Filter Bar if native filters are enabled
|
||||
- `expand_filters`:
|
||||
- (default): render dashboard with Filter Bar expanded if there are native filters
|
||||
- `0`: render dashboard with Filter Bar collapsed
|
||||
- `1`: render dashboard with Filter Bar expanded
|
||||
|
||||
For example, when running the local development build, the following will disable the
|
||||
Top Nav and remove the Filter Bar:
|
||||
`http://localhost:8088/superset/dashboard/my-dashboard/?standalone=1&show_filters=0`
|
||||
327
docs/docs/using-superset/exploring-data.mdx
Normal file
327
docs/docs/using-superset/exploring-data.mdx
Normal file
@@ -0,0 +1,327 @@
|
||||
---
|
||||
title: Exploring Data in Superset
|
||||
hide_title: true
|
||||
sidebar_position: 2
|
||||
version: 1
|
||||
---
|
||||
|
||||
import useBaseUrl from "@docusaurus/useBaseUrl";
|
||||
|
||||
## Exploring Data in Superset
|
||||
|
||||
In this tutorial, we will introduce key concepts in Apache Superset through the exploration of a
|
||||
real dataset which contains the flights made by employees of a UK-based organization in 2011. The
|
||||
following information about each flight is given:
|
||||
|
||||
- The traveller’s department. For the purposes of this tutorial the departments have been renamed
|
||||
Orange, Yellow and Purple.
|
||||
- The cost of the ticket.
|
||||
- The travel class (Economy, Premium Economy, Business and First Class).
|
||||
- Whether the ticket was a single or return.
|
||||
- The date of travel.
|
||||
- Information about the origin and destination.
|
||||
- The distance between the origin and destination, in kilometers (km).
|
||||
|
||||
### Enabling Data Upload Functionality
|
||||
|
||||
You may need to enable the functionality to upload a CSV or Excel file to your database. The following section
|
||||
explains how to enable this functionality for the examples database.
|
||||
|
||||
In the top menu, select **Data ‣ Databases**. Find the **examples** database in the list and
|
||||
select the **Edit** button.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/edit-record.png" )} />
|
||||
|
||||
In the resulting modal window, switch to the **Extra** tab and
|
||||
tick the checkbox for **Allow Data Upload**. End by clicking the **Save** button.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/add-data-upload.png" )} />
|
||||
|
||||
### Loading CSV Data
|
||||
|
||||
Download the CSV dataset to your computer from
|
||||
[GitHub](https://raw.githubusercontent.com/apache-superset/examples-data/master/tutorial_flights.csv).
|
||||
In the Superset menu, select **Data ‣ Upload a CSV**.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/upload_a_csv.png" )} />
|
||||
|
||||
Then, enter the **Table Name** as _tutorial_flights_ and select the CSV file from your computer.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/csv_to_database_configuration.png" )} />
|
||||
|
||||
Next enter the text _Travel Date_ into the **Parse Dates** field.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/parse_dates_column.png" )} />
|
||||
|
||||
Leaving all the other options in their default settings, select **Save** at the bottom of the page.
|
||||
|
||||
### Table Visualization
|
||||
|
||||
You should now see _tutorial_flights_ as a dataset in the **Datasets** tab. Click on the entry to
|
||||
launch an Explore workflow using this dataset.
|
||||
|
||||
In this section, we'll create a table visualization
|
||||
to show the number of flights and cost per travel class.
|
||||
|
||||
By default, Apache Superset only shows the last week of data. In our example, we want to visualize all
|
||||
of the data in the dataset. Click the **Time ‣ Time Range** section and change
|
||||
the **Range Type** to **No Filter**.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/no_filter_on_time_filter.png" )} />
|
||||
|
||||
Click **Apply** to save.
|
||||
|
||||
Now, we want to specify the rows in our table by using the **Group by** option. Since in this
|
||||
example, we want to understand different Travel Classes, we select **Travel Class** in this menu.
|
||||
|
||||
Next, we can specify the metrics we would like to see in our table with the **Metrics** option.
|
||||
|
||||
- `COUNT(*)`, which represents the number of rows in the table
|
||||
(in this case, quantity of flights in each Travel Class)
|
||||
- `SUM(Cost)`, which represents the total cost spent by each Travel Class
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/sum_cost_column.png" )} />
|
||||
|
||||
Finally, select **Run Query** to see the results of the table.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/tutorial_table.png" )} />
|
||||
|
||||
To save the visualization, click on **Save** in the top left of the screen. In the following modal,
|
||||
|
||||
- Select the **Save as**
|
||||
option and enter the chart name as Tutorial Table (you will be able to find it again through the
|
||||
**Charts** screen, accessible in the top menu).
|
||||
- Select **Add To Dashboard** and enter
|
||||
Tutorial Dashboard. Finally, select **Save & Go To Dashboard**.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/save_tutorial_table.png" )} />
|
||||
|
||||
### Dashboard Basics
|
||||
|
||||
Next, we are going to explore the dashboard interface. If you’ve followed the previous section, you
|
||||
should already have the dashboard open. Otherwise, you can navigate to the dashboard by selecting
|
||||
Dashboards on the top menu, then Tutorial dashboard from the list of dashboards.
|
||||
|
||||
On this dashboard you should see the table you created in the previous section. Select **Edit
|
||||
dashboard** and then hover over the table. By selecting the bottom right hand corner of the table
|
||||
(the cursor will change too), you can resize it by dragging and dropping.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/resize_tutorial_table_on_dashboard.png" )} />
|
||||
|
||||
Finally, save your changes by selecting Save changes in the top right.
|
||||
|
||||
### Pivot Table
|
||||
|
||||
In this section, we will extend our analysis using a more complex visualization, Pivot Table. By the
|
||||
end of this section, you will have created a table that shows the monthly spend on flights for the
|
||||
first six months, by department, by travel class.
|
||||
|
||||
Create a new chart by selecting **+ ‣ Chart** from the top right corner. Choose
|
||||
tutorial_flights again as a datasource, then click on the visualization type to get to the
|
||||
visualization menu. Select the **Pivot Table** visualization (you can filter by entering text in the
|
||||
search box) and then **Create New Chart**.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/create_pivot.png" )} />
|
||||
|
||||
In the **Time** section, keep the Time Column as Travel Date (this is selected automatically as we
|
||||
only have one time column in our dataset). Then select Time Grain to be month as having daily data
|
||||
would be too granular to see patterns from. Then select the time range to be the first six months of
|
||||
2011 by click on Last week in the Time Range section, then in Custom selecting a Start / end of 1st
|
||||
January 2011 and 30th June 2011 respectively by either entering directly the dates or using the
|
||||
calendar widget (by selecting the month name and then the year, you can move more quickly to far
|
||||
away dates).
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/select_dates_pivot_table.png" )} />
|
||||
|
||||
Next, within the **Query** section, remove the default COUNT(\*) and add Cost, keeping the default
|
||||
SUM aggregate. Note that Apache Superset will indicate the type of the metric by the symbol on the
|
||||
left hand column of the list (ABC for string, # for number, a clock face for time, etc.).
|
||||
|
||||
In **Group by** select **Time**: this will automatically use the Time Column and Time Grain
|
||||
selections we defined in the Time section.
|
||||
|
||||
Within **Columns**, select first Department and then Travel Class. All set – let’s **Run Query** to
|
||||
see some data!
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/tutorial_pivot_table.png" )} />
|
||||
|
||||
You should see months in the rows and Department and Travel Class in the columns. Publish this chart
|
||||
to your existing Tutorial Dashboard you created earlier.
|
||||
|
||||
### Line Chart
|
||||
|
||||
In this section, we are going to create a line chart to understand the average price of a ticket by
|
||||
month across the entire dataset.
|
||||
|
||||
In the Time section, as before, keep the Time Column as Travel Date and Time Grain as month but this
|
||||
time for the Time range select No filter as we want to look at entire dataset.
|
||||
|
||||
Within Metrics, remove the default `COUNT(*)` metric and instead add `AVG(Cost)`, to show the mean value.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/average_aggregate_for_cost.png" )} />
|
||||
|
||||
Next, select **Run Query** to show the data on the chart.
|
||||
|
||||
How does this look? Well, we can see that the average cost goes up in December. However, perhaps it
|
||||
doesn’t make sense to combine both single and return tickets, but rather show two separate lines for
|
||||
each ticket type.
|
||||
|
||||
Let’s do this by selecting Ticket Single or Return in the Group by box, and the selecting **Run
|
||||
Query** again. Nice! We can see that on average single tickets are cheaper than returns and that the
|
||||
big spike in December is caused by return tickets.
|
||||
|
||||
Our chart is looking pretty good already, but let’s customize some more by going to the Customize
|
||||
tab on the left hand pane. Within this pane, try changing the Color Scheme, removing the range
|
||||
filter by selecting No in the Show Range Filter drop down and adding some labels using X Axis Label
|
||||
and Y Axis Label.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/tutorial_line_chart.png" )} />
|
||||
|
||||
Once you’re done, publish the chart in your Tutorial Dashboard.
|
||||
|
||||
### Markup
|
||||
|
||||
In this section, we will add some text to our dashboard. If you’re there already, you can navigate
|
||||
to the dashboard by selecting Dashboards on the top menu, then Tutorial dashboard from the list of
|
||||
dashboards. Got into edit mode by selecting **Edit dashboard**.
|
||||
|
||||
Within the Insert components pane, drag and drop a Markdown box on the dashboard. Look for the blue
|
||||
lines which indicate the anchor where the box will go.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/blue_bar_insert_component.png" )} />
|
||||
|
||||
Now, to edit the text, select the box. You can enter text, in markdown format (see
|
||||
[this Markdown Cheatsheet](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet) for
|
||||
more information about this format). You can toggle between Edit and Preview using the menu on the
|
||||
top of the box.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/markdown.png" )} />
|
||||
|
||||
To exit, select any other part of the dashboard. Finally, don’t forget to keep your changes using
|
||||
**Save changes**.
|
||||
|
||||
### Publishing Your Dashboard
|
||||
|
||||
If you have followed all of the steps outlined in the previous section, you should have a dashboard
|
||||
that looks like the below. If you would like, you can rearrange the elements of the dashboard by
|
||||
selecting **Edit dashboard** and dragging and dropping.
|
||||
|
||||
If you would like to make your dashboard available to other users, simply select Draft next to the
|
||||
title of your dashboard on the top left to change your dashboard to be in Published state. You can
|
||||
also favorite this dashboard by selecting the star.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/publish_dashboard.png" )} />
|
||||
|
||||
### Annotations
|
||||
|
||||
Annotations allow you to add additional context to your chart. In this section, we will add an
|
||||
annotation to the Tutorial Line Chart we made in a previous section. Specifically, we will add the
|
||||
dates when some flights were cancelled by the UK’s Civil Aviation Authority in response to the
|
||||
eruption of the Grímsvötn volcano in Iceland (23-25 May 2011).
|
||||
|
||||
First, add an annotation layer by navigating to Manage ‣ Annotation Layers. Add a new annotation
|
||||
layer by selecting the green plus sign to add a new record. Enter the name Volcanic Eruptions and
|
||||
save. We can use this layer to refer to a number of different annotations.
|
||||
|
||||
Next, add an annotation by navigating to Manage ‣ Annotations and then create a new annotation by
|
||||
selecting the green plus sign. Then, select the Volcanic Eruptions layer, add a short description
|
||||
Grímsvötn and the eruption dates (23-25 May 2011) before finally saving.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/edit_annotation.png" )} />
|
||||
|
||||
Then, navigate to the line chart by going to Charts then selecting Tutorial Line Chart from the
|
||||
list. Next, go to the Annotations and Layers section and select Add Annotation Layer. Within this
|
||||
dialogue:
|
||||
|
||||
- Name the layer as Volcanic Eruptions
|
||||
- Change the Annotation Layer Type to Event
|
||||
- Set the Annotation Source as Superset annotation
|
||||
- Specify the Annotation Layer as Volcanic Eruptions
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/annotation_settings.png" )} />
|
||||
|
||||
Select **Apply** to see your annotation shown on the chart.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/annotation.png" )} />
|
||||
|
||||
If you wish, you can change how your annotation looks by changing the settings in the Display
|
||||
configuration section. Otherwise, select **OK** and finally **Save** to save your chart. If you keep
|
||||
the default selection to overwrite the chart, your annotation will be saved to the chart and also
|
||||
appear automatically in the Tutorial Dashboard.
|
||||
|
||||
### Advanced Analytics
|
||||
|
||||
In this section, we are going to explore the Advanced Analytics feature of Apache Superset that
|
||||
allows you to apply additional transformations to your data. The three types of transformation are:
|
||||
|
||||
**Setting up the base chart**
|
||||
|
||||
In this section, we’re going to set up a base chart which we can then apply the different **Advanced
|
||||
Analytics** features to. Start off by creating a new chart using the same _tutorial_flights_
|
||||
datasource and the **Line Chart** visualization type. Within the Time section, set the Time Range as
|
||||
1st October 2011 and 31st October 2011.
|
||||
|
||||
Next, in the query section, change the Metrics to the sum of Cost. Select **Run Query** to show the
|
||||
chart. You should see the total cost per day for each month in October 2011.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/advanced_analytics_base.png" )} />
|
||||
|
||||
Finally, save the visualization as Tutorial Advanced Analytics Base, adding it to the Tutorial
|
||||
Dashboard.
|
||||
|
||||
### Rolling Mean
|
||||
|
||||
There is quite a lot of variation in the data, which makes it difficult to identify any trend. One
|
||||
approach we can take is to show instead a rolling average of the time series. To do this, in the
|
||||
**Moving Average** subsection of **Advanced Analytics**, select mean in the **Rolling** box and
|
||||
enter 7 into both Periods and Min Periods. The period is the length of the rolling period expressed
|
||||
as a multiple of the Time Grain. In our example, the Time Grain is day, so the rolling period is 7
|
||||
days, such that on the 7th October 2011 the value shown would correspond to the first seven days of
|
||||
October 2011. Lastly, by specifying Min Periods as 7, we ensure that our mean is always calculated
|
||||
on 7 days and we avoid any ramp up period.
|
||||
|
||||
After displaying the chart by selecting **Run Query** you will see that the data is less variable
|
||||
and that the series starts later as the ramp up period is excluded.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/rolling_mean.png" )} />
|
||||
|
||||
Save the chart as Tutorial Rolling Mean and add it to the Tutorial Dashboard.
|
||||
|
||||
### Time Comparison
|
||||
|
||||
In this section, we will compare values in our time series to the value a week before. Start off by
|
||||
opening the Tutorial Advanced Analytics Base chart, by going to **Charts** in the top menu and then
|
||||
selecting the visualization name in the list (alternatively, find the chart in the Tutorial
|
||||
Dashboard and select Explore chart from the menu for that visualization).
|
||||
|
||||
Next, in the Time Comparison subsection of **Advanced Analytics**, enter the Time Shift by typing in
|
||||
“minus 1 week” (note this box accepts input in natural language). Run Query to see the new chart,
|
||||
which has an additional series with the same values, shifted a week back in time.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/time_comparison_two_series.png" )} />
|
||||
|
||||
Then, change the **Calculation type** to Absolute difference and select **Run Query**. We can now
|
||||
see only one series again, this time showing the difference between the two series we saw
|
||||
previously.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/time_comparison_absolute_difference.png" )} />
|
||||
|
||||
Save the chart as Tutorial Time Comparison and add it to the Tutorial Dashboard.
|
||||
|
||||
### Resampling the data
|
||||
|
||||
In this section, we’ll resample the data so that rather than having daily data we have weekly data.
|
||||
As in the previous section, reopen the Tutorial Advanced Analytics Base chart.
|
||||
|
||||
Next, in the Python Functions subsection of **Advanced Analytics**, enter 7D, corresponding to seven
|
||||
days, in the Rule and median as the Method and show the chart by selecting **Run Query**.
|
||||
|
||||
<img src={useBaseUrl("/img/tutorial/resample.png" )} />
|
||||
|
||||
Note that now we have a single data point every 7 days. In our case, the value showed corresponds to
|
||||
the median value within the seven daily data points. For more information on the meaning of the
|
||||
various options in this section, refer to the
|
||||
[Pandas documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.resample.html).
|
||||
|
||||
Lastly, save your chart as Tutorial Resample and add it to the Tutorial Dashboard. Go to the
|
||||
tutorial dashboard to see the four charts side by side and compare the different outputs.
|
||||
Reference in New Issue
Block a user