docs: Refactor Documentation Structure (#28161)

Co-authored-by: Evan Rusackas <evan@preset.io>
Co-authored-by: Sam Firke <sfirke@users.noreply.github.com>
This commit is contained in:
Braum
2024-04-25 05:17:50 +08:00
committed by GitHub
parent bc65c245fe
commit e8a678b75a
37 changed files with 139 additions and 100 deletions

View File

@@ -0,0 +1,145 @@
---
title: Chart Parameters Reference
hide_title: true
sidebar_position: 4
version: 1
---
## Chart Parameters
Chart parameters are stored as a JSON encoded string in the `slices.params` column and are often referenced throughout the code as form-data. Currently the form-data is neither versioned nor typed as thus is somewhat free-formed. Note in the future there may be merit in using something like [JSON Schema](https://json-schema.org/) to both annotate and validate the JSON object in addition to using a Mypy `TypedDict` (introduced in Python 3.8) for typing the form-data in the backend. This section serves as a potential primer for that work.
The following tables provide a non-exhaustive list of the various fields which can be present in the JSON object grouped by the Explorer pane sections. These values were obtained by extracting the distinct fields from a legacy deployment consisting of tens of thousands of charts and thus some fields may be missing whilst others may be deprecated.
Note not all fields are correctly categorized. The fields vary based on visualization type and may appear in different sections depending on the type. Verified deprecated columns may indicate a missing migration and/or prior migrations which were unsuccessful and thus future work may be required to clean up the form-data.
### Datasource & Chart Type
| Field | Type | Notes |
| ----------------- | -------- | ------------------------------------ |
| `database_name` | _string_ | _Deprecated?_ |
| `datasource` | _string_ | `<datasource_id>__<datasource_type>` |
| `datasource_id` | _string_ | _Deprecated?_ See `datasource` |
| `datasource_name` | _string_ | _Deprecated?_ |
| `datasource_type` | _string_ | _Deprecated?_ See `datasource` |
| `viz_type` | _string_ | The **Visualization Type** widget |
### Time
| Field | Type | Notes |
| ------------------ | -------- | ------------------------------------- |
| `granularity_sqla` | _string_ | The SQLA **Time Column** widget |
| `time_grain_sqla` | _string_ | The SQLA **Time Grain** widget |
| `time_range` | _string_ | The **Time range** widget |
### GROUP BY
| Field | Type | Notes |
| ------------------------- | --------------- | ----------------- |
| `metrics` | _array(string)_ | See Query section |
| `order_asc` | - | See Query section |
| `row_limit` | - | See Query section |
| `timeseries_limit_metric` | - | See Query section |
### NOT GROUPED BY
| Field | Type | Notes |
| --------------- | --------------- | ----------------------- |
| `order_by_cols` | _array(string)_ | The **Ordering** widget |
| `row_limit` | - | See Query section |
### Y Axis 1
| Field | Type | Notes |
| --------------- | ---- | -------------------------------------------------- |
| `metric` | - | The **Left Axis Metric** widget. See Query section |
| `y_axis_format` | - | See Y Axis section |
### Y Axis 2
| Field | Type | Notes |
| ---------- | ---- | --------------------------------------------------- |
| `metric_2` | - | The **Right Axis Metric** widget. See Query section |
### Query
| Field | Type | Notes |
| ------------------------------------------------------------------------------------------------------ | ------------------------------------------------- | ------------------------------------------------- |
| `adhoc_filters` | _array(object)_ | The **Filters** widget |
| `extra_filters` | _array(object)_ | Another pathway to the **Filters** widget.<br/>It is generally used to pass dashboard filter parameters to a chart.<br/>It can be used for appending additional filters to a chart that has been saved with its own filters on an ad-hoc basis if the chart is being used as a standalone widget.<br/><br/>For implementation examples see : [utils test.py](https://github.com/apache/superset/blob/66a4c94a1ed542e69fe6399bab4c01d4540486cf/tests/utils_tests.py#L181)<br/>For insight into how superset processes the contents of this parameter see: [exploreUtils/index.js](https://github.com/apache/superset/blob/93c7f5bb446ec6895d7702835f3157426955d5a9/superset-frontend/src/explore/exploreUtils/index.js#L159) |
| `columns` | _array(string)_ | The **Breakdowns** widget |
| `groupby` | _array(string)_ | The **Group by** or **Series** widget |
| `limit` | _number_ | The **Series Limit** widget |
| `metric`<br/>`metric_2`<br/>`metrics`<br/>`percent_metrics`<br/>`secondary_metric`<br/>`size`<br/>`x`<br/>`y` | _string_,_object_,_array(string)_,_array(object)_ | The metric(s) depending on the visualization type |
| `order_asc` | _boolean_ | The **Sort Descending** widget |
| `row_limit` | _number_ | The **Row limit** widget |
| `timeseries_limit_metric` | _object_ | The **Sort By** widget |
The `metric` (or equivalent) and `timeseries_limit_metric` fields are all composed of either metric names or the JSON representation of the `AdhocMetric` TypeScript type. The `adhoc_filters` is composed of the JSON represent of the `AdhocFilter` TypeScript type (which can comprise of columns or metrics depending on whether it is a WHERE or HAVING clause). The `all_columns`, `all_columns_x`, `columns`, `groupby`, and `order_by_cols` fields all represent column names.
### Chart Options
| Field | Type | Notes |
| -------------- | --------- | --------------------------- |
| `color_picker` | _object_ | The **Fixed Color** widget |
| `label_colors` | _object_ | The **Color Scheme** widget |
| `normalized` | _boolean_ | The **Normalized** widget |
### Y Axis
| Field | Type | Notes |
| ---------------- | -------- | ---------------------------- |
| `y_axis_2_label` | _N/A_ | _Deprecated?_ |
| `y_axis_format` | _string_ | The **Y Axis Format** widget |
| `y_axis_zero` | _N/A_ | _Deprecated?_ |
Note the `y_axis_format` is defined under various section for some charts.
### Other
| Field | Type | Notes |
| -------------- | -------- | ----- |
| `color_scheme` | _string_ | |
### Unclassified
| Field | Type | Notes |
| ----------------------------- | ----- | ----- |
| `add_to_dash` | _N/A_ | |
| `code` | _N/A_ | |
| `collapsed_fieldsets` | _N/A_ | |
| `comparison type` | _N/A_ | |
| `country_fieldtype` | _N/A_ | |
| `default_filters` | _N/A_ | |
| `entity` | _N/A_ | |
| `expanded_slices` | _N/A_ | |
| `filter_immune_slice_fields` | _N/A_ | |
| `filter_immune_slices` | _N/A_ | |
| `flt_col_0` | _N/A_ | |
| `flt_col_1` | _N/A_ | |
| `flt_eq_0` | _N/A_ | |
| `flt_eq_1` | _N/A_ | |
| `flt_op_0` | _N/A_ | |
| `flt_op_1` | _N/A_ | |
| `goto_dash` | _N/A_ | |
| `import_time` | _N/A_ | |
| `label` | _N/A_ | |
| `linear_color_scheme` | _N/A_ | |
| `new_dashboard_name` | _N/A_ | |
| `new_slice_name` | _N/A_ | |
| `num_period_compare` | _N/A_ | |
| `period_ratio_type` | _N/A_ | |
| `perm` | _N/A_ | |
| `rdo_save` | _N/A_ | |
| `refresh_frequency` | _N/A_ | |
| `remote_id` | _N/A_ | |
| `resample_fillmethod` | _N/A_ | |
| `resample_how` | _N/A_ | |
| `rose_area_proportion` | _N/A_ | |
| `save_to_dashboard_id` | _N/A_ | |
| `schema` | _N/A_ | |
| `series` | _N/A_ | |
| `show_bubbles` | _N/A_ | |
| `slice_name` | _N/A_ | |
| `timed_refresh_immune_slices` | _N/A_ | |
| `userid` | _N/A_ | |

View File

@@ -0,0 +1,212 @@
---
title: Creating Your First Dashboard
hide_title: true
sidebar_position: 1
version: 1
---
import useBaseUrl from "@docusaurus/useBaseUrl";
## Creating Your First Dashboard
This section is focused on documentation for end-users who will be using Superset
for the data analysis and exploration workflow
(data analysts, business analysts, data
scientists, etc). In addition to this site, [Preset.io](http://preset.io/) maintains an updated set of end-user
documentation at [docs.preset.io](https://docs.preset.io/).
This tutorial targets someone who wants to create charts and dashboards in Superset. Well show you
how to connect Superset to a new database and configure a table in that database for analysis.
Youll also explore the data youve exposed and add a visualization to a dashboard so that you get a
feel for the end-to-end user experience.
### Connecting to a new database
Superset itself doesn't have a storage layer to store your data but instead pairs with
your existing SQL-speaking database or data store.
First things first, we need to add the connection credentials to your database to be able
to query and visualize data from it. If you're using Superset locally via
[Docker compose](/docs/installation/docker-compose), you can
skip this step because a Postgres database, named **examples**, is included and
pre-configured in Superset for you.
Under the **+** menu in the top right, select Data, and then the _Connect Database_ option:
<img src={useBaseUrl("/img/tutorial/tutorial_01_add_database_connection.png")} width="600" />{" "} <br/><br/>
Then select your database type in the resulting modal:
<img src={useBaseUrl("/img/tutorial/tutorial_02_select_database.png" )} width="600" />{" "} <br/><br/>
Once you've selected a database, you can configure a number of advanced options in this window,
or for the purposes of this walkthrough, you can click the link below all these fields:
<img src={useBaseUrl("/img/tutorial/tutorial_03a_database_connection_string_link.png" )} width="600" />{" "} <br/><br/>
Once you've clicked that link you only need to specify two things (the database name and SQLAlchemy URI):
<img src={useBaseUrl("/img/tutorial/tutorial_03b_connection_string_details.png" )} width="600" />{" "} <br/><br/>
As noted in the text below the form, you should refer to the SQLAlchemy documentation on
[creating new connection URIs](https://docs.sqlalchemy.org/en/12/core/engines.html#database-urls)
for your target database.
Click the **Test Connection** button to confirm things work end to end. If the connection looks good, save the configuration
by clicking the **Connect** button in the bottom right corner of the modal window:
Congratulations, you've just added a new data source in Superset!
### Registering a new table
Now that youve configured a data source, you can select specific tables (called **Datasets** in Superset)
that you want exposed in Superset for querying.
Navigate to **Data ‣ Datasets** and select the **+ Dataset** button in the top right corner.
<img src={useBaseUrl("/img/tutorial/tutorial_08_sources_tables.png" )} />
A modal window should pop up in front of you. Select your **Database**,
**Schema**, and **Table** using the drop downs that appear. In the following example,
we register the **cleaned_sales_data** table from the **examples** database.
<img src={useBaseUrl("/img/tutorial/tutorial_09_add_new_table.png" )} />
To finish, click the **Add** button in the bottom right corner. You should now see your dataset in the list of datasets.
### Customizing column properties
Now that you've registered your dataset, you can configure column properties
for how the column should be treated in the Explore workflow:
- Is the column temporal? (should it be used for slicing & dicing in time series charts?)
- Should the column be filterable?
- Is the column dimensional?
- If it's a datetime column, how should Superset parse
the datetime format? (using the [ISO-8601 string pattern](https://en.wikipedia.org/wiki/ISO_8601))
<img src={useBaseUrl("/img/tutorial/tutorial_column_properties.png" )} />
### Superset semantic layer
Superset has a thin semantic layer that adds many quality of life improvements for analysts.
The Superset semantic layer can store 2 types of computed data:
1. Virtual metrics: you can write SQL queries that aggregate values
from multiple column (e.g. `SUM(recovered) / SUM(confirmed)`) and make them
available as columns for (e.g. `recovery_rate`) visualization in Explore.
Aggregate functions are allowed and encouraged for metrics.
<img src={useBaseUrl("/img/tutorial/tutorial_sql_metric.png" )} />
You can also certify metrics if you'd like for your team in this view.
2. Virtual calculated columns: you can write SQL queries that
customize the appearance and behavior
of a specific column (e.g. `CAST(recovery_rate) as float`).
Aggregate functions aren't allowed in calculated columns.
<img src={useBaseUrl("/img/tutorial/tutorial_calculated_column.png" )} />
### Creating charts in Explore view
Superset has 2 main interfaces for exploring data:
- **Explore**: no-code viz builder. Select your dataset, select the chart,
customize the appearance, and publish.
- **SQL Lab**: SQL IDE for cleaning, joining, and preparing data for Explore workflow
We'll focus on the Explore view for creating charts right now.
To start the Explore workflow from the **Datasets** tab, start by clicking the name
of the dataset that will be powering your chart.
<img src={useBaseUrl("/img/tutorial/tutorial_launch_explore.png" )} /><br/><br/>
You're now presented with a powerful workflow for exploring data and iterating on charts.
- The **Dataset** view on the left-hand side has a list of columns and metrics,
scoped to the current dataset you selected.
- The **Data** preview below the chart area also gives you helpful data context.
- Using the **Data** tab and **Customize** tabs, you can change the visualization type,
select the temporal column, select the metric to group by, and customize
the aesthetics of the chart.
As you customize your chart using drop-down menus, make sure to click the **Run** button
to get visual feedback.
<img src={useBaseUrl("/img/tutorial/tutorial_explore_run.jpg" )} />
In the following screenshot, we craft a grouped Time-series Bar Chart to visualize
our quarterly sales data by product line just by clicking options in drop-down menus.
<img src={useBaseUrl("/img/tutorial/tutorial_explore_settings.jpg" )} />
### Creating a slice and dashboard
To save your chart, first click the **Save** button. You can either:
- Save your chart and add it to an existing dashboard
- Save your chart and add it to a new dashboard
In the following screenshot, we save the chart to a new "Superset Duper Sales Dashboard":
<img src={useBaseUrl("/img/tutorial/tutorial_save_slice.png" )} />
To publish, click **Save and goto Dashboard**.
Behind the scenes, Superset will create a slice and store all the information needed
to create your chart in its thin data layer
(the query, chart type, options selected, name, etc).
<img src={useBaseUrl("/img/tutorial/tutorial_first_dashboard.png" )} style={{width: "100%", maxWidth: "500px"}} />
To resize the chart, start by clicking the Edit Dashboard button in the top right corner.
<img src={useBaseUrl("/img/tutorial/tutorial_edit_button.png" )} width="300" />
Then, click and drag the bottom right corner of the chart until the chart layout snaps
into a position you like onto the underlying grid.
<img src={useBaseUrl("/img/tutorial/tutorial_chart_resize.png" )} style={{width: "100%", maxWidth: "500px"}} />
Click **Save** to persist the changes.
Congrats! Youve successfully linked, analyzed, and visualized data in Superset. There are a wealth
of other table configuration and visualization options, so please start exploring and creating
slices and dashboards of your own
ֿ
### Manage access to Dashboards
Access to dashboards is managed via owners (users that have edit permissions to the dashboard)
Non-owner users access can be managed two different ways:
1. Dataset permissions - if you add to the relevant role permissions to datasets it automatically grants implicit access to all dashboards that uses those permitted datasets
2. Dashboard roles - if you enable **DASHBOARD_RBAC** [feature flag](/docs/configuration/configuring-superset#feature-flags) then you be able to manage which roles can access the dashboard
- Granting a role access to a dashboard will bypass dataset level checks. Having dashboard access implicitly grants read access to all the featured charts in the dashboard, and thereby also all the associated datasets.
- If no roles are specified for a dashboard, regular **Dataset permissions** will apply.
<img src={useBaseUrl("/img/tutorial/tutorial_dashboard_access.png" )} />
### Customizing dashboard
The following URL parameters can be used to modify how the dashboard is rendered:
- `standalone`:
- `0` (default): dashboard is displayed normally
- `1`: Top Navigation is hidden
- `2`: Top Navigation + title is hidden
- `3`: Top Navigation + title + top level tabs are hidden
- `show_filters`:
- `0`: render dashboard without Filter Bar
- `1` (default): render dashboard with Filter Bar if native filters are enabled
- `expand_filters`:
- (default): render dashboard with Filter Bar expanded if there are native filters
- `0`: render dashboard with Filter Bar collapsed
- `1`: render dashboard with Filter Bar expanded
For example, when running the local development build, the following will disable the
Top Nav and remove the Filter Bar:
`http://localhost:8088/superset/dashboard/my-dashboard/?standalone=1&show_filters=0`

View File

@@ -0,0 +1,327 @@
---
title: Exploring Data in Superset
hide_title: true
sidebar_position: 2
version: 1
---
import useBaseUrl from "@docusaurus/useBaseUrl";
## Exploring Data in Superset
In this tutorial, we will introduce key concepts in Apache Superset through the exploration of a
real dataset which contains the flights made by employees of a UK-based organization in 2011. The
following information about each flight is given:
- The travellers department. For the purposes of this tutorial the departments have been renamed
Orange, Yellow and Purple.
- The cost of the ticket.
- The travel class (Economy, Premium Economy, Business and First Class).
- Whether the ticket was a single or return.
- The date of travel.
- Information about the origin and destination.
- The distance between the origin and destination, in kilometers (km).
### Enabling Data Upload Functionality
You may need to enable the functionality to upload a CSV or Excel file to your database. The following section
explains how to enable this functionality for the examples database.
In the top menu, select **Data ‣ Databases**. Find the **examples** database in the list and
select the **Edit** button.
<img src={useBaseUrl("/img/tutorial/edit-record.png" )} />
In the resulting modal window, switch to the **Extra** tab and
tick the checkbox for **Allow Data Upload**. End by clicking the **Save** button.
<img src={useBaseUrl("/img/tutorial/add-data-upload.png" )} />
### Loading CSV Data
Download the CSV dataset to your computer from
[GitHub](https://raw.githubusercontent.com/apache-superset/examples-data/master/tutorial_flights.csv).
In the Superset menu, select **Data ‣ Upload a CSV**.
<img src={useBaseUrl("/img/tutorial/upload_a_csv.png" )} />
Then, enter the **Table Name** as _tutorial_flights_ and select the CSV file from your computer.
<img src={useBaseUrl("/img/tutorial/csv_to_database_configuration.png" )} />
Next enter the text _Travel Date_ into the **Parse Dates** field.
<img src={useBaseUrl("/img/tutorial/parse_dates_column.png" )} />
Leaving all the other options in their default settings, select **Save** at the bottom of the page.
### Table Visualization
You should now see _tutorial_flights_ as a dataset in the **Datasets** tab. Click on the entry to
launch an Explore workflow using this dataset.
In this section, we'll create a table visualization
to show the number of flights and cost per travel class.
By default, Apache Superset only shows the last week of data. In our example, we want to visualize all
of the data in the dataset. Click the **Time ‣ Time Range** section and change
the **Range Type** to **No Filter**.
<img src={useBaseUrl("/img/tutorial/no_filter_on_time_filter.png" )} />
Click **Apply** to save.
Now, we want to specify the rows in our table by using the **Group by** option. Since in this
example, we want to understand different Travel Classes, we select **Travel Class** in this menu.
Next, we can specify the metrics we would like to see in our table with the **Metrics** option.
- `COUNT(*)`, which represents the number of rows in the table
(in this case, quantity of flights in each Travel Class)
- `SUM(Cost)`, which represents the total cost spent by each Travel Class
<img src={useBaseUrl("/img/tutorial/sum_cost_column.png" )} />
Finally, select **Run Query** to see the results of the table.
<img src={useBaseUrl("/img/tutorial/tutorial_table.png" )} />
To save the visualization, click on **Save** in the top left of the screen. In the following modal,
- Select the **Save as**
option and enter the chart name as Tutorial Table (you will be able to find it again through the
**Charts** screen, accessible in the top menu).
- Select **Add To Dashboard** and enter
Tutorial Dashboard. Finally, select **Save & Go To Dashboard**.
<img src={useBaseUrl("/img/tutorial/save_tutorial_table.png" )} />
### Dashboard Basics
Next, we are going to explore the dashboard interface. If youve followed the previous section, you
should already have the dashboard open. Otherwise, you can navigate to the dashboard by selecting
Dashboards on the top menu, then Tutorial dashboard from the list of dashboards.
On this dashboard you should see the table you created in the previous section. Select **Edit
dashboard** and then hover over the table. By selecting the bottom right hand corner of the table
(the cursor will change too), you can resize it by dragging and dropping.
<img src={useBaseUrl("/img/tutorial/resize_tutorial_table_on_dashboard.png" )} />
Finally, save your changes by selecting Save changes in the top right.
### Pivot Table
In this section, we will extend our analysis using a more complex visualization, Pivot Table. By the
end of this section, you will have created a table that shows the monthly spend on flights for the
first six months, by department, by travel class.
Create a new chart by selecting **+ ‣ Chart** from the top right corner. Choose
tutorial_flights again as a datasource, then click on the visualization type to get to the
visualization menu. Select the **Pivot Table** visualization (you can filter by entering text in the
search box) and then **Create New Chart**.
<img src={useBaseUrl("/img/tutorial/create_pivot.png" )} />
In the **Time** section, keep the Time Column as Travel Date (this is selected automatically as we
only have one time column in our dataset). Then select Time Grain to be month as having daily data
would be too granular to see patterns from. Then select the time range to be the first six months of
2011 by click on Last week in the Time Range section, then in Custom selecting a Start / end of 1st
January 2011 and 30th June 2011 respectively by either entering directly the dates or using the
calendar widget (by selecting the month name and then the year, you can move more quickly to far
away dates).
<img src={useBaseUrl("/img/tutorial/select_dates_pivot_table.png" )} />
Next, within the **Query** section, remove the default COUNT(\*) and add Cost, keeping the default
SUM aggregate. Note that Apache Superset will indicate the type of the metric by the symbol on the
left hand column of the list (ABC for string, # for number, a clock face for time, etc.).
In **Group by** select **Time**: this will automatically use the Time Column and Time Grain
selections we defined in the Time section.
Within **Columns**, select first Department and then Travel Class. All set lets **Run Query** to
see some data!
<img src={useBaseUrl("/img/tutorial/tutorial_pivot_table.png" )} />
You should see months in the rows and Department and Travel Class in the columns. Publish this chart
to your existing Tutorial Dashboard you created earlier.
### Line Chart
In this section, we are going to create a line chart to understand the average price of a ticket by
month across the entire dataset.
In the Time section, as before, keep the Time Column as Travel Date and Time Grain as month but this
time for the Time range select No filter as we want to look at entire dataset.
Within Metrics, remove the default `COUNT(*)` metric and instead add `AVG(Cost)`, to show the mean value.
<img src={useBaseUrl("/img/tutorial/average_aggregate_for_cost.png" )} />
Next, select **Run Query** to show the data on the chart.
How does this look? Well, we can see that the average cost goes up in December. However, perhaps it
doesnt make sense to combine both single and return tickets, but rather show two separate lines for
each ticket type.
Lets do this by selecting Ticket Single or Return in the Group by box, and the selecting **Run
Query** again. Nice! We can see that on average single tickets are cheaper than returns and that the
big spike in December is caused by return tickets.
Our chart is looking pretty good already, but lets customize some more by going to the Customize
tab on the left hand pane. Within this pane, try changing the Color Scheme, removing the range
filter by selecting No in the Show Range Filter drop down and adding some labels using X Axis Label
and Y Axis Label.
<img src={useBaseUrl("/img/tutorial/tutorial_line_chart.png" )} />
Once youre done, publish the chart in your Tutorial Dashboard.
### Markup
In this section, we will add some text to our dashboard. If youre there already, you can navigate
to the dashboard by selecting Dashboards on the top menu, then Tutorial dashboard from the list of
dashboards. Got into edit mode by selecting **Edit dashboard**.
Within the Insert components pane, drag and drop a Markdown box on the dashboard. Look for the blue
lines which indicate the anchor where the box will go.
<img src={useBaseUrl("/img/tutorial/blue_bar_insert_component.png" )} />
Now, to edit the text, select the box. You can enter text, in markdown format (see
[this Markdown Cheatsheet](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet) for
more information about this format). You can toggle between Edit and Preview using the menu on the
top of the box.
<img src={useBaseUrl("/img/tutorial/markdown.png" )} />
To exit, select any other part of the dashboard. Finally, dont forget to keep your changes using
**Save changes**.
### Publishing Your Dashboard
If you have followed all of the steps outlined in the previous section, you should have a dashboard
that looks like the below. If you would like, you can rearrange the elements of the dashboard by
selecting **Edit dashboard** and dragging and dropping.
If you would like to make your dashboard available to other users, simply select Draft next to the
title of your dashboard on the top left to change your dashboard to be in Published state. You can
also favorite this dashboard by selecting the star.
<img src={useBaseUrl("/img/tutorial/publish_dashboard.png" )} />
### Annotations
Annotations allow you to add additional context to your chart. In this section, we will add an
annotation to the Tutorial Line Chart we made in a previous section. Specifically, we will add the
dates when some flights were cancelled by the UKs Civil Aviation Authority in response to the
eruption of the Grímsvötn volcano in Iceland (23-25 May 2011).
First, add an annotation layer by navigating to Manage ‣ Annotation Layers. Add a new annotation
layer by selecting the green plus sign to add a new record. Enter the name Volcanic Eruptions and
save. We can use this layer to refer to a number of different annotations.
Next, add an annotation by navigating to Manage ‣ Annotations and then create a new annotation by
selecting the green plus sign. Then, select the Volcanic Eruptions layer, add a short description
Grímsvötn and the eruption dates (23-25 May 2011) before finally saving.
<img src={useBaseUrl("/img/tutorial/edit_annotation.png" )} />
Then, navigate to the line chart by going to Charts then selecting Tutorial Line Chart from the
list. Next, go to the Annotations and Layers section and select Add Annotation Layer. Within this
dialogue:
- Name the layer as Volcanic Eruptions
- Change the Annotation Layer Type to Event
- Set the Annotation Source as Superset annotation
- Specify the Annotation Layer as Volcanic Eruptions
<img src={useBaseUrl("/img/tutorial/annotation_settings.png" )} />
Select **Apply** to see your annotation shown on the chart.
<img src={useBaseUrl("/img/tutorial/annotation.png" )} />
If you wish, you can change how your annotation looks by changing the settings in the Display
configuration section. Otherwise, select **OK** and finally **Save** to save your chart. If you keep
the default selection to overwrite the chart, your annotation will be saved to the chart and also
appear automatically in the Tutorial Dashboard.
### Advanced Analytics
In this section, we are going to explore the Advanced Analytics feature of Apache Superset that
allows you to apply additional transformations to your data. The three types of transformation are:
**Setting up the base chart**
In this section, were going to set up a base chart which we can then apply the different **Advanced
Analytics** features to. Start off by creating a new chart using the same _tutorial_flights_
datasource and the **Line Chart** visualization type. Within the Time section, set the Time Range as
1st October 2011 and 31st October 2011.
Next, in the query section, change the Metrics to the sum of Cost. Select **Run Query** to show the
chart. You should see the total cost per day for each month in October 2011.
<img src={useBaseUrl("/img/tutorial/advanced_analytics_base.png" )} />
Finally, save the visualization as Tutorial Advanced Analytics Base, adding it to the Tutorial
Dashboard.
### Rolling Mean
There is quite a lot of variation in the data, which makes it difficult to identify any trend. One
approach we can take is to show instead a rolling average of the time series. To do this, in the
**Moving Average** subsection of **Advanced Analytics**, select mean in the **Rolling** box and
enter 7 into both Periods and Min Periods. The period is the length of the rolling period expressed
as a multiple of the Time Grain. In our example, the Time Grain is day, so the rolling period is 7
days, such that on the 7th October 2011 the value shown would correspond to the first seven days of
October 2011. Lastly, by specifying Min Periods as 7, we ensure that our mean is always calculated
on 7 days and we avoid any ramp up period.
After displaying the chart by selecting **Run Query** you will see that the data is less variable
and that the series starts later as the ramp up period is excluded.
<img src={useBaseUrl("/img/tutorial/rolling_mean.png" )} />
Save the chart as Tutorial Rolling Mean and add it to the Tutorial Dashboard.
### Time Comparison
In this section, we will compare values in our time series to the value a week before. Start off by
opening the Tutorial Advanced Analytics Base chart, by going to **Charts** in the top menu and then
selecting the visualization name in the list (alternatively, find the chart in the Tutorial
Dashboard and select Explore chart from the menu for that visualization).
Next, in the Time Comparison subsection of **Advanced Analytics**, enter the Time Shift by typing in
“minus 1 week” (note this box accepts input in natural language). Run Query to see the new chart,
which has an additional series with the same values, shifted a week back in time.
<img src={useBaseUrl("/img/tutorial/time_comparison_two_series.png" )} />
Then, change the **Calculation type** to Absolute difference and select **Run Query**. We can now
see only one series again, this time showing the difference between the two series we saw
previously.
<img src={useBaseUrl("/img/tutorial/time_comparison_absolute_difference.png" )} />
Save the chart as Tutorial Time Comparison and add it to the Tutorial Dashboard.
### Resampling the data
In this section, well resample the data so that rather than having daily data we have weekly data.
As in the previous section, reopen the Tutorial Advanced Analytics Base chart.
Next, in the Python Functions subsection of **Advanced Analytics**, enter 7D, corresponding to seven
days, in the Rule and median as the Method and show the chart by selecting **Run Query**.
<img src={useBaseUrl("/img/tutorial/resample.png" )} />
Note that now we have a single data point every 7 days. In our case, the value showed corresponds to
the median value within the seven daily data points. For more information on the meaning of the
various options in this section, refer to the
[Pandas documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.resample.html).
Lastly, save your chart as Tutorial Resample and add it to the Tutorial Dashboard. Go to the
tutorial dashboard to see the four charts side by side and compare the different outputs.