[SQL Lab] Async query results serialization with MessagePack and PyArrow (#8069)

* Add support for msgpack results_backend serialization

* Serialize DataFrame with PyArrow rather than JSON

* Adjust dependencies, de-lint

* Add tests for (de)serialization methods

* Add MessagePack config info to Installation docs

* Enable msgpack/arrow serialization by default

* [Fix] Prevent msgpack serialization on synchronous queries

* Add type annotations
This commit is contained in:
Rob DiCiuccio
2019-08-27 14:23:40 -07:00
committed by Maxime Beauchemin
parent 56566c2645
commit 7595d9e5fd
13 changed files with 362 additions and 28 deletions

View File

@@ -846,6 +846,12 @@ look something like:
RESULTS_BACKEND = RedisCache(
host='localhost', port=6379, key_prefix='superset_results')
For performance gains, `MessagePack <https://github.com/msgpack/msgpack-python>`_
and `PyArrow <https://arrow.apache.org/docs/python/>`_ are now used for results
serialization. This can be disabled by setting ``RESULTS_BACKEND_USE_MSGPACK = False``
in your configuration, should any issues arise. Please clear your existing results
cache store when upgrading an existing environment.
**Important notes**
* It is important that all the worker nodes and web servers in