Improve examples & related tests #7773

mistercrunch · 2019-06-25T06:06:13Z

SUMMARY

related to #7478
This enables setting up an alternative URI for loading/storing/serving the examples data. New URI is defined with config key SQLALCHEMY_EXAMPLES_URI. If nothing not set explicitly, will default to point to the metadata database (SQLALCHEMY_DATABASE_URI). This in turn adds a new default entry to the Database model named examples

As a result, examples data tables can now be reused across Superset environments, and the tables don't get mixed in with the metadata database, leaving it cleaner.

Approach

allowing specifying an alternate database connection for examples
allowing a --only-metadata flag to load_examples to load only
dashboard and chart definitions, no actual data is loaded
fixed a bunch of tests that are in some cases not directly related to this, but were not idempotent and prevented me from making progress.

Longer term we will generate the examples by exporting
them into tarball as in #7472.

codecov-io · 2019-07-06T20:42:49Z

Codecov Report

Merging #7773 into master will decrease coverage by 0.04%.
The diff coverage is 45.78%.

@@            Coverage Diff             @@
##           master    #7773      +/-   ##
==========================================
- Coverage   65.82%   65.77%   -0.05%     
==========================================
  Files         461      461              
  Lines       22105    22167      +62     
  Branches     2425     2425              
==========================================
+ Hits        14551    14581      +30     
- Misses       7433     7465      +32     
  Partials      121      121

Impacted Files	Coverage Δ
superset/examples/misc_dashboard.py	`27.27% <ø> (ø)`
superset/examples/__init__.py	`100% <ø> (ø)`
superset/examples/css_templates.py	`100% <ø> (ø)`
superset/examples/countries.py	`100% <ø> (ø)`
superset/examples/deck.py	`11.11% <0%> (ø)`
superset/examples/helpers.py	`100% <100%> (ø)`
superset/examples/unicode_test_data.py	`100% <100%> (ø)`
superset/examples/tabbed_dashboard.py	`22.22% <100%> (ø)`
superset/examples/birth_names.py	`100% <100%> (ø)`
superset/examples/energy.py	`100% <100%> (ø)`
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 86fdceb...29068a7. Read the comment docs.

superset/models/core.py

superset/tasks/cache.py

tests/strategy_tests.py

villebro

A big PR, so difficult to review thoroughly, but many important fixes/improvements and can be further refined later. I suggest merging this and dealing with possible regressions later.

villebro · 2019-07-09T19:24:21Z

superset/data/bart_lines.py

+                "path_json": Text,
+            },
+            index=False,
+        )


Oh, one more thing; when I added csv import functionality for BigQuery, I refactored db_engine_specs so that one can call db_engine_spec.df_to_sql(df, *kwargs) in place of df.to_sql(*kwargs) for engines that don't support df.to_sql(). So to make this work universally here, one would write

database.db_engine_spec.df_to_sql( tbl_name, database.get_sqla_engine(), if_exists="replace", chunksize=500, dtype={ "color": String(255), "name": String(255), "polyline": Text, "path_json": Text, }, index=False, )

This doesn't necessarily have to be addressed in this PR; I can do that later, too, as I have a good test rig for that.

related to apache#7472, longer term we will generate the examples by exporting them into tarball as in apache#7472. In the meantime, we need this subset of the features: * allowing specifying an alternate database connection for examples * allowing a --only-metadata flag to `load_examples` to load only dashboard and chart definitions, no actual data is loaded

The bug was introduced in apache#7773 It uses filter by `cls.table_name == datasource_name`: https://github.com/apache/incubator-superset/pull/7773/files#diff-a8dd5ec8d8decda2e3c5571d1ec0cdb6R740 But export puts `slc.datasource.name` into exported json: https://github.com/apache/incubator-superset/pull/7773/files#diff-ceeb7eee8d573333109e0037299c9711L673 `slc.datasource.name` in case of `SqlaTable` is `"{}.{}".format(self.schema, self.table_name)`

The bug was introduced in #7773 It uses filter by `cls.table_name == datasource_name`: https://github.com/apache/incubator-superset/pull/7773/files#diff-a8dd5ec8d8decda2e3c5571d1ec0cdb6R740 But export puts `slc.datasource.name` into exported json: https://github.com/apache/incubator-superset/pull/7773/files#diff-ceeb7eee8d573333109e0037299c9711L673 `slc.datasource.name` in case of `SqlaTable` is `"{}.{}".format(self.schema, self.table_name)`

* [WiP] improve load_examples related to apache#7472, longer term we will generate the examples by exporting them into tarball as in apache#7472. In the meantime, we need this subset of the features: * allowing specifying an alternate database connection for examples * allowing a --only-metadata flag to `load_examples` to load only dashboard and chart definitions, no actual data is loaded * Improve logging * Rename data->examples * Load only if not exist * By default do not load, add a force flag * fix build * set published to true

The bug was introduced in apache#7773 It uses filter by `cls.table_name == datasource_name`: https://github.com/apache/incubator-superset/pull/7773/files#diff-a8dd5ec8d8decda2e3c5571d1ec0cdb6R740 But export puts `slc.datasource.name` into exported json: https://github.com/apache/incubator-superset/pull/7773/files#diff-ceeb7eee8d573333109e0037299c9711L673 `slc.datasource.name` in case of `SqlaTable` is `"{}.{}".format(self.schema, self.table_name)`

pull-request-size bot added the size/XL label Jun 25, 2019

mistercrunch force-pushed the improve_load_examples branch from c96cc2d to d12747a Compare June 25, 2019 06:19

mistercrunch force-pushed the improve_load_examples branch 14 times, most recently from c11a5f4 to 1b249b5 Compare July 6, 2019 20:33

mistercrunch changed the title ~~[WiP] improve load_examples~~ Improve examples & related tests Jul 6, 2019

mistercrunch commented Jul 6, 2019

View reviewed changes

superset/models/core.py Show resolved Hide resolved

mistercrunch commented Jul 6, 2019

View reviewed changes

superset/models/core.py Show resolved Hide resolved

mistercrunch commented Jul 6, 2019

View reviewed changes

superset/tasks/cache.py Show resolved Hide resolved

mistercrunch commented Jul 6, 2019

View reviewed changes

tests/strategy_tests.py Show resolved Hide resolved

mistercrunch requested a review from betodealmeida July 6, 2019 21:01

mistercrunch mentioned this pull request Jul 6, 2019

bugfix: cli export_datasources (fixes 7437) #7728

Merged

12 tasks

mistercrunch force-pushed the improve_load_examples branch from 8a5d82d to f01b8ff Compare July 8, 2019 16:46

mistercrunch closed this Jul 9, 2019

mistercrunch reopened this Jul 9, 2019

villebro approved these changes Jul 9, 2019

View reviewed changes

villebro reviewed Jul 9, 2019

View reviewed changes

mistercrunch added 4 commits July 15, 2019 21:42

Improve logging

999bc38

Rename data->examples

79c6d93

Load only if not exist

625e013

By default do not load, add a force flag

146f599

mistercrunch force-pushed the improve_load_examples branch from f01b8ff to 146f599 Compare July 16, 2019 22:27

fix build

4ca819d

mistercrunch force-pushed the improve_load_examples branch from 135a85f to 4ca819d Compare July 17, 2019 04:12

set published to true

29068a7

mistercrunch mentioned this pull request Jul 17, 2019

Set examples as published=True #7869

Closed

6 tasks

mistercrunch merged commit d65b039 into apache:master Jul 17, 2019

mistercrunch deleted the improve_load_examples branch July 17, 2019 04:37

smacker mentioned this pull request Jul 17, 2019

Provide a way to export/import dashboards src-d/sourced-ui#165

Closed

smacker mentioned this pull request Jul 18, 2019

Fix incorrect datasource_name in dashboard export #7892

Merged

6 tasks

mistercrunch mentioned this pull request Aug 13, 2019

Prevent 'main' database connection creation #8038

Merged

6 tasks

mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.34.0 labels Feb 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve examples & related tests #7773

Improve examples & related tests #7773

mistercrunch commented Jun 25, 2019 •

edited

Loading

codecov-io commented Jul 6, 2019 •

edited

Loading

villebro left a comment

villebro Jul 9, 2019

Improve examples & related tests #7773

Improve examples & related tests #7773

Conversation

mistercrunch commented Jun 25, 2019 • edited Loading

CATEGORY

SUMMARY

Approach

codecov-io commented Jul 6, 2019 • edited Loading

Codecov Report

villebro left a comment

Choose a reason for hiding this comment

villebro Jul 9, 2019

Choose a reason for hiding this comment

mistercrunch commented Jun 25, 2019 •

edited

Loading

codecov-io commented Jul 6, 2019 •

edited

Loading