Reduce time to run tests by using a persistent source schema #482

plypaul · 2023-04-28T22:14:27Z

Describe the Feature
Many tests rely on test tables (e.g. fct_bookings) in a source schema in the target SQL engine to verify that the generated SQL can run and produce valid results. Currently, the source schema with a unique name is created and populated before the tests run, and dropped at the conclusion of the test. However, this incurs a significant overhead when running a single test with engines other than DuckDB. During development, a single test is often run repeatedly to resolve a bug. In addition, this overhead is present in the tests suites that are run in CI.

Since the test tables in the source schema change infrequently, this overhead can be reduced by creating a persistent schema that is reused between testing sessions. By using a hash of the data in the name of the schema, issues with stale data in the schema can be avoided. This also enables more automatic updates when the test data changes without requiring the user to manually drop / update tables.

When using a persistent schema, potential race conditions may exist when having multiple concurrent testing sessions create tables in the schema for the firs time since the name of the schema is only dependent on the hash. After the schema and the test tables are created, concurrency will not be an issue since the tables do not change. There may be some other conditions as well, so using a persistent schema will be enabled by a flag as the default behavior is robust. A ideal solution to the concurrency for the initial schema creation / table population needs more investigation.

Would you like to contribute?
Yes.

Anything Else?
N/A

The text was updated successfully, but these errors were encountered:

plypaul added enhancement New feature or request triage Tasks that need to be triaged labels Apr 28, 2023

plypaul mentioned this issue Apr 29, 2023

Reduce time to run tests by using a persistent source schema #483

Merged

Jstein77 added backlog and removed triage Tasks that need to be triaged labels Aug 30, 2023

tlento closed this as completed Sep 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce time to run tests by using a persistent source schema #482

Reduce time to run tests by using a persistent source schema #482

plypaul commented Apr 28, 2023 •

edited

Loading

Reduce time to run tests by using a persistent source schema #482

Reduce time to run tests by using a persistent source schema #482

Comments

plypaul commented Apr 28, 2023 • edited Loading

plypaul commented Apr 28, 2023 •

edited

Loading