Use dbt postgres adapter wrapper for integration tests #614

tlento · 2023-06-22T01:34:57Z

As part of the effort to provide basic metric querying capabilities
for local installations of dbt core, MetricFlow is removing all
engine-specific dependencies and instead delegating warehouse
connection and query management to dbt adapters for integration
tests and command line operations.

This commit is the first step towards moving onto dbt adapters. It
provides a new wrapper class for dbt adapters, conforming to
MetricFlow's SqlClient protocol, and delegating all operations
down to the underlying adapter. It uses this adapter in test cases
only, and sources it from a dummy project in the test fixtures
directory. The dummy project is configured to allow for different
adapters (configured through test runner environment variables), and
in future it will likely be used as the basis for MetricFlow's
core integration test table management.

Note the implementation of the SqlClient wrapper class has some
highly temporary work-arounds to certain rough edges that exist
between dbt's adapter interface and the current SqlClient requirements.
Most of these arise from table and schema management calls, which are
only used in the MetricFlow CLI tutorial and integration test suite.
As such, we expect those methods to be removed from the SqlClient
interface as we move management of these datasets to dbt.

Further, the dbt adapter fetching is a bit suspect, and not to be
copied - we are relying on the dbtRunner()'s state loading to remain
valid as we access certain internal APIs from dbt core to get the
profile we need. All of this works, and we expect it to continue
working, but we will need to maintain this connection and keep it
up to date as the dbt core project updates its external accessors for
the dbt profile and dbt project which ultimately initialize and house
the adapter instance.

tlento · 2023-06-22T01:35:10Z

Current dependencies on/for this PR:

main
- PR Move snapshot files to enumerated paths #613
  - PR Use dbt postgres adapter wrapper for integration tests #614 👈

This comment was auto-generated by Graphite.

tlento · 2023-06-22T01:36:49Z

snyk failures are expected due to pyproject.toml changes

plypaul

Sorry, clicked approve a little too soon. Taking a look still.

plypaul · 2023-06-22T17:23:26Z

metricflow/cli/dbt_connectors/adapter_backed_client.py

+                    if pd.isnull(cell):
+                        # use null keyword instead of isNA/None/etc.
+                        cells.append("null")
+                    elif type(cell) in [str, pd.Timestamp]:


Generally works, but I vaguely recall running into type errors with one SQL engine where specifying a string when a time is expected threw an error. Cross that bridge if / when we get there.

BigQuery almost certainly has this issue.

plypaul · 2023-06-22T17:25:10Z

metricflow/cli/dbt_connectors/adapter_backed_client.py

+        logger.info(f"Finished running the dry_run in {stop - start:.2f}s")
+        return
+
+    def create_table_from_dataframe(


This is straightforward, but yet I have a suspicion that we might run into edge cases. Just a comment, no action needed.

Yeah, this is temporary. It'll be a nightmare to maintain as we extend across dialects which is why I'm hoping to move to dbt managing dataset creation ASAP.

plypaul · 2023-06-22T17:30:34Z

metricflow/test/sql_clients/test_sql_client.py

@@ -40,6 +41,8 @@ def test_query(sql_client: SqlClient) -> None:  # noqa: D

 def test_query_with_execution_params(sql_client: SqlClient) -> None:
    """Test querying with execution parameters of all supported datatypes."""
+    if isinstance(sql_client, AdapterBackedSqlClient):


Putting this this check / skip logic into a function might be nice so that we can use the IDE tooling to update this later when support is added. Also, reduces isinstance calls.

Once all of our production engines are cut over to adapters I'll either remove these tests or keep them solely for a DuckDB "example" implementation of a SQLClient. I don't think the adapter-backed client will support bind parameters any time soon.

If we don't have one already we should add a test to make sure bind parameters are preserved and propagated through the rendering layers, though. I know they are, generally, but I don't remember what kind of testing we have around that behavior.

plypaul · 2023-06-22T21:44:32Z

pyproject.toml

+[tool.hatch.envs.postgres-env]
+description = "Dev environment for working with Postgres adapter"
+pre-install-commands = [
+  "pip install dbt-postgres",


What's the reason for installing via pip instead of specifying it as a dependency?

I could've sworn I had a comment about this, but clearly there isn't one. I'll add one before merging.

We don't depend on dbt-postgres, or any specific adapter, anywhere in the codebase, so we don't want to force an install in the normal dependency list. We could make it an env-specific dependency but then we're bound to a version range, which means keeping the version ranges in sync between here and where we specify dbt-core, and dealing with things like dbt-semantic-interfaces version requirements.

This way we get the relevant adapter installed in the environment, the way an end user would, and pip takes care of the rest as best it's able.

Once we have all of the packaging laid out with dbt-metricflow we'll likely update this to use an editable dependency on the local package. The pre-install is apparently the only way hatch supports editable dependency installations, so that'll have to be here anyway, but that should allow us to work around dbt-semantic-interfaces version update conflicts in most circumstances.

This helper function was tagged as only being used in tests, but it was sitting in the main package. This moves it to the test package to avoid confusion as we restructure around dbt adapter integrations.

As part of the effort to provide basic metric querying capabilities for local installations of dbt core, MetricFlow is removing all engine-specific dependencies and instead delegating warehouse connection and query management to dbt adapters for integration tests and command line operations. This commit is the first step towards moving onto dbt adapters. It provides a new wrapper class for dbt adapters, conforming to MetricFlow's SqlClient protocol, and delegating all operations down to the underlying adapter. It uses this adapter in test cases only, and sources it from a dummy project in the test fixtures directory. The dummy project is configured to allow for different adapters (configured through test runner environment variables), and in future it will likely be used as the basis for MetricFlow's core integration test table management. Note the implementation of the SqlClient wrapper class has some highly temporary work-arounds to certain rough edges that exist between dbt's adapter interface and the current SqlClient requirements. Most of these arise from table and schema management calls, which are only used in the MetricFlow CLI tutorial and integration test suite. As such, we expect those methods to be removed from the SqlClient interface as we move management of these datasets to dbt. Further, the dbt adapter fetching is a bit suspect, and not to be copied - we are relying on the dbtRunner()'s state loading to remain valid as we access certain internal APIs from dbt core to get the profile we need. All of this works, and we expect it to continue working, but we will need to maintain this connection and keep it up to date as the dbt core project updates its external accessors for the dbt profile and dbt project which ultimately initialize and house the adapter instance.

The switch to dbt adapters breaks the Postgres test run for generating snapshots. Since Postgres runs so fast we simply run all tests via the relevant hatch env command.

The dbt adapter-backed SqlClient needs to skip some tests and depends on some custom package installations. This commit updates the logic for the former to depend only on DuckDB, which is not currently planned for migration to adapters. We also add a comment explaining the installation configuration in the pyproject.toml.

cla-bot bot added the cla:yes label Jun 22, 2023

tlento mentioned this pull request Jun 22, 2023

Move snapshot files to enumerated paths #613

Merged

tlento linked an issue Jun 22, 2023 that may be closed by this pull request

Add dbt adapter shim wrapper class and configuration updates to enable loading a SqlClient instance that delegates execution through the dbt adapter instance, with working tests in Postgres #578

Closed

tlento added the run_mf_sql_engine_tests label Jun 22, 2023

tlento had a problem deploying to DW_INTEGRATION_TESTS June 22, 2023 01:36 — with GitHub Actions Failure

tlento had a problem deploying to DW_INTEGRATION_TESTS June 22, 2023 01:37 — with GitHub Actions Failure

tlento force-pushed the add-dbt-adapter-shim branch from a4ac935 to 85cd7fa Compare June 22, 2023 01:41

tlento added run_mf_sql_engine_tests and removed run_mf_sql_engine_tests labels Jun 22, 2023

tlento temporarily deployed to DW_INTEGRATION_TESTS June 22, 2023 01:43 — with GitHub Actions Inactive

tlento requested review from plypaul and WilliamDee June 22, 2023 01:43

plypaul approved these changes Jun 22, 2023

View reviewed changes

plypaul reviewed Jun 22, 2023

View reviewed changes

Base automatically changed from move-snapshots-to-engine-type-paths to main June 22, 2023 21:54

tlento added 4 commits June 23, 2023 11:50

Move make_sql_client helper to sql_client_fixtures

7a46617

This helper function was tagged as only being used in tests, but it was sitting in the main package. This moves it to the test package to avoid confusion as we restructure around dbt adapter integrations.

Fix generate snapshots helper for postgres

6c841ac

The switch to dbt adapters breaks the Postgres test run for generating snapshots. Since Postgres runs so fast we simply run all tests via the relevant hatch env command.

tlento force-pushed the add-dbt-adapter-shim branch from 85cd7fa to cfab7c7 Compare June 23, 2023 19:25

tlento merged commit 09281fb into main Jun 23, 2023

tlento deleted the add-dbt-adapter-shim branch June 23, 2023 19:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use dbt postgres adapter wrapper for integration tests #614

Use dbt postgres adapter wrapper for integration tests #614

tlento commented Jun 22, 2023 •

edited

Loading

tlento commented Jun 22, 2023

tlento commented Jun 22, 2023

plypaul left a comment

plypaul Jun 22, 2023

tlento Jun 22, 2023

plypaul Jun 22, 2023

tlento Jun 22, 2023

plypaul Jun 22, 2023

tlento Jun 23, 2023

plypaul Jun 22, 2023

tlento Jun 23, 2023

Use dbt postgres adapter wrapper for integration tests #614

Use dbt postgres adapter wrapper for integration tests #614

Conversation

tlento commented Jun 22, 2023 • edited Loading

tlento commented Jun 22, 2023

tlento commented Jun 22, 2023

plypaul left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tlento commented Jun 22, 2023 •

edited

Loading