Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add testing #22

Merged
merged 31 commits into from
Aug 11, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
c6b517d
Add working script to run macro
JCZuurmond Dec 28, 2021
b3b0a0a
Add comment about adapters
JCZuurmond Dec 28, 2021
5bb9728
Try using a project instead of runtime config
JCZuurmond Dec 28, 2021
4e31232
Remove spark credentials and Project
JCZuurmond Dec 28, 2021
37aa6e6
Use connection from soda spark
JCZuurmond Dec 28, 2021
69ed207
Add test requirements
JCZuurmond Dec 28, 2021
236286d
Add pytest ini
JCZuurmond Aug 5, 2022
d72ebf2
Move everything into pytest fixtures
JCZuurmond Dec 28, 2021
18170a1
Copy connection
JCZuurmond Dec 29, 2021
25e5806
Remove pytest-dbt-core code
JCZuurmond Jan 28, 2022
409d827
Add pytest dbt core as test requirement
JCZuurmond Jan 28, 2022
56c848a
Add workflow for testing
JCZuurmond Jan 28, 2022
1560e5e
Bump pytest dbt core version
JCZuurmond May 27, 2022
3014264
Add profile to dbt project
JCZuurmond May 27, 2022
153708b
Add profiles
JCZuurmond May 27, 2022
f9b0db7
Add profiles dir when running pytest
JCZuurmond May 27, 2022
52307bb
Remove redundant from future import annotations
JCZuurmond May 27, 2022
cb447a2
Bump pytest-dbt-core version
JCZuurmond Jul 22, 2022
8b7eb8f
Change version
JCZuurmond Jul 22, 2022
6dfd9f7
Add pyspark dependency
JCZuurmond Jul 22, 2022
91b6bb1
Change pyspark dependency to dbt-spark session
JCZuurmond Jul 22, 2022
30112b0
Change required by to dbt-spark
JCZuurmond Jul 23, 2022
59f2139
Add test docstring
JCZuurmond Aug 5, 2022
74482a7
Make test less strict
JCZuurmond Aug 5, 2022
df29346
Create and delete table with fixture
JCZuurmond Aug 5, 2022
ffe50cb
Fix typo
JCZuurmond Aug 5, 2022
8b13fda
Add section about testing to the documentation
JCZuurmond Aug 5, 2022
29e88d6
Move test macros into tests/unit
JCZuurmond Aug 11, 2022
ee25a3e
Run unit tests only in Github action
JCZuurmond Aug 11, 2022
bbc7923
Merge dev and test requirements
JCZuurmond Aug 11, 2022
d380607
Move conftest into functional
JCZuurmond Aug 11, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions .github/workflows/workflow.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
name: Test

on:
pull_request:
push:
branches:
- main

jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2

- name: Set up Python 3.9
uses: actions/setup-python@v2
with:
python-version: 3.9

- name: Install dependencies
shell: bash
run: |
sudo apt-get install libsasl2-dev
python -m pip install --upgrade pip
python -m pip install -r dev-requirements.txt

- name: Run unit tests
shell: bash
run: DBT_PROFILES_DIR=$PWD pytest tests/unit
37 changes: 36 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ dispatch:

### Note to maintainers of other packages

The spark-utils package may be able to provide compatibility for your package, especially if your package leverages dbt-utils macros for cross-database compatibility. This package _does not_ need to be specified as a depedency of your package in `packages.yml`. Instead, you should encourage anyone using your package on Apache Spark / Databricks to:
The spark-utils package may be able to provide compatibility for your package, especially if your package leverages dbt-utils macros for cross-database compatibility. This package _does not_ need to be specified as a dependency of your package in `packages.yml`. Instead, you should encourage anyone using your package on Apache Spark / Databricks to:
- Install `spark_utils` alongside your package
- Add a `dispatch` config in their root project, like the one above

Expand All @@ -56,6 +56,41 @@ We welcome contributions to this repo! To contribute a new feature or a fix,
please open a Pull Request with 1) your changes and 2) updated documentation for
the `README.md` file.

## Testing

The macros are tested with [`pytest`](https://docs.pytest.org) and
[`pytest-dbt-core`](https://pypi.org/project/pytest-dbt-core/). For example,
the [`create_tables` macro is tested](./tests/test_macros.py) by:

1. Create a test table (test setup):
``` python
spark_session.sql(f"CREATE TABLE {table_name} (id int) USING parquet")
```
2. Call the macro generator:
``` python
tables = macro_generator()
```
3. Assert test condition:
``` python
assert simple_table in tables
```
4. Delete the test table (test cleanup):
``` python
spark_session.sql(f"DROP TABLE IF EXISTS {table_name}")
```

A macro is fetched using the
[`macro_generator`](https://pytest-dbt-core.readthedocs.io/en/latest/dbt_spark.html#usage)
fixture and providing the macro name trough
[indirect parameterization](https://docs.pytest.org/en/7.1.x/example/parametrize.html?highlight=indirect#indirect-parametrization):

``` python
@pytest.mark.parametrize(
"macro_generator", ["macro.spark_utils.get_tables"], indirect=True
)
def test_create_table(macro_generator: MacroGenerator) -> None:
```

----

### Getting started with dbt + Spark
Expand Down
3 changes: 2 additions & 1 deletion dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
name: 'spark_utils'
profile: 'sparkutils'
version: '0.3.0'
config-version: 2
require-dbt-version: [">=1.2.0", "<2.0.0"]
macro-paths: ["macros"]
macro-paths: ["macros"]
4 changes: 3 additions & 1 deletion dev-requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,6 @@ pytest
pyodbc==4.0.32
git+https://github.com/dbt-labs/dbt-core.git#egg=dbt-core&subdirectory=core
git+https://github.com/dbt-labs/dbt-core.git#egg=dbt-tests-adapter&subdirectory=tests/adapter
git+https://github.com/dbt-labs/dbt-spark.git#egg=dbt-spark[ODBC]
git+https://github.com/dbt-labs/dbt-spark.git#egg=dbt-spark[ODBC,session]
pytest-spark~=0.6.0
pytest-dbt-core~=0.1.0
8 changes: 8 additions & 0 deletions profiles.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
sparkutils:
target: test
outputs:
test:
type: spark
method: session
schema: test
host: NA # not used, but required by `dbt-spark`
4 changes: 4 additions & 0 deletions pytest.ini
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,7 @@ env_files =
test.env
testpaths =
tests/functional
spark_options =
spark.app.name: spark-utils
spark.executor.instances: 1
spark.sql.catalogImplementation: in-memory
File renamed without changes.
26 changes: 26 additions & 0 deletions tests/unit/test_macros.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
import uuid

import pytest
from dbt.clients.jinja import MacroGenerator
from pyspark.sql import SparkSession


@pytest.fixture
def simple_table(spark_session: SparkSession) -> str:
"""Create and delete a simple table used for testing."""
table_name = f"default.table_{uuid.uuid4()}".replace("-", "_")
spark_session.sql(f"CREATE TABLE {table_name} (id int) USING parquet")
yield table_name
spark_session.sql(f"DROP TABLE IF EXISTS {table_name}")


@pytest.mark.parametrize(
"macro_generator", ["macro.spark_utils.get_tables"], indirect=True
)
def test_create_table(
macro_generator: MacroGenerator, simple_table: str
) -> None:
"""The `get_tables` macro should return the created table."""
tables = macro_generator()
assert simple_table in tables