Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Features/add pre commit #64

Merged
merged 23 commits into from
Apr 18, 2023
Merged
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
eb4a4c9
Add .pre-commit-config.yaml
MaGering Mar 20, 2023
dfe5c52
Add ruff as package in pyproject.toml
MaGering Mar 20, 2023
eaac75a
Add mypy as package in pyproject.toml
MaGering Mar 20, 2023
b58af6d
Specify options for black in pyproject.toml
MaGering Mar 20, 2023
f1a1f55
Specify options for isort in pyproject.toml
MaGering Mar 20, 2023
fff267c
Specify options for ruff in pyproject.toml
MaGering Mar 20, 2023
acceb0e
Update poetry lock file
MaGering Mar 20, 2023
e9435d0
Update README.md
MaGering Mar 20, 2023
7495aaf
Update CHANGELOG.md
MaGering Mar 20, 2023
d9aa0d2
Comment mypy pre-commit check out for now
MaGering Mar 20, 2023
2d2fdd4
Fix json formats hence checks with check-json pass
MaGering Mar 20, 2023
4e16804
Apply fix end of files
MaGering Mar 20, 2023
1b944f6
Apply trim trailing whitespace
MaGering Mar 20, 2023
0325f98
Apply ruff, isort and black
MaGering Mar 20, 2023
7c05dd7
Revert "Comment mypy pre-commit check out for now"
MaGering Mar 20, 2023
b03611a
Update poetry lock file
MaGering Mar 21, 2023
7c1c509
Merge branch 'features/switch_to_poetry' into features/add_pre-commit_
MaGering Mar 21, 2023
ca8f624
Merge branch 'features/switch_to_poetry' into features/add_pre-commit_
MaGering Apr 17, 2023
eeb0272
Fix env name in conda activation command
nesnoj Apr 18, 2023
c75e0da
Add sudo to system packages install command
nesnoj Apr 18, 2023
d693760
Update .pre-commit-config.yaml
MaGering Apr 18, 2023
5b43f97
Move __init__.py to child dir
MaGering Apr 18, 2023
162e293
Merge branch 'dev' into features/add_pre-commit_
MaGering Apr 18, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
89 changes: 89 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# TODO: Find out if something needs to be changed in here in comparison to digiplan

exclude: 'docs|node_modules|vendors|migrations|.git|.tox'
default_stages: [commit]
fail_fast: true

repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.3.0
hooks:
- id: check-json
- id: end-of-file-fixer
- id: trailing-whitespace
- id: check-added-large-files

- repo: https://github.com/charliermarsh/ruff-pre-commit
# Ruff version.
rev: 'v0.0.244'
hooks:
- id: ruff
args: ["--fix"]

- repo: https://github.com/pycqa/isort
rev: 5.12.0
hooks:
- id: isort
name: isort (python)
args: ["--line-length", "80", "black"]

- repo: local
hooks:
- id: black
name: black
entry: black
language: python
types: [python]
- id: flake8
name: flake8
args:
- --max-line-length=80
- --ignore=DAR101,DAR201,F821,DAR401,W503,E800,E722,B001,B008,C408,W605, B007
entry: flake8
language: python
types: [python]
- id: pylint
name: pylint
entry: env DATABASE_URL=null env PROJ_LIB=null USE_DOCKER=null pylint
language: system
types: [python]
args:
[
"-rn", # Only display messages
"-sn", # Don't display the score
"--disable=E0602", # Disable the E0602 error
"--disable=C0114", # Disable the C0114 error TODO: To be fixed
"--disable=C0116", # Disable the C0116 error TODO: To be fixed
"--disable=R1729", # Disable the R1729 error TODO: To be fixed
"--disable=C0103", # Disable the C0103 error TODO: To be fixed
"--disable=R0801", # Disable the R0801 error TODO: To be fixed
"--disable=W1514", # Disable the W1514 error TODO: To be fixed
"--disable=R1734", # Disable the R1734 error TODO: To be fixed
"--disable=R1735", # Disable the R1735 error TODO: To be fixed
"--disable=W0612", # Disable the W0612 error TODO: To be fixed
"--disable=W1401", # Disable the W1401 error TODO: To be fixed
"--disable=W0511", # Disable the W0511 error TODO: To be fixed
"--disable=R0913", # Disable the R0913 error TODO: To be fixed
"--disable=R1705", # Disable the R1705 error TODO: To be fixed
"--disable=E0401", # Disable the E0401 error TODO: To be fixed
"--disable=E0611", # Disable the E0611 error TODO: To be fixed
"--max-line-length=80", # Set the maximum line length to 80
]
- id: mypy
name: mypy
entry: mypy
language: python
types: [python]
args:
[
"--ignore-missing-imports", # Ignore missing imports
"--warn-unused-ignores", # Warn about unused ignore comments
"--disable-error-code=name-defined", # To suppress Name "snakemake"
# is not defined [name-defined]
"--disable-error-code=var-annotated",
"--disable-error-code=var-annotated",
"--disable-error-code=dict-item",
"--disable-error-code=arg-type",
"--disable-error-code=assignment",
"--disable-error-code=attr-defined",
]
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ and the versioning aim to respect [Semantic Versioning](http://semver.org/spec/v
- Datasets attribute captions
- Create list of region-specific datasets in the docs
- pyproject.toml and poetry.lock file have been added with the conversion to poetry
- Add pre-commit in order to check for errors and linting bugs before commits

### Changed

Expand All @@ -31,6 +32,7 @@ and the versioning aim to respect [Semantic Versioning](http://semver.org/spec/v
- Fix conda installation by removing python-gdal from environment.yml
- The package management in digipipe has been changed to poetry.
- The installation of a virtual environment is done only from the environment.yml file and via conda.
- Apply linters on repo among others: black, isort, check-json and end-of-file-fixer

### Removed

Expand Down
22 changes: 18 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@

Pipeline for data and energy system in the Digiplan project.

## Installation
## Installation

**Note: Linux only, Windows is currently not supported.**

Expand All @@ -27,7 +27,7 @@ Enter repo folder. Set up a conda environment and activate it with:

```
conda env create -f environment.yml
conda activate digiplan
conda activate digipipe
```

Install [poetry](https://python-poetry.org/) (python dependency manager used
Expand All @@ -43,12 +43,26 @@ poetry install

Some additional system packages are required, install them by

apt install gdal-bin python3-gdal libspatialindex-dev imagemagick osmium-tool graphviz graphviz-dev
sudo apt install gdal-bin python3-gdal libspatialindex-dev imagemagick osmium-tool graphviz graphviz-dev

Notes:
- Make sure you have GDAL>=3.0 as older versions will not work.
- `imagemagick` is optional and only required for report creation

## Contributing to digipipe

You can write `issues <https://github.com/rl-institut-private/digipipe/issues>`_ to announce bugs or to propose enhancements.

If you want to participate in the development of digipipe, please make sure you use pre-commit.

You activate it with:

pre-commit install

To trigger a check manually, execute:

pre-commit run -a

## Further reading on structure, pipeline and conventions

- Datasets/data flow: [DATASETS.md](digipipe/store/DATASETS.md)
Expand Down Expand Up @@ -131,4 +145,4 @@ cores and requires about 600 GB of disk space.
└── setup.py
```

(created via `tree --dirsfirst -L 4 -a -I '__*|*log|.gitkeep|PKG-INFO|*egg-info*|img|.git|.idea|venv|.snakemake' . > dirtree.txt`)
(created via `tree --dirsfirst -L 4 -a -I '__*|*log|.gitkeep|PKG-INFO|*egg-info*|img|.git|.idea|venv|.snakemake' . > dirtree.txt`)
2 changes: 0 additions & 2 deletions __init__.py

This file was deleted.

1 change: 1 addition & 0 deletions digipipe/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
__version__ = "0.0.1dev"
6 changes: 4 additions & 2 deletions digipipe/scripts/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,10 @@
"""

import os
import yaml
from pathlib import Path

import yaml

from digipipe.store.utils import get_abs_store_root_path


Expand All @@ -21,7 +23,7 @@ def read_config(file: Path) -> dict:
dict
Config dict
"""
with open(file, 'r') as cfg_file:
with open(file, "r") as cfg_file:
try:
cfg = yaml.safe_load(cfg_file) or {}
except yaml.YAMLError as exc:
Expand Down
76 changes: 39 additions & 37 deletions digipipe/scripts/datasets/mastr.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
"""Shared functions for processing data from MaStR dataset"""

from typing import Tuple, Union

import geopandas as gpd
import pandas as pd
from geopy.exc import GeocoderUnavailable
from geopy.extra.rate_limiter import RateLimiter
from geopy.geocoders import Nominatim
from typing import Tuple, Union


def cleanse(
Expand Down Expand Up @@ -68,14 +69,18 @@ def add_voltage_level(
).rename(columns={"MastrNummer": "mastr_location_id2"})
gridconn = pd.read_csv(
gridconn_path,
usecols=["NetzanschlusspunktMastrNummer", "Spannungsebene"]
usecols=["NetzanschlusspunktMastrNummer", "Spannungsebene"],
)
locations = (
locations.merge(
gridconn,
left_on="Netzanschlusspunkte",
right_on="NetzanschlusspunktMastrNummer",
how="left",
)
.drop_duplicates()
.rename(columns={"Spannungsebene": "voltage_level"})
)
locations = locations.merge(
gridconn,
left_on="Netzanschlusspunkte",
right_on="NetzanschlusspunktMastrNummer",
how="left",
).drop_duplicates().rename(columns={"Spannungsebene": "voltage_level"})

# Add voltage level to units
units_df = units_df.reset_index().merge(
Expand All @@ -98,8 +103,8 @@ def add_voltage_level(


def add_geometry(
units_df: pd.DataFrame,
drop_units_wo_coords: bool = True,
units_df: pd.DataFrame,
drop_units_wo_coords: bool = True,
) -> gpd.GeoDataFrame:
"""
Add `geometry` column to MaStR unit data using `lat` and `lon` values.
Expand Down Expand Up @@ -129,9 +134,7 @@ def add_geometry(

units_gdf = gpd.GeoDataFrame(
units_df,
geometry=gpd.points_from_xy(
units_df["lon"], units_df["lat"], crs=4326
),
geometry=gpd.points_from_xy(units_df["lon"], units_df["lat"], crs=4326),
crs=4326,
).to_crs(3035)

Expand All @@ -142,10 +145,10 @@ def add_geometry(


def geocode(
units_df: pd.DataFrame,
user_agent: str = "geocoder",
interval: int = 1,
target_crs: str = "EPSG:3035",
units_df: pd.DataFrame,
user_agent: str = "geocoder",
interval: int = 1,
target_crs: str = "EPSG:3035",
) -> gpd.GeoDataFrame:
"""
Geocode locations from MaStR unit table using zip code and city.
Expand All @@ -171,8 +174,8 @@ def geocode(
"""

def geocoder(
user_agent: str,
interval: int,
user_agent: str,
interval: int,
) -> RateLimiter:
"""Setup Nominatim geocoding class.

Expand All @@ -195,7 +198,7 @@ def geocoder(
locator.geocode,
min_delay_seconds=interval,
)
except GeocoderUnavailable as e:
except GeocoderUnavailable:
print("Geocoder unavailable, aborting geocoding!")
raise

Expand Down Expand Up @@ -231,8 +234,9 @@ def geocoder(
)
unique_locations = gpd.GeoDataFrame(
unique_locations,
geometry=gpd.points_from_xy(unique_locations.longitude,
unique_locations.latitude),
geometry=gpd.points_from_xy(
unique_locations.longitude, unique_locations.latitude
),
crs="EPSG:4326",
)
# Merge locations back in units
Expand All @@ -247,9 +251,9 @@ def geocoder(


def geocode_units_wo_geometry(
units_df: pd.DataFrame,
columns_agg_functions: dict,
target_crs: str = "EPSG:3035",
units_df: pd.DataFrame,
columns_agg_functions: dict,
target_crs: str = "EPSG:3035",
) -> Tuple[gpd.GeoDataFrame, gpd.GeoDataFrame]:
"""
Add locations to units without coordinates by geocoding. The units are
Expand Down Expand Up @@ -287,8 +291,9 @@ def geocode_units_wo_geometry(
Units grouped by approximated location (1 dataset with >=1 units per
row) with aggregated attributes as given by `columns_agg_functions`.
"""

def aggregate_units_wo_geometry(
units_gdf: gpd.GeoDataFrame
units_gdf: gpd.GeoDataFrame,
) -> gpd.GeoDataFrame:
"""Aggregate units by approximated position

Expand All @@ -308,24 +313,21 @@ def aggregate_units_wo_geometry(

grouping_columns = ["zip_code", "city", "lat", "lon"]
units_agg_gdf = (
units_gdf[
grouping_columns + columns_agg_names
].groupby(grouping_columns, as_index=False).agg(
**columns_agg_functions
)
units_gdf[grouping_columns + columns_agg_names]
.groupby(grouping_columns, as_index=False)
.agg(**columns_agg_functions)
)

# Create geometry and select columns
units_agg_gdf = gpd.GeoDataFrame(
units_agg_gdf,
geometry=gpd.points_from_xy(units_agg_gdf.lon,
units_agg_gdf.lat),
geometry=gpd.points_from_xy(units_agg_gdf.lon, units_agg_gdf.lat),
crs=target_crs,
)[
["zip_code", "city"] +
list(columns_agg_functions.keys()) +
["geometry"]
]
["zip_code", "city"]
+ list(columns_agg_functions.keys())
+ ["geometry"]
]
return units_agg_gdf.assign(
status="In Betrieb oder in Planung",
geometry_approximated=1,
Expand Down
Loading