diff --git a/.github/RELEASE_TEMPLATE/release_body.md b/.github/RELEASE_TEMPLATE/release_body.md
index 630c9436c8..d4b5f9834a 100644
--- a/.github/RELEASE_TEMPLATE/release_body.md
+++ b/.github/RELEASE_TEMPLATE/release_body.md
@@ -1,3 +1,3 @@
 We are pleased to announce the release of a new Darts version.
 
-You can find a list with all changes in the [release notes](https://unit8co.github.io/darts/release_notes/RELEASE_NOTES.html).
\ No newline at end of file
+You can find a list with all changes in the [release notes](https://unit8co.github.io/darts/release_notes/RELEASE_NOTES.html).
diff --git a/.github/codecov.yml b/.github/codecov.yml
index 5dd2178631..35cde5cd5e 100644
--- a/.github/codecov.yml
+++ b/.github/codecov.yml
@@ -1,4 +1,4 @@
 coverage:
   status:
     project: off
-    patch: off
\ No newline at end of file
+    patch: off
diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md
index 77350a2f56..48047b2301 100644
--- a/.github/pull_request_template.md
+++ b/.github/pull_request_template.md
@@ -1,7 +1,7 @@
 Checklist before merging this PR:
 - [ ] Mentioned all issues that this PR fixes or addresses.
 - [ ] Summarized the updates of this PR under **Summary**.
-- [ ] Added an entry under **Unreleased** in the [Changelog](../CHANGELOG.md). 
+- [ ] Added an entry under **Unreleased** in the [Changelog](../CHANGELOG.md).
 
 <!-- Please mention an issue this pull request addresses. -->
 Fixes #.
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index eb52467d33..66a1e1f06c 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -1,4 +1,24 @@
+default_language_version:
+  python: python3
+
+ci:
+  autofix_prs: true
+  autoupdate_commit_msg: "[pre-commit.ci] pre-commit suggestions"
+  autoupdate_schedule: quarterly
+  # submodules: true
+
 repos:
+  - repo: https://github.com/pre-commit/pre-commit-hooks
+    rev: v4.6.0
+    hooks:
+      - id: end-of-file-fixer
+      - id: trailing-whitespace
+      - id: check-json
+      - id: check-yaml
+        exclude: "conda_recipe/darts/meta.yaml"
+      - id: check-toml
+      - id: detect-private-key
+
   - repo: https://github.com/psf/black
     rev: 24.3.0
     hooks:
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 33db100bc3..579e2b613b 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -13,6 +13,8 @@ but cannot always guarantee backwards compatibility. Changes that may **break co
 **Fixed**
 
 **Dependencies**
+- Improvements to linting via updated pre-commit configurations: [#2324](https://github.com/unit8co/darts/pull/2324) by [Jirka Borovec](https://github.com/borda).
+
 
 ### For developers of the library:
 
@@ -27,7 +29,7 @@ but cannot always guarantee backwards compatibility. Changes that may **break co
       - Time aggregated metric `merr()` (Mean Error)
       - Time aggregated scaled metrics  `rmsse()`, and `msse()` : The (Root) Mean Squared Scaled Error.
       - "Per time step" metrics that return a metric score per time step: `err()` (Error), `ae()` (Absolute Error), `se()` (Squared Error), `sle()` (Squared Log Error), `ase()` (Absolute Scaled Error), `sse` (Squared Scaled Error), `ape()` (Absolute Percentage Error), `sape()` (symmetric Absolute Percentage Error), `arre()` (Absolute Ranged Relative Error), `ql` (Quantile Loss)
-    - All scaled metrics (`mase()`, ...) now accept `insample` series that can be overlapping into `pred_series` (before they had to end exactly one step before `pred_series`).  Darts will handle the correct time extraction for you.  
+    - All scaled metrics (`mase()`, ...) now accept `insample` series that can be overlapping into `pred_series` (before they had to end exactly one step before `pred_series`).  Darts will handle the correct time extraction for you.
     - Improvements to the documentation:
       - Added a summary list of all metrics to the [metrics documentation page](https://unit8co.github.io/darts/generated_api/darts.metrics.html)
       - Standardized the documentation of each metric (added formula, improved return documentation, ...)
@@ -48,7 +50,7 @@ but cannot always guarantee backwards compatibility. Changes that may **break co
     - 🔴 Improved historical forecasts output consistency based on the type of input `series` : If `series` is a sequence, historical forecasts will now always return a sequence/list of the same length (instead of trying to reduce to a `TimeSeries` object). You can find a detailed description in the [historical forecasts API documentation](https://unit8co.github.io/darts/generated_api/darts.models.forecasting.linear_regression_model.html#darts.models.forecasting.linear_regression_model.LinearRegressionModel.historical_forecasts).
   - **Backtest**:
     - Metrics are now computed only once on all `series` and `historical_forecasts`, significantly speeding things up when using a large number of `series`.
-    - Added support for scaled metrics as `metric` (such as `ase`, `mase`, ...). No extra code required, backtest extracts the correct `insample` series for you.   
+    - Added support for scaled metrics as `metric` (such as `ase`, `mase`, ...). No extra code required, backtest extracts the correct `insample` series for you.
     - Added support for passing additional metric (-specific) arguments with parameter `metric_kwargs`. This allows for example to parallelize the metric computation with `n_jobs`, customize the metric reduction with `*_reduction`, specify seasonality `m` for scaled metrics, etc.
     - 🔴 Breaking changes:
       - Improved backtest output consistency based on the type of input `series`, `historical_forecast`, and the applied backtest reduction. For some scenarios, the output type changed compared to previous Darts versions. You can find a detailed description in the [backtest API documentation](https://unit8co.github.io/darts/generated_api/darts.models.forecasting.linear_regression_model.html#darts.models.forecasting.linear_regression_model.LinearRegressionModel.backtest).
@@ -99,13 +101,13 @@ but cannot always guarantee backwards compatibility. Changes that may **break co
   - 🚀🚀🚀 All global models (regression and torch models) now support shifted predictions with model creation parameter `output_chunk_shift`. This will shift the output chunk for training and prediction by `output_chunk_shift` steps into the future. [#2176](https://github.com/unit8co/darts/pull/2176) by [Dennis Bader](https://github.com/dennisbader).
 - Improvements to `TimeSeries`, [#2196](https://github.com/unit8co/darts/pull/2196) by [Dennis Bader](https://github.com/dennisbader):
   - 🚀🚀🚀 Significant performance boosts for several `TimeSeries` methods resulting increased efficiency across the entire `Darts` library. Up to 2x faster creation times for series indexed with "regular" frequencies (e.g. Daily, hourly, ...), and >100x for series indexed with "special" frequencies (e.g. "W-MON", ...). Affects:
-    - All `TimeSeries` creation methods     
+    - All `TimeSeries` creation methods
     - Additional boosts for slicing with integers and Timestamps
     - Additional boosts for `from_group_dataframe()` by performing some of the heavy-duty computations on the entire DataFrame, rather than iteratively on the group level.
   - Added option to exclude some `group_cols` from being added as static covariates when using `TimeSeries.from_group_dataframe()` with parameter `drop_group_cols`.
 - 🚀 New global baseline models that use fixed input and output chunks for prediction. This offers support for univariate, multivariate, single and multiple target series prediction, one-shot- or autoregressive/moving forecasts, optimized historical forecasts, batch prediction, prediction from datasets, and more. [#2261](https://github.com/unit8co/darts/pull/2261) by [Dennis Bader](https://github.com/dennisbader).
-  - `GlobalNaiveAggregate` : Computes an aggregate (using a custom or built-in `torch` function) for each target component over the last `input_chunk_length` points, and repeats the values `output_chunk_length` times for prediction. Depending on the parameters, this model can be equivalent to `NaiveMean` and `NaiveMovingAverage`. 
-  - `GlobalNaiveDrift` : Takes the slope of each target component over the last `input_chunk_length` points and projects the trend over the next `output_chunk_length` points for prediction. Depending on the parameters, this model can be equivalent to `NaiveDrift`. 
+  - `GlobalNaiveAggregate` : Computes an aggregate (using a custom or built-in `torch` function) for each target component over the last `input_chunk_length` points, and repeats the values `output_chunk_length` times for prediction. Depending on the parameters, this model can be equivalent to `NaiveMean` and `NaiveMovingAverage`.
+  - `GlobalNaiveDrift` : Takes the slope of each target component over the last `input_chunk_length` points and projects the trend over the next `output_chunk_length` points for prediction. Depending on the parameters, this model can be equivalent to `NaiveDrift`.
   - `GlobalNaiveSeasonal` : Takes the target component value at the `input_chunk_length`th point before the end of the target `series`, and repeats the values `output_chunk_length` times for prediction. Depending on the parameters, this model can be equivalent to `NaiveSeasonal`.
 - Improvements to `TorchForecastingModel` :
   - Added support for additional lr scheduler configuration parameters for more control ("interval", "frequency", "monitor", "strict", "name"). [#2218](https://github.com/unit8co/darts/pull/2218) by [Dennis Bader](https://github.com/dennisbader).
@@ -210,10 +212,10 @@ No changes.
 - Improvements to `EnsembleModel`, [#1815](https://github.com/unit8co/darts/pull/#1815) by [Antoine Madrona](https://github.com/madtoinou) and [Dennis Bader](https://github.com/dennisbader):
   - 🔴 Renamed model constructor argument `models` to `forecasting_models`.
   - 🚀🚀 Added support for pre-trained `GlobalForecastingModel` as `forecasting_models` to avoid re-training when ensembling. This requires all models to be pre-trained global models.
-  - 🚀 Added support for generating the `forecasting_model` forecasts (used to train the ensemble model) with historical forecasts rather than direct (auto-regressive) predictions. Enable it with `train_using_historical_forecasts=True` at model creation. 
+  - 🚀 Added support for generating the `forecasting_model` forecasts (used to train the ensemble model) with historical forecasts rather than direct (auto-regressive) predictions. Enable it with `train_using_historical_forecasts=True` at model creation.
   - Added an example notebook for ensemble models.
 - Improvements to historical forecasts, backtest and gridsearch, [#1866](https://github.com/unit8co/darts/pull/1866) by [Antoine Madrona](https://github.com/madtoinou):
-  - Added support for negative `start` values to start historical forecasts relative to the end of the target series. 
+  - Added support for negative `start` values to start historical forecasts relative to the end of the target series.
   - Added a new argument `start_format` that allows to use an integer `start` either as the index position or index value/label for `series` indexed with a `pd.RangeIndex`.
   - Added support for `TimeSeries` with a `RangeIndex` starting at a negative integer.
 - Other improvements:
@@ -241,7 +243,7 @@ No changes.
 **Installation**
 - 🔴 Removed Prophet, LightGBM, and CatBoost dependencies from PyPI packages (`darts`, `u8darts`, `u8darts[torch]`), and conda-forge packages (`u8darts`, `u8darts-torch`)  to avoid installation issues that some users were facing (installation on Apple M1/M2 devices, ...). [#1589](https://github.com/unit8co/darts/pull/1589) by [Julien Herzen](https://github.com/hrzn) and [Dennis Bader](https://github.com/dennisbader).
   - The models are still supported by installing the required packages as described in our [installation guide](https://github.com/unit8co/darts/blob/master/INSTALL.md#enabling-optional-dependencies).
-  - The Darts package including all dependencies can still be installed with PyPI package `u8darts[all]` or conda-forge package `u8darts-all`. 
+  - The Darts package including all dependencies can still be installed with PyPI package `u8darts[all]` or conda-forge package `u8darts-all`.
   - Added new PyPI flavor `u8darts[notorch]`, and conda-forge flavor `u8darts-notorch` which are equivalent to the old `u8darts` installation (all dependencies except neural networks).
 - 🔴 Removed support for Python 3.7 [#1864](https://github.com/unit8co/darts/pull/1864) by [Dennis Bader](https://github.com/dennisbader).
 
@@ -296,7 +298,7 @@ No changes.
   - New baseline forecasting model `NaiveMovingAverage`. [#1557](https://github.com/unit8co/darts/pull/1557) by [Janek Fidor](https://github.com/JanFidor).
   - New models `StatsForecastAutoCES`, and `StatsForecastAutoTheta` from Nixtla's statsforecasts library as local forecasting models without covariates support. AutoTheta supports probabilistic forecasts. [#1476](https://github.com/unit8co/darts/pull/1476) by [Boyd Biersteker](https://github.com/Beerstabr).
   - Added support for future covariates, and probabilistic forecasts to `StatsForecastAutoETS`. [#1476](https://github.com/unit8co/darts/pull/1476) by [Boyd Biersteker](https://github.com/Beerstabr).
-  - Added support for logistic growth to `Prophet` with parameters `growth`, `cap`, `floor`. [#1419](https://github.com/unit8co/darts/pull/1419) by [David Kleindienst](https://github.com/DavidKleindienst). 
+  - Added support for logistic growth to `Prophet` with parameters `growth`, `cap`, `floor`. [#1419](https://github.com/unit8co/darts/pull/1419) by [David Kleindienst](https://github.com/DavidKleindienst).
   - Improved the model string / object representation style similar to scikit-learn models. [#1590](https://github.com/unit8co/darts/pull/1590) by [Janek Fidor](https://github.com/JanFidor).
   - 🔴 Renamed `MovingAverage` to `MovingAverageFilter` to avoid confusion with new `NaiveMovingAverage` model. [#1557](https://github.com/unit8co/darts/pull/1557) by [Janek Fidor](https://github.com/JanFidor).
 - Improvements to `RegressionModel` :
@@ -541,7 +543,7 @@ Patch release
 - Improved user guide with more sections. [#905](https://github.com/unit8co/darts/pull/905)
   by [Julien Herzen](https://github.com/hrzn).
 - New notebook showcasing transfer learning and training forecasting models on large time
-  series datasets. [#885](https://github.com/unit8co/darts/pull/885) 
+  series datasets. [#885](https://github.com/unit8co/darts/pull/885)
   by [Julien Herzen](https://github.com/hrzn).
 
 
@@ -554,7 +556,7 @@ Patch release
 
 **Improved**
 - `LinearRegressionModel` and `LightGBMModel` can now be probabilistic, supporting quantile
-  and poisson regression. [#831](https://github.com/unit8co/darts/pull/831), 
+  and poisson regression. [#831](https://github.com/unit8co/darts/pull/831),
   [#853](https://github.com/unit8co/darts/pull/853) by [Gian Wiher](https://github.com/gnwhr).
 - New models: `BATS` and `TBATS`, based on [tbats](https://github.com/intive-DataScience/tbats).
   [#816](https://github.com/unit8co/darts/pull/816) by [Julien Herzen](https://github.com/hrzn).
@@ -564,7 +566,7 @@ Patch release
   by [@gsamaras](https://github.com/gsamaras).
 - Added train and validation loss to PyTorch Lightning progress bar.
   [#825](https://github.com/unit8co/darts/pull/825) by [Dennis Bader](https://github.com/dennisbader).
-- More losses available in `darts.utils.losses` for PyTorch-based models: 
+- More losses available in `darts.utils.losses` for PyTorch-based models:
   `SmapeLoss`, `MapeLoss` and `MAELoss`. [#845](https://github.com/unit8co/darts/pull/845)
   by [Julien Herzen](https://github.com/hrzn).
 - Improvement to the seasonal decomposition [#862](https://github.com/unit8co/darts/pull/862).
@@ -595,7 +597,7 @@ Patch release
   by [Dennis Bader](https://github.com/dennisbader).
 - Fixed an issue with the periodic basis functions of N-BEATS. [#804](https://github.com/unit8co/darts/pull/804)
   by [Vladimir Chernykh](https://github.com/vladimir-chernykh).
-- Relaxed requirements for `pandas`; from `pandas>=1.1.0` to `pandas>=1.0.5`. 
+- Relaxed requirements for `pandas`; from `pandas>=1.1.0` to `pandas>=1.0.5`.
   [#800](https://github.com/unit8co/darts/pull/800) by [@adelnick](https://github.com/adelnick).
 
 
@@ -629,7 +631,7 @@ Patch release
 
 
 **Fixed**
-- Fixed an issue with tensorboard and gridsearch when `model_name` is provided. 
+- Fixed an issue with tensorboard and gridsearch when `model_name` is provided.
   [#759](https://github.com/unit8co/darts/issues/759) by [@gdevos010](https://github.com/gdevos010).
 - Fixed issues with pip-tools. [#762](https://github.com/unit8co/darts/pull/762)
   by [Tomas Van Pottelbergh](https://github.com/tomasvanpottelbergh).
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index ac2513c308..4e1e504621 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -80,5 +80,5 @@ To ensure you don't need to worry about formatting and linting when contributing
 Please follow the procedure described in [INSTALL.md](https://github.com/unit8co/darts/blob/master/INSTALL.md#test-environment-appple-m1-processor)
 to set up a x_64 emulated environment. For the development environment, instead of installing Darts with
 `pip install darts`, instead go to the darts cloned repo location and install the packages with: `pip install -r requirements/dev-all.txt`.
-If necessary, follow the same steps to setup libomp for lightgbm. 
+If necessary, follow the same steps to setup libomp for lightgbm.
 Finally, verify your overall environment setup by successfully running all unitTests with gradlew or pytest.
diff --git a/Dockerfile b/Dockerfile
index 160fbec0a8..98179fbbdc 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -21,4 +21,4 @@ RUN pip install -e .
 
 # assuming you are working from inside your darts directory:
 # docker build . -t darts-test:latest
-# docker run -it -v $(pwd)/:/app/ darts-test:latest bash
\ No newline at end of file
+# docker run -it -v $(pwd)/:/app/ darts-test:latest bash
diff --git a/INSTALL.md b/INSTALL.md
index c69a29ce54..b235d33be2 100644
--- a/INSTALL.md
+++ b/INSTALL.md
@@ -5,7 +5,7 @@ Below, we detail how to install Darts using either `conda` or `pip`.
 ## From PyPI
 Install Darts with all models except the ones from optional dependencies (Prophet, LightGBM, CatBoost, see more on that [here](#enabling-optional-dependencies)): `pip install darts`.
 
-If this fails on your platform, please follow the official installation 
+If this fails on your platform, please follow the official installation
 guide for [PyTorch](https://pytorch.org/get-started/locally/), then try installing Darts again.
 
 As some dependencies are relatively big or involve non-Python dependencies,
@@ -37,8 +37,8 @@ As some models have relatively heavy dependencies, we provide four conda-forge p
 ## Other Information
 
 ### Enabling Optional Dependencies
-As of version 0.25.0, the default `darts` package does not install Prophet, CatBoost, and LightGBM dependencies anymore, because their 
-build processes were too often causing issues. We continue supporting the model wrappers `Prophet`, `CatBoostModel`, and `LightGBMModel` in Darts though. If you want to use any of them, you will need to manually install the corresponding packages (or install a Darts flavor as described above).  
+As of version 0.25.0, the default `darts` package does not install Prophet, CatBoost, and LightGBM dependencies anymore, because their
+build processes were too often causing issues. We continue supporting the model wrappers `Prophet`, `CatBoostModel`, and `LightGBMModel` in Darts though. If you want to use any of them, you will need to manually install the corresponding packages (or install a Darts flavor as described above).
 
 #### Prophet
 Install the `prophet` package (version 1.1.1 or more recent) using the [Prophet install guide](https://facebook.github.io/prophet/docs/installation.html#python)
@@ -99,4 +99,4 @@ To build documentation locally just run
 ```bash
 ./gradlew buildDocs
 ```
-After that docs will be available in `./docs/build/html` directory. You can just open `./docs/build/html/index.html` using your favourite browser.
\ No newline at end of file
+After that docs will be available in `./docs/build/html` directory. You can just open `./docs/build/html/index.html` using your favourite browser.
diff --git a/README.md b/README.md
index 4d482214cc..aa1b4067c3 100644
--- a/README.md
+++ b/README.md
@@ -19,8 +19,8 @@ on time series. It contains a variety of models, from classics such as ARIMA to
 deep neural networks. The forecasting models can all be used in the same way,
 using `fit()` and `predict()` functions, similar to scikit-learn.
 The library also makes it easy to backtest models,
-combine the predictions of several models, and take external data into account. 
-Darts supports both univariate and multivariate time series and models. 
+combine the predictions of several models, and take external data into account.
+Darts supports both univariate and multivariate time series and models.
 The ML-based models can be trained on potentially large datasets containing multiple time
 series, and some of the models offer a rich support for probabilistic forecasting.
 
@@ -59,7 +59,7 @@ Once your environment is set up you can install darts using pip:
 
     pip install darts
 
-For more details you can refer to our 
+For more details you can refer to our
 [installation instructions](https://github.com/unit8co/darts/blob/master/INSTALL.md).
 
 ## Example Usage
@@ -166,7 +166,7 @@ series.plot()
 * **Multivariate Support:** `TimeSeries` can be multivariate - i.e., contain multiple time-varying
   dimensions instead of a single scalar value. Many models can consume and produce multivariate series.
 
-* **Multiple series training (global models):** All machine learning based models (incl. all neural networks) 
+* **Multiple series training (global models):** All machine learning based models (incl. all neural networks)
   support being trained on multiple (potentially multivariate) series. This can scale to large datasets too.
 
 * **Probabilistic Support:** `TimeSeries` objects can (optionally) represent stochastic
@@ -174,7 +174,7 @@ series.plot()
   flavours of probabilistic forecasting (such as estimating parametric distributions or quantiles).
   Some anomaly detection scorers are also able to exploit these predictive distributions.
 
-* **Past and Future Covariates support:** Many models in Darts support past-observed and/or future-known 
+* **Past and Future Covariates support:** Many models in Darts support past-observed and/or future-known
   covariate (external data) time series as inputs for producing forecasts.
 
 * **Static Covariates support:** In addition to time-dependent data, `TimeSeries` can also contain
@@ -262,7 +262,7 @@ on bringing more models and features.
 
 
 ## Community & Contact
-Anyone is welcome to join our [Gitter room](https://gitter.im/u8darts/darts) to ask questions, make proposals, 
+Anyone is welcome to join our [Gitter room](https://gitter.im/u8darts/darts) to ask questions, make proposals,
 discuss use-cases, and more. If you spot a bug or have suggestions, GitHub issues are also welcome.
 
 If what you want to tell us is not suitable for Gitter or Github,
diff --git a/datasets/heart_rate.csv b/datasets/heart_rate.csv
index 189e262631..8af4d5bbcc 100644
--- a/datasets/heart_rate.csv
+++ b/datasets/heart_rate.csv
@@ -1798,4 +1798,4 @@ Heart rate
 101.623
 99.5679
 99.1835
-98.8567
\ No newline at end of file
+98.8567
diff --git a/datasets/ice_cream_heater.csv b/datasets/ice_cream_heater.csv
index 2c87a562e9..d45e28f9df 100644
--- a/datasets/ice_cream_heater.csv
+++ b/datasets/ice_cream_heater.csv
@@ -196,4 +196,4 @@ Month,heater,ice cream
 2020-03,25,44
 2020-04,25,53
 2020-05,27,70
-2020-06,24,74
\ No newline at end of file
+2020-06,24,74
diff --git a/datasets/monthly-milk-incomplete.csv b/datasets/monthly-milk-incomplete.csv
index fa498a3773..b5f20519d2 100644
--- a/datasets/monthly-milk-incomplete.csv
+++ b/datasets/monthly-milk-incomplete.csv
@@ -154,4 +154,3 @@
 "1975-08",858
 "1975-11",797
 "1975-12",843
-
diff --git a/datasets/monthly-milk.csv b/datasets/monthly-milk.csv
index 8c90e1073a..8040820d67 100644
--- a/datasets/monthly-milk.csv
+++ b/datasets/monthly-milk.csv
@@ -167,4 +167,3 @@
 "1975-10",827
 "1975-11",797
 "1975-12",843
-
diff --git a/datasets/monthly-sunspots.csv b/datasets/monthly-sunspots.csv
index bddb7f8c20..4817b1e75f 100644
--- a/datasets/monthly-sunspots.csv
+++ b/datasets/monthly-sunspots.csv
@@ -2818,4 +2818,4 @@
 "1983-09",50.3
 "1983-10",55.8
 "1983-11",33.3
-"1983-12",33.4
\ No newline at end of file
+"1983-12",33.4
diff --git a/datasets/temps.csv b/datasets/temps.csv
index e9a18f6710..c2c6969cfa 100644
--- a/datasets/temps.csv
+++ b/datasets/temps.csv
@@ -3648,4 +3648,4 @@ Date,Daily minimum temperatures
 12/28/1990,13.6
 12/29/1990,13.5
 12/30/1990,15.7
-12/31/1990,13
\ No newline at end of file
+12/31/1990,13
diff --git a/datasets/us_gasoline.csv b/datasets/us_gasoline.csv
index f79de0fd79..89165db6b8 100644
--- a/datasets/us_gasoline.csv
+++ b/datasets/us_gasoline.csv
@@ -1576,4 +1576,4 @@ Week,Gasoline
 03/1/1991,7224
 02/22/1991,6582
 02/15/1991,6433
-02/8/1991,6621
\ No newline at end of file
+02/8/1991,6621
diff --git a/docs/source/userguide.rst b/docs/source/userguide.rst
index a1f81fe61c..e25d17922e 100644
--- a/docs/source/userguide.rst
+++ b/docs/source/userguide.rst
@@ -25,7 +25,7 @@ You will find here some more detailed information about Darts.
    .. userguide/probabilistic_forecasting.md
 
    .. userguide/ensembling.md
- 
+
    .. userguide/filtering_models.md
 
    .. userguide/preprocessing_and_pipelines.md
diff --git a/docs/userguide/covariates.md b/docs/userguide/covariates.md
index cc4c564b87..97f82c6d92 100644
--- a/docs/userguide/covariates.md
+++ b/docs/userguide/covariates.md
@@ -90,13 +90,13 @@ Let's have a look at some examples of past, future, and static covariates:
     -   daily average **forecasted** temperatures (known in the future)
     -   day of week, month, year, ...
 - `static_covariates`: time independent/constant/static `target` characteristics
-    -   categorical: 
+    -   categorical:
         - location of `target` (country, city, .. name)
         - `target` identifier: (product ID, store ID, ...)
     -   numerical:
         - population of `target`'s country/market area (assuming it stays constant over the forecasting horizon)
         - average temperature of `target`'s region (assuming it stays constant over the forecasting horizon)
- 
+
 
 Temporal attributes are powerful because they are known in advance and can help models capture trends and / or seasonal patterns of the `target` series.
 Static attributes are powerful when working with multiple `targets` (either multiple `TimeSeries`, or multivariate series containing multiple dimensions each). The time independent information can help models identify the nature/environment of the underlying series and improve forecasts across different `targets`.
@@ -148,8 +148,8 @@ GFMs are models that can be trained on multiple target (and covariate) time seri
 | [NHiTSModel](https://unit8co.github.io/darts/generated_api/darts.models.forecasting.nhits.html#darts.models.forecasting.nhits.NHiTSModel)                                                                                                                                                 | ✅               |                   |                   |
 | [TCNModel](https://unit8co.github.io/darts/generated_api/darts.models.forecasting.tcn_model.html#darts.models.forecasting.tcn_model.TCNModel)                                                                                                                                             | ✅               |                   |                   |
 | [TransformerModel](https://unit8co.github.io/darts/generated_api/darts.models.forecasting.transformer_model.html#darts.models.forecasting.transformer_model.TransformerModel)                                                                                                             | ✅               |                   |                   |
-| [TFTModel](https://unit8co.github.io/darts/generated_api/darts.models.forecasting.tft_model.html#darts.models.forecasting.tft_model.TFTModel)                                                                                                                                             | ✅               |         ✅         |         ✅         | 
-| [DLinearModel](https://unit8co.github.io/darts/generated_api/darts.models.forecasting.dlinear.html#darts.models.forecasting.dlinear.DLinearModel)                                                                                                                                         | ✅               |         ✅         |         ✅         | 
+| [TFTModel](https://unit8co.github.io/darts/generated_api/darts.models.forecasting.tft_model.html#darts.models.forecasting.tft_model.TFTModel)                                                                                                                                             | ✅               |         ✅         |         ✅         |
+| [DLinearModel](https://unit8co.github.io/darts/generated_api/darts.models.forecasting.dlinear.html#darts.models.forecasting.dlinear.DLinearModel)                                                                                                                                         | ✅               |         ✅         |         ✅         |
 | [NLinearModel](https://unit8co.github.io/darts/generated_api/darts.models.forecasting.nlinear.html#darts.models.forecasting.nlinear.NLinearModel)                                                                                                                                         | ✅               |         ✅         |         ✅         |
 | [TiDEModel](https://unit8co.github.io/darts/generated_api/darts.models.forecasting.tide_model.html#darts.models.forecasting.tide_model.TiDEModel)                                                                                                                                         | ✅               |         ✅         |         ✅         |
 | [TSMixerModel](https://unit8co.github.io/darts/generated_api/darts.models.forecasting.tsmixer_model.html#darts.models.forecasting.tsmixer_model.TSMixerModel)                                                                                                                             | ✅               |         ✅         |         ✅         |
diff --git a/docs/userguide/forecasting_overview.md b/docs/userguide/forecasting_overview.md
index b56ad4b568..64484d6922 100644
--- a/docs/userguide/forecasting_overview.md
+++ b/docs/userguide/forecasting_overview.md
@@ -15,7 +15,7 @@ by calling the `fit()` function, and finally they are used to obtain one or seve
 from darts.models import NaiveSeasonal
 
 naive_model = NaiveSeasonal(K=1)            # init
-naive_model.fit(train)                      # fit  
+naive_model.fit(train)                      # fit
 naive_forecast = naive_model.predict(n=36)  # predict
 ```
 
@@ -111,7 +111,7 @@ These models are shown with a "✅" under the `Multivariate` column on the [mode
 ## Handling multiple series
 
 Some models support being fit on multiple time series. To do this, it is enough to simply provide a Python `Sequence` of `TimeSeries` (for instance a list of `TimeSeries`) to `fit()`. When a model is fit this way, the `predict()` function will expect the argument `series` to be set, containing
-one or several `TimeSeries` (i.e., a single or a `Sequence` of `TimeSeries`) that need to be forecasted. 
+one or several `TimeSeries` (i.e., a single or a `Sequence` of `TimeSeries`) that need to be forecasted.
 The advantage of training on multiple series is that a single model can be exposed to more patterns occurring across all series in the training dataset. That can often be beneficial, especially for larger models with more capacity.
 
 In turn, the advantage of having `predict()` providing forecasts for potentially several series at once is that the computation can often be batched and vectorized across the multiple series, which is computationally faster than calling `predict()` multiple times on isolated series.
@@ -178,9 +178,9 @@ pred.plot(label='forecast')
 ![Exponential Smoothing](./images/probabilistic/example_ets.png)
 
 ### Probabilistic neural networks
-All neural networks (torch-based models) in Darts have a rich support to estimate different kinds of probability distributions. 
-When creating the model, it is possible to provide one of the *likelihood models* available in [darts.utils.likelihood_models](https://unit8co.github.io/darts/generated_api/darts.utils.likelihood_models.html), which determine the distribution that will be estimated by the model. 
-In such cases, the model will output the parameters of the distribution, and it will be trained by minimising the negative log-likelihood of the training samples. 
+All neural networks (torch-based models) in Darts have a rich support to estimate different kinds of probability distributions.
+When creating the model, it is possible to provide one of the *likelihood models* available in [darts.utils.likelihood_models](https://unit8co.github.io/darts/generated_api/darts.utils.likelihood_models.html), which determine the distribution that will be estimated by the model.
+In such cases, the model will output the parameters of the distribution, and it will be trained by minimising the negative log-likelihood of the training samples.
 Most of the likelihood models also support prior values for the distribution's parameters, in which case the training loss is regularized by a Kullback-Leibler divergence term pushing the resulting distribution in the direction of the distribution specified by the prior parameters.
 The strength of this regularization term can also be specified when creating the likelihood model object.
 
@@ -201,7 +201,7 @@ train = scaler.fit_transform(train)
 val = scaler.transform(val)
 series = scaler.transform(series)
 
-model = TCNModel(input_chunk_length=30, 
+model = TCNModel(input_chunk_length=30,
                  output_chunk_length=12,
                  likelihood=LaplaceLikelihood(prior_b=0.1))
 model.fit(train, epochs=400)
@@ -232,7 +232,7 @@ train = scaler.fit_transform(train)
 val = scaler.transform(val)
 series = scaler.transform(series)
 
-model = TCNModel(input_chunk_length=30, 
+model = TCNModel(input_chunk_length=30,
                  output_chunk_length=12,
                  likelihood=QuantileRegression(quantiles=[0.01, 0.05, 0.2, 0.5, 0.8, 0.95, 0.99]))
 model.fit(train, epochs=400)
@@ -291,8 +291,8 @@ from darts.models import LinearRegressionModel
 series = AirPassengersDataset().load()
 train, val = series[:-36], series[-36:]
 
-model = LinearRegressionModel(lags=30, 
-                              likelihood="quantile", 
+model = LinearRegressionModel(lags=30,
+                              likelihood="quantile",
                               quantiles=[0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.95])
 model.fit(train)
 pred = model.predict(n=36, num_samples=500)
@@ -304,4 +304,4 @@ pred.plot(label='forecast')
 ![quantile linear regression](./images/probabilistic/example_linreg_quantile.png)
 
 
-[1] Yarin Gal, Zoubin Ghahramani, ["Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning"](https://arxiv.org/abs/1506.02142)
\ No newline at end of file
+[1] Yarin Gal, Zoubin Ghahramani, ["Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning"](https://arxiv.org/abs/1506.02142)
diff --git a/docs/userguide/gpu_and_tpu_usage.md b/docs/userguide/gpu_and_tpu_usage.md
index 5585a84534..89e6f2b198 100644
--- a/docs/userguide/gpu_and_tpu_usage.md
+++ b/docs/userguide/gpu_and_tpu_usage.md
@@ -66,9 +66,9 @@ IPU available: False, using: 0 IPUs
 
   | Name      | Type    | Params
 --------------------------------------
-0 | criterion | MSELoss | 0     
-1 | rnn       | RNN     | 460   
-2 | V         | Linear  | 21    
+0 | criterion | MSELoss | 0
+1 | rnn       | RNN     | 460
+2 | V         | Linear  | 21
 --------------------------------------
 481       Trainable params
 0         Non-trainable params
@@ -105,9 +105,9 @@ LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
 
   | Name      | Type    | Params
 --------------------------------------
-0 | criterion | MSELoss | 0     
-1 | rnn       | RNN     | 460   
-2 | V         | Linear  | 21    
+0 | criterion | MSELoss | 0
+1 | rnn       | RNN     | 460
+2 | V         | Linear  | 21
 --------------------------------------
 481       Trainable params
 0         Non-trainable params
@@ -122,11 +122,11 @@ From the output we can see that the GPU is both available and used. The rest of
 
 ### Multi GPU support
 
-Darts utilizes [Lightning's multi GPU capabilities](https://pytorch-lightning.readthedocs.io/en/stable/accelerators/gpu_intermediate.html) to be able to capitalize on scalable hardware. 
+Darts utilizes [Lightning's multi GPU capabilities](https://pytorch-lightning.readthedocs.io/en/stable/accelerators/gpu_intermediate.html) to be able to capitalize on scalable hardware.
 
-Multiple parallelization strategies exist for multiple GPU training, which - because of different strategies for multiprocessing and data handling - interact strongly with the execution environment. 
+Multiple parallelization strategies exist for multiple GPU training, which - because of different strategies for multiprocessing and data handling - interact strongly with the execution environment.
 
-Currently in Darts the `ddp_spawn` distribution strategy is tested. 
+Currently in Darts the `ddp_spawn` distribution strategy is tested.
 
 As per the description of the [Lightning documentation](https://pytorch-lightning.readthedocs.io/en/stable/accelerators/gpu_intermediate.html#distributed-data-parallel-spawn) has some noteworthy limitations, eg. it __can not run__ in:
 
@@ -156,7 +156,7 @@ The `ddp` family of strategies creates indiviual subprocesses for each GPU, so c
 
 "Dataloader(num_workers=N), where N is large, bottlenecks training with DDP… ie: it will be VERY slow or won’t work at all. This is a PyTorch limitation."
 
-Usage of other distribution strategies with Darts currently _might_ very well work, but are yet untested and subject to individual setup / experimentation.   
+Usage of other distribution strategies with Darts currently _might_ very well work, but are yet untested and subject to individual setup / experimentation.
 
 ## Use a TPU
 
@@ -197,9 +197,9 @@ IPU available: False, using: 0 IPUs
 
   | Name      | Type    | Params
 --------------------------------------
-0 | criterion | MSELoss | 0     
-1 | rnn       | RNN     | 460   
-2 | V         | Linear  | 21    
+0 | criterion | MSELoss | 0
+1 | rnn       | RNN     | 460
+2 | V         | Linear  | 21
 --------------------------------------
 481       Trainable params
 0         Non-trainable params
diff --git a/docs/userguide/hyperparameter_optimization.md b/docs/userguide/hyperparameter_optimization.md
index bfd659000b..c5c995b79c 100644
--- a/docs/userguide/hyperparameter_optimization.md
+++ b/docs/userguide/hyperparameter_optimization.md
@@ -65,7 +65,7 @@ def objective(trial):
         num_workers = 4
     else:
         num_workers = 0
-        
+
     pl_trainer_kwargs = {
         "accelerator": "auto",
         "callbacks": callbacks,
@@ -80,7 +80,7 @@ def objective(trial):
 
     # reproducibility
     torch.manual_seed(42)
-    
+
     # build the TCN model
     model = TCNModel(
         input_chunk_length=in_len,
@@ -101,8 +101,8 @@ def objective(trial):
         force_reset=True,
         save_checkpoints=True,
     )
-    
-    
+
+
     # when validating during training, we can use a slightly longer validation
     # set which also contains the first input_chunk_length time steps
     model_val_set = scaler.transform(series[-(VAL_LEN + in_len) :])
@@ -116,7 +116,7 @@ def objective(trial):
 
     # reload best model over course of training
     model = TCNModel.load_from_checkpoint("tcn_model")
-    
+
     # Evaluate how good it is on the validation set, using sMAPE
     preds = model.predict(series=train, n=VAL_LEN)
     smapes = smape(val, preds, n_jobs=-1, verbose=True)
@@ -140,7 +140,7 @@ if __name__ == "__main__":
 ## Hyperparameter optimization with Ray Tune
 [Ray Tune](https://docs.ray.io/en/latest/tune/examples/tune-pytorch-lightning.html) is another option for hyperparameter optimization with automatic pruning.
 
-Here is an example of how to use Ray Tune to with the `NBEATSModel` model using the [Asynchronous Hyperband scheduler](https://blog.ml.cmu.edu/2018/12/12/massively-parallel-hyperparameter-optimization/). 
+Here is an example of how to use Ray Tune to with the `NBEATSModel` model using the [Asynchronous Hyperband scheduler](https://blog.ml.cmu.edu/2018/12/12/massively-parallel-hyperparameter-optimization/).
 
 ```python
 import pandas as pd
@@ -224,13 +224,13 @@ train_fn_with_parameters = tune.with_parameters(
     train_model, callbacks=[my_stopper, tune_callback], train=train, val=val,
 )
 
-# optimize hyperparameters by minimizing the MAPE on the validation set 
+# optimize hyperparameters by minimizing the MAPE on the validation set
 analysis = tune.run(
     train_fn_with_parameters,
     resources_per_trial=resources_per_trial,
-    # Using a metric instead of loss allows for 
+    # Using a metric instead of loss allows for
     # comparison between different likelihood or loss functions.
-    metric="MAPE",  # any value in TuneReportCallback. 
+    metric="MAPE",  # any value in TuneReportCallback.
     mode="min",
     config=config,
     num_samples=num_samples,
diff --git a/docs/userguide/timeseries.md b/docs/userguide/timeseries.md
index 7faeb66234..0027290b82 100644
--- a/docs/userguide/timeseries.md
+++ b/docs/userguide/timeseries.md
@@ -19,7 +19,7 @@ We distinguish univariate from multivariate series:
 
 Sometimes the dimensions are called *components*. A single `TimeSeries` object can be either univariate (if it has a single component), or multivariate (if it has multiple components). In a multivariate series, all components share the same time axis. I.e., they all share the same time stamps.
 
-Some models in Darts (and all machine learning models) support multivariate series. This means that they can take multivariate series in inputs (either as targets or as covariates), and the forecasts they produce will have a dimensionality matching that of the targets. 
+Some models in Darts (and all machine learning models) support multivariate series. This means that they can take multivariate series in inputs (either as targets or as covariates), and the forecasts they produce will have a dimensionality matching that of the targets.
 
 In addition, some models can work on *multiple time series*, meaning that they can be trained on multiple `TimeSeries` objects, and used to forecasts multiple `TimeSeries` objects in one go. This is sometimes referred to as panel data. In such cases, the different `TimeSeries` need not share the same time index -- for instance, some series might be in 1990 and others in 2000. In fact, the series need not even have the same frequency. The models handling multiple series expect Python `Sequence`s of `TimeSeries` in inputs (for example, a simple list of `TimeSeries`).
 
diff --git a/docs/userguide/torch_forecasting_models.md b/docs/userguide/torch_forecasting_models.md
index 662bc4bc66..5edd7a34b3 100644
--- a/docs/userguide/torch_forecasting_models.md
+++ b/docs/userguide/torch_forecasting_models.md
@@ -327,7 +327,7 @@ loaded_model.to_cpu()
 To re-train or fine-tune a model using a different optimizer and/or learning rate scheduler, you can load the weights from the automatic checkpoints into a new model:
 
 ```python
-# model with identical architecture but different optimizer (default: torch.optim.Adam) 
+# model with identical architecture but different optimizer (default: torch.optim.Adam)
 model_finetune = SomeTorchForecastingModel(...,  # use identical parameters & values as in original model
                                            optimizer_cls=torch.optim.SGD,
                                            optimizer_kwargs={"lr": 0.001})
@@ -366,8 +366,8 @@ The code is triggered once the process execution reaches the corresponding hooks
 Some useful predefined PyTorch Lightning callbacks can be found [here](https://lightning.ai/docs/pytorch/stable/extensions/callbacks.html#built-in-callbacks).
 
 #### Example with Early Stopping
-Early stopping is an efficient way to avoid overfitting and reduce training time. 
-It will exit the training process once the validation loss has not significantly improved over some epochs.   
+Early stopping is an efficient way to avoid overfitting and reduce training time.
+It will exit the training process once the validation loss has not significantly improved over some epochs.
 
 You can use Early Stopping with any `TorchForecastingModel`, leveraging PyTorch Lightning's [EarlyStopping](https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.callbacks.EarlyStopping.html#lightning.pytorch.callbacks.EarlyStopping) callback:
 ```python
@@ -568,5 +568,3 @@ We train two models; `NBEATSModel` and `TFTModel`, with default parameters and `
 | `TFTModel`    | Energy  | 32    | yes  | 1024       | 0           | 41s            |
 | `TFTModel`    | Energy  | 32    | yes  | 1024       | 2           | 31s            |
 | `TFTModel`    | Energy  | 32    | yes  | 1024       | 4           | 31s            |
-
-
diff --git a/gradlew b/gradlew
index fbd7c51583..4f906e0c81 100755
--- a/gradlew
+++ b/gradlew
@@ -130,7 +130,7 @@ fi
 if [ "$cygwin" = "true" -o "$msys" = "true" ] ; then
     APP_HOME=`cygpath --path --mixed "$APP_HOME"`
     CLASSPATH=`cygpath --path --mixed "$CLASSPATH"`
-    
+
     JAVACMD=`cygpath --unix "$JAVACMD"`
 
     # We build the pattern for arguments to be converted via cygpath
diff --git a/pyproject.toml b/pyproject.toml
index e023217621..708780cf2b 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -52,4 +52,4 @@ convention = "google"
 
 [tool.ruff.lint.mccabe]
 # Unlike Flake8, default to a complexity level of 10.
-max-complexity = 10
\ No newline at end of file
+max-complexity = 10