ENH: Add sparse top-down reconciliation via `TopDownSparse` #277

christophertitchen · 2024-07-25T18:31:33Z

This PR adds a top-down reconciliation class called TopDownSparse for sparse "summing" matrices, $S$, to significantly reduce the wall time for reconciling large hierarchical structures using disaggregation proportions, which now take $98 \%$ to $100 \%$ less time to reconcile.

TopDownSparse uses the _get_PW_matrices method to efficiently instantiate and allocate $P$ as a COO matrix and then convert it to a CSR matrix, which is much faster than using a LIL matrix for constructing a large sparse matrix. Although the sparsity structure of $P$ for top-down reconciliation means that a CSC matrix would outperform a CSR matrix for $SP\hat{y}_{h}$, it is not worth converting $S$ to a CSC matrix. As with #276, the "weight" matrix, $W$, is only instantiated and allocated if performing probabilistic reconciliation, not for scenarios where only point reconciliation is required.

The same tests for TopDown in nbs/methods.ipynb are now evaluated on TopDownSparse as well, and the class supports all of the disaggregation methods of TopDown. This required a small change in _get_child_nodes to check for the presence of a sparse matrix if disaggregating using forecast proportions.

Reference	Method	`method`	Formula
Gross and Sohl, 1990	A	`average_proportions`	$$p_j=\frac{1}{T} \sum_{t=1}^T \frac{y_{j, t}}{y_t}$$
Gross and Sohl, 1990	F	`proportion_averages`	$$p_j=\sum_{t=1}^T \frac{y_{j, t}}{T} / \sum_{t=1}^T \frac{y_t}{T}$$
Athanasopoulos, Ahmed, and Hyndman, 2009	FP	`forecast_proportions`	$$p_j=\prod_{\ell=0}^{K-1} \frac{\hat{y}_{j, h}^{(\ell)}}{\hat{S}_{j, h}^{(\ell+1)}}$$

The benchmarks for fit_predict across a few large hierarchical datasets are below, measured on a machine with 6/12 x 2.6 GHz (dynamic frequency scaling to 4.5 GHz) cores and 16 GB RAM.

Note: $P$ is only used for methods A and F, not FP, so there will be no performance gain with TopDownSparse when disaggregating using forecast proportions, only historical proportions. Although the private and M5 datasets are not strictly hierarchical, this is not an issue when disaggregating using methods A and F, because the proportions do not depend on the disaggregation pathways from the top node to the bottom nodes. However, this is not the case with method FP, which is invalid for these structures as there are multiple $p_j$ for each $j$, depending on the respective disaggregation pathway. There are practical approaches to deal with this while accepting further information loss, like creating a synthetic disaggregation pathway directly from the top node to the bottom nodes, such that $p_j=\hat{y}_{j, h} / \hat{y}_{h}$, selecting a single disaggregation pathway for each $j$, or averaging them, but that is beyond the scope of this PR. Also, note that the private dataset is quite interesting in the sense that the number of aggregated time series is greater than the number of bottom level time series, which is not usually the case for practical problems. This is not an issue here, but becomes more consequential in min trace reconciliation as inverting $S'W^{-1}S$ in the structural representation becomes more efficient than inverting $CWC'$ in the zero-constrained representation as $n_{a} > n_{b}$, not that min trace reconciliation scales particularly well on large hierarchical structures anyway, apart from perhaps structural scaling.

Dataset	$\textup{dim} \left ( P \right )$	$K$	$T$	$h$	Method	`TopDown`	`TopDownSparse`	Reduction
Private	$\small{\left( 22\, 705 \times 96\, 249 \right)}$	$45$	$36$	$6$	A	$59.5\, s$	$14.3\, ms$	$100 \%$
Private	$\small{\left( 22\, 705 \times 96\, 249 \right)}$	$45$	$36$	$6$	F	$59.1\, s$	$12.6\, ms$	$100 \%$
M5	$\small{\left( 30\, 490 \times 42\, 840 \right)}$	$12$	$1\, 969$	$28$	A	$34.1\, s$	$590\, ms$	$98.3 \%$
M5	$\small{\left( 30\, 490 \times 42\, 840 \right)}$	$12$	$1\, 969$	$28$	F	$33.9\, s$	$307\, ms$	$99.1 \%$
M5	$\small{\left( 30\, 490 \times 42\, 840 \right)}$	$12$	$1\, 969$	$7$	A	$33.8\, s$	$582\, ms$	$98.3 \%$
M5	$\small{\left( 30\, 490 \times 42\, 840 \right)}$	$12$	$1\, 969$	$7$	F	$33.7\, s$	$298\, ms$	$99.1 \%$
Tourism	$\small{\left( 304 \times 555 \right)}$	$8$	$228$	$12$	A	$2.09\, ms$	$1.91\, ms$	$8.6 \%$
Tourism	$\small{\left( 304 \times 555 \right)}$	$8$	$228$	$12$	F	$1.41\, ms$	$823\, \mu s$	$41.6 \%$
Tourism	n/a	$8$	$228$	$12$	FP	$172\, ms$	$173\, ms$	$-0.6 \%$

review-notebook-app · 2024-07-25T18:31:39Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

elephaint · 2024-07-25T22:34:43Z

Great work - can you also maybe add a small test demonstrating output equivalence (up to dtype precision)?

christophertitchen · 2024-07-26T16:52:08Z

Great work - can you also maybe add a small test demonstrating output equivalence (up to dtype precision)?

Thanks, Olivier! I just added a test to check for equivalence between TopDown and TopDownSparse to within to the machine epsilon of a double, $2^{-52}$.

christophertitchen · 2024-08-07T12:44:02Z

Great work - can you also maybe add a small test demonstrating output equivalence (up to dtype precision)?

Hi, @elephaint. Can you think of any more changes or is it OK to be merged now?

elephaint · 2024-08-14T19:14:09Z

Great work - can you also maybe add a small test demonstrating output equivalence (up to dtype precision)?

Hi, @elephaint. Can you think of any more changes or is it OK to be merged now?

Thanks - sorry for the late reply. Great work! I have a small comment, I think other than that it's good to go.

Thanks also for adding the output equivalence test!

elephaint

See small comment - might want to initialize P as a csr_matrix directly.

hierarchicalforecast/methods.py

elephaint

Thanks, awesome! One minor thing, left a suggestion in the code.

hierarchicalforecast/methods.py

Co-authored-by: Olivier Sprangers <45119856+elephaint@users.noreply.github.com>

elephaint

Great!

christophertitchen added 2 commits July 25, 2024 15:18

ENH: ✨ Add sparse top-down reconciliation class.

2378191

ENH: ✨ Add support for forecast proportions disaggregation.

3f16ad6

jmoralez and others added 2 commits July 25, 2024 14:59

Merge branch 'main' into enh/top-down-sparse

5232adb

Merge branch 'main' into enh/top-down-sparse

8dc76e5

elephaint self-requested a review July 25, 2024 22:47

TST: ✅ Add output equivalence test.

b7ded2d

elephaint requested changes Aug 14, 2024

View reviewed changes

hierarchicalforecast/methods.py Outdated Show resolved Hide resolved

christophertitchen added 3 commits August 14, 2024 23:49

ENH: 🔥 Remove redundant computations.

609faec

ENH: ✨ Improve instantiation of P matrix.

bb1366c

ENH: ✨ Downcast the column indices of the CSR matrix.

e273279

christophertitchen requested a review from elephaint August 15, 2024 11:40

elephaint requested changes Aug 15, 2024

View reviewed changes

hierarchicalforecast/methods.py Outdated Show resolved Hide resolved

Update hierarchicalforecast/methods.py

e21f694

Co-authored-by: Olivier Sprangers <45119856+elephaint@users.noreply.github.com>

christophertitchen requested a review from elephaint August 15, 2024 14:01

elephaint approved these changes Aug 15, 2024

View reviewed changes

elephaint merged commit 21229a3 into Nixtla:main Aug 15, 2024
18 checks passed

christophertitchen mentioned this pull request Aug 17, 2024

ENH: Add sparse middle-out reconciliation via MiddleOutSparse #281

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Add sparse top-down reconciliation via `TopDownSparse` #277

ENH: Add sparse top-down reconciliation via `TopDownSparse` #277

christophertitchen commented Jul 25, 2024 •

edited

Loading

review-notebook-app bot commented Jul 25, 2024

elephaint commented Jul 25, 2024

christophertitchen commented Jul 26, 2024

christophertitchen commented Aug 7, 2024

elephaint commented Aug 14, 2024 •

edited

Loading

elephaint left a comment

elephaint left a comment

elephaint left a comment

ENH: Add sparse top-down reconciliation via TopDownSparse #277

ENH: Add sparse top-down reconciliation via TopDownSparse #277

Conversation

christophertitchen commented Jul 25, 2024 • edited Loading

review-notebook-app bot commented Jul 25, 2024

elephaint commented Jul 25, 2024

christophertitchen commented Jul 26, 2024

christophertitchen commented Aug 7, 2024

elephaint commented Aug 14, 2024 • edited Loading

elephaint left a comment

Choose a reason for hiding this comment

elephaint left a comment

Choose a reason for hiding this comment

elephaint left a comment

Choose a reason for hiding this comment

ENH: Add sparse top-down reconciliation via `TopDownSparse` #277

ENH: Add sparse top-down reconciliation via `TopDownSparse` #277

christophertitchen commented Jul 25, 2024 •

edited

Loading

elephaint commented Aug 14, 2024 •

edited

Loading