Total axes #118

zschira · 2023-08-17T18:23:58Z

This PR changes the extractor behavior so missing Axes will be treated as Totals.

codecov · 2023-08-17T18:24:32Z

Codecov Report

Patch coverage: 100.00% and project coverage change: +11.46% 🎉

Comparison is base (0572a99) 77.25% compared to head (19ae97c) 88.72%.

Additional details and impacted files

@@             Coverage Diff             @@
##             main     #118       +/-   ##
===========================================
+ Coverage   77.25%   88.72%   +11.46%     
===========================================
  Files           8        8               
  Lines         554      541       -13     
===========================================
+ Hits          428      480       +52     
+ Misses        126       61       -65

Files Changed	Coverage Δ
src/ferc_xbrl_extractor/arelle_interface.py	`100.00% <100.00%> (ø)`
src/ferc_xbrl_extractor/cli.py	`77.27% <100.00%> (ø)`
src/ferc_xbrl_extractor/datapackage.py	`98.52% <100.00%> (+10.41%)`	⬆️
src/ferc_xbrl_extractor/instance.py	`97.43% <100.00%> (+40.52%)`	⬆️
src/ferc_xbrl_extractor/taxonomy.py	`86.20% <100.00%> (ø)`

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

This reverts commit bb67741.

jdangerx

The only actually blocking thing is the question about whether dropna() is doing what we want it to do. But I would feel warm and fuzzy inside if you engaged with my various suggestions, too!

Broader questions I wanted to reiterate at the top level:

do we need to filter by context dimensions in get_facts? it might make sense to just grab all the facts. then we can filter contexts by whether or not each context fits with the schema.primary_key, and avoid passing extra state around.
should we consider not smearing the concept names into their own columns? I'm sure there was some sort of decision made about this in the ancient times, but what was the rationale there?
should we have the output dataframes be indexed by schema.primary_key?

Finally, when I was testing the code I'm suggesting, I found a couple weird things that I needed to handle:

empty files, obviously
filings with duplicated ("c_id", "name") values in facts... that seemed surprising.

src/ferc_xbrl_extractor/instance.py

tests/integration/datapackage_test.py

src/ferc_xbrl_extractor/datapackage.py

src/ferc_xbrl_extractor/instance.py

zschira added 2 commits August 16, 2023 17:26

Fix using filings from zip file

c80cbe9

Allow missing axes and treat as totals

373f0e8

zschira added 2 commits August 17, 2023 14:36

Downgrade pydantic and pandas

7188fad

Revert "Update to Pydantic 2.0; temporarily disable rstcheck."

c8a888a

This reverts commit bb67741.

zschira force-pushed the total_axes branch from c8327d9 to c8a888a Compare August 17, 2023 19:46

Cleanup create_dataframe comments

65dbaf1

zschira requested a review from jdangerx August 17, 2023 21:06

zschira marked this pull request as draft August 17, 2023 21:57

zschira added 2 commits August 18, 2023 11:11

Test instance parsing and dataframe construction

77b72aa

Add more test facts

93447c4

zschira marked this pull request as ready for review August 18, 2023 15:37

zschira added 2 commits August 18, 2023 11:47

Improve comments

8b811a9

Lowercase 'Total' for consistency with PUDL

ad9ed84

jdangerx requested changes Aug 18, 2023

View reviewed changes

Fix Totals in tests

19ae97c

jdangerx approved these changes Aug 18, 2023

View reviewed changes

Apply PR suggestions from @jdangerx

4de5756

zschira merged commit f37907f into main Aug 18, 2023
1 of 6 checks passed

zschira deleted the total_axes branch August 18, 2023 20:46

jdangerx mentioned this pull request Aug 21, 2023

Lost facts fix #113

Closed

zaneselvans linked an issue Aug 22, 2023 that may be closed by this pull request

FERC Form 1 Statement of Income Contains Holes catalyst-cooperative/pudl#2755

Closed

zschira restored the total_axes branch August 23, 2023 16:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Total axes #118

Total axes #118

zschira commented Aug 17, 2023

codecov bot commented Aug 17, 2023 •

edited

Loading

jdangerx left a comment

Total axes #118

Total axes #118

Conversation

zschira commented Aug 17, 2023

codecov bot commented Aug 17, 2023 • edited Loading

Codecov Report

jdangerx left a comment

Choose a reason for hiding this comment

codecov bot commented Aug 17, 2023 •

edited

Loading