Misc core typescripting #3605

rbuels · 2023-03-22T22:07:19Z

These are some probably worthwhile core changes split out from the REST adapter PR #3503

better if blankStats() does not cast since TS won't check if all the necessary fields are filled when casting improve TS types for feature score stats add jsdoc comments about region stats types we use interbase coords correct stats tests

typescript mst-reflection.ts, might be a tiny bit better than leaving it js improve some ts types some config slot utility types, some better types in mst-reflection.ts StringEnumEditor does not need to use getPropertyType improve more typescripting

codecov · 2023-03-22T22:25:13Z

Codecov Report

Merging #3605 (dab7e28) into main (b766f03) will increase coverage by 0.02%.
The diff coverage is 86.56%.

❗ Current head dab7e28 differs from pull request most recent head 13e0871. Consider uploading reports for the commit 13e0871 to get more accurate results

@@            Coverage Diff             @@
##             main    #3605      +/-   ##
==========================================
+ Coverage   62.89%   62.91%   +0.02%     
==========================================
  Files         869      866       -3     
  Lines       30227    30253      +26     
  Branches     7280     7300      +20     
==========================================
+ Hits        19010    19035      +25     
+ Misses      11031    11030       -1     
- Partials      186      188       +2

Impacted Files	Coverage Δ
packages/core/configuration/configurationSchema.ts	`79.64% <ø> (ø)`
packages/core/configuration/configurationSlot.ts	`63.63% <ø> (ø)`
packages/core/util/types/index.ts	`70.96% <ø> (ø)`
...readsheetView/models/ColumnDataTypes/LocString.tsx	`12.12% <ø> (ø)`
...rc/SpreadsheetView/models/ColumnDataTypes/Text.tsx	`14.81% <ø> (ø)`
plugins/wiggle/src/WiggleRPC/rpcMethods.ts	`65.45% <0.00%> (ø)`
plugins/wiggle/src/util.ts	`71.79% <50.00%> (ø)`
packages/core/util/mst-reflection.ts	`61.53% <57.14%> (ø)`
packages/core/data_adapters/BaseAdapter.ts	`72.41% <72.72%> (-2.29%)`	⬇️
packages/core/util/stats.ts	`98.80% <97.43%> (-1.20%)`	⬇️
... and 3 more

... and 17 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

cmdcolin · 2023-03-24T14:58:34Z

I cherry-picked the misc stuff off this branch (1075f3d) and the remainder is the stats code essentially
I think it's possible that the stats thing is worth reviewing still. It seems like there is "conflation" of concerns where our "stats" code applies to "quantitative stats" for y scalebar, feature density stats for toggling when to display 'too much data' warnings, and then just the issue of maybe undefined's existing in e.g. a score field on just some features

If it helps to decouple these concerns, I think stats estimation could go through a different code path

cmdcolin · 2023-03-24T14:59:29Z

example where 'two different return types' are used for the single function, feature density on one but not the other, indicating the stats estimation vs quantitative stats situation https://github.com/GMOD/jbrowse-components/pull/3605/files#diff-9d5632c87842dea801ef5acbc3f15abac30b437769d9b1ace2fc6f86542a752eR254-R263

rbuels · 2023-03-24T16:18:04Z

Yes, with the current feature adapter API, we don't know exactly what kind of stats getRegionStats will return, so the quantitative stats and feature density stats have to use the same code path.

Having different kinds of stats estimation go through different code paths would necessitate some kind of change to the feature adapter API, probably something like getRegionScoreStats and getRegionFeatureDensityStats, is that what you had in mind?

cmdcolin · 2023-03-24T18:10:55Z

Having different kinds of stats estimation go through different code paths would necessitate some kind of change to the feature adapter API, probably something like getRegionScoreStats and getRegionFeatureDensityStats, is that what you had in mind?

currently there are 'estimateRegionsStats' used essentially for feature density, though it can also return number of bytes estimated also, and 'getMultiRegionStats' which is used by the quantitative tracks. could potentially rename to estimateMultiRegionStats to be more consistently named, and/or, use a more descriptive word than stats

rbuels · 2023-03-24T19:20:47Z

Ah, that's a problem. estimateRegionsStats is supposed to be a protected method of data adapters, but it looks like it's being called directly in quite a number of places now in main. I think that needs to change. Calling code isn't supposed to know how the data adapter is getting the stats, whether they are precomputed or what. All the "estimateStats" code in the main should probably be renamed to call getRegionStats, which in the case of the base adapter just delegates to estimate the region stats, but is meant to be overridden by adapters that might actually have precomputed stats, like bigwig.

But that's kind of a separate question. Again: in order to avoid the same code path being used for feature density, scores, and (later) histogram bins, it seems like we would need to change the adapter API. Is that what you are thinking?

cmdcolin · 2023-03-24T19:29:57Z

I'm open to changing the API, but also don't necessarily understand your proposal to e.g. rename estimateStats to getRegionStats because that would, in the current way things are done, make quantiative and feature density type stats have a single code path again which is what the current approach already IMO properly avoids

rbuels · 2023-03-24T20:07:10Z

The original design of the data adapters API had two public methods for stats: getRegionStats for single regions, and getMultiRegionStats for multiple regions. Those would return whatever stats were available, whether they were density, scores, histogram bins, or whatever. The estimateRegionStats method was an internal method of the base adapter that would fetch feature data and try to calculate some stats from it, and it was for use by data adapters that didn't have precomputed stats (which is a lot of them, but definitely not all). Looks like it wasn't marked protected, but it should have been. This design makes it possible to use any data adapter with any track type and have it pretty much work, which has always been part of the design of JBrowse.

Looks like about a year ago the estimateRegionsStats method started being used by calling code (that is, code outside of the data adapters). The problem with this is, this means that data adapters that have pre-computed feature stats, that don't estimate them, like NCList adapters, either won't have their pre-computed stats used, or they have to override the estimateRegionsStats method to return data that, despite the name of the method, is not actually estimated. So the method name then becomes inaccurate.

But let's focus on where we need to get to: we need a uniform data adapter API that lets (in theory) any data adapter by any track type. The method names need to be accurate for what they do.

And also, it sounds like you think it's important for quantitative stats (scores, stddev, etc) and density stats (featureDensity, bytes, precalculated histogram bins, etc) to be separated, so that the kind of stats that are being returned are easier for calling code to understand, is that right? Is the motivation more to make the calling code easier to write (avoid needing to use type guards or other runtime reflection to figure out what is in the returned data), or more for performance?

cmdcolin · 2023-03-29T22:59:42Z

@rbuels and I met today

we identified that there are (at least) two types of things that the word "stats" refers to, used in different contexts. as a result, we will try to distinguish these two usages by renaming some functions in BaseAdapter

quantitative stats

Quantitative stats are related to e.g. y-scalebars on the quantitative tracks. We propose rename functions like this

getRegionStats -> getRegionQuantitativeStats
getMultiRegionStats -> getMultiRegionQuantitativeStats

"feature density"-type stats

Feature density type stats are related to displaying the "Region too large, zoom in/force load" type messages. We propose renaming functions like this

estimateRegionsStats -> getMultiRegionFeatureDensityStats (alternative considered getRegionTooLargeStats)

Note 1. The getMultiRegionFeatureDensityStats can return feature density of byte size download estimate as a tagged type. this is currently what estimateRegionsStats already does.

Note 2. The singular method estimateRegionStats does not exist currently. Reason being is that we just call the 'multi-region' (estimateRegionsStats) version by default, even though only the first region of the multi-region set is looked at in the baseadapter's version. if we want to add the single region version, might be named getRegionFeatureDensityStats.

cmdcolin · 2023-03-29T23:03:16Z

note: also considered splitting feature density type stats into "byte range" stat estimate and "feature density" estimate, but then the display code would need to figure out how to decide which to call

the proposals there were that e.g. byte size estimate could be undefined by default, and if some calling code received undefined, it could further call feature density estimation. I proposed not doing this in favor of keeping "getMultiRegionFeatureDensityStats" able to return either a true feature density or a byte range estimate, but if there is a strong feeling for splitting it, let me know

rbuels added 2 commits March 22, 2023 15:02

more miscellaneous ts and types improvements

1763568

better if blankStats() does not cast since TS won't check if all the necessary fields are filled when casting improve TS types for feature score stats add jsdoc comments about region stats types we use interbase coords correct stats tests

more typescripting

dab7e28

typescript mst-reflection.ts, might be a tiny bit better than leaving it js improve some ts types some config slot utility types, some better types in mst-reflection.ts StringEnumEditor does not need to use getPropertyType improve more typescripting

github-actions bot added the needs label triage Needs a label to show in changelog (breaking, enhancement, bug, documentation, or internal) label Mar 22, 2023

rbuels added enhancement New feature or request housekeeping needs to be tidied up, is a drag on team performance and removed needs label triage Needs a label to show in changelog (breaking, enhancement, bug, documentation, or internal) enhancement New feature or request labels Mar 22, 2023

Merge origin/main

13e0871

github-actions bot added the needs label triage Needs a label to show in changelog (breaking, enhancement, bug, documentation, or internal) label Mar 24, 2023

cmdcolin mentioned this pull request Mar 30, 2023

Rename APIs for feature density stats and quantitative stats #3618

Merged

rbuels closed this Mar 30, 2023

cmdcolin deleted the misc_core_typescripting branch June 8, 2023 02:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Misc core typescripting #3605

Misc core typescripting #3605

rbuels commented Mar 22, 2023

codecov bot commented Mar 22, 2023 •

edited

Loading

cmdcolin commented Mar 24, 2023

cmdcolin commented Mar 24, 2023

rbuels commented Mar 24, 2023

cmdcolin commented Mar 24, 2023

rbuels commented Mar 24, 2023

cmdcolin commented Mar 24, 2023

rbuels commented Mar 24, 2023

cmdcolin commented Mar 29, 2023

cmdcolin commented Mar 29, 2023

Misc core typescripting #3605

Misc core typescripting #3605

Conversation

rbuels commented Mar 22, 2023

codecov bot commented Mar 22, 2023 • edited Loading

Codecov Report

cmdcolin commented Mar 24, 2023

cmdcolin commented Mar 24, 2023

rbuels commented Mar 24, 2023

cmdcolin commented Mar 24, 2023

rbuels commented Mar 24, 2023

cmdcolin commented Mar 24, 2023

rbuels commented Mar 24, 2023

cmdcolin commented Mar 29, 2023

quantitative stats

"feature density"-type stats

cmdcolin commented Mar 29, 2023

codecov bot commented Mar 22, 2023 •

edited

Loading