Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scatterplot not displaying negative values for ACE2 binding #1626

Closed
joverlee521 opened this issue Jan 19, 2023 · 6 comments · Fixed by #1655
Closed

Scatterplot not displaying negative values for ACE2 binding #1626

joverlee521 opened this issue Jan 19, 2023 · 6 comments · Fixed by #1655
Labels
bug Something isn't working

Comments

@joverlee521
Copy link
Contributor

The scatterplot view does not display the negative values for ACE2 binding:

Screen Shot 2023-01-19 at 2 16 14 PM

The negative values are displayed properly for other continuous color-bys such as Logistic Growth and Mutational Fitness:

Screen Shot 2023-01-19 at 2 31 57 PM

This was a bug reported/discovered during Nextstrain office hours on 2023-01-19.

@joverlee521 joverlee521 added the bug Something isn't working label Jan 19, 2023
@joverlee521
Copy link
Contributor Author

Looked at the underlying Auspice JSON and found that ACE2 binding field values are strings while the Logistic Growth and Mutational Fitness values are floats. The string values would cause the checks for the minX and minY values to not work as expected.

All continuous field values should probably be converted to floats within Auspice, but maybe this is also an issue in augur export?

@corneliusroemer
Copy link
Member

corneliusroemer commented Mar 22, 2023

Thanks for looking into this @joverlee521

Why do they end up as strings? In the nextclade_data_workflows/sars-cov-2 case:

  • in metadata.tsv they are not quoted, just bare floats
  • they are added in auspice_config as "continuous"
  • but export turns them into "ace2_binding": {"value": "-0"}

I presume it's similar in ncov.

However auspice knows that this trait is continuous, so it should always parse it as a number, and not treat it as string.

So I don't think this is an export bug. It's an Auspice bug.

Could we move this out of backlog? I think this would be quite important to fix.

@jameshadfield

Potentially related to #1644

@tsibley
Copy link
Member

tsibley commented Mar 22, 2023

All values in a CSV are strings, since CSV doesn't have data types. Types may be inferred upon parsing, but such inference is often fragile and prone to bugs. So I'd think if augur export knows a field should be numeric, then it should parse it as such and export it to JSON as such.

Auspice should maybe also force it to numeric regardless of its JSON data type, iff Auspice knows the field is supposed to be numeric.

@corneliusroemer
Copy link
Member

Do I understand you correctly @tsibley that you consider it both a bug in export (if export knows, it should use correct type for JSON - which has types) and a bug in Auspice (because export also knows what the type is from coloring metadata) but doesn't cast?

@jameshadfield
Copy link
Member

consider it both a bug in export ... and a bug in Auspice

I think this is fair. For instance, here are the types seen across the tree for the 6 continuous traits in the 21L dataset (used in the duplicate #1653 issue), so it looks like augur is exporting as all strings or all numbers - perhaps conditional on the presence of negative values?

image

I'll add a fix in auspice for this now, which may or may not be enough to fix this particular scatterplot bug.

jameshadfield added a commit that referenced this issue Mar 22, 2023
Conceptually these values only make sense if they are numeric.
Previously some implicit casting was happening, but this resulted in
bugs such as negative values not being displayed in the scatterplot.

Note one slight side-effect which may be considered a regression:
previously string values which were not numbers (e.g. "abc" but not
"1.0") would show up in the hover/click info-boxes. Now such values will
be changed to `undefined` and therefore not shown in those info-boxes.

Timing of the changes introduced here (AppleM1, Firefox 111):
* nextclade/sars-cov-2/21L   7-9ms
* ncov/gisaid/global/6m      4-10ms
* zika                       2ms

Closes #1626
@tsibley
Copy link
Member

tsibley commented Mar 23, 2023

@corneliusroemer Yep, that's right.

jameshadfield added a commit that referenced this issue Mar 27, 2023
Conceptually these values only make sense if they are numeric.
Previously some implicit casting was happening, but this resulted in
bugs such as negative values not being cast and thus not displayed in
the scatterplot.

Note one slight side-effect which may be considered a regression:
previously string values which were not numbers (e.g. "abc" but not
"1.0") would show up in the hover/click info-boxes. Now such values will
be changed to `undefined` and therefore not shown in those info-boxes.

Timing of the changes introduced here (AppleM1, Firefox 111):
* nextclade/sars-cov-2/21L   7-9ms
* ncov/gisaid/global/6m      4-10ms
* zika                       2ms

Closes #1626
jameshadfield added a commit that referenced this issue Jun 6, 2023
Conceptually these values only make sense if they are numeric.
Previously some implicit casting was happening, but this resulted in
bugs such as negative values not being cast and thus not displayed in
the scatterplot.

Note one slight side-effect which may be considered a regression:
previously string values which were not numbers (e.g. "abc" but not
"1.0") would show up in the hover/click info-boxes. Now such values will
be changed to `undefined` and therefore not shown in those info-boxes.

Timing of the changes introduced here (AppleM1, Firefox 111):
* nextclade/sars-cov-2/21L   7-9ms
* ncov/gisaid/global/6m      4-10ms
* zika                       2ms

Closes #1626
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

4 participants