Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Generate expr method signatures, docs #3600

Open
wants to merge 76 commits into
base: main
Choose a base branch
from
Open

Conversation

dangotbanned
Copy link
Member

@dangotbanned dangotbanned commented Sep 19, 2024

Closes #3563

Description

This PR builds on the work of #3466, and provides accurate signatures for alt.expr methods.

Benefits

Originally listed in #3563 (comment)

  • Raising a TypeError at the time of expr definition
    • Rather than silently passing until validated within a ChartType
    • Importantly, this is within python and not javascript
  • Providing a meaningful traceback, identifying both the expr and the missing argument
  • Opens the door for defining default argument values
    • E.g. in waterfall_chart, there are repeat uses of "" for the elseValue
    • If this is a common case, we could use that as a default and it would be clear by the signature alone
  • The current docstrings - which refer to parameters by name - would be easier to understand
    • Right now, you'd need to refer to Vega Expressions to determine the order they must appear in
  • The change would not add any new requirements to currently valid specs
    • Therefore, suitable as a MINOR version feature

Screenshots

image
image
image

Implementation

In #3563 I proposed authoring these changes manually.
However, after revisiting a conversation I had with @jonmmease I thought it would be worth at least trying to generate the definitions.

I found that https://vega.github.io/vega/docs/expressions/ seemed to be the only place they are documented in a single place.
From my understanding, this document is manually maintained but updates occur infrequently.

There was enough consistency in the markup used to work out how various parameters aligned with python syntax https://docs.python.org/3/tutorial/controlflow.html#special-parameters.

I also used this opportunity to clean up the docs themselves, and get them closer to the rest of altair.
There is definitely more that could be explored on that front, and some of the changes here could be utilised to improve the existing generated docs.

Related

Tasks

Deferred

Trying my best to keep the scope of the PR focused.
Working on this sparked a lot of ideas, so I've collected those here to possibly explore in the future:

  • Inline code -> block code (comment)
  • Also parsing properties from expressions.md
    • Wouldn't be that much work to add, but not a great deal of value
    • We're missing MAX_VALUE, MIN_VALUE
  • tools.schemapi.vega_expr.py -> tools.vega_expr.py
    • The module is unrelated to the vega-lite schema
    • I think the review will be simpler if it stays here for now
  • Soft-deprecating camelCase expr methods, replace w/ snake_case
    • Pretty sure adding an _ExprMeta.__getattr__ would make this possible
    • The current names would remain accessible (but undocumented)

Preview

Prior to 5e75051 (#3600) there wasn't any generated source on this branch.
Keeping these screenshots for a quick referenceuntil I've resolved some test issues.

Screenshots

image

image

image

- Only the most common case
- no docs
- No method body
Currently just collects the pieces, but doesn't render the final str
Also renamed constant `EXPR_ANNOTATION` -> `INPUT_ANNOTATION`
- Contains all the functionality
- Needs a lot of tidying up
- Uses the same config as `indent_docstring`
- Can't truly be `numpydoc` though without a parameters section
There were 10+ cases of methods not being defined.
These are now all present

#3600 (comment)
altair/expr/__init__.py Outdated Show resolved Hide resolved
@dangotbanned
Copy link
Member Author

@mattijn following your feedback in #3536 (comment) I'm trying to write a helpful description but having a hard time, without either:

This is otherwise ready for review, so I wanted to ask ahead to see what content you'd like to see there

@dangotbanned dangotbanned changed the title feat: expr method signatures feat: Generate expr method signatures, docs Sep 27, 2024
@mattijn
Copy link
Contributor

mattijn commented Sep 27, 2024

Thanks for requesting a review. You can help me explaining how I can review this. Do I have to run write_expr_module(source_url="static", output=EXPR_FILE) myself or inspect expressions by importing altair.expr.xx? Thanks for the work here.

@dangotbanned
Copy link
Member Author

dangotbanned commented Sep 27, 2024

Thanks for requesting a review. You can help me explaining how I can review this. Do I have to run write_expr_module(source_url="static", output=EXPR_FILE) myself or inspect expressions by importing altair.expr.xx? Thanks for the work here.

@mattijn
So everything should work exactly the same as the current alt.expr.
If you check out this branch - the changes should be reflected in two ways:

  1. The signatures/docs displayed by our IDE when calling alt.expr.xxx()
  2. Python TypeError(s) if you try to call a method without enough arguments

The actual code generation you could call yourself, but this is already in the CI.
I suppose you could check the output is the same?

@dangotbanned dangotbanned marked this pull request as ready for review September 27, 2024 16:04
@mattijn
Copy link
Contributor

mattijn commented Sep 27, 2024

I get proper signatures when doing

import altair as alt

alt.expr.scale()
Expected 2 more positional arguments
(method) def scale(
    name: IntoExpression,
    value: IntoExpression,
    group: IntoExpression = None,
    /
) -> Expression
Applies the named scale transform (or projection) to the specified value.

The optional group argument takes a scenegraph group mark item to indicate the specific scope in which to look up the scale or projection.

But not when doing:

import altair.expr as ae

ae.scale()
"scale" is not a known attribute of module "altair.expr"
(function) scale: Unknown

See animated gif:
ScreenRecording2024-09-27at19 47 03-ezgif com-optimize

Is that something that is expected? Or was it naive to expect that something as such could work?

@dangotbanned
Copy link
Member Author

dangotbanned commented Sep 27, 2024

Is that something that is expected? Or was it naive to expect that something as such could work?

@mattijn I managed to get all of these working correctly:

# ruff: noqa: B018
import altair as alt

# import altair.expr as ae
from altair import expr
from altair import expr as ae
from altair.expr import expr as expr_cls

alt.expr.scale
expr.scale
ae.scale
expr_cls.scale

image

Edit

AFAIK this behavior isn't something I'd expect to have changed vs main - but would be interesting if so

Comment on lines +86 to +97
# NOTE: `altair` types (for annotations)
RETURN_WRAPPER: LiteralString = "FunctionExpression"
RETURN_ANNOTATION: LiteralString = "Expression"
"""
The annotation is intentionally *less* specific than the real type.

``Expression`` is shorter, while preserving all the user-facing functionality
"""

CONST_WRAPPER: LiteralString = "ConstExpression"
CLS_META: LiteralString = "_ExprMeta"
INPUT_ANNOTATION: LiteralString = "IntoExpression"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note

Adding this comment to provide a link from #3616

@mattijn
Copy link
Contributor

mattijn commented Sep 28, 2024

Thanks! It works as expected, but that also means it surface the following issue:
This works

import polars as pl
import altair as alt
from altair import expr as ae

df = pl.DataFrame(
    {
        "x": [0.32, 0.86, 0.10, 0.30, 0.47, 0.0, 1.0, 0.91, 0.88, 0.12],
        "color": list("ACABBCABBA"),
    }
)

param_array = alt.param(name="my_array", value=["B", "A", "C"])
alt.Chart(df).mark_tick(thickness=10).encode(
    x=alt.X("x"), 
    color=alt.Color("color").scale(domain=param_array)
).add_params(param_array)
image

But this will not:

param_array = alt.param(name="my_array", value=["B", "A", "C"])
param_sort = alt.param(name="my_sorted_array", expr=ae.sort(param_array))
alt.Chart(df).mark_tick(thickness=10).encode(
    x=alt.X("x"), 
    color=alt.Color("color").scale(domain=param_sort)
).add_params(param_array, param_sort)
Javascript Error: Unrecognized function: sort
This usually means there's a typo in your chart specification. See the javascript console for the full traceback.

And while ae.sort(param_array) is now recognised as valid, since it is now auto-generated from the docs (great!), it is not yet released as part of the current accepted Vega version within Altair therefore raising an JavaScript error.

The compiled vega-lite specification looks correctly though:

{
  "config": {"view": {"continuousWidth": 300, "continuousHeight": 300}},
  "data": {"name": "data-1d2849971aacb1c06bb7718af005021c"},
  "mark": {"type": "tick", "thickness": 10},
  "encoding": {
    "color": {
      "field": "color",
      "scale": {"domain": {"expr": "my_sorted_array"}},
      "type": "nominal"
    },
    "x": {"field": "x", "type": "quantitative"}
  },
  "params": [
    {"name": "my_array", "value": ["B", "A", "C"]},
    {"name": "my_sorted_array", "expr": "sort(my_array)"}
  ],
  "$schema": "https://vega.github.io/schema/vega-lite/v5.20.1.json",
  "datasets": {
    "data-1d2849971aacb1c06bb7718af005021c": [
      {"x": 0.32, "color": "A"},
      {"x": 0.86, "color": "C"},
      {"x": 0.1, "color": "A"},
      {"x": 0.3, "color": "B"},
      {"x": 0.47, "color": "B"},
      {"x": 0, "color": "C"},
      {"x": 1, "color": "A"},
      {"x": 0.91, "color": "B"},
      {"x": 0.88, "color": "B"},
      {"x": 0.12, "color": "A"}
    ]
  }
}

Comment on lines 54 to 60
class Source(str, enum.Enum):
"""Enumerations for ``expressions.md`` source files."""

LIVE = "https://raw.githubusercontent.com/vega/vega/main/docs/docs/expressions.md"
STATIC = "https://raw.githubusercontent.com/vega/vega/fb2e60274071033b4c427410ef43375b6f314cf2/docs/docs/expressions.md"
OLD = "https://raw.githubusercontent.com/vega/vega/ff98519cce32b776a98d01dd982467d76fc9ee34/docs/docs/expressions.md"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Provided a manual solution in d75792b (#3600)

So far I've only come up with this as a source for automating:

https://github.com/vega/vega-lite/blob/c7697dc293b606ba60c7e374dbf9a77226f91225/package.json#L137-L138

@mattijn I guess I have two questions:

  1. Would manually syncing the (vega|vega-lite) versions seem reasonable to you?
  2. Do you know of any simpler sources for getting the corresponding vega minimum version for a given vega-lite release?

I'm unsure whether using package.json is a good/bad idea, it was just the first source I found

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manually syncing is fine by me. Adding a note in this section that we should update the VEGA_VERSION by eg inspecting the file you found or just the latest released Vega version.

@dangotbanned
Copy link
Member Author

But this will not:

...

And while ae.sort(param_array) is now recognised as valid, since it is now auto-generated from the docs (great!), it is not yet released as part of the current accepted Vega version within Altair therefore raising an JavaScript error.

The compiled vega-lite specification looks correctly though:

{
  "config": {"view": {"continuousWidth": 300, "continuousHeight": 300}},
  "data": {"name": "data-1d2849971aacb1c06bb7718af005021c"},
  "mark": {"type": "tick", "thickness": 10},
  "encoding": {
    "color": {
      "field": "color",
      "scale": {"domain": {"expr": "my_sorted_array"}},
      "type": "nominal"
    },
    "x": {"field": "x", "type": "quantitative"}
  },
  "params": [
    {"name": "my_array", "value": ["B", "A", "C"]},
    {"name": "my_sorted_array", "expr": "sort(my_array)"}
  ],
  "$schema": "https://vega.github.io/schema/vega-lite/v5.20.1.json",
  "datasets": {
    "data-1d2849971aacb1c06bb7718af005021c": [
      {"x": 0.32, "color": "A"},
      {"x": 0.86, "color": "C"},
      {"x": 0.1, "color": "A"},
      {"x": 0.3, "color": "B"},
      {"x": 0.47, "color": "B"},
      {"x": 0, "color": "C"},
      {"x": 1, "color": "A"},
      {"x": 0.91, "color": "B"},
      {"x": 0.88, "color": "B"},
      {"x": 0.12, "color": "A"}
    ]
  }
}

Well spotted @mattijn thank you!

I'll follow this up in #3600 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve expr method signatures
2 participants