Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A new matcher representation for use in parse_tt #95555

Merged
merged 2 commits into from
Apr 4, 2022

Commits on Apr 4, 2022

  1. A new matcher representation for use in parse_tt.

    `parse_tt` currently traverses a `&[TokenTree]` to do matching. But this
    is a bad representation for the traversal.
    - `TokenTree` is nested, and there's a bunch of expensive and fiddly
      state required to handle entering and exiting nested submatchers.
    - There are three positions (sequence separators, sequence Kleene ops,
      and end of the matcher) that are represented by an index that exceeds
      the end of the `&[TokenTree]`, which is clumsy and error-prone.
    
    This commit introduces a new representation called `MatcherLoc` that is
    designed specifically for matching. It fixes all the above problems,
    making the code much easier to read. A `&[TokenTree]` is converted to a
    `&[MatcherLoc]` before matching begins. Despite the cost of the
    conversion, it's still a net performance win, because various pieces of
    traversal state are computed once up-front, rather than having to be
    recomputed repeatedly during the macro matching.
    
    Some improvements worth noting.
    - `parse_tt_inner` is *much* easier to read. No more having to compare
      `idx` against `len` and read comments to understand what the result
      means.
    - The handling of `Delimited` in `parse_tt_inner` is now trivial.
    - The three end-of-sequence cases in `parse_tt_inner` are now handled in
      three separate match arms, and the control flow is much simpler.
    - `nameize` is no longer recursive.
    - There were two places that issued "missing fragment specifier" errors:
      one in `parse_tt_inner()`, and one in `nameize()`. Presumably the
      latter was never executed. There's now a single place issuing these
      errors, in `compute_locs()`.
    - The number of heap allocations done for a `check full` build of
      `async-std-1.10.0` (an extreme example of heavy macro use) drops from
      11.8M to 2.6M, and most of these occur outside of macro matching.
    - The size of `MatcherPos` drops from 64 bytes to 16 bytes. Small enough
      that it no longer needs boxing, which partly accounts for the
      reduction in allocations.
    - The rest of the drop in allocations is due to the removal of
      `MatcherKind`, because we no longer need to record anything for the
      parent matcher when entering a submatcher.
    - Overall it reduces code size by 45 lines.
    nnethercote committed Apr 4, 2022
    Configuration menu
    Copy the full SHA
    88f8fbc View commit details
    Browse the repository at this point in the history
  2. Reorder match arms in parse_tt_inner.

    To match the order the variants are declared in.
    nnethercote committed Apr 4, 2022
    Configuration menu
    Copy the full SHA
    0bd47e8 View commit details
    Browse the repository at this point in the history