A new matcher representation for use in `parse_tt` #95555

`parse_tt` currently traverses a `&[TokenTree]` to do matching. But this is a bad representation for the traversal. - `TokenTree` is nested, and there's a bunch of expensive and fiddly state required to handle entering and exiting nested submatchers. - There are three positions (sequence separators, sequence Kleene ops, and end of the matcher) that are represented by an index that exceeds the end of the `&[TokenTree]`, which is clumsy and error-prone. This commit introduces a new representation called `MatcherLoc` that is designed specifically for matching. It fixes all the above problems, making the code much easier to read. A `&[TokenTree]` is converted to a `&[MatcherLoc]` before matching begins. Despite the cost of the conversion, it's still a net performance win, because various pieces of traversal state are computed once up-front, rather than having to be recomputed repeatedly during the macro matching. Some improvements worth noting. - `parse_tt_inner` is *much* easier to read. No more having to compare `idx` against `len` and read comments to understand what the result means. - The handling of `Delimited` in `parse_tt_inner` is now trivial. - The three end-of-sequence cases in `parse_tt_inner` are now handled in three separate match arms, and the control flow is much simpler. - `nameize` is no longer recursive. - There were two places that issued "missing fragment specifier" errors: one in `parse_tt_inner()`, and one in `nameize()`. Presumably the latter was never executed. There's now a single place issuing these errors, in `compute_locs()`. - The number of heap allocations done for a `check full` build of `async-std-1.10.0` (an extreme example of heavy macro use) drops from 11.8M to 2.6M, and most of these occur outside of macro matching. - The size of `MatcherPos` drops from 64 bytes to 16 bytes. Small enough that it no longer needs boxing, which partly accounts for the reduction in allocations. - The rest of the drop in allocations is due to the removal of `MatcherKind`, because we no longer need to record anything for the parent matcher when entering a submatcher. - Overall it reduces code size by 45 lines.

To match the order the variants are declared in.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A new matcher representation for use in `parse_tt` #95555

A new matcher representation for use in `parse_tt` #95555

Commits on Apr 4, 2022

A new matcher representation for use in parse_tt #95555

A new matcher representation for use in parse_tt #95555

Commits on Apr 4, 2022

A new matcher representation for use in `parse_tt` #95555

A new matcher representation for use in `parse_tt` #95555