Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support combined injections #1378

Closed
the-mikedavis opened this issue Dec 26, 2021 · 7 comments
Closed

support combined injections #1378

the-mikedavis opened this issue Dec 26, 2021 · 7 comments
Assignees
Labels
A-tree-sitter Area: Tree-sitter C-enhancement Category: Improvements

Comments

@the-mikedavis
Copy link
Member

I think we chatted about this on the matrix a while back. I wanted to make an issue just to make sure it doesn't get lost.

There are a few grammars that would benefit from being able to combine the injections. Mostly it's templating languages like heex (#881) or tree-sitter-embedded-template (erb and some js templates), but also now git-diff (#1373) (which I wrote as a line-based grammar so it would work out-of-the-box without combined injections).

What are combined injections?...

From the tree-sitter docs:

injection.combined - indicates that all of the matching nodes in the tree should have their content parsed as one nested document

So you can write a query in a language's injections.scm that does a (#set! injection.combined), and all matching nodes will be parsed together.

As a practical example, the git-commit grammar would parse a document like this one:

foo

abc
# def
ghi

like so:

(source
  (subject)
  (message)
  (comment)
  (message))

And if you had an injections.scm like so:

((message) @injection.content
 (#set! injection.combined)
 (#set! injection.language "comment"))

Then the message nodes would be parsed as if they were one continuous message node spanning multiple lines. This ends up being important for template languages which usually have control flow spanning multiple nodes, like so:

<%= if true do %>
  <p>Hello, combined injections!</p>
<% end %>

Here the two directive nodes need to be combined for the contained do-end block to be parsed as a pair.


I'm interested in taking a stab at this but I don't really know where to begin. I suspect I probably don't have the rust chops to take this on :P

@the-mikedavis the-mikedavis added the C-enhancement Category: Improvements label Dec 26, 2021
@kirawi kirawi added the A-tree-sitter Area: Tree-sitter label Dec 26, 2021
@archseer
Copy link
Member

This stems from a hack in syntax.rs: the code was based on tree-sitter-highlight that did not do incremental tree parsing, so the code was modified so that we incrementally parse and reuse the root layer, but injections are parsed on the fly.

HighlightIterLayer::new() contains code that processes the combined injections:

helix/helix-core/src/syntax.rs

Lines 1111 to 1157 in a4641a8

// Process combined injections.
if let Some(combined_injections_query) = &config.combined_injections_query {
let mut injections_by_pattern_index = vec![
(None, Vec::new(), false);
combined_injections_query
.pattern_count()
];
let matches = cursor.matches(
combined_injections_query,
tree.root_node(),
RopeProvider(source),
);
for mat in matches {
let entry = &mut injections_by_pattern_index[mat.pattern_index];
let (language_name, content_node, include_children) =
injection_for_match(
config,
combined_injections_query,
&mat,
source,
);
if language_name.is_some() {
entry.0 = language_name;
}
if let Some(content_node) = content_node {
entry.1.push(content_node);
}
entry.2 = include_children;
}
for (lang_name, content_nodes, includes_children) in
injections_by_pattern_index
{
if let (Some(lang_name), false) = (lang_name, content_nodes.is_empty())
{
if let Some(next_config) = (injection_callback)(&lang_name) {
let ranges = Self::intersect_ranges(
&ranges,
&content_nodes,
includes_children,
);
if !ranges.is_empty() {
queue.push((next_config, depth + 1, ranges));
}
}
}
}
}

But this is sidestepped for the root layer:

// manually craft the root layer based on the existing tree
let layer = HighlightIterLayer {
highlight_end_stack: Vec::new(),
scope_stack: vec![LocalScope {
inherits: false,
range: 0..usize::MAX,
local_defs: Vec::new(),
}],
cursor,
depth: 0,
_tree: None,
captures,
config: config_ref,
ranges: vec![Range {
start_byte: 0,
end_byte: usize::MAX,
start_point: Point::new(0, 0),
end_point: Point::new(usize::MAX, usize::MAX),
}],
};

So combined injections only work on nested grammars.

@archseer
Copy link
Member

I'm working on a rewrite that will be able to incrementally update all the layers. This should resolve this issue (and #1151)

@archseer archseer self-assigned this Dec 29, 2021
@archseer
Copy link
Member

(Work is ongoing in https://github.com/helix-editor/helix/tree/incremental)

@archseer
Copy link
Member

Merged into master in 7c9ebd0 ! Would you be interested in adding https://github.com/tree-sitter/tree-sitter-embedded-template & eex now?

@the-mikedavis
Copy link
Member Author

(H)eex still has some bugs in my local testing (unrelated to combined injections, needs a fix in the elixir grammar) but embedded-template should be good to go. The only weird thing with it currently has injections for different languages so the one grammar is used for erb and ejs with different injections queries between them https://github.com/tree-sitter/tree-sitter-embedded-template/tree/d21df11b0ecc6fd211dbe11278e92ef67bd17e97/queries

is there a way to specify a config like this in languages.toml? I think jsx and tsx would benefit from something like that too

@the-mikedavis
Copy link
Member Author

I should be able to add https://github.com/elixir-lang/tree-sitter-iex though which is perfect for combined injections

@the-mikedavis
Copy link
Member Author

The combined injections work quite well so I'm gonna close this out. Thanks @archseer!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tree-sitter Area: Tree-sitter C-enhancement Category: Improvements
Projects
None yet
Development

No branches or pull requests

3 participants