Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add pcre2_get_match_data_heapframes_size() #191

Merged
merged 1 commit into from
Jan 17, 2023

Conversation

carenas
Copy link
Contributor

@carenas carenas commented Jan 13, 2023

The proposed code, allows for an external application to implement logic similar to:

PCRE2 version 10.42 2022-12-11
  re> /\[(a)]{1000}/expand,framesize
Frame size for pcre2_match(): 16128
data> \[a]{1000}\=ovector=1
Matched, but too many substrings
 0: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
data> 
  re> /a/memory,heapframes_size
Memory allocation (code space): 9
Heapframes size for pcre2_match(): 20643840
data> a\=ovector=0
 0: a
data> 
  re> /a/heapframes_size
Heapframes size for pcre2_match(): 20480

So that the system could recover from memory pressure in cases were match_data is being reused.

@carenas carenas marked this pull request as draft January 13, 2023 19:43
Since PCRE2 10.41, the match data contains a pointer to a vector of
frames allocated in the heap and that are used by pcre2_match()
when doing non JIT matches.

There is though, no outside visibility on the size of it, and therefore
the memory it uses is locked away until match_data itself is freed.

Add an API that allows getting that value, so an application could
decide based on its own experienced memory pressure to keep reusing
that match_data or not.

While at it, update the documentation of other related functions for
clarity.
@carenas carenas marked this pull request as ready for review January 13, 2023 21:28
@PhilipHazel PhilipHazel merged commit c80c633 into PCRE2Project:master Jan 17, 2023
@carenas carenas deleted the mdheapsize branch January 17, 2023 19:37
@PhilipHazel
Copy link
Collaborator

I have now played with this a bit. I'm happy with the new function, but I'm going to re-work the changes to pcre2test. It seems wrong to me for the heapframes_size option to be a pattern option; I'm going to change it to a subject option. As a pattern option, it gives information about the previous match, which doesn't seem right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants