Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decref CacheFileRegion after read is done not before #102848

Merged
merged 2 commits into from
Dec 1, 2023

Conversation

tlrx
Copy link
Member

@tlrx tlrx commented Dec 1, 2023

The CacheFileRegion instance is decref before the read operation is executed, meaning that the SharedBytes.IO instance can return to the pool of free regions, being polled and written for another cache file region by another thread, before the first read is effectively completed (and will return incorrect bytes).

@tlrx tlrx added >bug :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.12.0 v8.11.2 labels Dec 1, 2023
@elasticsearchmachine elasticsearchmachine added the Team:Distributed Meta label for distributed team label Dec 1, 2023
@elasticsearchmachine
Copy link
Collaborator

Hi @tlrx, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

Copy link
Member

@original-brownbear original-brownbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@original-brownbear
Copy link
Member

@tlrx only thing to fix is the title, CacheFileRegion is what you decrement here :)

@tlrx tlrx changed the title Decref SharedBytes.IO after read is done not before Decref CacheFileRegion after read is done not before Dec 1, 2023
@tlrx
Copy link
Member Author

tlrx commented Dec 1, 2023

@tlrx only thing to fix is the title, CacheFileRegion is what you decrement here :)

🤦🏻 Thanks for noticing!

@DaveCTurner
Copy link
Contributor

When assertions are enabled, could we fill regions with junk on release? That would make it easier to catch this kind of thing I think.

@tlrx
Copy link
Member Author

tlrx commented Dec 1, 2023

CI failures are #102575

@tlrx
Copy link
Member Author

tlrx commented Dec 1, 2023

When assertions are enabled, could we fill regions with junk on release? That would make it easier to catch this kind of thing I think.

Probably, I take the point. Thanks for the suggestion!

@tlrx tlrx merged commit 640687a into elastic:main Dec 1, 2023
16 checks passed
@tlrx
Copy link
Member Author

tlrx commented Dec 1, 2023

Thanks Armin!

@tlrx tlrx deleted the 2023/12/01/runAfter branch December 1, 2023 14:14
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Dec 8, 2023
The CacheFileRegion instance is decref before the read operation 
is executed, meaning that the SharedBytes.IO instance can return 
to the pool of free regions, being polled and written by another 
thread for another cache file region, before the first read is 
effectively completed (and will return incorrect bytes).
elasticsearchmachine pushed a commit that referenced this pull request Dec 8, 2023
The CacheFileRegion instance is decref before the read operation 
is executed, meaning that the SharedBytes.IO instance can return 
to the pool of free regions, being polled and written by another 
thread for another cache file region, before the first read is 
effectively completed (and will return incorrect bytes).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed Meta label for distributed team v8.11.3 v8.12.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants