Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rt: fix bug in work-stealing queue #2387

Merged
merged 4 commits into from
Apr 9, 2020
Merged

rt: fix bug in work-stealing queue #2387

merged 4 commits into from
Apr 9, 2020

Conversation

carllerche
Copy link
Member

Fixes a couple bugs in the work-stealing queue introduced as
part of #2315. First, the cursor needs to be able to represent more
values than the size of the buffer. This is to be able to track if
tail is ahead of head or if they are identical. This bug resulted in
the "overflow" path being taken before the buffer was full.

The second bug can happen when a queue is being stolen from concurrently
with stealing into. In this case, it is possible for buffer slots to be
overwritten before they are released by the stealer. This is harder to
happen in practice due to the first bug preventing the queue from
filling up 100%, but could still happen. It triggered an assertion in
steal_into. This bug slipped through due to a bug in loom not
correctly catching the case. The loom bug is fixed as part of
tokio-rs/loom#119.

Fixes: #2382

Fixes a couple bugs in the work-stealing queue introduced as
part of #2315. First, the cursor needs to be able to represent more
values than the size of the buffer. This is to be able to track if
`tail` is ahead of `head` or if they are identical. This bug resulted in
the "overflow" path being taken before the buffer was full.

The second bug can happen when a queue is being stolen from concurrently
with stealing into. In this case, it is possible for buffer slots to be
overwritten before they are released by the stealer. This is harder to
happen in practice due to the first bug preventing the queue from
filling up 100%, but could still happen. It triggered an assertion in
`steal_into`. This bug slipped through due to a bug in loom not
correctly catching the case. The loom bug is fixed as part of
tokio-rs/loom#119.

Fixes: #2382
tokio/src/runtime/queue.rs Show resolved Hide resolved
tokio/src/runtime/queue.rs Outdated Show resolved Hide resolved
@carllerche carllerche marked this pull request as ready for review April 9, 2020 05:09
Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! 👍

@hawkw
Copy link
Member

hawkw commented Apr 9, 2020

The FreeBSD CI build failure is due to some kind of CI issue (possibly something in Cirrus is broken?), so per a conversation with Carl, we're ignoring it for now.

@hawkw hawkw merged commit 58ba45a into master Apr 9, 2020
hawkw added a commit that referenced this pull request Apr 9, 2020
- rt: bug in work-stealing queue (#2387)

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
hawkw added a commit that referenced this pull request Apr 9, 2020
- rt: bug in work-stealing queue (#2387)

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
hawkw added a commit that referenced this pull request Apr 9, 2020
# 0.2.17 (April 9, 2020)

### Fixes
- rt: bug in work-stealing queue (#2387) 

### Changes 
- rt: threadpool uses logical CPU count instead of physical by default
  (#2391)


Signed-off-by: Eliza Weisman <eliza@buoyant.io>
@carllerche carllerche deleted the queue-maybe-bug branch April 15, 2020 20:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Assertion failure in steal_into / steal_into2
2 participants