Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v13.x] Backport thread-safe function deadlock detection fix #32948

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 13 additions & 6 deletions doc/api/n-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -5096,11 +5096,18 @@ preventing data from being successfully added to the queue. If set to
`napi_call_threadsafe_function()` never blocks if the thread-safe function was
created with a maximum queue size of 0.

As a special case, when `napi_call_threadsafe_function()` is called from the
main thread, it will return `napi_would_deadlock` if the queue is full and it
was called with `napi_tsfn_blocking`. The reason for this is that the main
thread is responsible for reducing the number of items in the queue so, if it
waits for room to become available on the queue, then it will deadlock.
As a special case, when `napi_call_threadsafe_function()` is called from a
JavaScript thread, it will return `napi_would_deadlock` if the queue is full
and it was called with `napi_tsfn_blocking`. The reason for this is that the
JavaScript thread is responsible for removing items from the queue, thereby
reducing their number. Thus, if it waits for room to become available on the
queue, then it will deadlock.

`napi_call_threadsafe_function()` will also return `napi_would_deadlock` if the
thread-safe function created on one JavaScript thread is called from another
JavaScript thread. The reason for this is to prevent a deadlock arising from the
possibility that the two JavaScript threads end up waiting on one another,
thereby both deadlocking.

The actual call into JavaScript is controlled by the callback given via the
`call_js_cb` parameter. `call_js_cb` is invoked on the main thread once for each
Expand Down Expand Up @@ -5243,7 +5250,7 @@ changes:
pr-url: https://github.com/nodejs/node/pull/32689
description: >
Return `napi_would_deadlock` when called with `napi_tsfn_blocking` from
the main thread and the queue is full.
the main thread or a worker thread and the queue is full.
-->

```C
Expand Down
28 changes: 23 additions & 5 deletions src/node_api.cc
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,6 @@ class ThreadSafeFunction : public node::AsyncResource {
is_closing(false),
context(context_),
max_queue_size(max_queue_size_),
main_thread(uv_thread_self()),
env(env_),
finalize_data(finalize_data_),
finalize_cb(finalize_cb_),
Expand All @@ -149,16 +148,36 @@ class ThreadSafeFunction : public node::AsyncResource {

napi_status Push(void* data, napi_threadsafe_function_call_mode mode) {
node::Mutex::ScopedLock lock(this->mutex);
uv_thread_t current_thread = uv_thread_self();

while (queue.size() >= max_queue_size &&
max_queue_size > 0 &&
!is_closing) {
if (mode == napi_tsfn_nonblocking) {
return napi_queue_full;
} else if (uv_thread_equal(&current_thread, &main_thread)) {
return napi_would_deadlock;
}

// Here we check if there is a Node.js event loop running on this thread.
// If there is, and our queue is full, we return `napi_would_deadlock`. We
// do this for two reasons:
//
// 1. If this is the thread on which our own event loop runs then we
// cannot wait here because that will prevent our event loop from
// running and emptying the very queue on which we are waiting.
//
// 2. If this is not the thread on which our own event loop runs then we
// still cannot wait here because that allows the following sequence of
// events:
//
// 1. JSThread1 calls JSThread2 and blocks while its queue is full and
// because JSThread2's queue is also full.
//
// 2. JSThread2 calls JSThread1 before it's had a chance to remove an
// item from its own queue and blocks because JSThread1's queue is
// also full.
v8::Isolate* isolate = v8::Isolate::GetCurrent();
if (isolate != nullptr && node::GetCurrentEventLoop(isolate) != nullptr)
return napi_would_deadlock;

cond->Wait(lock);
}

Expand Down Expand Up @@ -438,7 +457,6 @@ class ThreadSafeFunction : public node::AsyncResource {
// means we don't need the mutex to read them.
void* context;
size_t max_queue_size;
uv_thread_t main_thread;

// These are variables accessed only from the loop thread.
v8impl::Persistent<v8::Function> ref;
Expand Down