Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition in get_annotator_task assigning multiple docs to same annotator #374

Closed
ianroberts opened this issue May 16, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@ianroberts
Copy link
Member

Describe the bug

A nasty race condition exists in Project.get_annotator_task if it is called twice in parallel for the same user (e.g. due to network instability the user's browser believes the first call has failed and retries it - this has happened at least once in the production installation)

def get_annotator_task(self, user):
"""
Gets or creates a new annotation task for user (annotator).
:returns: Dictionary with all information to complete an annotation task. Only project information
is returned if user is waiting to be approved as an annotator. Returns None and removes
user from annotator list if there's no more tasks or user reached quota.
"""
annotation = self.get_current_annotator_task(user)
if annotation:
# User has existing task
return self.get_annotation_task_dict(annotation)
else:
# Tries to generate new task if there's no existing task
if self.annotator_reached_quota(user):
self.remove_annotator(user)
return None # Also return None as we've completed all the task
else:
return self.decide_annotator_task_type_and_assign(user)

The method first checks whether there is already a task in progress for this user, and if so returns it. If not it proceeds to assign a new task and returns that instead. However two parallel calls can both see no existing task, then both attempt to assign a new task, and even though the two calls are both executing inside with transaction.atomic() (in the RPC call), both can validly succeed, leaving the user with multiple pending annotation tasks recorded in the database. This means that all subsequent calls to get_annotator_task will fail with an exception since they enforce the invariant that an annotator can have at most one pending task at any given time.

Analysis

The reason why the atomic transaction is not sufficient protection is, I believe, to do with the way we assign documents to annotators at random. Assigning a task to an annotator does not modify the project or user tables directly, it just inserts a row into the annotation table linking to the user ID and the document ID. Since two calls to assign a new task will generally assign different document IDs, these two inserts do not clash and both transactions can commit successfully.

The simplest fix for this race would be to completely serialize the transactions, i.e. grab some sort of lock before doing the "is there a task already in progress" check and release it only after completing the new task assignment, such that a second simultaneous call is blocked from even checking for pending annotations until the first call is complete and has committed its insert. If it would be too heavy to grab this lock for every call to get_annotator_task then a halfway house might be a kind of "double-checked locking" approach, where it is only in the else case (line 597) that we grab the lock and then check for pending tasks again (this check must happen inside the scope of the lock, even if it has already been checked outside) before assigning a new task and finally releasing the lock.

I don't know enough Django to know how best to implement this, is there a Django concept that maps to the SQL SELECT FOR UPDATE construct? Or a Postgresql-specific API for acquiring locks?

@ianroberts ianroberts added the bug Something isn't working label May 16, 2023
@ianroberts ianroberts changed the title Potential race condition in get_annotator_task assigning multiple docs to same annotator Race condition in get_annotator_task assigning multiple docs to same annotator May 16, 2023
@ianroberts
Copy link
Member Author

If it would be too heavy to grab this lock for every call to get_annotator_task

If we can scope the lock per user (e.g. SELECT FOR UPDATE * FROM service_user WHERE id=:this_user) then it shouldn't be too big a deal as you're only blocking a user against themself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant