Race condition in get_annotator_task assigning multiple docs to same annotator #374

ianroberts · 2023-05-16T17:16:30Z

Describe the bug

A nasty race condition exists in Project.get_annotator_task if it is called twice in parallel for the same user (e.g. due to network instability the user's browser believes the first call has failed and retries it - this has happened at least once in the production installation)

gate-teamware/backend/models.py

Lines 583 to 602 in f9a1a99

    
               def get_annotator_task(self, user): 
        
                   """ 
        
                   Gets or creates a new annotation task for user (annotator). 
        
                   :returns: Dictionary with all information to complete an annotation task. Only project information 
        
                       is returned if user is waiting to be approved as an annotator. Returns None and removes 
        
                       user from annotator list if there's no more tasks or user reached quota. 
        
                   """ 
        
                   annotation = self.get_current_annotator_task(user) 
        
                   if annotation: 
        
                       # User has existing task 
        
                       return self.get_annotation_task_dict(annotation) 
        
                   else: 
        
                       # Tries to generate new task if there's no existing task 
        
                       if self.annotator_reached_quota(user): 
        
                           self.remove_annotator(user) 
        
                           return None  # Also return None as we've completed all the task 
        
                       else: 
        
                           return self.decide_annotator_task_type_and_assign(user)

The method first checks whether there is already a task in progress for this user, and if so returns it. If not it proceeds to assign a new task and returns that instead. However two parallel calls can both see no existing task, then both attempt to assign a new task, and even though the two calls are both executing inside with transaction.atomic() (in the RPC call), both can validly succeed, leaving the user with multiple pending annotation tasks recorded in the database. This means that all subsequent calls to get_annotator_task will fail with an exception since they enforce the invariant that an annotator can have at most one pending task at any given time.

Analysis

The reason why the atomic transaction is not sufficient protection is, I believe, to do with the way we assign documents to annotators at random. Assigning a task to an annotator does not modify the project or user tables directly, it just inserts a row into the annotation table linking to the user ID and the document ID. Since two calls to assign a new task will generally assign different document IDs, these two inserts do not clash and both transactions can commit successfully.

The simplest fix for this race would be to completely serialize the transactions, i.e. grab some sort of lock before doing the "is there a task already in progress" check and release it only after completing the new task assignment, such that a second simultaneous call is blocked from even checking for pending annotations until the first call is complete and has committed its insert. If it would be too heavy to grab this lock for every call to get_annotator_task then a halfway house might be a kind of "double-checked locking" approach, where it is only in the else case (line 597) that we grab the lock and then check for pending tasks again (this check must happen inside the scope of the lock, even if it has already been checked outside) before assigning a new task and finally releasing the lock.

I don't know enough Django to know how best to implement this, is there a Django concept that maps to the SQL SELECT FOR UPDATE construct? Or a Postgresql-specific API for acquiring locks?

The text was updated successfully, but these errors were encountered:

ianroberts · 2023-05-16T17:32:11Z

If it would be too heavy to grab this lock for every call to get_annotator_task

If we can scope the lock per user (e.g. SELECT FOR UPDATE * FROM service_user WHERE id=:this_user) then it shouldn't be too big a deal as you're only blocking a user against themself.

ianroberts added the bug Something isn't working label May 16, 2023

ianroberts changed the title ~~Potential race condition in get_annotator_task assigning multiple docs to same annotator~~ Race condition in get_annotator_task assigning multiple docs to same annotator May 16, 2023

This was referenced May 16, 2023

User-level DB lock in get_annotator_task to avoid race condition #375

Merged

An annotator could not add anymore annotation because they have more than one annotation in a document #323

Open

johann-petrak mentioned this issue Jul 18, 2023

Annotator has nothing left to annotate and marked as completed after only a small part of the annotations #379

Closed

twinkarma closed this as completed in 89d0cd5 Oct 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Race condition in get_annotator_task assigning multiple docs to same annotator #374

Race condition in get_annotator_task assigning multiple docs to same annotator #374

ianroberts commented May 16, 2023

ianroberts commented May 16, 2023

Race condition in get_annotator_task assigning multiple docs to same annotator #374

Race condition in get_annotator_task assigning multiple docs to same annotator #374

Comments

ianroberts commented May 16, 2023

Describe the bug

Analysis

ianroberts commented May 16, 2023