Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix revoking tasks on custom queues #1352

Merged
merged 1 commit into from
Apr 19, 2024
Merged

Fix revoking tasks on custom queues #1352

merged 1 commit into from
Apr 19, 2024

Conversation

cjh1
Copy link
Contributor

@cjh1 cjh1 commented Mar 5, 2024

@ mention of reviewers

@ihsaan-ullah

A brief description of the purpose of the changes contained in this PR.

Currently cancelling a submission running from an custom queue doesn't work. Custom queues have a different vhost we need to use an appropriate celery configuration with that vhost including in the broker URL.

A checklist for hand testing

  • Tested locally

Checklist

  • Code review by me
  • Hand tested by me
  • I'm proud of my work
  • Code review by reviewer
  • Hand tested by reviewer
  • CircleCi tests are passing
  • Ready to merge

As custom queues have a different vhost we need to use an
appropriate celery configuration with that vhost including
in the broker URL.
@Didayolo
Copy link
Collaborator

I assign Ihsaan as reviewer as I you mentioned him in your message. @ihsaan-ullah If you won't review Chris' PRs, please let me know.

@cjh1
Copy link
Contributor Author

cjh1 commented Apr 10, 2024

Any movement on this?

@Didayolo Didayolo assigned Didayolo and unassigned ihsaan-ullah Apr 12, 2024
@Didayolo Didayolo merged commit 80b634f into codalab:develop Apr 19, 2024
@liviust
Copy link

liviust commented Jun 7, 2024

If you submit a run to a custom queue, and in the middle of the scoring stage, you basically cancel the execution of the run, you still get the output, meaning that the run was not cancelled, and the next time you send a run to the same queue, you'd get:

2024-06-07 20:31:07 django-1          | Task competitions.tasks._run_submission[5a3c1e17-56dd-48b3-a048-cc1f436d7bb6] raised unexpected: PreconditionFailed(406, "PRECONDITION_FAILED - inequivalent arg 'x-max-priority' for queue 'compute-worker' in vhost '0ada9d07-8b68-4d8c-bdf3-d9e8dc02054a': received none but current is the value '10' of type 'signedint'", (50, 10), 'Queue.declare')
2024-06-07 20:31:07 django-1          | Traceback (most recent call last):
2024-06-07 20:31:07 django-1          |   File "/usr/local/lib/python3.8/site-packages/celery/app/trace.py", line 382, in trace_task
2024-06-07 20:31:07 django-1          |     R = retval = fun(*args, **kwargs)
2024-06-07 20:31:07 django-1          |   File "/app/src/apps/competitions/tasks.py", line 316, in _run_submission
2024-06-07 20:31:07 django-1          |     _send_to_compute_worker(submission, is_scoring)
2024-06-07 20:31:07 django-1          |   File "/app/src/apps/competitions/tasks.py", line 197, in _send_to_compute_worker
2024-06-07 20:31:07 django-1          |     task = celery_app.send_task(
2024-06-07 20:31:07 django-1          |   File "/usr/local/lib/python3.8/site-packages/celery/app/base.py", line 745, in send_task
2024-06-07 20:31:07 django-1          |     amqp.send_task_message(P, name, message, **options)
2024-06-07 20:31:07 django-1          |   File "/usr/local/lib/python3.8/site-packages/celery/app/amqp.py", line 543, in send_task_message
2024-06-07 20:31:07 django-1          |     ret = producer.publish(
2024-06-07 20:31:07 django-1          |   File "/usr/local/lib/python3.8/site-packages/kombu/messaging.py", line 178, in publish
2024-06-07 20:31:07 django-1          |     return _publish(
2024-06-07 20:31:07 django-1          |   File "/usr/local/lib/python3.8/site-packages/kombu/connection.py", line 533, in _ensured
2024-06-07 20:31:07 django-1          |     return fun(*args, **kwargs)
2024-06-07 20:31:07 django-1          |   File "/usr/local/lib/python3.8/site-packages/kombu/messaging.py", line 194, in _publish
2024-06-07 20:31:07 django-1          |     [maybe_declare(entity) for entity in declare]
2024-06-07 20:31:07 django-1          |   File "/usr/local/lib/python3.8/site-packages/kombu/messaging.py", line 194, in <listcomp>
2024-06-07 20:31:07 django-1          |     [maybe_declare(entity) for entity in declare]
2024-06-07 20:31:07 django-1          |   File "/usr/local/lib/python3.8/site-packages/kombu/messaging.py", line 102, in maybe_declare
2024-06-07 20:31:07 django-1          |     return maybe_declare(entity, self.channel, retry, **retry_policy)
2024-06-07 20:31:07 django-1          |   File "/usr/local/lib/python3.8/site-packages/kombu/common.py", line 121, in maybe_declare
2024-06-07 20:31:07 django-1          |     return _maybe_declare(entity, channel)
2024-06-07 20:31:07 django-1          |   File "/usr/local/lib/python3.8/site-packages/kombu/common.py", line 161, in _maybe_declare
2024-06-07 20:31:07 django-1          |     entity.declare(channel=channel)
2024-06-07 20:31:07 django-1          |   File "/usr/local/lib/python3.8/site-packages/kombu/entity.py", line 611, in declare
2024-06-07 20:31:07 django-1          |     self._create_queue(nowait=nowait, channel=channel)
2024-06-07 20:31:07 django-1          |   File "/usr/local/lib/python3.8/site-packages/kombu/entity.py", line 620, in _create_queue
2024-06-07 20:31:07 django-1          |     self.queue_declare(nowait=nowait, passive=False, channel=channel)
2024-06-07 20:31:07 django-1          |   File "/usr/local/lib/python3.8/site-packages/kombu/entity.py", line 648, in queue_declare
2024-06-07 20:31:07 django-1          |     ret = channel.queue_declare(
2024-06-07 20:31:07 django-1          |   File "/usr/local/lib/python3.8/site-packages/amqp/channel.py", line 1148, in queue_declare
2024-06-07 20:31:07 django-1          |     return queue_declare_ok_t(*self.wait(
2024-06-07 20:31:07 django-1          |   File "/usr/local/lib/python3.8/site-packages/amqp/abstract_channel.py", line 88, in wait
2024-06-07 20:31:07 django-1          |     self.connection.drain_events(timeout=timeout)
2024-06-07 20:31:07 django-1          |   File "/usr/local/lib/python3.8/site-packages/amqp/connection.py", line 508, in drain_events
2024-06-07 20:31:07 django-1          |     while not self.blocking_read(timeout):
2024-06-07 20:31:07 django-1          |   File "/usr/local/lib/python3.8/site-packages/amqp/connection.py", line 514, in blocking_read
2024-06-07 20:31:07 django-1          |     return self.on_inbound_frame(frame)
2024-06-07 20:31:07 django-1          |   File "/usr/local/lib/python3.8/site-packages/amqp/method_framing.py", line 55, in on_frame
2024-06-07 20:31:07 django-1          |     callback(channel, method_sig, buf, None)
2024-06-07 20:31:07 django-1          |   File "/usr/local/lib/python3.8/site-packages/amqp/connection.py", line 520, in on_inbound_method
2024-06-07 20:31:07 django-1          |     return self.channels[channel_id].dispatch_method(
2024-06-07 20:31:07 django-1          |   File "/usr/local/lib/python3.8/site-packages/amqp/abstract_channel.py", line 145, in dispatch_method
2024-06-07 20:31:07 django-1          |     listener(*args)
2024-06-07 20:31:07 django-1          |   File "/usr/local/lib/python3.8/site-packages/amqp/channel.py", line 279, in _on_close
2024-06-07 20:31:07 django-1          |     raise error_for_code(
2024-06-07 20:31:07 django-1          | amqp.exceptions.PreconditionFailed: Queue.declare: (406) PRECONDITION_FAILED - inequivalent arg 'x-max-priority' for queue 'compute-worker' in vhost '0ada9d07-8b68-4d8c-bdf3-d9e8dc02054a': received none but current is the value '10' of type 'signedint'

@cjh1
Copy link
Contributor Author

cjh1 commented Jun 7, 2024

I can take a look at this, it looks like the original value for x-max-priority is not being picked up for some reason.

@cjh1
Copy link
Contributor Author

cjh1 commented Jun 10, 2024

@liviust I chatted with @ihsaan-ullah and the fact that the running submission is not cancelling is expected behavior/a missing feature. The submission will only be cancelled if it hasn't started running yet. I am not able to recreate the error you are seeing with a custom queue, is it consistent? If so can you provide a more detailed description of the steps?

@liviust
Copy link

liviust commented Jun 21, 2024

@liviust I chatted with @ihsaan-ullah and the fact that the running submission is not cancelling is expected behavior/a missing feature. The submission will only be cancelled if it hasn't started running yet. I am not able to recreate the error you are seeing with a custom queue, is it consistent? If so can you provide a more detailed description of the steps?

@cjh1 The error still persists on the develop and main branches.

2024-06-21 18:47:56 django-1          | Task competitions.tasks._run_submission[c4692dd3-77dc-441f-a251-4dd1972f247e] raised unexpected: PreconditionFailed(406, "PRECONDITION_FAILED - inequivalent arg 'x-max-priority' for queue 'compute-worker' in vhost 'e69f7998-09ad-4f5e-a9f9-b2bb641a9845': received none but current is the value '10' of type 'signedint'", (50, 10), 'Queue.declare')
2024-06-21 18:47:56 django-1          | Traceback (most recent call last):
2024-06-21 18:47:56 django-1          |   File "/usr/local/lib/python3.8/site-packages/celery/app/trace.py", line 382, in trace_task
2024-06-21 18:47:56 django-1          |     R = retval = fun(*args, **kwargs)
2024-06-21 18:47:56 django-1          |   File "/app/src/apps/competitions/tasks.py", line 319, in _run_submission
2024-06-21 18:47:56 django-1          |     _send_to_compute_worker(submission, is_scoring)
2024-06-21 18:47:56 django-1          |   File "/app/src/apps/competitions/tasks.py", line 200, in _send_to_compute_worker
2024-06-21 18:47:56 django-1          |     task = celery_app.send_task(
2024-06-21 18:47:56 django-1          |   File "/usr/local/lib/python3.8/site-packages/celery/app/base.py", line 745, in send_task
2024-06-21 18:47:56 django-1          |     amqp.send_task_message(P, name, message, **options)
2024-06-21 18:47:56 django-1          |   File "/usr/local/lib/python3.8/site-packages/celery/app/amqp.py", line 543, in send_task_message
2024-06-21 18:47:56 django-1          |     ret = producer.publish(
2024-06-21 18:47:56 django-1          |   File "/usr/local/lib/python3.8/site-packages/kombu/messaging.py", line 178, in publish
2024-06-21 18:47:56 django-1          |     return _publish(
2024-06-21 18:47:56 django-1          |   File "/usr/local/lib/python3.8/site-packages/kombu/connection.py", line 533, in _ensured
2024-06-21 18:47:56 django-1          |     return fun(*args, **kwargs)
2024-06-21 18:47:56 django-1          |   File "/usr/local/lib/python3.8/site-packages/kombu/messaging.py", line 194, in _publish
2024-06-21 18:47:56 django-1          |     [maybe_declare(entity) for entity in declare]
2024-06-21 18:47:56 django-1          |   File "/usr/local/lib/python3.8/site-packages/kombu/messaging.py", line 194, in <listcomp>
2024-06-21 18:47:56 django-1          |     [maybe_declare(entity) for entity in declare]
2024-06-21 18:47:56 django-1          |   File "/usr/local/lib/python3.8/site-packages/kombu/messaging.py", line 102, in maybe_declare
2024-06-21 18:47:56 django-1          |     return maybe_declare(entity, self.channel, retry, **retry_policy)
2024-06-21 18:47:56 django-1          |   File "/usr/local/lib/python3.8/site-packages/kombu/common.py", line 121, in maybe_declare
2024-06-21 18:47:56 django-1          |     return _maybe_declare(entity, channel)
2024-06-21 18:47:56 django-1          |   File "/usr/local/lib/python3.8/site-packages/kombu/common.py", line 161, in _maybe_declare
2024-06-21 18:47:56 django-1          |     entity.declare(channel=channel)
2024-06-21 18:47:56 django-1          |   File "/usr/local/lib/python3.8/site-packages/kombu/entity.py", line 611, in declare
2024-06-21 18:47:56 django-1          |     self._create_queue(nowait=nowait, channel=channel)
2024-06-21 18:47:56 django-1          |   File "/usr/local/lib/python3.8/site-packages/kombu/entity.py", line 620, in _create_queue
2024-06-21 18:47:56 django-1          |     self.queue_declare(nowait=nowait, passive=False, channel=channel)
2024-06-21 18:47:56 django-1          |   File "/usr/local/lib/python3.8/site-packages/kombu/entity.py", line 648, in queue_declare
2024-06-21 18:47:56 django-1          |     ret = channel.queue_declare(
2024-06-21 18:47:56 django-1          |   File "/usr/local/lib/python3.8/site-packages/amqp/channel.py", line 1148, in queue_declare
2024-06-21 18:47:56 django-1          |     return queue_declare_ok_t(*self.wait(
2024-06-21 18:47:56 django-1          |   File "/usr/local/lib/python3.8/site-packages/amqp/abstract_channel.py", line 88, in wait
2024-06-21 18:47:56 django-1          |     self.connection.drain_events(timeout=timeout)
2024-06-21 18:47:56 django-1          |   File "/usr/local/lib/python3.8/site-packages/amqp/connection.py", line 508, in drain_events
2024-06-21 18:47:56 django-1          |     while not self.blocking_read(timeout):
2024-06-21 18:47:56 django-1          |   File "/usr/local/lib/python3.8/site-packages/amqp/connection.py", line 514, in blocking_read
2024-06-21 18:47:56 django-1          |     return self.on_inbound_frame(frame)
2024-06-21 18:47:56 django-1          |   File "/usr/local/lib/python3.8/site-packages/amqp/method_framing.py", line 55, in on_frame
2024-06-21 18:47:56 django-1          |     callback(channel, method_sig, buf, None)
2024-06-21 18:47:56 django-1          |   File "/usr/local/lib/python3.8/site-packages/amqp/connection.py", line 520, in on_inbound_method
2024-06-21 18:47:56 django-1          |     return self.channels[channel_id].dispatch_method(
2024-06-21 18:47:56 django-1          |   File "/usr/local/lib/python3.8/site-packages/amqp/abstract_channel.py", line 145, in dispatch_method
2024-06-21 18:47:56 django-1          |     listener(*args)
2024-06-21 18:47:56 django-1          |   File "/usr/local/lib/python3.8/site-packages/amqp/channel.py", line 279, in _on_close
2024-06-21 18:47:56 django-1          |     raise error_for_code(
2024-06-21 18:47:56 django-1          | amqp.exceptions.PreconditionFailed: Queue.declare: (406) PRECONDITION_FAILE

Here are the steps to reproduce:

  1. Create a new queue;
  2. Create a cpu worker using the Dockerfile in the project. Mine is on the same machine as the server;
  3. Send a job on the queue created at step 1. The moment you press submit from the platform, immediately press cancel to cancel it.
  4. Send a new job on the same queue. You will get the above error.

From this point forwards, all the submission will get stuck with that error.

If you don't cancel the submission, everything works as expected. If you do cancel it, even if it does still compute the results and returns them back, you will get the above error.

@liviust
Copy link

liviust commented Jun 24, 2024

It could be linked to this: #1445.

@cjh1
Copy link
Contributor Author

cjh1 commented Jun 24, 2024

I am afraid, I am still unable to recreate this in my dev environment. When I can the submission I see the following in the worker:

[2024-06-24 16:35:38,086: INFO/MainProcess] Terminating 3c9c7c49-506f-4520-bad4-2b5d6830210a (Signals.SIGTERM)
[2024-06-24 16:35:38,086: INFO/ForkPoolWorker-1] No return code from Process. Killing it
b79590ea-ea88-4719-bcfd-b53835f803aa
[2024-06-24 16:35:48,353: INFO/ForkPoolWorker-1] Kill process returned 0
[2024-06-24 16:35:48,353: INFO/ForkPoolWorker-1] [exited with None]
[2024-06-24 16:35:48,353: INFO/ForkPoolWorker-1] Program finished
[2024-06-24 16:35:48,353: INFO/ForkPoolWorker-1] CODALAB_IGNORE_CLEANUP_STEP mode enabled, ignoring clean up of: /codabench/tmpa4htnp1h
[2024-06-24 16:35:48,366: ERROR/MainProcess] Task handler raised error: Terminated(15)
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/billiard/pool.py", line 1774, in _set_terminated
    raise Terminated(-(signum or 0))
billiard.exceptions.Terminated: 15

This seems to show the submission being cancelled.

@cjh1
Copy link
Contributor Author

cjh1 commented Jun 24, 2024

@liviust Not sure if you can try adding the following debug patch to your setup, to see if we can figure out what is going on?

diff --git a/src/apps/competitions/tasks.py b/src/apps/competitions/tasks.py
index facc804..1f61b3f 100644
--- a/src/apps/competitions/tasks.py
+++ b/src/apps/competitions/tasks.py
@@ -195,6 +195,8 @@ def _send_to_compute_worker(submission, is_scoring):
         # variable above
         celery_app = app_or_default()
         with celery_app.connection() as new_connection:
+            print("task_queues")
+            print(celery_app.conf["task_queues"][0].queue_arguments)
             new_connection.virtual_host = str(submission.phase.competition.queue.vhost)
             task = celery_app.send_task(
                 'compute_worker_run',

If you can apply that and then provide the logs for the site_worker after you have run through the cancel steps, that might help find out what is going on.

@liviust
Copy link

liviust commented Jun 24, 2024

@liviust Not sure if you can try adding the following debug patch to your setup, to see if we can figure out what is going on?

diff --git a/src/apps/competitions/tasks.py b/src/apps/competitions/tasks.py
index facc804..1f61b3f 100644
--- a/src/apps/competitions/tasks.py
+++ b/src/apps/competitions/tasks.py
@@ -195,6 +195,8 @@ def _send_to_compute_worker(submission, is_scoring):
         # variable above
         celery_app = app_or_default()
         with celery_app.connection() as new_connection:
+            print("task_queues")
+            print(celery_app.conf["task_queues"][0].queue_arguments)
             new_connection.virtual_host = str(submission.phase.competition.queue.vhost)
             task = celery_app.send_task(
                 'compute_worker_run',

If you can apply that and then provide the logs for the site_worker after you have run through the cancel steps, that might help find out what is going on.

Hi @cjh1,

  1. I submitted a run without any intervention. Everything went as expected. No errors, the submission was successful.
  2. I submitted a run and cancelled it. In the worker logs I have:
2024-06-24 20:33:14 [2024-06-24 17:33:14,595: INFO/MainProcess] Terminating e02b350e-cab9-4084-af41-1003ec5ea7ac (Signals.SIGTERM)
2024-06-24 20:33:14 [2024-06-24 17:33:14,597: INFO/ForkPoolWorker-1] Destroying submission temp dir: /codabench/tmppm9xe3fk
2024-06-24 20:33:14 [2024-06-24 17:33:14,616: ERROR/MainProcess] Task handler raised error: Terminated(15)
2024-06-24 20:33:14 Traceback (most recent call last):
2024-06-24 20:33:14   File "/usr/local/lib/python3.8/site-packages/billiard/pool.py", line 1774, in _set_terminated
2024-06-24 20:33:14     raise Terminated(-(signum or 0))
2024-06-24 20:33:14 billiard.exceptions.Terminated: 15
  1. I have submitted another run after step 2, and I get the following output in django container. Here is the output:
2024-06-24 20:39:36 django-1          | task_queues
2024-06-24 20:39:36 django-1          | Task competitions.tasks._run_submission[669709f9-e1ff-4517-9009-d2d12599e960] raised unexpected: TypeError("'NoneType' object is not subscriptable")
2024-06-24 20:39:36 django-1          | Traceback (most recent call last):
2024-06-24 20:39:36 django-1          |   File "/usr/local/lib/python3.8/site-packages/celery/app/trace.py", line 382, in trace_task
2024-06-24 20:39:36 django-1          |     R = retval = fun(*args, **kwargs)
2024-06-24 20:39:36 django-1          |   File "/app/src/apps/competitions/tasks.py", line 321, in _run_submission
2024-06-24 20:39:36 django-1          |     _send_to_compute_worker(submission, is_scoring)
2024-06-24 20:39:36 django-1          |   File "/app/src/apps/competitions/tasks.py", line 200, in _send_to_compute_worker
2024-06-24 20:39:36 django-1          |     print(celery_app.conf["task_queues"][0].queue_arguments)
2024-06-24 20:39:36 django-1          | TypeError: 'NoneType' object is not subscriptable

There are no logs in the worker.

@cjh1
Copy link
Contributor Author

cjh1 commented Jun 25, 2024

@liviust Thanks, that is helpful.

@Didayolo
Copy link
Collaborator

  • Why sometimes we call app.conf.task_queues, sometimes app.task_queues and sometimes app.conf["task_queues"]? Is there a reason for that?

  • Also, could this PR introduced the bug, or did it just revealed it? Why are we using global variables in celery_config.py?

  • We should investigate also the code of the cancellation, to see if it re-initialize the celery config or something.

@cjh1
Copy link
Contributor Author

cjh1 commented Jun 25, 2024

@Didayolo app.conf.task_queues and app.conf["task_queues"] are the same, the Celery config can be accessed as attributes or as a dict. However, I think app.task_queues may be the cause of the problem ( I just found it as you posted this ! ).

@liviust Could you try the following patch?

diff --git a/src/celery_config.py b/src/celery_config.py
index 7ef8f81..7606147 100644
--- a/src/celery_config.py
+++ b/src/celery_config.py
@@ -33,6 +33,6 @@ def app_for_vhost(vhost):
         django_settings = copy.copy(settings)
         django_settings.CELERY_BROKER_URL = broker_url
         vhost_app.config_from_object(django_settings, namespace='CELERY')
-        vhost_app.task_queues = app.conf.task_queues
+        vhost_app.conf.task_queues = app.conf.task_queues
         _vhost_apps[vhost] = vhost_app
     return _vhost_apps[vhost]

@liviust
Copy link

liviust commented Jun 25, 2024

@cjh1

Sure. I have updated the celery_config.py script. Here is what I tried with this patch:

  1. Uploaded a submission with no intervention. The submission was successful.
  2. Uploaded a submission and then canceled it. There are two behaviors that I observed:
    a) If the submission is canceled before it enters the "Scoring" status (when it is still in "Preparing" status), it is successfully canceled. This action is reflected in the worker's container logs.
    b) If the submission has already entered the "Scoring" status, the cancel request is not registered. There is no log of the cancel request in the worker's container, and the submission results are still returned in Codabench, no matter how many times you press cancel.
  3. Uploaded a new submission after successfully canceling the previous one (step 2a). The new submission was successfully processed, and the previous error disappeared.
  4. Uploaded a new submission after the previous submission (step 2b). The new submission was successfully processed.

If the two behaviors (a and b) in step 2 are expected, then, it seems that the patch fixed the issue.

@cjh1
Copy link
Contributor Author

cjh1 commented Jun 25, 2024

Thanks @liviust for testing this out. As far as I understand (a and b) in step 2 are the current expected behavour, @Didayolo and @ihsaan-ullah can confirm this.

@ihsaan-ullah
Copy link
Collaborator

ihsaan-ullah commented Jun 25, 2024

Thanks @liviust for testing this out. As far as I understand (a and b) in step 2 are the current expected behavour, @Didayolo and @ihsaan-ullah can confirm this.

From the code I don't get how scoring submission cannot be cancelled and running submission can be. Previously I believed that a submission once submitted to worker cannot be cancelled but the code says otherwise and also liviust comments

def cancel(self, status=CANCELLED):

There is a commented line

# If a custom queue is set, we need to fetch the appropriate celery app

I think this should be checked

@cjh1
Copy link
Contributor Author

cjh1 commented Jun 25, 2024

# If a custom queue is set, we need to fetch the appropriate celery app

I think this should be checked
Yes, this was the code that was added to fix cancellation of submissions on a custom queue?

@ihsaan-ullah
Copy link
Collaborator

Yes, this was the code that was added to fix cancellation of submissions on a custom queue?

sorry for the confusion, I thought this was comment for a todo

@Didayolo
Copy link
Collaborator

@liviust @cjh1

Thank you very much for your help. I'll test this patch and incorporate it.

I confirm that this is the current expected behavior (being able to cancel submissions only before they start being computed). Once a worker is working, it does not listen to any cancellation signal. That would be a nice feature for the future (see #872).

@cjh1
Copy link
Contributor Author

cjh1 commented Jun 25, 2024

@Didayolo Should I push a PR with that change?

@cjh1
Copy link
Contributor Author

cjh1 commented Jun 25, 2024

I see you have already raised a PR

@liviust
Copy link

liviust commented Jun 25, 2024

What is strange now is that I encounter the same behavior regarding cancellation also for the default queue. If I remember correctly, one could cancel the default queue at any stage. I have also tested this on the Codabench website using the default queue, and it canceled successfully, but on the develop and master branches, it doesn't. It has the same behavior as the custom queue.

@Didayolo
Copy link
Collaborator

@liviust What do you mean it cancels successfully? You are able to interrupt a running submission?

https://codabench.org/ is using the latest master branch.

@Didayolo Didayolo mentioned this pull request Jun 26, 2024
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants