Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redis connection error when a pipeline service stops #452

Open
JenySadadia opened this issue Jan 10, 2024 · 0 comments
Open

Redis connection error when a pipeline service stops #452

JenySadadia opened this issue Jan 10, 2024 · 0 comments

Comments

@JenySadadia
Copy link
Collaborator

Issue:
When a pipeline service stops, PubSub connection error is observed in API.

Logs:
Staging server logs of api service:

today at 12:27:52INFO:     172.27.0.1:37962 - "POST /latest/subscribe/node HTTP/1.0" 200 OK
today at 12:28:59INFO:     172.27.0.1:47904 - "GET /latest/listen/14344 HTTP/1.0" 200 OK
today at 12:29:02INFO:     172.27.0.1:40144 - "POST /latest/unsubscribe/14344 HTTP/1.0" 200 OK
today at 12:29:02ERROR:    Exception in ASGI application
today at 12:29:02Traceback (most recent call last):

today at 12:29:02  File "/home/kernelci/./api/main.py", line 646, in listen
today at 12:29:02    return await pubsub.listen(sub_id, user.username)
today at 12:29:02  File "/home/kernelci/./api/pubsub.py", line 136, in listen
today at 12:29:02    msg = await sub['redis_sub'].get_message(
today at 12:29:02  File "/home/kernelci/.local/lib/python3.10/site-packages/redis/asyncio/client.py", line 1032, in get_message
today at 12:29:02    response = await self.parse_response(block=(timeout is None), timeout=timeout)
today at 12:29:02  File "/home/kernelci/.local/lib/python3.10/site-packages/redis/asyncio/client.py", line 905, in parse_response
today at 12:29:02    response = await self._execute(
today at 12:29:02  File "/home/kernelci/.local/lib/python3.10/site-packages/redis/asyncio/client.py", line 885, in _execute
today at 12:29:02    return await conn.retry.call_with_retry(
today at 12:29:02  File "/home/kernelci/.local/lib/python3.10/site-packages/redis/asyncio/retry.py", line 62, in call_with_retry
today at 12:29:02    await fail(error)
today at 12:29:02  File "/home/kernelci/.local/lib/python3.10/site-packages/redis/asyncio/client.py", line 874, in _disconnect_raise_connect
today at 12:29:02    raise error
today at 12:29:02  File "/home/kernelci/.local/lib/python3.10/site-packages/redis/asyncio/retry.py", line 59, in call_with_retry
today at 12:29:02    return await do()
today at 12:29:02  File "/home/kernelci/.local/lib/python3.10/site-packages/redis/asyncio/connection.py", line 502, in read_response
today at 12:29:02    response = await self._parser.read_response(
today at 12:29:02  File "/home/kernelci/.local/lib/python3.10/site-packages/redis/_parsers/resp2.py", line 82, in read_response
today at 12:29:02    response = await self._read_response(disable_decoding=disable_decoding)
today at 12:29:02  File "/home/kernelci/.local/lib/python3.10/site-packages/redis/_parsers/resp2.py", line 90, in _read_response
today at 12:29:02    raw = await self._readline()
today at 12:29:02  File "/home/kernelci/.local/lib/python3.10/site-packages/redis/_parsers/base.py", line 221, in _readline
today at 12:29:02    raise ConnectionError(SERVER_CLOSED_CONNECTION_ERROR)
today at 12:29:02redis.exceptions.ConnectionError: Connection closed by server.

Root cause:
Basically when a pipeline service starts, it first subscribes to a channel (/subscribe) and then starts listening (/listen) to it.
To properly shut down pubsub connections, I added logic to close the connection in the unsubscribe endpoint here https://github.com/kernelci/kernelci-api/blob/main/api/pubsub.py#L117.
So, when a service stops, it unsubscribes and API closes the pubsub connection. It causes a connection error for sub.get_message in the pubsub.listen as the service was still listening to the subscription.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant