Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RedisCommandException for "Command cannot be issued to a replica" is difficult to handle #2016

Closed
AdamOutcalt opened this issue Feb 26, 2022 · 5 comments · Fixed by #2191
Closed

Comments

@AdamOutcalt
Copy link

Hey all,

I work on a large service using Azure Redis Premium. Our normal pattern for using Redis is something like:

try{
db.StringGet()
}
catch (RedisException)
...
catch (TimeoutException)
...

This works pretty well, but we have some low level of failures that look like this:

RedisCommandException: Exception: 'Command cannot be issued to a slave: UNLINK '

(In newer versions of the SDK, it says "Command cannot be issued to a replica" instead.)

This is a reasonable error, but I am not intentionally targeting a replica. I assume Azure has failed over from under me, changing the recipient to a replica, while it was primary when the command was sent.

I would like to just treat this as a transient failure, log it, and move on with life as though it were a timeout. This is a little tricky though because the exception is RedisCommandException (which doesn't inherit from RedisException). I am hesitant to catch and handle RedisCommandException, because I understand that it usually means the command is bad, like maybe a bad LUA script.

Basically, I'd like some way to differentiate this exception so I can treat it as transient.

My suggestion is to make this case a RedisException or RedisServerException, but I'm open to other resolutions, including being told that I should be catching some different set of exceptions as transient.

Thanks,
Adam

@NickCraver
Copy link
Collaborator

Which version of the library are you using currently?

@AdamOutcalt
Copy link
Author

2.2.79 (currently shows "replica" in the text)

@hao9638422
Copy link

Same issue here:
StackExchange.Redis.RedisCommandException: Command cannot be issued to a replica: SETEX xxx

@pseidel-kcf
Copy link
Contributor

pseidel-kcf commented Jul 15, 2022

Having this issue with UNLINK on AWS clustered Elasticache 6.x using StackExchange 2.6.45 (2.6.48 in our staging environment does not appear to resolve the issue). Explicitly adding DemandMaster as a parameter to KeyDeleteAsync mitigates the issue.

StackExchange.Redis.RedisCommandException: Command cannot be issued to a replica: UNLINK xxx
   at StackExchange.Redis.ConnectionMultiplexer.PrepareToPushMessageToBridge[T](Message message, ResultProcessor`1 processor, IResultBox`1 resultBox, ServerEndPoint& server) in /_/src/StackExchange.Redis/ConnectionMultiplexer.cs:line 1742
   at StackExchange.Redis.ConnectionMultiplexer.TryPushMessageToBridgeAsync[T](Message message, ResultProcessor`1 processor, IResultBox`1 resultBox, ServerEndPoint& server) in /_/src/StackExchange.Redis/ConnectionMultiplexer.cs:line 1788
   at StackExchange.Redis.ConnectionMultiplexer.ExecuteAsyncImpl[T](Message message, ResultProcessor`1 processor, Object state, ServerEndPoint server) in /_/src/StackExchange.Redis/ConnectionMultiplexer.cs:line 1952
   at StackExchange.Redis.RedisBase.ExecuteAsync[T](Message message, ResultProcessor`1 processor, ServerEndPoint server) in /_/src/StackExchange.Redis/RedisBase.cs:line 54
   at StackExchange.Redis.RedisDatabase.KeyDeleteAsync(RedisKey key, CommandFlags flags) in /_/src/StackExchange.Redis/RedisDatabase.cs:line 759

@slorello89
Copy link
Collaborator

slorello89 commented Jul 15, 2022

@NickCraver

Looks like the pattern of checking features and passing in the server might be proving to be a bit problematic.

Looks like the issue is propagating from here

When the feature check is done here, there's nothing that allows the writeability of the command to influence server-selection, so it's just a crap-shoot whether you get back a master or replica to pass further down the line.

NickCraver added a commit that referenced this issue Aug 2, 2022
@NickCraver - This will fix #2016 

The issue is that when `GetFeatures` does the server selection, it uses `PING` as the command it's asking about, so when the multiplexer responds with a server, it responds with any server. Of course, then when it goes to execute the command, it's been explicitly handed a server, so it honors that choice but checks it to make sure it's a valid server to send the command to, so if you're executing a write command and `GetFeatuers` just so happened to output a read-only replica, when the multiplexer does the 'is this a valid server?' check, it determines it is not and propagates the error. that's what causes the blowup. By passing in the RedisCommand to `GetFeatures`, we allow it to make an informed choice of server for that Redis Command.

The other option is to simply not pass the server on down the line when we we've done a feature check, and allow the muxer to make it's own decision, this would cause an extra run of `Select` which the current pattern e.g. see [sort](https://github.com/StackExchange/StackExchange.Redis/blob/main/src/StackExchange.Redis/RedisDatabase.cs#L1816) tries to avoid

One weird case, with SORT/SORT_RO, if the RO command is what we prefer to run (because we don't have a destination key to output it to), and `SORT_RO` is not available (because they aren't on Redis 7+), we cannot trust the output of GetFeatures, so I just set the selected server back to null and let the muxer figure it out down the line.

Co-authored-by: Nick Craver <nrcraver@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants