When the master in a cluster is failed over, the writes fail with InternalError until the next auto discovery cycle #1660

dreijer · 2021-01-05T09:32:50Z

I've been testing the StackExchange.Redis library against a Redis cluster with three shards. My topology consists of three physical nodes each running three Redis instances, i.e. one master and two replicas (for the other two shards) on each physical node, resulting in a total of nine Redis instances. Each of the physical nodes has its own public IP address and each Redis instance.

When the master is failed over to one of its replicas, the library doesn't seem to handle the MOVED response properly on set commands. Specifically, when the library processes the MOVED response as it tries to set a value on the old master (which is now a replica), it correctly updates ServerSelectionStrategy.map for the given hash slot (i.e. it changes the slot in the array to the ServerEndpoint of the new master), but when it tries to re-send the set command, the logic in ServerSelectionStrategy.Select() causes the old master to be chosen again because the ServerEndpoint isn't marked as a master yet:

private ServerEndPoint FindMaster(ServerEndPoint endpoint, RedisCommand command)
{
    int max = 5;
    do
    {
        if (!endpoint.IsReplica && endpoint.IsSelectable(command)) return endpoint;

        endpoint = endpoint.Master;
    } while (endpoint != null && --max != 0);
    return null;
}

Interestingly, ServerSelectionStrategy.Select() has a comment that states that all the entries in the 'map' are masters, which is correct, so why do we need to even call FindMaster() on the node if we could just use it directly?

ServerEndPoint endpoint = arr[slot], testing;
// but: ^^^ is the MASTER slots; if we want a replica, we need to do some thinking

The end result of the current logic is that the set operation ultimately fails with an InternalServer error since the same endpoint is tried twice and it's no longer the master.

The text was updated successfully, but these errors were encountered:

dreijer · 2021-01-18T08:36:54Z

Several potential solutions come to mind, but I'm not an expert in the library and would love some feedback.

One solution could be to add an override flag that will allow the library to write to the ServerEndpoint because we got the MOVED hint, even though it’s still set as a replica, which will make sets work until the next auto discovery will fix everything up. In other words, one could extend the logic in PhysicalBridge.WriteMessageToServerInsideWriteLock to check an additional flag on the following line:
if (isMasterOnly && ServerEndPoint.IsReplica && (ServerEndPoint.ReplicaReadOnly || !ServerEndPoint.AllowReplicaWrites))

Alternatively, we could kick off an auto discovery operation (cluster nodes command) to refresh the map when it looks like a key has moved, but that seems pretty heavy when we’re anyway getting the MOVED hints.

Thoughts?

I'm not 100% sure about this because if there is a MOVED happening (e.g. bad proxy somewhere) this would just continually re-run...but only once every 5 seconds. Overall though, we linger in a bad state retrying moves until a discovery happens today and this could be resolved much faster. Meant to help address #1520, #1660, #2074, and #2020.

Meant to help address #1520, #1660, #2074, and #2020. I'm not 100% sure about this because if there is a MOVED happening (e.g. bad proxy somewhere) this would just continually re-run...but only once every 5 seconds. Overall though, we linger in a bad state retrying moves until a discovery happens today and this could be resolved much faster.

NickCraver mentioned this issue Oct 27, 2022

Cluster: Proactively reconfigure when we hit a MOVED response #2286

Merged

NickCraver linked a pull request Nov 15, 2022 that will close this issue

Cluster: Proactively reconfigure when we hit a MOVED response #2286

Merged

NickCraver closed this as completed in #2286 Nov 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When the master in a cluster is failed over, the writes fail with InternalError until the next auto discovery cycle #1660

When the master in a cluster is failed over, the writes fail with InternalError until the next auto discovery cycle #1660

dreijer commented Jan 5, 2021

dreijer commented Jan 18, 2021

When the master in a cluster is failed over, the writes fail with InternalError until the next auto discovery cycle #1660

When the master in a cluster is failed over, the writes fail with InternalError until the next auto discovery cycle #1660

Comments

dreijer commented Jan 5, 2021

dreijer commented Jan 18, 2021