Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests/robustness: Add safeguards to client and history #15873

Merged
merged 1 commit into from
May 11, 2023

Conversation

serathius
Copy link
Member

@serathius serathius force-pushed the robustness-safeguards branch 2 times, most recently from de3cfb7 to f6af467 Compare May 10, 2023 13:48
Copy link
Member

@chaochn47 chaochn47 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

The CI test was not kicked off due to github outage.

@@ -27,10 +28,15 @@ import (
"go.etcd.io/etcd/tests/v3/robustness/model"
)

// RecordingClient provides a semi etcd client (different interface than
// clientv3.Client) that records all the requests and responses made. Doesn't
// allow for concurrent requests to ensure correct appending to history.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgive me if this is a silly question, still learning... Does using a non standard client for robustness create any risk that the robustness tests will not behave as real world etcd will behave and therefore miss potential issues?

Copy link
Member

@chaochn47 chaochn47 May 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. From my understanding, current k8s/etcd traffic run in robustness test does not support concurrent read/write for each recording client.

The nuance between robustness test and k8s usage is

  1. k8s re-uses the same etcd client per resource type to achieve concurrent r/w.
  2. robustness test uses multiple etcd clients to achieve concurrent r/w.

To conform robustness test to k8s usage requires some re-factoring on appending to history sequentially that might complicates the code base. Could you please comment if there is a plan to do that? @serathius

Copy link
Member Author

@serathius serathius May 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is not a limitation of robustness test, nor k8s client. What I implemented here are safeguards over a naive workaround around limitation of linearizability checking. "There can be only one concurrent request in a request stream". For now I assign to client only one request stream at a time. I could assign each request a unique stream, but that would make visualizations not readable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation! That's really helpful. Code usually tells how but not why.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hoped my comments will help :(
I think the problem is with clash of term client. Etcd uses client as a library to call etcd. For porcupine it's more like a identifier for sequential stream of operations. For example on failed requests we rotate stream id for client, to continue executing requests, because from client perspective we don't know when the requests has finished.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@serathius Please bear me with too many question~

  1. Why you name it as we rotate stream id, looks like h.idProvider.ClientId() always adds up by 1.
  2. What if we keep adding new records into the same stream and what will break?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to correlate between statement and code for other readers.

h.id is updated with the next stream ID. (I guess porcupine calls it client ID but it is actually a stream from visualization).

For porcupine it's more like a identifier for sequential stream of operations. For example on failed requests we rotate stream id for client, to continue executing requests

func (h *AppendableHistory) appendFailed(request EtcdRequest, start time.Duration, err error) {
h.failed = append(h.failed, porcupine.Operation{
ClientId: h.id,
Input: request,
Call: start.Nanoseconds(),
Output: failedResponse(err),
Return: 0, // For failed writes we don't know when request has really finished.
})
// Operations of single client needs to be sequential.
// As we don't know return time of failed operations, all new writes need to be done with new client id.
h.id = h.idProvider.ClientId()
}

Copy link
Member Author

@serathius serathius May 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@serathius Please bear me with too many question~

  1. Why you name it as we rotate stream id, looks like h.idProvider.ClientId() always adds up by 1.

Yep, by rotate I mean pick another unique value. Porcupine requires it to be a counter from 0 to render nicely.

  1. What if we keep adding new records into the same stream and what will break?

I think you might get incorrect result from porcupine linearizer and porcupine visualization will render two blocks overlapping making it not readable. This was a big problem with failed requests that took me a long time to figure out. See anishathalye/porcupine#8 (comment), #14880

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to document answers in #15880

Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a couple of minor comments.

tests/robustness/traffic/client.go Show resolved Hide resolved
tests/robustness/traffic/client.go Show resolved Hide resolved
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
@serathius serathius merged commit 9a92209 into etcd-io:main May 11, 2023
@serathius serathius deleted the robustness-safeguards branch June 15, 2023 20:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants