-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tests/robustness: Add safeguards to client and history #15873
Conversation
de3cfb7
to
f6af467
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
The CI test was not kicked off due to github outage.
@@ -27,10 +28,15 @@ import ( | |||
"go.etcd.io/etcd/tests/v3/robustness/model" | |||
) | |||
|
|||
// RecordingClient provides a semi etcd client (different interface than | |||
// clientv3.Client) that records all the requests and responses made. Doesn't | |||
// allow for concurrent requests to ensure correct appending to history. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgive me if this is a silly question, still learning... Does using a non standard client for robustness create any risk that the robustness tests will not behave as real world etcd will behave and therefore miss potential issues?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. From my understanding, current k8s/etcd traffic run in robustness test does not support concurrent read/write for each recording client.
The nuance between robustness test and k8s usage is
- k8s re-uses the same etcd client per resource type to achieve concurrent r/w.
- robustness test uses multiple etcd clients to achieve concurrent r/w.
To conform robustness test to k8s usage requires some re-factoring on appending to history sequentially that might complicates the code base. Could you please comment if there is a plan to do that? @serathius
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is not a limitation of robustness test, nor k8s client. What I implemented here are safeguards over a naive workaround around limitation of linearizability checking. "There can be only one concurrent request in a request stream". For now I assign to client only one request stream at a time. I could assign each request a unique stream, but that would make visualizations not readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation! That's really helpful. Code usually tells how but not why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hoped my comments will help :(
I think the problem is with clash of term client. Etcd uses client as a library to call etcd. For porcupine it's more like a identifier for sequential stream of operations. For example on failed requests we rotate stream id for client, to continue executing requests, because from client perspective we don't know when the requests has finished.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@serathius Please bear me with too many question~
- Why you name it as
we rotate stream id
, looks likeh.idProvider.ClientId()
always adds up by 1. - What if we keep adding new records into the same stream and what will break?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to correlate between statement and code for other readers.
h.id
is updated with the next stream ID. (I guess porcupine calls it client ID but it is actually a stream from visualization).
For porcupine it's more like a identifier for sequential stream of operations. For example on failed requests we rotate stream id for client, to continue executing requests
etcd/tests/robustness/model/history.go
Lines 335 to 346 in 7c68be4
func (h *AppendableHistory) appendFailed(request EtcdRequest, start time.Duration, err error) { | |
h.failed = append(h.failed, porcupine.Operation{ | |
ClientId: h.id, | |
Input: request, | |
Call: start.Nanoseconds(), | |
Output: failedResponse(err), | |
Return: 0, // For failed writes we don't know when request has really finished. | |
}) | |
// Operations of single client needs to be sequential. | |
// As we don't know return time of failed operations, all new writes need to be done with new client id. | |
h.id = h.idProvider.ClientId() | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@serathius Please bear me with too many question~
- Why you name it as
we rotate stream id
, looks likeh.idProvider.ClientId()
always adds up by 1.
Yep, by rotate I mean pick another unique value. Porcupine requires it to be a counter from 0 to render nicely.
- What if we keep adding new records into the same stream and what will break?
I think you might get incorrect result from porcupine linearizer and porcupine visualization will render two blocks overlapping making it not readable. This was a big problem with failed requests that took me a long time to figure out. See anishathalye/porcupine#8 (comment), #14880
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried to document answers in #15880
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a couple of minor comments.
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
f6af467
to
962e150
Compare
cc @ahrtr @ptabor