Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[teamd] Porting teamd patch which solves the race condition between RTM_NEWLINK and port add #2815

Merged
merged 1 commit into from
Apr 30, 2019

Conversation

yxieca
Copy link
Contributor

@yxieca yxieca commented Apr 23, 2019

- What I did

Porting teamd patch which solves the race condition between RTM_NEWLINK and port add.

Removed previous patches from us: 0006, 0008, 0009.

Signed-off-by: Ying Xie ying.xie@microsoft.com

- How to verify it
continuous warm-reboot.

@yxieca
Copy link
Contributor Author

yxieca commented Apr 23, 2019

@jipanyang here is the RCA you asked for when I checked in patch 0009. I think patch 0006, 0008, 0009 and 0011 incrementally solved the issue and this should be the last piece of the puzzle.

@jipanyang
Copy link
Collaborator

I have not gone through code flow thoroughly. Likely I missed some key points.

When adding member, there is processing to fetch the member attribute via netlink synchronously (team_refresh() ? All the member attribute info should be available at kernel at that time based on the upper layer dependency control.

Why need to have dependency on RTM_NEWLINK event? Would the previous patches be unnecessary but just adding some complexity, if team_refresh() change here works?

@yxieca
Copy link
Contributor Author

yxieca commented Apr 24, 2019

I think they are fixing timing issue too. My previous 2 patches are fixing the timing issue with sequence of (1) create port, (2) RMT_NEWLINK but initial update failed. This last change was fixing the time sequence of (1) RTM_NEWLINK but initial update failed, (2) create port.

@pavel-shirshov
Copy link
Contributor

Have you checked this patch?
jpirko/libteam@45912de

@yxieca
Copy link
Contributor Author

yxieca commented Apr 28, 2019

@pavel-shirshov The change you identified has passed 200-ish iterations of tests where I removed patch 0006, 0007, 0008 and the patch in this PR. It looks like this change could be the fix. Thanks for looking into it!

Port libteam patch which fixes the race condition we observed during
warm reboot.

Remove early patches: 0006, 0008, 0009.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
@yxieca yxieca changed the title [teamd] refresh port information earlier [teamd] Porting teamd patch which solves the race condition between LTM_NEWLINK and port add Apr 29, 2019
@yxieca yxieca changed the title [teamd] Porting teamd patch which solves the race condition between LTM_NEWLINK and port add [teamd] Porting teamd patch which solves the race condition between RTM_NEWLINK and port add Apr 29, 2019
@yxieca yxieca requested a review from stcheng April 29, 2019 15:55
@yxieca yxieca merged commit b9ddae8 into sonic-net:master Apr 30, 2019
@yxieca yxieca deleted the teamd branch April 30, 2019 00:56
yxieca added a commit that referenced this pull request Apr 30, 2019
Port libteam patch which fixes the race condition we observed during
warm reboot.

Remove early patches: 0006, 0008, 0009.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
MichelMoriniaux pushed a commit to criteo-forks/sonic-buildimage that referenced this pull request May 28, 2019
…t#2815)

Port libteam patch which fixes the race condition we observed during
warm reboot.

Remove early patches: 0006, 0008, 0009.

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants