Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restart SwSS, syncd and dependent services if a critical process in syncd container exits unexpectedly #3534

Merged
merged 2 commits into from
Nov 9, 2019
Merged

Conversation

jleveque
Copy link
Contributor

- What I did

Restart SwSS, syncd and dependent services if a critical process in the syncd container exits unexpectedly

- How I did it

Add the same mechanism I developed for the SwSS service in #2845 to the syncd service. However, in order to cause the SwSS service to also exit and restart in this situation, I developed a docker-wait-any program which the SwSS service uses to wait for either the swss or syncd containers to exit.

- How to verify it

Run etiher sudo pkill -11 <critical_process_in_syncd_container>, and observe that syncd service exits, swss and all dependent services exit, then all of those services start back up.

files/scripts/swss.sh Outdated Show resolved Hide resolved
@lguohan
Copy link
Collaborator

lguohan commented Nov 9, 2019

retest broadcom please

@lguohan lguohan merged commit 85b0de3 into sonic-net:master Nov 9, 2019
@jleveque jleveque deleted the restart_swss_syncd_crash branch November 9, 2019 20:45
zhenggen-xu pushed a commit to zhenggen-xu/sonic-buildimage that referenced this pull request Jan 10, 2020
…cal process in syncd container exits unexpectedly (sonic-net#3534)

Add the same mechanism I developed for the SwSS service in sonic-net#2845 to the syncd service. However, in order to cause the SwSS service to also exit and restart in this situation, I developed a docker-wait-any program which the SwSS service uses to wait for either the swss or syncd containers to exit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants