Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sai_failure_dump]Invoking dump during SAI failure #2644

Merged
merged 6 commits into from
Feb 8, 2023

Conversation

dgsudharsan
Copy link
Collaborator

@dgsudharsan dgsudharsan commented Jan 27, 2023

HLD: sonic-net/SONiC#1212
Depends on sonic-net/sonic-sairedis#1198
What I did
Added logic to invoke SAI failure dump during any SAI programming failure before invoking abort.

Why I did it
To collect necessary dumps in problem state in syncd before abort is called and all processes restarts

How I verified it
Manual verification. Added UT to cover abort scenario as well.

Details if related

@dgsudharsan
Copy link
Collaborator Author

/azpw run Azure.sonic-swss

@mssonicbld
Copy link
Collaborator

/AzurePipelines run Azure.sonic-swss

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@dgsudharsan
Copy link
Collaborator Author

/azpw run Azure.sonic-swss

@mssonicbld
Copy link
Collaborator

/AzurePipelines run Azure.sonic-swss

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@dgsudharsan
Copy link
Collaborator Author

@prsunny I have added tests to cover all lines except main file (which cannot be included) but still the coverage throws error. I feel there is some issue with coverage here. Can you please help?

@prsunny prsunny merged commit 44ea6a0 into sonic-net:master Feb 8, 2023
@prsunny
Copy link
Collaborator

prsunny commented Feb 8, 2023

Tests are added for coverage but for some reason, it is not showing in the tool. Skipping the check as the changes are in the failure path.

dgsudharsan added a commit to dgsudharsan/sonic-buildimage that referenced this pull request Feb 8, 2023
Update sonic-swss submodule pointer to include the following:
* 44ea6a0 [sai_failure_dump]Invoking dump during SAI failure ([sonic-net#2644](sonic-net/sonic-swss#2644))
* 065a471 [hash]: Add UT infra. ([sonic-net#2660](sonic-net/sonic-swss#2660))
* 9597eb7 [autoneg]Fixing adv interface types to be set when AN is disabled ([sonic-net#2638](sonic-net/sonic-swss#2638))

Signed-off-by: dgsudharsan <sudharsand@nvidia.com>
dgsudharsan added a commit to dgsudharsan/sonic-swss that referenced this pull request Feb 9, 2023
* [sai_failure_dump]Invoking dump during SAI failure
liat-grozovik pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Feb 9, 2023
Update sonic-swss submodule pointer to include the following:
* 44ea6a0 [sai_failure_dump]Invoking dump during SAI failure ([#2644](sonic-net/sonic-swss#2644))
* 065a471 [hash]: Add UT infra. ([#2660](sonic-net/sonic-swss#2660))
* 9597eb7 [autoneg]Fixing adv interface types to be set when AN is disabled ([#2638](sonic-net/sonic-swss#2638))

Signed-off-by: dgsudharsan <sudharsand@nvidia.com>
prsunny pushed a commit that referenced this pull request Feb 9, 2023
* [sai_failure_dump]Invoking dump during SAI failure
* Added logic to invoke SAI failure dump during any SAI programming failure before invoking abort.
@yxieca
Copy link
Contributor

yxieca commented Feb 9, 2023

@dgsudharsan please help create a separate PR for 202205 branch.

@dgsudharsan
Copy link
Collaborator Author

@yxieca I already created and its merged #2661. Can you please update submodule?

StormLiangMS pushed a commit that referenced this pull request Feb 10, 2023
* [sai_failure_dump]Invoking dump during SAI failure
prsunny pushed a commit that referenced this pull request Feb 16, 2023
*Merge remote-tracking branch 'upstream/master' into dash (#2663)
* Modify coppmgr mergeConfig to support preserving copp tables through reboot. (#2548)
* Avoid aborting orchagent when setting TUNNEL attributes (#2591)
* Handle Mac address 'none' (#2593)
* Increase diff coverage to 80% (#2599)
* Add missing parameter to on_switch_shutdown_request method. (#2567)
* Add ZMQ based ProducerStateTable and CustomerStateTable.
* Revert "[voq][chassis]Add show fabric counters port/queue commands (#2522)" (#2611)
* Added new attributes for Vnet and Vxlan ecmp configurations. (#2584)
* added support for monitoring, primary and adv_prefix and overlay_dmac.
* [routesync] Fix for stale dynamic neighbor (#2553)
* [MuxOrch] Enabling neighbor when adding in active state (#2601)
* Changed the BFD default detect multiplier to 10x (#2614)
* Remove TODO comments that are no longer relevant (#2622)
* Fix issue #13341 ARP entry can be out of sync between kernel and APPL_DB if multiple updates are received from RTNL (#2619)
* [refactor]Refactoring sai handle status (#2621)
* Vxlan tunnel endpoint custom monitoring APPL DB table. (#2589)
* added support for monitoring, primary and adv_prefix. changed filter_mac to overlay_dmac
* Data Structures and code to write APP_DB VNET_MONITOR table entries for custom monitoring of Vxlan tunnel endpoints.
* [bfdorch] add local discriminator to state DB (#2629)
* [acl] Add new ACL key BTH_OPCODE and AETH_SYNDROME  (#2617)
* [voq][chassis] Remove created ports from the default vlan. (#2607)
* [EVPN]Handling race condition when remote VNI arrives before tunnel map entry (#2642)
*Added check in remote VNI add to ensure vxlan tunnel map is created before adding the remote end point.
* [test_mux] add sleep in test_NH (#2648)
* [autoneg]Fixing adv interface types to be set when AN is disabled (#2638)
* [hash]: Add UT infra. (#2660)
*Added UT infra for Generic Hash feature
*Aligned PBH tests with Generic Hash UT infra
* [sai_failure_dump]Invoking dump during SAI failure (#2644)
* [ResponsePublisher] add pipeline support  (#2511)
* [dash] Fix compilation issue caused by missing include.
@dgsudharsan dgsudharsan deleted the sai_failure branch March 9, 2023 02:02
theasianpianist pushed a commit to theasianpianist/sonic-swss that referenced this pull request Jul 21, 2023
)

*Merge remote-tracking branch 'upstream/master' into dash (sonic-net#2663)
* Modify coppmgr mergeConfig to support preserving copp tables through reboot. (sonic-net#2548)
* Avoid aborting orchagent when setting TUNNEL attributes (sonic-net#2591)
* Handle Mac address 'none' (sonic-net#2593)
* Increase diff coverage to 80% (sonic-net#2599)
* Add missing parameter to on_switch_shutdown_request method. (sonic-net#2567)
* Add ZMQ based ProducerStateTable and CustomerStateTable.
* Revert "[voq][chassis]Add show fabric counters port/queue commands (sonic-net#2522)" (sonic-net#2611)
* Added new attributes for Vnet and Vxlan ecmp configurations. (sonic-net#2584)
* added support for monitoring, primary and adv_prefix and overlay_dmac.
* [routesync] Fix for stale dynamic neighbor (sonic-net#2553)
* [MuxOrch] Enabling neighbor when adding in active state (sonic-net#2601)
* Changed the BFD default detect multiplier to 10x (sonic-net#2614)
* Remove TODO comments that are no longer relevant (sonic-net#2622)
* Fix issue #13341 ARP entry can be out of sync between kernel and APPL_DB if multiple updates are received from RTNL (sonic-net#2619)
* [refactor]Refactoring sai handle status (sonic-net#2621)
* Vxlan tunnel endpoint custom monitoring APPL DB table. (sonic-net#2589)
* added support for monitoring, primary and adv_prefix. changed filter_mac to overlay_dmac
* Data Structures and code to write APP_DB VNET_MONITOR table entries for custom monitoring of Vxlan tunnel endpoints.
* [bfdorch] add local discriminator to state DB (sonic-net#2629)
* [acl] Add new ACL key BTH_OPCODE and AETH_SYNDROME  (sonic-net#2617)
* [voq][chassis] Remove created ports from the default vlan. (sonic-net#2607)
* [EVPN]Handling race condition when remote VNI arrives before tunnel map entry (sonic-net#2642)
*Added check in remote VNI add to ensure vxlan tunnel map is created before adding the remote end point.
* [test_mux] add sleep in test_NH (sonic-net#2648)
* [autoneg]Fixing adv interface types to be set when AN is disabled (sonic-net#2638)
* [hash]: Add UT infra. (sonic-net#2660)
*Added UT infra for Generic Hash feature
*Aligned PBH tests with Generic Hash UT infra
* [sai_failure_dump]Invoking dump during SAI failure (sonic-net#2644)
* [ResponsePublisher] add pipeline support  (sonic-net#2511)
* [dash] Fix compilation issue caused by missing include.
theasianpianist pushed a commit to theasianpianist/sonic-swss that referenced this pull request Jul 25, 2023
)

*Merge remote-tracking branch 'upstream/master' into dash (sonic-net#2663)
* Modify coppmgr mergeConfig to support preserving copp tables through reboot. (sonic-net#2548)
* Avoid aborting orchagent when setting TUNNEL attributes (sonic-net#2591)
* Handle Mac address 'none' (sonic-net#2593)
* Increase diff coverage to 80% (sonic-net#2599)
* Add missing parameter to on_switch_shutdown_request method. (sonic-net#2567)
* Add ZMQ based ProducerStateTable and CustomerStateTable.
* Revert "[voq][chassis]Add show fabric counters port/queue commands (sonic-net#2522)" (sonic-net#2611)
* Added new attributes for Vnet and Vxlan ecmp configurations. (sonic-net#2584)
* added support for monitoring, primary and adv_prefix and overlay_dmac.
* [routesync] Fix for stale dynamic neighbor (sonic-net#2553)
* [MuxOrch] Enabling neighbor when adding in active state (sonic-net#2601)
* Changed the BFD default detect multiplier to 10x (sonic-net#2614)
* Remove TODO comments that are no longer relevant (sonic-net#2622)
* Fix issue #13341 ARP entry can be out of sync between kernel and APPL_DB if multiple updates are received from RTNL (sonic-net#2619)
* [refactor]Refactoring sai handle status (sonic-net#2621)
* Vxlan tunnel endpoint custom monitoring APPL DB table. (sonic-net#2589)
* added support for monitoring, primary and adv_prefix. changed filter_mac to overlay_dmac
* Data Structures and code to write APP_DB VNET_MONITOR table entries for custom monitoring of Vxlan tunnel endpoints.
* [bfdorch] add local discriminator to state DB (sonic-net#2629)
* [acl] Add new ACL key BTH_OPCODE and AETH_SYNDROME  (sonic-net#2617)
* [voq][chassis] Remove created ports from the default vlan. (sonic-net#2607)
* [EVPN]Handling race condition when remote VNI arrives before tunnel map entry (sonic-net#2642)
*Added check in remote VNI add to ensure vxlan tunnel map is created before adding the remote end point.
* [test_mux] add sleep in test_NH (sonic-net#2648)
* [autoneg]Fixing adv interface types to be set when AN is disabled (sonic-net#2638)
* [hash]: Add UT infra. (sonic-net#2660)
*Added UT infra for Generic Hash feature
*Aligned PBH tests with Generic Hash UT infra
* [sai_failure_dump]Invoking dump during SAI failure (sonic-net#2644)
* [ResponsePublisher] add pipeline support  (sonic-net#2511)
* [dash] Fix compilation issue caused by missing include.
theasianpianist pushed a commit to theasianpianist/sonic-swss that referenced this pull request Jul 25, 2023
)

*Merge remote-tracking branch 'upstream/master' into dash (sonic-net#2663)
* Modify coppmgr mergeConfig to support preserving copp tables through reboot. (sonic-net#2548)
* Avoid aborting orchagent when setting TUNNEL attributes (sonic-net#2591)
* Handle Mac address 'none' (sonic-net#2593)
* Increase diff coverage to 80% (sonic-net#2599)
* Add missing parameter to on_switch_shutdown_request method. (sonic-net#2567)
* Add ZMQ based ProducerStateTable and CustomerStateTable.
* Revert "[voq][chassis]Add show fabric counters port/queue commands (sonic-net#2522)" (sonic-net#2611)
* Added new attributes for Vnet and Vxlan ecmp configurations. (sonic-net#2584)
* added support for monitoring, primary and adv_prefix and overlay_dmac.
* [routesync] Fix for stale dynamic neighbor (sonic-net#2553)
* [MuxOrch] Enabling neighbor when adding in active state (sonic-net#2601)
* Changed the BFD default detect multiplier to 10x (sonic-net#2614)
* Remove TODO comments that are no longer relevant (sonic-net#2622)
* Fix issue #13341 ARP entry can be out of sync between kernel and APPL_DB if multiple updates are received from RTNL (sonic-net#2619)
* [refactor]Refactoring sai handle status (sonic-net#2621)
* Vxlan tunnel endpoint custom monitoring APPL DB table. (sonic-net#2589)
* added support for monitoring, primary and adv_prefix. changed filter_mac to overlay_dmac
* Data Structures and code to write APP_DB VNET_MONITOR table entries for custom monitoring of Vxlan tunnel endpoints.
* [bfdorch] add local discriminator to state DB (sonic-net#2629)
* [acl] Add new ACL key BTH_OPCODE and AETH_SYNDROME  (sonic-net#2617)
* [voq][chassis] Remove created ports from the default vlan. (sonic-net#2607)
* [EVPN]Handling race condition when remote VNI arrives before tunnel map entry (sonic-net#2642)
*Added check in remote VNI add to ensure vxlan tunnel map is created before adding the remote end point.
* [test_mux] add sleep in test_NH (sonic-net#2648)
* [autoneg]Fixing adv interface types to be set when AN is disabled (sonic-net#2638)
* [hash]: Add UT infra. (sonic-net#2660)
*Added UT infra for Generic Hash feature
*Aligned PBH tests with Generic Hash UT infra
* [sai_failure_dump]Invoking dump during SAI failure (sonic-net#2644)
* [ResponsePublisher] add pipeline support  (sonic-net#2511)
* [dash] Fix compilation issue caused by missing include.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants