[Mellanox][202012] Support Mellanox-SN4600C-C64 as T1 switch in dual-ToR scenario #11032

stephenxs · 2022-06-05T03:23:59Z

Why I did it

Support Mellanox-SN4600C-C64 as T1 switch in dual-ToR scenario

Support additional queue and PG in buffer templates, including both traditional and dynamic model
Support mapping DSCP 2/6 to lossless traffic in the QoS template.
Add macros to generate additional lossless PG in the dynamic model
Adjust the order in which the generic/dedicated (with additional lossless queues) macros are checked and called to generate buffer tables in common template buffers_config.j2
- Buffer tables are rendered via using macros.
- Both generic and dedicated macros are defined on our platform. Currently, the generic one is called as long as it is defined, which causes the generic one always being called on our platform. To avoid it, the dedicated macrio is checked and called first and then the generic ones.
Support MAP_PFC_PRIORITY_TO_PRIORITY_GROUP on ports with additional lossless queues.

On Mellanox-SN4600C-C64, buffer configuration for t1 is calculated as:

40 * 100G downlink ports with 4 lossless PGs/queues, 1 lossy PG, and 3 lossy queues
16 * 100G uplink ports with 2 lossless PGs/queues, 1 lossy PG, and 5 lossy queues

Signed-off-by: Stephen Sun stephens@nvidia.com

How I did it

How to verify it

Run regression test.

Which release branch to backport (provide reason below if selected)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

Signed-off-by: Stephen Sun <stephens@nvidia.com>

…ng macro are defined Move '_with_extra_queue' and 'with_extra_queue_with_inactive_port' version to the beginning. For some vendors, all versions are defined. The generic version will be called if it is defined at the beginning Signed-off-by: Stephen Sun <stephens@nvidia.com>

Signed-off-by: Stephen Sun <stephens@nvidia.com>

device/mellanox/x86_64-mlnx_msn2700-r0/Mellanox-SN2700-D48C8/buffers_defaults_objects.j2

bingwang-ms · 2022-06-06T05:21:56Z

https://github.com/Azure/sonic-buildimage/blob/f0f23ebac210777333ca3f6f234317f4e211ff89/device/mellanox/x86_64-mlnx_msn4600c-r0/Mellanox-SN4600C-C64/qos.json.j2#L82
Should it be "7" : "0" or "7" : "7"?
We are using "7" : "7" in current design. But I think "7" : "0" also makes sense. We can save a lossy PG in that way.
@neethajohn What do you think?

stephenxs · 2022-06-06T05:59:52Z

https://github.com/Azure/sonic-buildimage/blob/f0f23ebac210777333ca3f6f234317f4e211ff89/device/mellanox/x86_64-mlnx_msn4600c-r0/Mellanox-SN4600C-C64/qos.json.j2#L82

Should it be "7" : "0" or "7" : "7"?
We are using "7" : "7" in current design. But I think "7" : "0" also makes sense. We can save a lossy PG in that way.
@neethajohn What do you think?

Yes. We would like to use 7:0 for the purpose of saving a lossy PG and buffers.

It doesn't make sense since both cases result in the same code Signed-off-by: Stephen Sun <stephens@nvidia.com>

stephenxs · 2022-06-06T12:26:10Z

/azpw run Azure.sonic-buildimage

mssonicbld · 2022-06-06T12:26:12Z

/AzurePipelines run Azure.sonic-buildimage

azure-pipelines · 2022-06-06T12:26:22Z

Azure Pipelines successfully started running 1 pipeline(s).

stephenxs · 2022-06-06T12:52:42Z

Looks like it is failing because #11018 was cherry-picked without updating unit test

liat-grozovik · 2022-06-14T06:52:51Z

@neethajohn , @bingwang-ms could you please help to review and signoff?

device/mellanox/x86_64-mlnx_msn2700-r0/Mellanox-SN2700-D48C8/buffers_defaults_objects.j2

src/sonic-config-engine/tests/sample_output/py3/qos-mellanox4600c-c64.json

device/mellanox/x86_64-mlnx_msn4600c-r0/Mellanox-SN4600C-C64/buffers_defaults_t1.j2

stephenxs · 2022-06-18T02:01:15Z

There should be build issues that are caused by newly introduced commits:

global DSCP_TO_TC map, which is not necessary on our platforms. I will fix it by introducing a vendor-specific generate_global_dscp_to_tc_map to avoid generating the global map.
additional lossy PGs on T1 uplinks, which is still under discussion and will be an other PR.

- the way to enable dual ToR feature is changed - global DSCP_TO_TC_MAP Signed-off-by: Stephen Sun <stephens@nvidia.com>

…ario (#11261) - Why I did it Support Mellanox-SN4600C-C64 as T1 switch in dual-ToR scenario This is to port #11032 and #11299 from 202012 to master. Support additional queue and PG in buffer templates, including both traditional and dynamic model Support mapping DSCP 2/6 to lossless traffic in the QoS template. Add macros to generate additional lossless PG in the dynamic model Adjust the order in which the generic/dedicated (with additional lossless queues) macros are checked and called to generate buffer tables in common template buffers_config.j2 Buffer tables are rendered via using macros. Both generic and dedicated macros are defined on our platform. Currently, the generic one is called as long as it is defined, which causes the generic one always being called on our platform. To avoid it, the dedicated macrio is checked and called first and then the generic ones. Support MAP_PFC_PRIORITY_TO_PRIORITY_GROUP on ports with additional lossless queues. On Mellanox-SN4600C-C64, buffer configuration for t1 is calculated as: 40 * 100G downlink ports with 4 lossless PGs/queues, 1 lossy PG, and 3 lossy queues 16 * 100G uplink ports with 2 lossless PGs/queues, 1 lossy PG, and 5 lossy queues Signed-off-by: Stephen Sun <stephens@nvidia.com>

…ario (sonic-net#11261) - Why I did it Support Mellanox-SN4600C-C64 as T1 switch in dual-ToR scenario This is to port sonic-net#11032 and sonic-net#11299 from 202012 to master. Support additional queue and PG in buffer templates, including both traditional and dynamic model Support mapping DSCP 2/6 to lossless traffic in the QoS template. Add macros to generate additional lossless PG in the dynamic model Adjust the order in which the generic/dedicated (with additional lossless queues) macros are checked and called to generate buffer tables in common template buffers_config.j2 Buffer tables are rendered via using macros. Both generic and dedicated macros are defined on our platform. Currently, the generic one is called as long as it is defined, which causes the generic one always being called on our platform. To avoid it, the dedicated macrio is checked and called first and then the generic ones. Support MAP_PFC_PRIORITY_TO_PRIORITY_GROUP on ports with additional lossless queues. On Mellanox-SN4600C-C64, buffer configuration for t1 is calculated as: 40 * 100G downlink ports with 4 lossless PGs/queues, 1 lossy PG, and 3 lossy queues 16 * 100G uplink ports with 2 lossless PGs/queues, 1 lossy PG, and 5 lossy queues Signed-off-by: Stephen Sun <stephens@nvidia.com>

stephenxs and others added 4 commits June 5, 2022 02:39

Support t1 in dual ToR scenario

4adb6bd

Signed-off-by: Stephen Sun <stephens@nvidia.com>

Support unit test for QoS and buffer when remap is disabled

6347a10

Signed-off-by: Stephen Sun <stephens@nvidia.com>

4600c buffer templates support both remap disabled and enabled

99b3a4d

Signed-off-by: Stephen Sun <stephens@nvidia.com>

stephenxs requested review from bingwang-ms, neethajohn, baiwei0427 and keboliu June 5, 2022 03:24

stephenxs added 2 commits June 5, 2022 03:51

Commit missing change: file.cmp => util.cmp

c599c06

Signed-off-by: Stephen Sun <stephens@nvidia.com>

Remain the original buffer pool sizes if remapping is disabled

f0f23eb

Signed-off-by: Stephen Sun <stephens@nvidia.com>

stephenxs added Request for 202012 Branch Platform: Mellanox labels Jun 6, 2022

bingwang-ms reviewed Jun 6, 2022

View reviewed changes

device/mellanox/x86_64-mlnx_msn2700-r0/Mellanox-SN2700-D48C8/buffers_defaults_objects.j2 Show resolved Hide resolved

don't test buffer_model for lossy PG for ports with extra lossless PGs

acd59ed

It doesn't make sense since both cases result in the same code Signed-off-by: Stephen Sun <stephens@nvidia.com>

Merge remote-tracking branch 'origin/202012' into dual-tor-t1-c64-202012

2bcfb7d

neethajohn reviewed Jun 14, 2022

View reviewed changes

baiwei0427 reviewed Jun 15, 2022

View reviewed changes

device/mellanox/x86_64-mlnx_msn4600c-r0/Mellanox-SN4600C-C64/buffers_defaults_t1.j2 Show resolved Hide resolved

baiwei0427 approved these changes Jun 15, 2022

View reviewed changes

neethajohn approved these changes Jun 17, 2022

View reviewed changes

Merge branch '202012' into dual-tor-t1-c64-202012

0b9016c

Fix unit test failure caused by the latest community change

3edba88

- the way to enable dual ToR feature is changed - global DSCP_TO_TC_MAP Signed-off-by: Stephen Sun <stephens@nvidia.com>

keboliu approved these changes Jun 20, 2022

View reviewed changes

neethajohn merged commit 307d0e2 into sonic-net:202012 Jun 21, 2022

stephenxs deleted the dual-tor-t1-c64-202012 branch June 21, 2022 22:17

qiluo-msft removed the Request for 202012 Branch label Jul 5, 2022

stephenxs mentioned this pull request Jul 13, 2022

[Mellanox] Support SN4600C-C64 as T1 switch in dual-ToR scenario #11261

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Mellanox][202012] Support Mellanox-SN4600C-C64 as T1 switch in dual-ToR scenario #11032

[Mellanox][202012] Support Mellanox-SN4600C-C64 as T1 switch in dual-ToR scenario #11032

stephenxs commented Jun 5, 2022

bingwang-ms commented Jun 6, 2022

stephenxs commented Jun 6, 2022

stephenxs commented Jun 6, 2022

mssonicbld commented Jun 6, 2022

azure-pipelines bot commented Jun 6, 2022

stephenxs commented Jun 6, 2022

liat-grozovik commented Jun 14, 2022

stephenxs commented Jun 18, 2022 •

edited

Loading

[Mellanox][202012] Support Mellanox-SN4600C-C64 as T1 switch in dual-ToR scenario #11032

[Mellanox][202012] Support Mellanox-SN4600C-C64 as T1 switch in dual-ToR scenario #11032

Conversation

stephenxs commented Jun 5, 2022

Why I did it

How I did it

How to verify it

Which release branch to backport (provide reason below if selected)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

bingwang-ms commented Jun 6, 2022

stephenxs commented Jun 6, 2022

stephenxs commented Jun 6, 2022

mssonicbld commented Jun 6, 2022

azure-pipelines bot commented Jun 6, 2022

stephenxs commented Jun 6, 2022

liat-grozovik commented Jun 14, 2022

stephenxs commented Jun 18, 2022 • edited Loading

stephenxs commented Jun 18, 2022 •

edited

Loading