Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FDB]All MACs are not synced to the kernel in scale scenario #12502

Closed
dgsudharsan opened this issue Oct 26, 2022 · 18 comments
Closed

[FDB]All MACs are not synced to the kernel in scale scenario #12502

dgsudharsan opened this issue Oct 26, 2022 · 18 comments
Assignees
Labels
BRCM Issue for 202205 Triaged this issue has been triaged

Comments

@dgsudharsan
Copy link
Collaborator

dgsudharsan commented Oct 26, 2022

Description

During scale testing, all MACs are not synced to kernel. This can be easily reproduced by learning 10K MACs on a port and doing a shutdown/no shut. Could this be due to fdbsyncd not checking if port is up in kernel before programming the mac?

The second scenario where the problem happens is when MAC ages out in kernel but not in switch (if FDB aging time is increased in switch). This results in fdbsyncd processing stale mac notifications and reprogramming them based on status in state_db. In this scenario it is always observed that exactly 8K macs are reprogrammed (could be size of netlink buffer queue?)

The above two scenarios will result in many MACs not synced to remote VTEPS in EVPN.

Steps to reproduce the issue:

  1. Learn 10 K MACs in a port
  2. Shutdown the interface
  3. Startup the interface.

Describe the results you received:

Describe the results you expected:

Output of show version:

SONiC Software Version: SONiC.202205.42-ea51d9514_Internal
Distribution: Debian 11.5
Kernel: 5.10.0-12-2-amd64
Build commit: ea51d9514
Build date: Fri Oct  7 05:45:56 UTC 2022
Built by: sw-r2d2-bot@r-build-sonic-ci03-243

Platform: x86_64-mlnx_msn2700-r0
HwSKU: Mellanox-SN2700
ASIC: mellanox
ASIC Count: 1
Serial Number: MT1829X20804
Model Number: MSN2700-CB2F
Hardware Revision: A2
Uptime: 07:07:08 up 11:30,  3 users,  load average: 2.17, 2.46, 2.58
Date: Wed 26 Oct 2022 07:07:08

Docker images:
REPOSITORY                                         TAG                            IMAGE ID       SIZE
docker-syncd-mlnx                                  202205.42-ea51d9514_Internal   97dd12cebe1e   859MB
docker-syncd-mlnx                                  latest                         97dd12cebe1e   859MB
docker-orchagent                                   202205.42-ea51d9514_Internal   347f0cdc723f   478MB
docker-orchagent                                   latest                         347f0cdc723f   478MB
docker-fpm-frr                                     202205.42-ea51d9514_Internal   ddadceae2d69   488MB
docker-fpm-frr                                     latest                         ddadceae2d69   488MB
docker-teamd                                       202205.42-ea51d9514_Internal   28f79f968d3c   459MB
docker-teamd                                       latest                         28f79f968d3c   459MB
docker-platform-monitor                            202205.42-ea51d9514_Internal   629c9ea03cf2   861MB
docker-platform-monitor                            latest                         629c9ea03cf2   861MB
docker-macsec                                      latest                         a7ea8b95281f   461MB
docker-snmp                                        202205.42-ea51d9514_Internal   0e96a62d07ee   488MB
docker-snmp                                        latest                         0e96a62d07ee   488MB
docker-dhcp-relay                                  latest                         8cef09a39edf   452MB
docker-lldp                                        202205.42-ea51d9514_Internal   337146c6b971   485MB
docker-lldp                                        latest                         337146c6b971   485MB
docker-mux                                         202205.42-ea51d9514_Internal   464339799d55   492MB
docker-mux                                         latest                         464339799d55   492MB
docker-sonic-telemetry                             202205.42-ea51d9514_Internal   7fc604d28c7c   523MB
docker-sonic-telemetry                             latest                         7fc604d28c7c   523MB
docker-database                                    202205.42-ea51d9514_Internal   98a7bdcfd7e8   443MB
docker-database                                    latest                         98a7bdcfd7e8   443MB
docker-router-advertiser                           202205.42-ea51d9514_Internal   f05c810acb38   443MB
docker-router-advertiser                           latest                         f05c810acb38   443MB
docker-nat                                         202205.42-ea51d9514_Internal   272fda2cdf1a   430MB
docker-nat                                         latest                         272fda2cdf1a   430MB
docker-sflow                                       202205.42-ea51d9514_Internal   5723c8d63918   428MB
docker-sflow                                       latest                         5723c8d63918   428MB
docker-sonic-mgmt-framework                        202205.42-ea51d9514_Internal   0fd3a3d91b98   557MB
docker-sonic-mgmt-framework                        latest                         0fd3a3d91b98   557MB
urm.nvidia.com/sw-nbu-sws-sonic-docker/sonic-wjh   1.3.1-202205                   4e8b9199b984   643MB

Output of show techsupport:

(paste your output here or download and attach the file here )

Issue 1 - sonic_dump_qa-eth-vt05-2-2700a1_20221026_070014
Issue 2 - sonic_dump_qa-eth-vt03-2-3700v_20221023_204313

Additional information you deem important (e.g. issue happens only occasionally):

sonic_dump_qa-eth-vt05-2-2700a1_20221026_070014.tar.gz
sonic_dump_qa-eth-vt03-2-3700v_20221023_204313.tar.gz

@adyeung
Copy link
Collaborator

adyeung commented Oct 28, 2022

@kishorekunal01 from BRCM to take a look

@kishorekunal01
Copy link
Contributor

kishorekunal01 commented Oct 29, 2022

This is not issue, this is kernel behavior, and reprogramming is done so that MAC will be added back to kernel as per design.

The second scenario where the problem happens is when MAC ages out in kernel but not in switch (if FDB aging time is increased in switch). This results in fdbsyncd processing stale mac notifications and reprogramming them based on status in state_db. In this scenario it is always observed that exactly 8K macs are reprogrammed (could be size of netlink buffer queue?)

@kishorekunal01
Copy link
Contributor

kishorekunal01 commented Oct 29, 2022

@dgsudharsan

When interface goes down all the entry in the FDB_TABLE for the interface should get deleted by the switch, And when interface comes up(in kernel also interface will come up) and now due to traffic MAC learning will again happen in ASIC and same will be added back in FDB_TABLE.

If it is not working please enable below debug "swssloglevel -l INFO -c fdbsyncd" and provide the tech support.

@dgsudharsan
Copy link
Collaborator Author

@kishorekunal01 The issue is not with reprogramming the MACs but the consistency of mac between ASIC and kernel. One of the dumps captured has swssloglevel enabled as info. As mentioned there are two scenarios where macs are not properly reprogrammed in the kernel. I believe fdbsyncd is not robust enough to handle these scenarios. Please let me know if you need more details.

@kishorekunal01
Copy link
Contributor

kishorekunal01 commented Oct 31, 2022

@dgsudharsan
As per above reply, the issue here is local MAC is not in sync from ASIC to Kernel.

I tried the interface up/down test case on Broadcom chipset. And I don't see any issue with MAC sync between the ASIC and kernel.

Attaching Tech support in next comment

As I have earlier replied
Step 1: When interface goes down all the entry in the FDB_TABLE for the interface should get deleted(Trigger from ASIC)
Step2: When interface comes up with command "config interface startup Ethernetxx" kernel also interface will come up.
Step3: Due to traffic MAC learning will again happen in ASIC and same will be added back in FDB_TABLE.
Step4: Fdbsyncd will sync this MAC(FDB_TABLE) to the Kernel.

Testing_With10K_MAC_Broadcom_Chip.log

@kishorekunal01
Copy link
Contributor

I have enable debug "swssloglevel -l INFO -c fdbsyncd" and collected the tech support. Log file attached

sonic_dump_sonic_20221031_230704.zip

@dgsudharsan
Copy link
Collaborator Author

@kishorekunal01 Thanks. For the port up down scenario, I did some more analysis and found the rc to be due to SAI notification issue which I am handling internally.

However I also specified the netlink buffer issue which you can see from the logs

Oct 23 20:24:52.707880 qa-eth-vt03-2-3700v ERR swss#fdbsyncd: :- readData: netlink reports out of memory on reading a netlink socket. High possibility of a lost message

Please check sonic_dump_qa-eth-vt03-2-3700v_20221023_204313.tar attached in the bug. This happens when I have 10K macs on one port.

@kishorekunal01
Copy link
Contributor

@prsunny Can we increase the netlink buffer. Currently it is set to 3MB, Can we increase the netlink buffer size to 16MB.

/* Set socket buffer size to 3MB */
nl_socket_set_buffer_size(m_socket, 3145728, 0);

When there is a dump from kernel it is possible that netlink buffer run out of memory with 10k MAC scale. Hence this error is reported.

@judyjoseph judyjoseph added BRCM Triaged this issue has been triaged labels Nov 23, 2022
@judyjoseph
Copy link
Contributor

  1. Check if the CPU processing speed makes a difference
  2. Can increase the netlink memory as a short term fix

@dgsudharsan
Copy link
Collaborator Author

@adyeung Can you please provide ETA for increasing netlink buffer size?

@adyeung adyeung assigned kishorekunal01 and unassigned adyeung Dec 29, 2022
@adyeung
Copy link
Collaborator

adyeung commented Dec 29, 2022

Expecting a fix to be posted by 1/20/23

@dgsudharsan
Copy link
Collaborator Author

Expecting a fix to be posted by 1/20/23

@adyeung @kishorekunal01 Can you please share the fix if it is ready?

@kishorekunal01
Copy link
Contributor

kishorekunal01 commented Jan 23, 2023 via email

@dgsudharsan
Copy link
Collaborator Author

Hi @kishorekunal01 We are still seeing the issue even after the fix. Should some parameters be adjusted?
root@qa-eth-vt03-2-3700v:~# show logging | grep fdb | grep ERR
Feb 22 18:13:46.561569 qa-eth-vt03-2-3700v ERR swss#fdbsyncd: :- readData: netlink reports out of memory on reading a netlink socket. High possibility of a lost message

@dgsudharsan
Copy link
Collaborator Author

sonic_dump_qa-eth-vt03-2-3700v_20230222_185508.tar.gz

I am attaching the techsupport here. You can see there are 10K MACS learnt in ASIC, However the kernel shows only 8K MACS

@dgsudharsan
Copy link
Collaborator Author

On a note someone had reported the same and increasing to 16 MB didn't help https://groups.google.com/g/sonicproject/c/Lc0cs-RzNSE

@dgsudharsan
Copy link
Collaborator Author

@kishorekunal01 Should we increase the netlink memory here? https://github.com/sonic-net/sonic-buildimage/blob/9ff2e2cff38fa71d0e5ce38f92d4339206849a74/files/image_config/sysctl/sysctl-net.conf. Currently it is 3MB
net.core.rmem_max=3145728
net.core.wmem_max=3145728

@snider-nokia
Copy link
Contributor

snider-nokia commented Mar 2, 2023

@kishorekunal01 Should we increase the netlink memory here? https://github.com/sonic-net/sonic-buildimage/blob/9ff2e2cff38fa71d0e5ce38f92d4339206849a74/files/image_config/sysctl/sysctl-net.conf. Currently it is 3MB net.core.rmem_max=3145728 net.core.wmem_max=3145728

Yes, the net.core.rmem_max setting in the referenced file is the proper place to modify this setting. Please see my associated comment here at sonic-net/sonic-swss-common#739 (comment)

Issue #12587 relates to netlink messages originated at the kernel and sent to application (kernel-to-socket), thus only the socket rcv buffer needs increasing in the context of that issue.

If the problematic path here is instead from netlink socket-to-kernel as opposed to from kernel-to-socket then the net.core.wmem_max setting will need to be increased for the sake of larger socket tx buffer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BRCM Issue for 202205 Triaged this issue has been triaged
Projects
None yet
Development

No branches or pull requests

5 participants