Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zebra: backpressure - Zebra push back on Buffer/Stream creation #15411

Conversation

raja-rajasekar
Copy link
Contributor

Currently, the way zebra works is it creates pthread per client (BGP is of interest in this case) and this thread loops itself in zserv_read() to check for any incoming data. If there is one, then it reads, validates and adds it in the ibuf_fifo signalling the main thread to process the message. The main thread when it gets a change, processes the message, and invokes the function pointer registered in the header command. (Ex: zserv_handlers).

Finally, if all of this was successful, this task reschedules itself and loops in zserv_read() again

However, if there are already items on the ibuf FIFO, that means zebra is slow in processing. And with the current mechanism if Zebra main is busy, the ibuf FIFO keeps growing holding up the memory.

Show memory zebra:(Example: 15k streams hoarding ~160 MB of data)
--- qmem libfrr ---
Stream : 44 variable 3432352 15042 161243800
Stream FIFO : 45 72 3240 48 3456

Fix:

  • Stop doing the read events when we know there are X number of items on the FIFO already.(X - zebra zapi-packets <1-10000> (Default-1000)

  • In zserv_read(), determine the number of items on the zserv->ibuf_fifo. Subtract this from the work items and only pull the number of items off that would take us to X items on the ibuf_fifo again.

  • If the number of items in the ibuf_fifo has reached to the maximum * Either initially when zserv_read() is called (or) * when processing the remainders of the incoming buffer the zserv_read which runs in client’s pthread expects the zebra main to schedule a wake up in zserv_process_message after the processing of all the items on the buffer.

NOTE

These code changes are on the server(Zebra) side.
The code changes from the client (BGP) will be raised as a separate review.

Ticket: #3390099

@ton31337
Copy link
Member

Is this related to #15390?

@mjstapp mjstapp self-requested a review February 23, 2024 13:30
@donaldsharp
Copy link
Member

It is insofar as that 15390 has changes for both zebra and bgp. I believe we are running into some internal discussions in regards to the bgp changes and these zebra changes are independent of the bgp changes. So the goal is to get these in while we continue to work on the bgp changes. These changes are independent enough that all it would do is add more data to the buffer side of bgp at this point in time. This is in opposition to the backup being spread between bgp and zebra currently.

@raja-rajasekar raja-rajasekar force-pushed the rajasekarr/backpressure_bgp_zebra_zserv branch from b8a20dd to f8d4565 Compare February 23, 2024 23:57
@raja-rajasekar
Copy link
Contributor Author

Testing/observation: Testing.pdf

@riw777 riw777 self-requested a review February 27, 2024 16:15
@ton31337
Copy link
Member

Looking at testing.pdf, the question comes to my head is: why the heap is increasing with the code change while the streams size remains OK?

@raja-rajasekar
Copy link
Contributor Author

raja-rajasekar commented Feb 29, 2024

Looking at testing.pdf, the question comes to my head is: why the heap is increasing with the code change while the streams size remains OK?

The Idea in the testing.pdf is to compare Streams of "initial state" and "after trigger" for baseline and with code changes seperately. The heap is more in one case than the other because the topology used is diferent for with/without code changes. Heap increases because the dplane_ctx and other modules taking extra memory. But the increase before and after trigger is ~200-250MB.

Apologies for the confusion. Here is the new test results where the setup is same for baseline and with code changes and complete show memory output is provided.
New_Testing.pdf

Copy link
Member

@ton31337 ton31337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I'll let others look at this also. /cc @mjstapp

Copy link
Member

@riw777 riw777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

zebra/zserv.c Outdated Show resolved Hide resolved
zebra/zserv.c Outdated Show resolved Hide resolved
zebra/zserv.c Outdated Show resolved Hide resolved
zebra/zserv.c Outdated Show resolved Hide resolved
@raja-rajasekar raja-rajasekar force-pushed the rajasekarr/backpressure_bgp_zebra_zserv branch from f8d4565 to 20cf48d Compare March 6, 2024 10:35
@github-actions github-actions bot added size/L and removed size/M labels Mar 6, 2024
@raja-rajasekar raja-rajasekar force-pushed the rajasekarr/backpressure_bgp_zebra_zserv branch 2 times, most recently from a24eaa4 to 7efe441 Compare March 6, 2024 10:48
@raja-rajasekar
Copy link
Contributor Author

Final Test Results @
Latest_Testing.pdf

Copy link
Contributor

@mjstapp mjstapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, gosh, do you really think the cli changes are necessary? I don't really think we need to expose the inner workings of the module.

zebra/zebra_vty.c Outdated Show resolved Hide resolved
zebra/zebra_vty.c Outdated Show resolved Hide resolved
zebra/zserv.c Show resolved Hide resolved
zebra/zserv.c Outdated Show resolved Hide resolved
zebra/zserv.c Outdated Show resolved Hide resolved
@raja-rajasekar raja-rajasekar force-pushed the rajasekarr/backpressure_bgp_zebra_zserv branch from 7efe441 to 34cbec9 Compare March 7, 2024 19:47
@github-actions github-actions bot added size/M and removed size/L labels Mar 7, 2024
@raja-rajasekar raja-rajasekar force-pushed the rajasekarr/backpressure_bgp_zebra_zserv branch from 34cbec9 to 7e99688 Compare March 7, 2024 20:05
@raja-rajasekar
Copy link
Contributor Author

Final test results @
Final_Testing.pdf

zebra/zserv.c Outdated Show resolved Hide resolved
zebra/zserv.c Outdated Show resolved Hide resolved
zebra/zserv.c Show resolved Hide resolved
Currently, the way zebra works is it creates pthread per client (BGP is
of interest in this case) and this thread loops itself in zserv_read()
to check for any incoming data. If there is one, then it reads,
validates and adds it in the ibuf_fifo signalling the main thread to
process the message. The main thread when it gets a change, processes
the message, and invokes the function pointer registered in the header
command. (Ex: zserv_handlers).

Finally, if all of this was successful, this task reschedules itself and
loops in zserv_read() again

However, if there are already items on ibuf FIFO, that means zebra is
slow in processing. And with the current mechanism if Zebra main is
busy, the ibuf FIFO keeps growing holding up the memory.

Show memory zebra:(Example: 15k streams hoarding ~160 MB of data)
--- qmem libfrr ---
Stream             :       44 variable   3432352    15042 161243800

Fix:
Client IO Thread: (zserv_read)
 - Stop doing the read events when we know there are X number of items
   on the FIFO already.(X - zebra zapi-packets <1-10000> (Default-1000)

 - Determine the number of items on the zserv->ibuf_fifo. Subtract this
   from the work items and only pull the number of items off that would
   take us to X items on the ibuf_fifo again.

 - If the number of items in the ibuf_fifo has reached to the maximum
      * Either initially when zserv_read() is called (or)
      * when processing the remainders of the incoming buffer
   the client IO thread is woken by the the zebra main.

Main thread: (zserv_process_message)
If the client ibuf always schedules a wakeup to the client IO to read
more items from the socked buffer. This way we ensure
 - Client IO thread always tries to read the socket buffer and add more
   items to the ibuf_fifo (until max limit)
 - hidden config change (zebra zapi-packets <>) is taken into account

Ticket: #3390099

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
@raja-rajasekar raja-rajasekar force-pushed the rajasekarr/backpressure_bgp_zebra_zserv branch from 7e99688 to a8efa99 Compare March 7, 2024 23:16
Copy link
Contributor

@mjstapp mjstapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - looks good to me now

@mjstapp mjstapp merged commit d0afa12 into FRRouting:master Mar 11, 2024
9 checks passed
lguohan pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Aug 6, 2024
…rns (#19717)

Added the below patches which are part of BGP Zebra back pressure feature required to keep the memory usage in check during route churns

How I did it
New patches that were added:

Patch	FRR Pull request
0030-zebra-backpressure-Zebra-push-back-on-Buffer-Stream-.patch	FRRouting/frr#15411
0031-bgpd-backpressure-Add-a-typesafe-list-for-Zebra-Anno.patch	FRRouting/frr#15524
0032-bgpd-fix-flushing-ipv6-flowspec-entries-when-peering.patch	FRRouting/frr#15326
0033-bgpd-backpressure-cleanup-bgp_zebra_XX-func-args.patch	FRRouting/frr#15524
0034-gpd-backpressure-Handle-BGP-Zebra-Install-evt-Creat.patch	FRRouting/frr#15524
0035-bgpd-backpressure-Handle-BGP-Zebra-EPVN-Install-evt-.patch	FRRouting/frr#15624
0036-zebra-backpressure-Fix-Null-ptr-access-Coverity-Issu.patch	FRRouting/frr#15728
0037-bgpd-Increase-install-uninstall-speed-of-evpn-vpn-vn.patch	FRRouting/frr#15727
0038-zebra-Actually-display-I-O-buffer-sizes.patch	FRRouting/frr#15708
0039-zebra-Actually-display-I-O-buffer-sizes-part-2.patch	FRRouting/frr#15769
0040-bgpd-backpressure-Fix-to-withdraw-evpn-type-5-routes.patch	FRRouting/frr#16034
0041-bgpd-backpressure-Fix-to-avoid-CPU-hog.patch	FRRouting/frr#16035
0042-zebra-Use-built-in-data-structure-counter.patch	FRRouting/frr#16221
0043-zebra-Use-the-ctx-queue-counters.patch	FRRouting/frr#16220
0044-zebra-Modify-dplane-loop-to-allow-backpressure-to-fi.patch	FRRouting/frr#16220
0045-zebra-Limit-queue-depth-in-dplane_fpm_nl.patch	FRRouting/frr#16220
0046-zebra-Modify-show-zebra-dplane-providers-to-give-mor.patch	FRRouting/frr#16220
0047-bgpd-backpressure-fix-evpn-route-sync-to-zebra.patch	FRRouting/frr#16234
0048-bgpd-backpressure-fix-to-properly-remove-dest-for-bg.patch	FRRouting/frr#16368
0049-bgpd-backpressure-Improve-debuggability.patch	FRRouting/frr#16368
0050-bgpd-backpressure-Avoid-use-after-free.patch	FRRouting/frr#16437
0051-bgpd-backpressure-fix-ret-value-evpn_route_select_in.patch	FRRouting/frr#16416
0052-bgpd-backpressure-log-error-for-evpn-when-route-inst.patch	FRRouting/frr#16416
matiAlfaro pushed a commit to Marvell-switching/sonic-buildimage that referenced this pull request Aug 6, 2024
…rns (sonic-net#19717)

Added the below patches which are part of BGP Zebra back pressure feature required to keep the memory usage in check during route churns

How I did it
New patches that were added:

Patch	FRR Pull request
0030-zebra-backpressure-Zebra-push-back-on-Buffer-Stream-.patch	FRRouting/frr#15411
0031-bgpd-backpressure-Add-a-typesafe-list-for-Zebra-Anno.patch	FRRouting/frr#15524
0032-bgpd-fix-flushing-ipv6-flowspec-entries-when-peering.patch	FRRouting/frr#15326
0033-bgpd-backpressure-cleanup-bgp_zebra_XX-func-args.patch	FRRouting/frr#15524
0034-gpd-backpressure-Handle-BGP-Zebra-Install-evt-Creat.patch	FRRouting/frr#15524
0035-bgpd-backpressure-Handle-BGP-Zebra-EPVN-Install-evt-.patch	FRRouting/frr#15624
0036-zebra-backpressure-Fix-Null-ptr-access-Coverity-Issu.patch	FRRouting/frr#15728
0037-bgpd-Increase-install-uninstall-speed-of-evpn-vpn-vn.patch	FRRouting/frr#15727
0038-zebra-Actually-display-I-O-buffer-sizes.patch	FRRouting/frr#15708
0039-zebra-Actually-display-I-O-buffer-sizes-part-2.patch	FRRouting/frr#15769
0040-bgpd-backpressure-Fix-to-withdraw-evpn-type-5-routes.patch	FRRouting/frr#16034
0041-bgpd-backpressure-Fix-to-avoid-CPU-hog.patch	FRRouting/frr#16035
0042-zebra-Use-built-in-data-structure-counter.patch	FRRouting/frr#16221
0043-zebra-Use-the-ctx-queue-counters.patch	FRRouting/frr#16220
0044-zebra-Modify-dplane-loop-to-allow-backpressure-to-fi.patch	FRRouting/frr#16220
0045-zebra-Limit-queue-depth-in-dplane_fpm_nl.patch	FRRouting/frr#16220
0046-zebra-Modify-show-zebra-dplane-providers-to-give-mor.patch	FRRouting/frr#16220
0047-bgpd-backpressure-fix-evpn-route-sync-to-zebra.patch	FRRouting/frr#16234
0048-bgpd-backpressure-fix-to-properly-remove-dest-for-bg.patch	FRRouting/frr#16368
0049-bgpd-backpressure-Improve-debuggability.patch	FRRouting/frr#16368
0050-bgpd-backpressure-Avoid-use-after-free.patch	FRRouting/frr#16437
0051-bgpd-backpressure-fix-ret-value-evpn_route_select_in.patch	FRRouting/frr#16416
0052-bgpd-backpressure-log-error-for-evpn-when-route-inst.patch	FRRouting/frr#16416
@raja-rajasekar raja-rajasekar deleted the rajasekarr/backpressure_bgp_zebra_zserv branch August 30, 2024 06:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants