Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flexalgo mpls frrbot #1

Open
wants to merge 65 commits into
base: master
Choose a base branch
from
Open

Flexalgo mpls frrbot #1

wants to merge 65 commits into from

Conversation

louis-6wind
Copy link
Owner

No description provided.

slankdev and others added 17 commits September 12, 2022 11:07
The spftree has a new property called algorithm
which is id used to identify the algorithm that
separates it in the same IGP network. This is
used in Flex-Algo. In other cases than Flex-Algo,
the algorithm id is always zero.

Signed-off-by: Hiroki Shirokura <hiroki.shirokura@linecorp.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
The information in prefix-sid has a new property
called algorithm id.  This is used to identify
the algorithm that separates it in the same IGP
network. This is used in Flex-Algo.In all other
cases, the algorithm id is basically 0.

Signed-off-by: Hiroki Shirokura <hiroki.shirokura@linecorp.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
The information in prefix-sid has a new property
called algorithm id.  This is used to identify
the algorithm that separates it in the same IGP
network. This is used in Flex-Algo.In all other
cases, the algorithm id is basically 0.

Signed-off-by: Hiroki Shirokura <hiroki.shirokura@linecorp.com>
Signed-off-by: Eric Kinzie <ekinzie@labn.net>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Before this commit, SR_ALGORITHM_COUNT was set to 2,
and each was hardcoded with router capability tlv.
When Flex-Algo is supported, SR-Algorithm may be
variably supported up to 256.

Signed-off-by: Hiroki Shirokura <hiroki.shirokura@linecorp.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
isis_tlvs_add_extended_ip_reach adds IS-IS Extended
IP reachability to the LSP. In this case, if the
pcfg argument is not NULL, you can add IGP
Prefix-SID as its sub tlv.

Before this commit, only one Prefix-SID can be added.
After this commit, the argument is not a single
pointer but an array of pointers, and multiple
Prefix-SIDs can be added.

This feature is necessary because Flex-Algo
requires multiple Prefix-SIDs for each Algorithm.

Signed-off-by: Hiroki Shirokura <hiroki.shirokura@linecorp.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Prefix-SID nexthops and backup nexthops are stored respectively in
isis_route_info->nexthops and isis_route_info->backup->nexthops.

With Flex-Algo, there are multiple Prefix-SIDs for a single prefix in
different algorithms. Each of these Prefix-SIDs performs SPF calculation
with a separate contract and sets a nexthops, so it is necessary to
store a different set nexthops for each Prefix-SID.

Add a nexthops and backup nethops list into the Prefix-SID
isis_sr_psid_info struct and use these lists instead of the  when needed

After this commit, the nexthops for each Prefix-SID is not
taken from route_info, but the nexthop set inside the
Prefix-SID is taken. This works for backup nexthops as well.

Signed-off-by: Hiroki Shirokura <hiroki.shirokura@linecorp.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Before this commit, there was only one sr psid info
included in route_info.

In fact, in RFC8667, Algorithm ID, which is a property of
Prefix-SID, has 8 bits of information. That is, each Prefix
can hold up to 256 Prefix-SIDs. This commit implements it.
The previously implemented single Prefix-SID will be
continued as Algorithm 0.

Signed-off-by: Hiroki Shirokura <hiroki.shirokura@linecorp.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Many of the enum definitions defined in isis_tlvs.h
are often extended at the end. The c/c++ allows
commas at the end of a list. This commit simplifies
the patching of later extensions.

Signed-off-by: Hiroki Shirokura <hiroki.shirokura@linecorp.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
SR Algorithms are independent of specific IGPs
such as IS-IS and OSPF. This commit adds lib/sr to
aggregate IGP agnostic functions and constants.

Signed-off-by: Hiroki Shirokura <hiroki.shirokura@linecorp.com>
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Basically in frrouting source code principle,
the log string should not be a complicated abstraction
or streamlined for grep.

But for log format for the "TLV size does not match ..."
can be unified, which makes development easier.

> $ grep "TLV size does not match expected size for" isisd/isis_tlvs.c
>    "TLV size does not match expected size for Administrative Group!\n");
>    "TLV size does not match expected size for Local IPv6 address!\n");
>    ...(snip)...
>    "TLV size does not match expected size for Adjacency SID!\n");
>          "TLV size does not match expected size for Adjacency SID!\n");
>          "TLV size does not match expected size for Adjacency SID!\n");
>    "TLV size does not match expected size for LAN-Adjacency SID!\n");
>          "TLV size does not match expected size for LAN-Adjacency SID!\n");
>          "TLV size does not match expected size for LAN-Adjacency SID!\n");
>
> $ grep "TLV size does not match expected size for" isisd/isis_tlvs.c | wc -l
> 25

Signed-off-by: Hiroki Shirokura <hiroki.shirokura@linecorp.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
IS-IS Extensions for Segment Routing (RFC8667) defines a variable length
SR-Algorithm Sub-TLV (of the router capability TLV) that includes a list
of supported SR algorithms.  Each algorithm number is one octet.  Only
two algorithms were defined at the time 8667 was written: SPF (0) and
Strict SPF (1).

draft-ietf-lsr-flex-algo-18 reserves the range of algorithm numbers from
128 to 255 for Flex-Algo definitions.  As a result, the SR-Algorithm
Sub-TLV may now, in practice, hold more than two algorithm identifiers.

The internal "struct ls_node", defined in link_state.h, has storage
space for only two algorithm IDs.

Extend this array to 256 entries. Adjust ls_node comparison logic, etc.,
to accommodate the longer array.

Note that the Router Capability TLV allows a maximum of 250 octets for
sub-TLVs and that this is not sufficient to hold a list of all possible
algorithm IDs.  These changes do not account for that limitation.

Signed-off-by: Eric Kinzie <ekinzie@labn.net>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Add the ability to configure a Segment-Routing prefix SID for a given
algorithm. For example:

> segment-routing prefix 10.10.10.10/32 algorithm 128 index 100

Signed-off-by: Hiroki Shirokura <hiroki.shirokura@linecorp.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Add the ability to configure a Segment-Routing prefix SID for a given
algorithm. For example:

> segment-routing prefix 10.10.10.10/32 algorithm 128 index 100

Signed-off-by: Hiroki Shirokura <hiroki.shirokura@linecorp.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Add a function to copy a bitfield_t structure.

Signed-off-by: Hiroki Shirokura <hiroki.shirokura@linecorp.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Add a library to deal with Flexible Algorithm that will be common to
IS-IS and OSPF. The functions enables to deal with:

- Affinity-maps
- Extended Admin Group (RFC7308)
- Flex-Algo structures that contains the flex-algo configurations

Signed-off-by: Hiroki Shirokura <hiroki.shirokura@linecorp.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Define the IS-IS flex-algo structure in yang, the CLI configuration
commands and the skeletons of frontend and backend functions that are
called by the CLI code.

Signed-off-by: Hiroki Shirokura <hiroki.shirokura@linecorp.com>
Signed-off-by: Eric Kinzie <ekinzie@labn.net>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Add a function to returns a human readable string of the metric types
that are defined in yang.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
louis-6wind and others added 4 commits September 12, 2022 18:44
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Adds basic functionality to Flex-Algo for IS-IS wrapping lib/flex_algo.
The configuration interface will be added in the next commit.

Signed-off-by: Hiroki Shirokura <hiroki.shirokura@linecorp.com>
Signed-off-by: Eric Kinzie <ekinzie@labn.net>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Add the backend functions for the flex-algo configuration.

Signed-off-by: Hiroki Shirokura <hiroki.shirokura@linecorp.com>
Signed-off-by: Eric Kinzie <ekinzie@labn.net>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Add the frontend functions for the flex-algo configuration.

Signed-off-by: Hiroki Shirokura <hiroki.shirokura@linecorp.com>
Signed-off-by: Eric Kinzie <ekinzie@labn.net>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Deal with the packing and unpacking of following Flex-Algo
(Sub-)-Sub-TLVs:

- Router Capability (already defined TLV 242)
	- List of the Flex-Algo Definitions (Sub-TLV 26)
		- Exclude admin group (Sub-Sub-TLV 1)
                - Include-any admin group (Sub-Sub-TLV 2)
                - Include-all admin group (Sub-Sub-TLV 3)
                - Flags (for prefix-metric) (Sub-Sub-TLV 4)
- Extended IS Reachability (already defined TLV 22)
	- Application-Specific Link Attributes (Sub-TLV 16)
	  (to enable the Flex-Algo flag on a link)
		- Admin-group (Sub-Sub-TLV 3)
		- Extended Admin-group (Sub-Sub-TLV 14)

Not that:

- Admin-group deals with affinities.
- List of SR Algorithm (Sub-TLV 19) within Router Capability (TLV 242)
  are already set in a previous commit.

Signed-off-by: Hiroki Shirokura <hiroki.shirokura@linecorp.com>
Signed-off-by: Eric Kinzie <ekinzie@labn.net>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Take into account the flex-algo affinity constraints to compute the SPF
tree.

Signed-off-by: Hiroki Shirokura <hiroki.shirokura@linecorp.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
Create a temporary "merge" route table that contains the routing
information from all algorithms and install the merge route table
into the FIB.

Signed-off-by: Hiroki Shirokura <hiroki.shirokura@linecorp.com>
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
louis-6wind pushed a commit that referenced this pull request Aug 21, 2024
The asan memory leak has been detected:
> Direct leak of 16 byte(s) in 1 object(s) allocated from:
>     #0 0x7f9066dadd28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
>     #1 0x7f9066779b5d in qcalloc lib/memory.c:105
>     #2 0x556d6ca527c2 in vpn_leak_zebra_vrf_sid_update_per_af bgpd/bgp_mplsvpn.c:389
>     #3 0x556d6ca530e1 in vpn_leak_zebra_vrf_sid_update bgpd/bgp_mplsvpn.c:451
>     FRRouting#4 0x556d6ca64b3b in vpn_leak_postchange bgpd/bgp_mplsvpn.h:311
>     FRRouting#5 0x556d6ca64b3b in vpn_leak_postchange_all bgpd/bgp_mplsvpn.c:3751
>     FRRouting#6 0x556d6cb9f116 in bgp_zebra_process_srv6_locator_chunk bgpd/bgp_zebra.c:3337
>     FRRouting#7 0x7f906685a6b6 in zclient_read lib/zclient.c:4490
>     FRRouting#8 0x7f9066826a32 in event_call lib/event.c:2011
>     FRRouting#9 0x7f906675c444 in frr_run lib/libfrr.c:1217
>     FRRouting#10 0x556d6c980d52 in main bgpd/bgp_main.c:545
>     FRRouting#11 0x7f9065784c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)

Fix this by freeing the previous memory chunk.

Fixes: b72c9e1 ("bgpd: cli for SRv6 SID alloc to redirect to vrf (step4)")
Fixes: 527588a ("bgpd: add support for per-VRF SRv6 SID")

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit eea8a8a)
louis-6wind added a commit that referenced this pull request Aug 21, 2024
> ==2334217==ERROR: AddressSanitizer: heap-use-after-free on address 0x61000001d0a0 at pc 0x563828c8de6f bp 0x7fffbdaee560 sp 0x7fffbdaee558
> READ of size 1 at 0x61000001d0a0 thread T0
>     #0 0x563828c8de6e in prefix_sid_cmp isisd/isis_spf.c:187
>     #1 0x7f84b8204f71 in hash_get lib/hash.c:142
>     #2 0x7f84b82055ec in hash_lookup lib/hash.c:184
>     #3 0x563828c8e185 in isis_spf_prefix_sid_lookup isisd/isis_spf.c:209
>     FRRouting#4 0x563828c90642 in isis_spf_add2tent isisd/isis_spf.c:598
>     FRRouting#5 0x563828c91cd0 in process_N isisd/isis_spf.c:824
>     FRRouting#6 0x563828c93852 in isis_spf_process_lsp isisd/isis_spf.c:1041
>     FRRouting#7 0x563828c98dde in isis_spf_loop isisd/isis_spf.c:1821
>     FRRouting#8 0x563828c998de in isis_run_spf isisd/isis_spf.c:1983
>     FRRouting#9 0x563828c99c7b in isis_run_spf_with_protection isisd/isis_spf.c:2009
>     FRRouting#10 0x563828c9a60d in isis_run_spf_cb isisd/isis_spf.c:2090
>     FRRouting#11 0x7f84b835c72d in event_call lib/event.c:2011
>     FRRouting#12 0x7f84b8236d93 in frr_run lib/libfrr.c:1217
>     FRRouting#13 0x563828c21918 in main isisd/isis_main.c:346
>     FRRouting#14 0x7f84b7e4fd09 in __libc_start_main ../csu/libc-start.c:308
>     FRRouting#15 0x563828c20df9 in _start (/usr/lib/frr/isisd+0xf5df9)
>
> 0x61000001d0a0 is located 96 bytes inside of 184-byte region [0x61000001d040,0x61000001d0f8)
> freed by thread T0 here:
>     #0 0x7f84b88a9b6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123
>     #1 0x7f84b8263bae in qfree lib/memory.c:130
>     #2 0x563828c8e433 in isis_vertex_del isisd/isis_spf.c:249
>     #3 0x563828c91c95 in process_N isisd/isis_spf.c:811
>     FRRouting#4 0x563828c93852 in isis_spf_process_lsp isisd/isis_spf.c:1041
>     FRRouting#5 0x563828c98dde in isis_spf_loop isisd/isis_spf.c:1821
>     FRRouting#6 0x563828c998de in isis_run_spf isisd/isis_spf.c:1983
>     FRRouting#7 0x563828c99c7b in isis_run_spf_with_protection isisd/isis_spf.c:2009
>     FRRouting#8 0x563828c9a60d in isis_run_spf_cb isisd/isis_spf.c:2090
>     FRRouting#9 0x7f84b835c72d in event_call lib/event.c:2011
>     FRRouting#10 0x7f84b8236d93 in frr_run lib/libfrr.c:1217
>     FRRouting#11 0x563828c21918 in main isisd/isis_main.c:346
>     FRRouting#12 0x7f84b7e4fd09 in __libc_start_main ../csu/libc-start.c:308
>
> previously allocated by thread T0 here:
>     #0 0x7f84b88aa037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     #1 0x7f84b8263a6c in qcalloc lib/memory.c:105
>     #2 0x563828c8e262 in isis_vertex_new isisd/isis_spf.c:225
>     #3 0x563828c904db in isis_spf_add2tent isisd/isis_spf.c:588
>     FRRouting#4 0x563828c91cd0 in process_N isisd/isis_spf.c:824
>     FRRouting#5 0x563828c93852 in isis_spf_process_lsp isisd/isis_spf.c:1041
>     FRRouting#6 0x563828c98dde in isis_spf_loop isisd/isis_spf.c:1821
>     FRRouting#7 0x563828c998de in isis_run_spf isisd/isis_spf.c:1983
>     FRRouting#8 0x563828c99c7b in isis_run_spf_with_protection isisd/isis_spf.c:2009
>     FRRouting#9 0x563828c9a60d in isis_run_spf_cb isisd/isis_spf.c:2090
>     FRRouting#10 0x7f84b835c72d in event_call lib/event.c:2011
>     FRRouting#11 0x7f84b8236d93 in frr_run lib/libfrr.c:1217
>     FRRouting#12 0x563828c21918 in main isisd/isis_main.c:346
>     FRRouting#13 0x7f84b7e4fd09 in __libc_start_main ../csu/libc-start.c:308
>
> SUMMARY: AddressSanitizer: heap-use-after-free isisd/isis_spf.c:187 in prefix_sid_cmp
> Shadow bytes around the buggy address:
>   0x0c207fffb9c0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
>   0x0c207fffb9d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa
>   0x0c207fffb9e0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
>   0x0c207fffb9f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa
>   0x0c207fffba00: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
> =>0x0c207fffba10: fd fd fd fd[fd]fd fd fd fd fd fd fd fd fd fd fa
>   0x0c207fffba20: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
>   0x0c207fffba30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa
>   0x0c207fffba40: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
>   0x0c207fffba50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa
>   0x0c207fffba60: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
> Shadow byte legend (one shadow byte represents 8 application bytes):
>   Addressable:           00
>   Partially addressable: 01 02 03 04 05 06 07
>   Heap left redzone:       fa
>   Freed heap region:       fd
>   Stack left redzone:      f1
>   Stack mid redzone:       f2
>   Stack right redzone:     f3
>   Stack after return:      f5
>   Stack use after scope:   f8
>   Global redzone:          f9
>   Global init order:       f6
>   Poisoned by user:        f7
>   Container overflow:      fc
>   Array cookie:            ac
>   Intra object redzone:    bb
>   ASan internal:           fe
>   Left alloca redzone:     ca
>   Right alloca redzone:    cb
>   Shadow gap:              cc
> ==2334217==ABORTING

Fixes: 2f7cc7b ("isisd: detect Prefix-SID collisions and handle them appropriately")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit e697de5)
louis-6wind pushed a commit that referenced this pull request Aug 21, 2024
- Addressed memory leak by removing `&c->peer_notifier` from the notifier list on termination. Retaining it caused the notifier list to stay active, preventing the deletion of `c->cur.peer`
  thereby causing a memory leak.

- Reordered termination steps to call `vrf_terminate` before `nhrp_vc_terminate`, preventing a heap-use-after-free issue when `nhrp_vc_notify_del` is invoked in `nhrp_peer_check_delete`.

- Added an if statement to avoid passing NULL as hash to `hash_release`, which leads to a SIGSEGV.

The ASan leak log for reference:

```
***********************************************************************************
Address Sanitizer Error detected in nhrp_topo.test_nhrp_topo/r1.asan.nhrpd.20265

=================================================================
==20265==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 112 byte(s) in 1 object(s) allocated from:
    #0 0x7f80270c9b40 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xdeb40)
    #1 0x7f8026ac1eb8 in qmalloc lib/memory.c:100
    #2 0x560fd648f0a6 in nhrp_peer_create nhrpd/nhrp_peer.c:175
    #3 0x7f8026a88d3f in hash_get lib/hash.c:147
    FRRouting#4 0x560fd6490a5d in nhrp_peer_get nhrpd/nhrp_peer.c:228
    FRRouting#5 0x560fd648a51a in nhrp_nhs_resolve_cb nhrpd/nhrp_nhs.c:297
    FRRouting#6 0x7f80266b000f in resolver_cb_literal lib/resolver.c:234
    FRRouting#7 0x7f8026b62e0e in event_call lib/event.c:1969
    FRRouting#8 0x7f8026aa5437 in frr_run lib/libfrr.c:1213
    FRRouting#9 0x560fd6488b4f in main nhrpd/nhrp_main.c:166
    FRRouting#10 0x7f8025eb2c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)

SUMMARY: AddressSanitizer: 112 byte(s) leaked in 1 allocation(s).
***********************************************************************************

***********************************************************************************
Address Sanitizer Error detected in nhrp_topo.test_nhrp_topo/r2.asan.nhrpd.20400

=================================================================
==20400==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 112 byte(s) in 1 object(s) allocated from:
    #0 0x7fb6e3ca5b40 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xdeb40)
    #1 0x7fb6e369deb8 in qmalloc lib/memory.c:100
    #2 0x562652de40a6 in nhrp_peer_create nhrpd/nhrp_peer.c:175
    #3 0x7fb6e3664d3f in hash_get lib/hash.c:147
    FRRouting#4 0x562652de5a5d in nhrp_peer_get nhrpd/nhrp_peer.c:228
    FRRouting#5 0x562652de1e8e in nhrp_packet_recvraw nhrpd/nhrp_packet.c:325
    FRRouting#6 0x7fb6e373ee0e in event_call lib/event.c:1969
    FRRouting#7 0x7fb6e3681437 in frr_run lib/libfrr.c:1213
    FRRouting#8 0x562652dddb4f in main nhrpd/nhrp_main.c:166
    FRRouting#9 0x7fb6e2a8ec86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)

SUMMARY: AddressSanitizer: 112 byte(s) leaked in 1 allocation(s).
***********************************************************************************
```

Signed-off-by: Keelan Cannoo <keelan.cannoo@icloud.com>
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
louis-6wind pushed a commit that referenced this pull request Aug 21, 2024
… the fragmented LSP

1. When the root IS regenerates an LSP, it calls lsp_build() -> lsp_clear_data() to free the TLV memory of the first fragment and all other fragments. If the number of fragments in the regenerated LSP decreases or if no fragmentation is needed, the extra LSP fragments are not immediately deleted. Instead, lsp_seqno_update() -> lsp_purge() is called to set the remaining time to zero and start aging, while also notifying other IS nodes to age these fragments. lsp_purge() usually does not reset lsp->hdr.seqno to zero because the LSP might recover during the aging process.
2. When other IS nodes receive an LSP, they always call process_lsp() -> isis_unpack_tlvs() to allocate TLV memory for the LSP. This does not differentiate whether the received LSP has a remaining lifetime of zero. Therefore, it is rare for an LSP of a non-root IS to have empty TLVs. Of course, if an LSP with a remaining time of zero and already corrupted is received, lsp_update() -> lsp_purge() will be called to free the TLV memory of the LSP, but this scenario is rare.
3. In LFA calculations, neighbors of the root IS are traversed, and each neighbor is taken as a new root to compute the neighbor SPT. During this process, the old root IS will serve as a neighbor of the new root IS, triggering a call to isis_spf_process_lsp() to parse the LSP of the old root IS and obtain its IP vertices and neighboring IS vertices. However, isis_spf_process_lsp() only checks whether the TLVs in the first fragment of the LSP exist, and does not check the TLVs in the fragmented LSP. If the TLV memory of the fragmented LSP of the old root IS has been freed, it can lead to a null pointer access, causing the current crash.

Additionally, for the base SPT, there are only two places where the LSP of the root IS is parsed:
1. When obtaining the UP neighbors of the root IS via spf_adj_list_parse_lsp().
2. When preloading the IP vertices of the root IS via isis_lsp_iterate_ip_reach().
Both of these checks ensure that frag->tlvs is not null, and they do not subsequently call isis_spf_process_lsp() to parse the root IS's LSP. It is very rare for non-root IS LSPs to have empty TLVs unless they are corrupted LSPs awaiting deletion. If it happens, a crash will occur.

The backtrace is as follows:
(gdb) bt
#0  0x00007f3097281fe1 in raise () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007f30973a2972 in core_handler (signo=11, siginfo=0x7ffce66c2870, context=0x7ffce66c2740) at ../lib/sigevent.c:261
#2  <signal handler called>
#3  0x000055dfa805512b in isis_spf_process_lsp (spftree=0x55dfa950eee0, lsp=0x55dfa94cb590, cost=10, depth=1, root_sysid=0x55dfa950ef6c "", parent=0x55dfa952fca0)
    at ../isisd/isis_spf.c:898
FRRouting#4  0x000055dfa805743b in isis_spf_loop (spftree=0x55dfa950eee0, root_sysid=0x55dfa950ef6c "") at ../isisd/isis_spf.c:1688
FRRouting#5  0x000055dfa805784f in isis_run_spf (spftree=0x55dfa950eee0) at ../isisd/isis_spf.c:1808
FRRouting#6  0x000055dfa8037ff5 in isis_spf_run_neighbors (spftree=0x55dfa9474440) at ../isisd/isis_lfa.c:1259
FRRouting#7  0x000055dfa803ac17 in isis_spf_run_lfa (area=0x55dfa9477510, spftree=0x55dfa9474440) at ../isisd/isis_lfa.c:2300
FRRouting#8  0x000055dfa8057964 in isis_run_spf_with_protection (area=0x55dfa9477510, spftree=0x55dfa9474440) at ../isisd/isis_spf.c:1827
FRRouting#9  0x000055dfa8057c15 in isis_run_spf_cb (thread=0x7ffce66c38e0) at ../isisd/isis_spf.c:1889
FRRouting#10 0x00007f30973bbf04 in thread_call (thread=0x7ffce66c38e0) at ../lib/thread.c:1990
FRRouting#11 0x00007f309735497b in frr_run (master=0x55dfa91733c0) at ../lib/libfrr.c:1198
FRRouting#12 0x000055dfa8029d5d in main (argc=5, argv=0x7ffce66c3b08, envp=0x7ffce66c3b38) at ../isisd/isis_main.c:273
(gdb) f 3
#3  0x000055dfa805512b in isis_spf_process_lsp (spftree=0x55dfa950eee0, lsp=0x55dfa94cb590, cost=10, depth=1, root_sysid=0x55dfa950ef6c "", parent=0x55dfa952fca0)
    at ../isisd/isis_spf.c:898
898     ../isisd/isis_spf.c: No such file or directory.
(gdb) p te_neighs
$1 = (struct isis_item_list *) 0x120
(gdb) p lsp->tlvs
$2 = (struct isis_tlvs *) 0x0
(gdb) p lsp->hdr
$3 = {pdu_len = 27, rem_lifetime = 0, lsp_id = "\000\000\000\000\000\001\000\001", seqno = 4, checksum = 59918, lsp_bits = 1 '\001'}

The backtrace provided above pertains to version 8.5.4, but it seems that the same issue exists in the code of the master branch as well.

I have reviewed the process for calculating the SPT based on the LSP, and isis_spf_process_lsp() is the only function that does not check whether the TLVs in the fragments are empty. Therefore, I believe that modifying this function alone should be sufficient. If the TLVs of the current fragment are already empty, we do not need to continue processing subsequent fragments. This is consistent with the behavior where we do not process fragments if the TLVs of the first fragment are empty.
Of course, one could argue that lsp_purge() should still retain the TLV memory, freeing it and then reallocating it if needed. However, this is a debatable point because in some scenarios, it is permissible for the LSP to have empty TLVs. For example, after receiving an SNP (Sequence Number PDU) message, an empty LSP (with lsp->hdr.seqno = 0) might be created by calling lsp_new. If the corresponding LSP message is discarded due to domain or area authentication failure, the TLV memory wouldn't be allocated.

Test scenario:
In an LFA network, importing a sufficient number of static routes to cause LSP fragmentation, and then rolling back the imported static routes so that the LSP is no longer fragmented, can easily result in this issue.

Signed-off-by: zhou-run <zhou.run@h3c.com>
(cherry picked from commit e905177)
louis-6wind added a commit that referenced this pull request Aug 21, 2024
Fix the following crash when pim options are (un)configured on an
non-existent interface.

> r1(config)# int fgljdsf
> r1(config-if)# no ip pim unicast-bsm
> vtysh: error reading from pimd: Connection reset by peer (104)Warning: closing connection to pimd because of an I/O error!

> #0  raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:50
> #1  0x00007f70c8f32249 in core_handler (signo=11, siginfo=0x7fffff88e4f0, context=0x7fffff88e3c0) at lib/sigevent.c:258
> #2  <signal handler called>
> #3  0x0000556cfdd9b16d in lib_interface_pim_address_family_unicast_bsm_modify (args=0x7fffff88f130) at pimd/pim_nb_config.c:1910
> FRRouting#4  0x00007f70c8efdcb5 in nb_callback_modify (context=0x556d00032b60, nb_node=0x556cffeeb9b0, event=NB_EV_APPLY, dnode=0x556d00031670, resource=0x556d00032b48, errmsg=0x7fffff88f710 "", errmsg_len=8192)
>     at lib/northbound.c:1538
> FRRouting#5  0x00007f70c8efe949 in nb_callback_configuration (context=0x556d00032b60, event=NB_EV_APPLY, change=0x556d00032b10, errmsg=0x7fffff88f710 "", errmsg_len=8192) at lib/northbound.c:1888
> FRRouting#6  0x00007f70c8efee82 in nb_transaction_process (event=NB_EV_APPLY, transaction=0x556d00032b60, errmsg=0x7fffff88f710 "", errmsg_len=8192) at lib/northbound.c:2016
> FRRouting#7  0x00007f70c8efd658 in nb_candidate_commit_apply (transaction=0x556d00032b60, save_transaction=true, transaction_id=0x0, errmsg=0x7fffff88f710 "", errmsg_len=8192) at lib/northbound.c:1356
> FRRouting#8  0x00007f70c8efd78e in nb_candidate_commit (context=..., candidate=0x556cffeb0e80, save_transaction=true, comment=0x0, transaction_id=0x0, errmsg=0x7fffff88f710 "", errmsg_len=8192) at lib/northbound.c:1389
> FRRouting#9  0x00007f70c8f03e58 in nb_cli_classic_commit (vty=0x556d00025a80) at lib/northbound_cli.c:51
> FRRouting#10 0x00007f70c8f043f8 in nb_cli_apply_changes_internal (vty=0x556d00025a80,
>     xpath_base=0x7fffff893bb0 "/frr-interface:lib/interface[name='fgljdsf']/frr-pim:pim/address-family[address-family='frr-routing:ipv4']", clear_pending=false) at lib/northbound_cli.c:178
> FRRouting#11 0x00007f70c8f0475d in nb_cli_apply_changes (vty=0x556d00025a80, xpath_base_fmt=0x556cfdde9fe0 "./frr-pim:pim/address-family[address-family='%s']") at lib/northbound_cli.c:234
> FRRouting#12 0x0000556cfdd8298f in pim_process_no_unicast_bsm_cmd (vty=0x556d00025a80) at pimd/pim_cmd_common.c:3493
> FRRouting#13 0x0000556cfddcf782 in no_ip_pim_ucast_bsm (self=0x556cfde40b20 <no_ip_pim_ucast_bsm_cmd>, vty=0x556d00025a80, argc=4, argv=0x556d00031500) at pimd/pim_cmd.c:4950
> FRRouting#14 0x00007f70c8e942f0 in cmd_execute_command_real (vline=0x556d00032070, vty=0x556d00025a80, cmd=0x0, up_level=0) at lib/command.c:1002
> FRRouting#15 0x00007f70c8e94451 in cmd_execute_command (vline=0x556d00032070, vty=0x556d00025a80, cmd=0x0, vtysh=0) at lib/command.c:1061
> FRRouting#16 0x00007f70c8e9499f in cmd_execute (vty=0x556d00025a80, cmd=0x556d00030320 "no ip pim unicast-bsm", matched=0x0, vtysh=0) at lib/command.c:1227
> FRRouting#17 0x00007f70c8f51e44 in vty_command (vty=0x556d00025a80, buf=0x556d00030320 "no ip pim unicast-bsm") at lib/vty.c:616
> FRRouting#18 0x00007f70c8f53bdd in vty_execute (vty=0x556d00025a80) at lib/vty.c:1379
> FRRouting#19 0x00007f70c8f55d59 in vtysh_read (thread=0x7fffff896600) at lib/vty.c:2374
> FRRouting#20 0x00007f70c8f4b209 in event_call (thread=0x7fffff896600) at lib/event.c:2011
> FRRouting#21 0x00007f70c8ed109e in frr_run (master=0x556cffdb4ea0) at lib/libfrr.c:1217
> FRRouting#22 0x0000556cfdddec12 in main (argc=2, argv=0x7fffff896828, envp=0x7fffff896840) at pimd/pim_main.c:165
> (gdb) f 3
> #3  0x0000556cfdd9b16d in lib_interface_pim_address_family_unicast_bsm_modify (args=0x7fffff88f130) at pimd/pim_nb_config.c:1910
> 1910			pim_ifp->ucast_bsm_accept =
> (gdb) list
> 1905		case NB_EV_ABORT:
> 1906			break;
> 1907		case NB_EV_APPLY:
> 1908			ifp = nb_running_get_entry(args->dnode, NULL, true);
> 1909			pim_ifp = ifp->info;
> 1910			pim_ifp->ucast_bsm_accept =
> 1911				yang_dnode_get_bool(args->dnode, NULL);
> 1912
> 1913			break;
> 1914		}
> (gdb) p pim_ifp
> $1 = (struct pim_interface *) 0x0

Fixes: 3bb513c ("lib: adapt to version 2 of libyang")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit 6952bea)
louis-6wind added a commit that referenced this pull request Aug 21, 2024
When 'no rpki' is requested and the rtrlib RPKI object was freed, bgpd
is crashing.

RPKI is configured in VRF red.

> ip l set red down
> ip l del red
> printf 'conf\n vrf red\n no rpki' | vtysh

> Core was generated by `/usr/bin/bgpd -A 127.0.0.1 -M snmp -M rpki -M bmp'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  __pthread_kill_implementation (no_tid=0, signo=11, threadid=140411103615424) at ./nptl/pthread_kill.c:44
> 44	./nptl/pthread_kill.c: No such file or directory.
> [Current thread is 1 (Thread 0x7fb401f419c0 (LWP 190226))]
> (gdb) bt
> #0  __pthread_kill_implementation (no_tid=0, signo=11, threadid=140411103615424) at ./nptl/pthread_kill.c:44
> #1  __pthread_kill_internal (signo=11, threadid=140411103615424) at ./nptl/pthread_kill.c:78
> #2  __GI___pthread_kill (threadid=140411103615424, signo=signo@entry=11) at ./nptl/pthread_kill.c:89
> #3  0x00007fb4021ad476 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
> FRRouting#4  0x00007fb4025ce22b in core_handler (signo=11, siginfo=0x7fff831b2d70, context=0x7fff831b2c40) at lib/sigevent.c:248
> FRRouting#5  <signal handler called>
> FRRouting#6  rtr_mgr_remove_group (config=0x55fe8789f750, preference=11) at /build/make-pkg/output/source/DIST_RTRLIB/rtrlib/rtrlib/rtr_mgr.c:607
> FRRouting#7  0x00007fb40145f518 in rpki_delete_all_cache_nodes (rpki_vrf=0x55fe8789f4f0) at bgpd/bgp_rpki.c:442
> FRRouting#8  0x00007fb401463098 in no_rpki_magic (self=0x7fb40146bba0 <no_rpki_cmd>, vty=0x55fe877f5130, argc=2, argv=0x55fe877fccd0) at bgpd/bgp_rpki.c:1732
> FRRouting#9  0x00007fb40145c09a in no_rpki (self=0x7fb40146bba0 <no_rpki_cmd>, vty=0x55fe877f5130, argc=2, argv=0x55fe877fccd0) at ./bgpd/bgp_rpki_clippy.c:37
> FRRouting#10 0x00007fb402527abc in cmd_execute_command_real (vline=0x55fe877fd150, vty=0x55fe877f5130, cmd=0x0, up_level=0) at lib/command.c:984
> FRRouting#11 0x00007fb402527c35 in cmd_execute_command (vline=0x55fe877fd150, vty=0x55fe877f5130, cmd=0x0, vtysh=0) at lib/command.c:1043
> FRRouting#12 0x00007fb4025281e5 in cmd_execute (vty=0x55fe877f5130, cmd=0x55fe877fb8c0 "no rpki\n", matched=0x0, vtysh=0) at lib/command.c:1209
> FRRouting#13 0x00007fb4025f0aed in vty_command (vty=0x55fe877f5130, buf=0x55fe877fb8c0 "no rpki\n") at lib/vty.c:615
> FRRouting#14 0x00007fb4025f2a11 in vty_execute (vty=0x55fe877f5130) at lib/vty.c:1378
> FRRouting#15 0x00007fb4025f513d in vtysh_read (thread=0x7fff831b5fa0) at lib/vty.c:2373
> FRRouting#16 0x00007fb4025e9611 in event_call (thread=0x7fff831b5fa0) at lib/event.c:2011
> FRRouting#17 0x00007fb402566976 in frr_run (master=0x55fe871a14a0) at lib/libfrr.c:1212
> FRRouting#18 0x000055fe857829fa in main (argc=9, argv=0x7fff831b6218) at bgpd/bgp_main.c:549

Fixes: 8156765 ("bgpd: Add `no rpki` command")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit 4e053d6)

There is also an issue when doing "rpki reset" and then "no rpki".

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
louis-6wind added a commit that referenced this pull request Aug 21, 2024
Fix a heap-after-free that causes zebra to crash even without
address-sanitizer. To reproduce:

> echo "100 my_table" | tee -a /etc/iproute2/rt_tables
> ip route add blackhole default table 100
> ip route show table 100
> ip l add red type vrf table 100
> ip l del red
> ip route del blackhole default table 100

> ==2866266==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000154f54 at pc 0x7fa32474b83f bp 0x7ffe94f67d90 sp 0x7ffe94f67d88
> READ of size 1 at 0x606000154f54 thread T0
>     #0 0x7fa32474b83e in rn_hash_node_const_find lib/table.c:28
>     #1 0x7fa32474bab1 in rn_hash_node_find lib/table.c:28
>     #2 0x7fa32474d783 in route_node_get lib/table.c:283
>     #3 0x7fa3247328dd in srcdest_rnode_get lib/srcdest_table.c:231
>     FRRouting#4 0x55b0e4fa8da4 in rib_find_rn_from_ctx zebra/zebra_rib.c:1957
>     FRRouting#5 0x55b0e4fa8e31 in rib_process_result zebra/zebra_rib.c:1988
>     FRRouting#6 0x55b0e4fb9d64 in rib_process_dplane_results zebra/zebra_rib.c:4894
>     FRRouting#7 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#8 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#9 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#10 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308
>     FRRouting#11 0x55b0e4e2d649 in _start (/usr/lib/frr/zebra+0x1a1649)
>
> 0x606000154f54 is located 20 bytes inside of 56-byte region [0x606000154f40,0x606000154f78)
> freed by thread T0 here:
>     #0 0x7fa324ca9b6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123
>     #1 0x7fa324668d8f in qfree lib/memory.c:130
>     #2 0x7fa32474c421 in route_table_free lib/table.c:126
>     #3 0x7fa32474bf96 in route_table_finish lib/table.c:46
>     FRRouting#4 0x55b0e4fbca3a in zebra_router_free_table zebra/zebra_router.c:191
>     FRRouting#5 0x55b0e4fbccea in zebra_router_release_table zebra/zebra_router.c:214
>     FRRouting#6 0x55b0e4fd428e in zebra_vrf_disable zebra/zebra_vrf.c:219
>     FRRouting#7 0x7fa32476fabf in vrf_disable lib/vrf.c:326
>     FRRouting#8 0x7fa32476f5d4 in vrf_delete lib/vrf.c:231
>     FRRouting#9 0x55b0e4e4ad36 in interface_vrf_change zebra/interface.c:1478
>     FRRouting#10 0x55b0e4e4d5d2 in zebra_if_dplane_ifp_handling zebra/interface.c:1949
>     FRRouting#11 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268
>     FRRouting#12 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954
>     FRRouting#13 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#14 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#15 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#16 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308
>
> previously allocated by thread T0 here:
>     #0 0x7fa324caa037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     #1 0x7fa324668c4d in qcalloc lib/memory.c:105
>     #2 0x7fa32474bf33 in route_table_init_with_delegate lib/table.c:38
>     #3 0x7fa32474e73c in route_table_init lib/table.c:512
>     FRRouting#4 0x55b0e4fbc353 in zebra_router_get_table zebra/zebra_router.c:137
>     FRRouting#5 0x55b0e4fd4da0 in zebra_vrf_table_create zebra/zebra_vrf.c:358
>     FRRouting#6 0x55b0e4fd3d30 in zebra_vrf_enable zebra/zebra_vrf.c:140
>     FRRouting#7 0x7fa32476f9b2 in vrf_enable lib/vrf.c:286
>     FRRouting#8 0x55b0e4e4af76 in interface_vrf_change zebra/interface.c:1533
>     FRRouting#9 0x55b0e4e4d612 in zebra_if_dplane_ifp_handling zebra/interface.c:1968
>     FRRouting#10 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268
>     FRRouting#11 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954
>     FRRouting#12 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#13 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#14 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#15 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308

Fixes: d8612e6 ("zebra: Track tables allocated by vrf and cleanup")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
louis-6wind added a commit that referenced this pull request Aug 22, 2024
Fix a heap-after-free that causes zebra to crash even without
address-sanitizer. To reproduce:

> echo "100 my_table" | tee -a /etc/iproute2/rt_tables
> ip route add blackhole default table 100
> ip route show table 100
> ip l add red type vrf table 100
> ip l del red
> ip route del blackhole default table 100

> ==2866266==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000154f54 at pc 0x7fa32474b83f bp 0x7ffe94f67d90 sp 0x7ffe94f67d88
> READ of size 1 at 0x606000154f54 thread T0
>     #0 0x7fa32474b83e in rn_hash_node_const_find lib/table.c:28
>     #1 0x7fa32474bab1 in rn_hash_node_find lib/table.c:28
>     #2 0x7fa32474d783 in route_node_get lib/table.c:283
>     #3 0x7fa3247328dd in srcdest_rnode_get lib/srcdest_table.c:231
>     FRRouting#4 0x55b0e4fa8da4 in rib_find_rn_from_ctx zebra/zebra_rib.c:1957
>     FRRouting#5 0x55b0e4fa8e31 in rib_process_result zebra/zebra_rib.c:1988
>     FRRouting#6 0x55b0e4fb9d64 in rib_process_dplane_results zebra/zebra_rib.c:4894
>     FRRouting#7 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#8 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#9 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#10 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308
>     FRRouting#11 0x55b0e4e2d649 in _start (/usr/lib/frr/zebra+0x1a1649)
>
> 0x606000154f54 is located 20 bytes inside of 56-byte region [0x606000154f40,0x606000154f78)
> freed by thread T0 here:
>     #0 0x7fa324ca9b6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123
>     #1 0x7fa324668d8f in qfree lib/memory.c:130
>     #2 0x7fa32474c421 in route_table_free lib/table.c:126
>     #3 0x7fa32474bf96 in route_table_finish lib/table.c:46
>     FRRouting#4 0x55b0e4fbca3a in zebra_router_free_table zebra/zebra_router.c:191
>     FRRouting#5 0x55b0e4fbccea in zebra_router_release_table zebra/zebra_router.c:214
>     FRRouting#6 0x55b0e4fd428e in zebra_vrf_disable zebra/zebra_vrf.c:219
>     FRRouting#7 0x7fa32476fabf in vrf_disable lib/vrf.c:326
>     FRRouting#8 0x7fa32476f5d4 in vrf_delete lib/vrf.c:231
>     FRRouting#9 0x55b0e4e4ad36 in interface_vrf_change zebra/interface.c:1478
>     FRRouting#10 0x55b0e4e4d5d2 in zebra_if_dplane_ifp_handling zebra/interface.c:1949
>     FRRouting#11 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268
>     FRRouting#12 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954
>     FRRouting#13 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#14 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#15 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#16 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308
>
> previously allocated by thread T0 here:
>     #0 0x7fa324caa037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     #1 0x7fa324668c4d in qcalloc lib/memory.c:105
>     #2 0x7fa32474bf33 in route_table_init_with_delegate lib/table.c:38
>     #3 0x7fa32474e73c in route_table_init lib/table.c:512
>     FRRouting#4 0x55b0e4fbc353 in zebra_router_get_table zebra/zebra_router.c:137
>     FRRouting#5 0x55b0e4fd4da0 in zebra_vrf_table_create zebra/zebra_vrf.c:358
>     FRRouting#6 0x55b0e4fd3d30 in zebra_vrf_enable zebra/zebra_vrf.c:140
>     FRRouting#7 0x7fa32476f9b2 in vrf_enable lib/vrf.c:286
>     FRRouting#8 0x55b0e4e4af76 in interface_vrf_change zebra/interface.c:1533
>     FRRouting#9 0x55b0e4e4d612 in zebra_if_dplane_ifp_handling zebra/interface.c:1968
>     FRRouting#10 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268
>     FRRouting#11 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954
>     FRRouting#12 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#13 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#14 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#15 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308

Linux RT table can be created before it is assigned and remains after it
is unassigned.

Do not release the table at VRF disabling but change the owner to the
default VRF. At VRF enabling, change the table owner to the VRF.

Fixes: d8612e6 ("zebra: Track tables allocated by vrf and cleanup")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
louis-6wind added a commit that referenced this pull request Aug 22, 2024
Fix a heap-after-free that causes zebra to crash even without
address-sanitizer. To reproduce:

> echo "100 my_table" | tee -a /etc/iproute2/rt_tables
> ip route add blackhole default table 100
> ip route show table 100
> ip l add red type vrf table 100
> ip l del red
> ip route del blackhole default table 100

> ==2866266==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000154f54 at pc 0x7fa32474b83f bp 0x7ffe94f67d90 sp 0x7ffe94f67d88
> READ of size 1 at 0x606000154f54 thread T0
>     #0 0x7fa32474b83e in rn_hash_node_const_find lib/table.c:28
>     #1 0x7fa32474bab1 in rn_hash_node_find lib/table.c:28
>     #2 0x7fa32474d783 in route_node_get lib/table.c:283
>     #3 0x7fa3247328dd in srcdest_rnode_get lib/srcdest_table.c:231
>     FRRouting#4 0x55b0e4fa8da4 in rib_find_rn_from_ctx zebra/zebra_rib.c:1957
>     FRRouting#5 0x55b0e4fa8e31 in rib_process_result zebra/zebra_rib.c:1988
>     FRRouting#6 0x55b0e4fb9d64 in rib_process_dplane_results zebra/zebra_rib.c:4894
>     FRRouting#7 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#8 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#9 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#10 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308
>     FRRouting#11 0x55b0e4e2d649 in _start (/usr/lib/frr/zebra+0x1a1649)
>
> 0x606000154f54 is located 20 bytes inside of 56-byte region [0x606000154f40,0x606000154f78)
> freed by thread T0 here:
>     #0 0x7fa324ca9b6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123
>     #1 0x7fa324668d8f in qfree lib/memory.c:130
>     #2 0x7fa32474c421 in route_table_free lib/table.c:126
>     #3 0x7fa32474bf96 in route_table_finish lib/table.c:46
>     FRRouting#4 0x55b0e4fbca3a in zebra_router_free_table zebra/zebra_router.c:191
>     FRRouting#5 0x55b0e4fbccea in zebra_router_release_table zebra/zebra_router.c:214
>     FRRouting#6 0x55b0e4fd428e in zebra_vrf_disable zebra/zebra_vrf.c:219
>     FRRouting#7 0x7fa32476fabf in vrf_disable lib/vrf.c:326
>     FRRouting#8 0x7fa32476f5d4 in vrf_delete lib/vrf.c:231
>     FRRouting#9 0x55b0e4e4ad36 in interface_vrf_change zebra/interface.c:1478
>     FRRouting#10 0x55b0e4e4d5d2 in zebra_if_dplane_ifp_handling zebra/interface.c:1949
>     FRRouting#11 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268
>     FRRouting#12 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954
>     FRRouting#13 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#14 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#15 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#16 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308
>
> previously allocated by thread T0 here:
>     #0 0x7fa324caa037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     #1 0x7fa324668c4d in qcalloc lib/memory.c:105
>     #2 0x7fa32474bf33 in route_table_init_with_delegate lib/table.c:38
>     #3 0x7fa32474e73c in route_table_init lib/table.c:512
>     FRRouting#4 0x55b0e4fbc353 in zebra_router_get_table zebra/zebra_router.c:137
>     FRRouting#5 0x55b0e4fd4da0 in zebra_vrf_table_create zebra/zebra_vrf.c:358
>     FRRouting#6 0x55b0e4fd3d30 in zebra_vrf_enable zebra/zebra_vrf.c:140
>     FRRouting#7 0x7fa32476f9b2 in vrf_enable lib/vrf.c:286
>     FRRouting#8 0x55b0e4e4af76 in interface_vrf_change zebra/interface.c:1533
>     FRRouting#9 0x55b0e4e4d612 in zebra_if_dplane_ifp_handling zebra/interface.c:1968
>     FRRouting#10 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268
>     FRRouting#11 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954
>     FRRouting#12 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#13 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#14 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#15 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308

Linux RT table can be created before it is assigned and remains after it
is unassigned.

Do not release the table at VRF disabling but change the owner to the
default VRF. At VRF enabling, change the table owner to the VRF.

Fixes: d8612e6 ("zebra: Track tables allocated by vrf and cleanup")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
louis-6wind added a commit that referenced this pull request Aug 22, 2024
Fix a heap-after-free that causes zebra to crash even without
address-sanitizer. To reproduce:

> echo "100 my_table" | tee -a /etc/iproute2/rt_tables
> ip route add blackhole default table 100
> ip route show table 100
> ip l add red type vrf table 100
> ip l del red
> ip route del blackhole default table 100

> ==2866266==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000154f54 at pc 0x7fa32474b83f bp 0x7ffe94f67d90 sp 0x7ffe94f67d88
> READ of size 1 at 0x606000154f54 thread T0
>     #0 0x7fa32474b83e in rn_hash_node_const_find lib/table.c:28
>     #1 0x7fa32474bab1 in rn_hash_node_find lib/table.c:28
>     #2 0x7fa32474d783 in route_node_get lib/table.c:283
>     #3 0x7fa3247328dd in srcdest_rnode_get lib/srcdest_table.c:231
>     FRRouting#4 0x55b0e4fa8da4 in rib_find_rn_from_ctx zebra/zebra_rib.c:1957
>     FRRouting#5 0x55b0e4fa8e31 in rib_process_result zebra/zebra_rib.c:1988
>     FRRouting#6 0x55b0e4fb9d64 in rib_process_dplane_results zebra/zebra_rib.c:4894
>     FRRouting#7 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#8 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#9 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#10 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308
>     FRRouting#11 0x55b0e4e2d649 in _start (/usr/lib/frr/zebra+0x1a1649)
>
> 0x606000154f54 is located 20 bytes inside of 56-byte region [0x606000154f40,0x606000154f78)
> freed by thread T0 here:
>     #0 0x7fa324ca9b6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123
>     #1 0x7fa324668d8f in qfree lib/memory.c:130
>     #2 0x7fa32474c421 in route_table_free lib/table.c:126
>     #3 0x7fa32474bf96 in route_table_finish lib/table.c:46
>     FRRouting#4 0x55b0e4fbca3a in zebra_router_free_table zebra/zebra_router.c:191
>     FRRouting#5 0x55b0e4fbccea in zebra_router_release_table zebra/zebra_router.c:214
>     FRRouting#6 0x55b0e4fd428e in zebra_vrf_disable zebra/zebra_vrf.c:219
>     FRRouting#7 0x7fa32476fabf in vrf_disable lib/vrf.c:326
>     FRRouting#8 0x7fa32476f5d4 in vrf_delete lib/vrf.c:231
>     FRRouting#9 0x55b0e4e4ad36 in interface_vrf_change zebra/interface.c:1478
>     FRRouting#10 0x55b0e4e4d5d2 in zebra_if_dplane_ifp_handling zebra/interface.c:1949
>     FRRouting#11 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268
>     FRRouting#12 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954
>     FRRouting#13 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#14 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#15 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#16 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308
>
> previously allocated by thread T0 here:
>     #0 0x7fa324caa037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     #1 0x7fa324668c4d in qcalloc lib/memory.c:105
>     #2 0x7fa32474bf33 in route_table_init_with_delegate lib/table.c:38
>     #3 0x7fa32474e73c in route_table_init lib/table.c:512
>     FRRouting#4 0x55b0e4fbc353 in zebra_router_get_table zebra/zebra_router.c:137
>     FRRouting#5 0x55b0e4fd4da0 in zebra_vrf_table_create zebra/zebra_vrf.c:358
>     FRRouting#6 0x55b0e4fd3d30 in zebra_vrf_enable zebra/zebra_vrf.c:140
>     FRRouting#7 0x7fa32476f9b2 in vrf_enable lib/vrf.c:286
>     FRRouting#8 0x55b0e4e4af76 in interface_vrf_change zebra/interface.c:1533
>     FRRouting#9 0x55b0e4e4d612 in zebra_if_dplane_ifp_handling zebra/interface.c:1968
>     FRRouting#10 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268
>     FRRouting#11 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954
>     FRRouting#12 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#13 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#14 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#15 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308

Linux RT table can be created before it is assigned and remains after it
is unassigned.

Do not release the table at VRF disabling but change the owner to the
default VRF. At VRF enabling, change the table owner to the VRF.

Fixes: d8612e6 ("zebra: Track tables allocated by vrf and cleanup")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
louis-6wind pushed a commit that referenced this pull request Aug 26, 2024
Some json attributes are missing for L3NHG values.

> PE1# show bgp vrf all detail
> [..]
> Instance vrf-purple:
> BGP table version is 1, local router ID is 27.3.0.85, vrf id 7
> Default local pref 100, local AS 65000
> BGP routing table entry for fe80::4620:ff:feff:ff01/128, version 1
> Paths: (1 available, best #1, vrf vrf-purple)
>   Not advertised to any peer
>   Imported from 10.30.30.30:5:[2]:[0]:[48]:[44:20:00:ff:ff:01]:[128]:[fe80::4620:ff:feff:ff01], VNI 4000  Local
>     ::ffff:a1e:1e1e (metric 20) from 10.30.30.30 (10.30.30.30) announce-nh-self
>       Origin IGP, localpref 100, valid, internal, best (First path received)
>       Extended Community: RT:65000:4000 ET:8
>       Last update: Thu Aug 22 18:23:38 2024
>
> Displayed 1 routes and 1 total paths

> PE1# show bgp vrf all json detail
> {
> "vrf-purple":{
>  "vrfId": 7,
>  "vrfName": "vrf-purple",
>  "tableVersion": 1,
>  "routerId": "27.3.0.85",
>  "defaultLocPrf": 100,
>  "localAS": 65000,
>  "routes": { "fe80::4620:ff:feff:ff01/128": [{"importedFrom":"10.30.30.30:5","l3nhg":false,"l3nhgActive":false, "vni": "4000",
> "aspath":{"string":"Local","segments":[],"length":0},"announceNexthopSelf":true,"origin":"IGP","locPrf":100,
> "valid":true,"version":1,"bestpath":{"overall":true,"selectionReason":"First path received"},
> "extendedCommunity":{"string":"RT:65000:4000 ET:8"},"lastUpdate":{"epoch":1724343817,
> "string":"Thu Aug 22 18:23:37 2024\n"},
> "nexthops":[{"ip":"::ffff:a1e:1e1e","hostname":"PE2","afi":"ipv6",
> "scope":"global","metric":20,"accessible":true,"used":true}],
> "peer":{"peerId":"10.30.30.30","routerId":"10.30.30.30","hostname":"PE2","type":"internal"}}]
>  }  }
> }

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
louis-6wind added a commit that referenced this pull request Aug 27, 2024
Fix crash when flex-algo is configured and mpls-te is disabled.

> interface eth0
>  ip router isis 1
> !
> router isis 1
>  flex-algo 129
>   dataplane sr-mpls
>   advertise-definition

> #0  __pthread_kill_implementation (no_tid=0, signo=11, threadid=140486233631168) at ./nptl/pthread_kill.c:44
> #1  __pthread_kill_internal (signo=11, threadid=140486233631168) at ./nptl/pthread_kill.c:78
> #2  __GI___pthread_kill (threadid=140486233631168, signo=signo@entry=11) at ./nptl/pthread_kill.c:89
> #3  0x00007fc5802e9476 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
> FRRouting#4  0x00007fc58076021f in core_handler (signo=11, siginfo=0x7ffd38d42470, context=0x7ffd38d42340) at lib/sigevent.c:248
> FRRouting#5  <signal handler called>
> FRRouting#6  0x000055c527f798c9 in isis_link_params_update_asla (circuit=0x55c52aaed3c0, ifp=0x55c52a1044e0) at isisd/isis_te.c:176
> FRRouting#7  0x000055c527fb29da in isis_instance_flex_algo_create (args=0x7ffd38d43120) at isisd/isis_nb_config.c:2875
> FRRouting#8  0x00007fc58072655b in nb_callback_create (context=0x55c52ab1d2f0, nb_node=0x55c529f72950, event=NB_EV_APPLY, dnode=0x55c52ab06230, resource=0x55c52ab189f8, errmsg=0x7ffd38d43750 "",
>     errmsg_len=8192) at lib/northbound.c:1262
> FRRouting#9  0x00007fc580727625 in nb_callback_configuration (context=0x55c52ab1d2f0, event=NB_EV_APPLY, change=0x55c52ab189c0, errmsg=0x7ffd38d43750 "", errmsg_len=8192) at lib/northbound.c:1662
> FRRouting#10 0x00007fc580727c39 in nb_transaction_process (event=NB_EV_APPLY, transaction=0x55c52ab1d2f0, errmsg=0x7ffd38d43750 "", errmsg_len=8192) at lib/northbound.c:1794
> FRRouting#11 0x00007fc580725f77 in nb_candidate_commit_apply (transaction=0x55c52ab1d2f0, save_transaction=true, transaction_id=0x0, errmsg=0x7ffd38d43750 "", errmsg_len=8192)
>     at lib/northbound.c:1131
> FRRouting#12 0x00007fc5807260d1 in nb_candidate_commit (context=..., candidate=0x55c529f0a730, save_transaction=true, comment=0x0, transaction_id=0x0, errmsg=0x7ffd38d43750 "", errmsg_len=8192)
>     at lib/northbound.c:1164
> FRRouting#13 0x00007fc58072d220 in nb_cli_classic_commit (vty=0x55c52a0fc6b0) at lib/northbound_cli.c:51
> FRRouting#14 0x00007fc58072d839 in nb_cli_apply_changes_internal (vty=0x55c52a0fc6b0,
>     xpath_base=0x7ffd38d477f0 "/frr-isisd:isis/instance[area-tag='1'][vrf='default']/flex-algos/flex-algo[flex-algo='129']", clear_pending=false) at lib/northbound_cli.c:178
> FRRouting#15 0x00007fc58072dbcf in nb_cli_apply_changes (vty=0x55c52a0fc6b0, xpath_base_fmt=0x55c528014de0 "./flex-algos/flex-algo[flex-algo='%ld']") at lib/northbound_cli.c:234
> FRRouting#16 0x000055c527fd3403 in flex_algo_magic (self=0x55c52804f1a0 <flex_algo_cmd>, vty=0x55c52a0fc6b0, argc=2, argv=0x55c52ab00ec0, algorithm=129, algorithm_str=0x55c52ab120d0 "129")
>     at isisd/isis_cli.c:3752
> FRRouting#17 0x000055c527fc97cb in flex_algo (self=0x55c52804f1a0 <flex_algo_cmd>, vty=0x55c52a0fc6b0, argc=2, argv=0x55c52ab00ec0) at ./isisd/isis_cli_clippy.c:6445
> FRRouting#18 0x00007fc5806b9abc in cmd_execute_command_real (vline=0x55c52aaf78f0, vty=0x55c52a0fc6b0, cmd=0x0, up_level=0) at lib/command.c:984
> FRRouting#19 0x00007fc5806b9c35 in cmd_execute_command (vline=0x55c52aaf78f0, vty=0x55c52a0fc6b0, cmd=0x0, vtysh=0) at lib/command.c:1043
> FRRouting#20 0x00007fc5806ba1e5 in cmd_execute (vty=0x55c52a0fc6b0, cmd=0x55c52aae6bd0 "flex-algo 129\n", matched=0x0, vtysh=0) at lib/command.c:1209
> FRRouting#21 0x00007fc580782ae1 in vty_command (vty=0x55c52a0fc6b0, buf=0x55c52aae6bd0 "flex-algo 129\n") at lib/vty.c:615
> FRRouting#22 0x00007fc580784a05 in vty_execute (vty=0x55c52a0fc6b0) at lib/vty.c:1378
> FRRouting#23 0x00007fc580787131 in vtysh_read (thread=0x7ffd38d4ab10) at lib/vty.c:2373
> FRRouting#24 0x00007fc58077b605 in event_call (thread=0x7ffd38d4ab10) at lib/event.c:2011
> FRRouting#25 0x00007fc5806f8976 in frr_run (master=0x55c529df9b30) at lib/libfrr.c:1212
> FRRouting#26 0x000055c527f301bc in main (argc=5, argv=0x7ffd38d4ad58, envp=0x7ffd38d4ad88) at isisd/isis_main.c:350
> (gdb) f 6
> FRRouting#6  0x000055c527f798c9 in isis_link_params_update_asla (circuit=0x55c52aaed3c0, ifp=0x55c52a1044e0) at isisd/isis_te.c:176
> 176                     list_delete_all_node(ext->aslas);
> (gdb) p ext
> $1 = (struct isis_ext_subtlvs *) 0x0

Fixes: ae27101 ("isisd: fix building asla at first flex-algo config")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
louis-6wind pushed a commit that referenced this pull request Aug 30, 2024
…links

When a neighbor connection is disconnected, it may trigger LSP re-generation as a timer task, but this process may be delayed. As a result, the list of neighbors in area->adjacency_list may be inconsistent with the neighbors in lsp->tlvs->oldstyle_reach/extended_reach. For example, the area->adjacency_list may lack certain neighbors even though they are present in the LSP. When computing SPF, the call to isis_spf_build_adj_list() generates the spftree->sadj_list, which reflects the real neighbors in the area->adjacency_list. However, in the case of LAN links, spftree->sadj_list may include additional pseudo neighbors.
The pre-loading of tents through the call to isis_spf_preload_tent involves two steps:
1. isis_spf_process_lsp() is called to generate real neighbor vertices based on the root LSP and pseudo LSP.
2. isis_spf_add_local() is called to add corresponding next hops to the vertex->Adj_N list for the real neighbor vertices.
In the case of LAN links, the absence of corresponding real neighbors in the spftree->sadj_list prevents the execution of the second step. Consequently, the vertex->Adj_N list for the real neighbor vertices lacks corresponding next hops. This leads to a null pointer access when isis_lfa_compute() is called to calculate LFA.
As for P2P links, since there are no pseudo neighbors, only the second step is executed, which does not create real neighbor vertices and therefore does not encounter this issue.
The backtrace is as follows:
(gdb) bt
#0  0x00007fd065277fe1 in raise () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007fd065398972 in core_handler (signo=11, siginfo=0x7ffc5c0636b0, context=0x7ffc5c063580) at ../lib/sigevent.c:261
#2  <signal handler called>
#3  0x00005564d82f8408 in isis_lfa_compute (area=0x5564d8b143f0, circuit=0x5564d8b21d10, spftree=0x5564d8b06bf0, resource=0x7ffc5c064410) at ../isisd/isis_lfa.c:2134
FRRouting#4  0x00005564d82f8d78 in isis_spf_run_lfa (area=0x5564d8b143f0, spftree=0x5564d8b06bf0) at ../isisd/isis_lfa.c:2344
FRRouting#5  0x00005564d8315964 in isis_run_spf_with_protection (area=0x5564d8b143f0, spftree=0x5564d8b06bf0) at ../isisd/isis_spf.c:1827
FRRouting#6  0x00005564d8315c15 in isis_run_spf_cb (thread=0x7ffc5c064590) at ../isisd/isis_spf.c:1889
FRRouting#7  0x00007fd0653b1f04 in thread_call (thread=0x7ffc5c064590) at ../lib/thread.c:1990
FRRouting#8  0x00007fd06534a97b in frr_run (master=0x5564d88103c0) at ../lib/libfrr.c:1198
FRRouting#9  0x00005564d82e7d5d in main (argc=5, argv=0x7ffc5c0647b8, envp=0x7ffc5c0647e8) at ../isisd/isis_main.c:273
(gdb) f 3
#3  0x00005564d82f8408 in isis_lfa_compute (area=0x5564d8b143f0, circuit=0x5564d8b21d10, spftree=0x5564d8b06bf0, resource=0x7ffc5c064410) at ../isisd/isis_lfa.c:2134
2134    ../isisd/isis_lfa.c: No such file or directory.
(gdb) p vadj_primary
$1 = (struct isis_vertex_adj *) 0x0
(gdb) p vertex->Adj_N->head
$2 = (struct listnode *) 0x0
(gdb) p (struct isis_vertex *)spftree->paths->l.list->head->next->next->next->next->data
$8 = (struct isis_vertex *) 0x5564d8b5b240
(gdb) p $8->type
$9 = VTYPE_NONPSEUDO_TE_IS
(gdb) p $8->N.id
$10 = "\000\000\000\000\000\002"
(gdb) p $8->Adj_N->count
$11 = 0
(gdb) p (struct isis_vertex *)spftree->paths->l.list->head->next->next->next->next->next->data
$12 = (struct isis_vertex *) 0x5564d8b73dd0
(gdb) p $12->type
$13 = VTYPE_NONPSEUDO_TE_IS
(gdb) p $12->N.id
$14 = "\000\000\000\000\000\003"
(gdb) p $12->Adj_N->count
$15 = 0
(gdb) p area->adjacency_list->count
$16 = 0
The backtrace provided above pertains to version 8.5.4, but it seems that the same issue exists in the code of the master branch as well.
The scenario where a vertex has no next hop is normal. For example, the "clear isis neighbor" command invokes isis_vertex_adj_del() to delete the next hop of a vertex. Upon reviewing all the instances where the vertex->Adj_N list is used, I found that only isis_lfa_compute() lacks a null check. Therefore, I believe that modifying this part will be sufficient. Additionally, the vertex->parents list for IP vertices is guaranteed not to be empty.
Test scenario:
Setting up LFA for LAN links and executing the "clear isis neighbor" command easily reproduces the issue.

Signed-off-by: zhou-run <zhou.run@h3c.com>
(cherry picked from commit a970bb5)
louis-6wind pushed a commit that referenced this pull request Aug 30, 2024
… the fragmented LSP

1. When the root IS regenerates an LSP, it calls lsp_build() -> lsp_clear_data() to free the TLV memory of the first fragment and all other fragments. If the number of fragments in the regenerated LSP decreases or if no fragmentation is needed, the extra LSP fragments are not immediately deleted. Instead, lsp_seqno_update() -> lsp_purge() is called to set the remaining time to zero and start aging, while also notifying other IS nodes to age these fragments. lsp_purge() usually does not reset lsp->hdr.seqno to zero because the LSP might recover during the aging process.
2. When other IS nodes receive an LSP, they always call process_lsp() -> isis_unpack_tlvs() to allocate TLV memory for the LSP. This does not differentiate whether the received LSP has a remaining lifetime of zero. Therefore, it is rare for an LSP of a non-root IS to have empty TLVs. Of course, if an LSP with a remaining time of zero and already corrupted is received, lsp_update() -> lsp_purge() will be called to free the TLV memory of the LSP, but this scenario is rare.
3. In LFA calculations, neighbors of the root IS are traversed, and each neighbor is taken as a new root to compute the neighbor SPT. During this process, the old root IS will serve as a neighbor of the new root IS, triggering a call to isis_spf_process_lsp() to parse the LSP of the old root IS and obtain its IP vertices and neighboring IS vertices. However, isis_spf_process_lsp() only checks whether the TLVs in the first fragment of the LSP exist, and does not check the TLVs in the fragmented LSP. If the TLV memory of the fragmented LSP of the old root IS has been freed, it can lead to a null pointer access, causing the current crash.

Additionally, for the base SPT, there are only two places where the LSP of the root IS is parsed:
1. When obtaining the UP neighbors of the root IS via spf_adj_list_parse_lsp().
2. When preloading the IP vertices of the root IS via isis_lsp_iterate_ip_reach().
Both of these checks ensure that frag->tlvs is not null, and they do not subsequently call isis_spf_process_lsp() to parse the root IS's LSP. It is very rare for non-root IS LSPs to have empty TLVs unless they are corrupted LSPs awaiting deletion. If it happens, a crash will occur.

The backtrace is as follows:
(gdb) bt
#0  0x00007f3097281fe1 in raise () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007f30973a2972 in core_handler (signo=11, siginfo=0x7ffce66c2870, context=0x7ffce66c2740) at ../lib/sigevent.c:261
#2  <signal handler called>
#3  0x000055dfa805512b in isis_spf_process_lsp (spftree=0x55dfa950eee0, lsp=0x55dfa94cb590, cost=10, depth=1, root_sysid=0x55dfa950ef6c "", parent=0x55dfa952fca0)
    at ../isisd/isis_spf.c:898
FRRouting#4  0x000055dfa805743b in isis_spf_loop (spftree=0x55dfa950eee0, root_sysid=0x55dfa950ef6c "") at ../isisd/isis_spf.c:1688
FRRouting#5  0x000055dfa805784f in isis_run_spf (spftree=0x55dfa950eee0) at ../isisd/isis_spf.c:1808
FRRouting#6  0x000055dfa8037ff5 in isis_spf_run_neighbors (spftree=0x55dfa9474440) at ../isisd/isis_lfa.c:1259
FRRouting#7  0x000055dfa803ac17 in isis_spf_run_lfa (area=0x55dfa9477510, spftree=0x55dfa9474440) at ../isisd/isis_lfa.c:2300
FRRouting#8  0x000055dfa8057964 in isis_run_spf_with_protection (area=0x55dfa9477510, spftree=0x55dfa9474440) at ../isisd/isis_spf.c:1827
FRRouting#9  0x000055dfa8057c15 in isis_run_spf_cb (thread=0x7ffce66c38e0) at ../isisd/isis_spf.c:1889
FRRouting#10 0x00007f30973bbf04 in thread_call (thread=0x7ffce66c38e0) at ../lib/thread.c:1990
FRRouting#11 0x00007f309735497b in frr_run (master=0x55dfa91733c0) at ../lib/libfrr.c:1198
FRRouting#12 0x000055dfa8029d5d in main (argc=5, argv=0x7ffce66c3b08, envp=0x7ffce66c3b38) at ../isisd/isis_main.c:273
(gdb) f 3
#3  0x000055dfa805512b in isis_spf_process_lsp (spftree=0x55dfa950eee0, lsp=0x55dfa94cb590, cost=10, depth=1, root_sysid=0x55dfa950ef6c "", parent=0x55dfa952fca0)
    at ../isisd/isis_spf.c:898
898     ../isisd/isis_spf.c: No such file or directory.
(gdb) p te_neighs
$1 = (struct isis_item_list *) 0x120
(gdb) p lsp->tlvs
$2 = (struct isis_tlvs *) 0x0
(gdb) p lsp->hdr
$3 = {pdu_len = 27, rem_lifetime = 0, lsp_id = "\000\000\000\000\000\001\000\001", seqno = 4, checksum = 59918, lsp_bits = 1 '\001'}

The backtrace provided above pertains to version 8.5.4, but it seems that the same issue exists in the code of the master branch as well.

I have reviewed the process for calculating the SPT based on the LSP, and isis_spf_process_lsp() is the only function that does not check whether the TLVs in the fragments are empty. Therefore, I believe that modifying this function alone should be sufficient. If the TLVs of the current fragment are already empty, we do not need to continue processing subsequent fragments. This is consistent with the behavior where we do not process fragments if the TLVs of the first fragment are empty.
Of course, one could argue that lsp_purge() should still retain the TLV memory, freeing it and then reallocating it if needed. However, this is a debatable point because in some scenarios, it is permissible for the LSP to have empty TLVs. For example, after receiving an SNP (Sequence Number PDU) message, an empty LSP (with lsp->hdr.seqno = 0) might be created by calling lsp_new. If the corresponding LSP message is discarded due to domain or area authentication failure, the TLV memory wouldn't be allocated.

Test scenario:
In an LFA network, importing a sufficient number of static routes to cause LSP fragmentation, and then rolling back the imported static routes so that the LSP is no longer fragmented, can easily result in this issue.

Signed-off-by: zhou-run <zhou.run@h3c.com>
(cherry picked from commit e905177)
louis-6wind added a commit that referenced this pull request Aug 30, 2024
Fix the following crash when pim options are (un)configured on an
non-existent interface.

> r1(config)# int fgljdsf
> r1(config-if)# no ip pim unicast-bsm
> vtysh: error reading from pimd: Connection reset by peer (104)Warning: closing connection to pimd because of an I/O error!

> #0  raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:50
> #1  0x00007f70c8f32249 in core_handler (signo=11, siginfo=0x7fffff88e4f0, context=0x7fffff88e3c0) at lib/sigevent.c:258
> #2  <signal handler called>
> #3  0x0000556cfdd9b16d in lib_interface_pim_address_family_unicast_bsm_modify (args=0x7fffff88f130) at pimd/pim_nb_config.c:1910
> FRRouting#4  0x00007f70c8efdcb5 in nb_callback_modify (context=0x556d00032b60, nb_node=0x556cffeeb9b0, event=NB_EV_APPLY, dnode=0x556d00031670, resource=0x556d00032b48, errmsg=0x7fffff88f710 "", errmsg_len=8192)
>     at lib/northbound.c:1538
> FRRouting#5  0x00007f70c8efe949 in nb_callback_configuration (context=0x556d00032b60, event=NB_EV_APPLY, change=0x556d00032b10, errmsg=0x7fffff88f710 "", errmsg_len=8192) at lib/northbound.c:1888
> FRRouting#6  0x00007f70c8efee82 in nb_transaction_process (event=NB_EV_APPLY, transaction=0x556d00032b60, errmsg=0x7fffff88f710 "", errmsg_len=8192) at lib/northbound.c:2016
> FRRouting#7  0x00007f70c8efd658 in nb_candidate_commit_apply (transaction=0x556d00032b60, save_transaction=true, transaction_id=0x0, errmsg=0x7fffff88f710 "", errmsg_len=8192) at lib/northbound.c:1356
> FRRouting#8  0x00007f70c8efd78e in nb_candidate_commit (context=..., candidate=0x556cffeb0e80, save_transaction=true, comment=0x0, transaction_id=0x0, errmsg=0x7fffff88f710 "", errmsg_len=8192) at lib/northbound.c:1389
> FRRouting#9  0x00007f70c8f03e58 in nb_cli_classic_commit (vty=0x556d00025a80) at lib/northbound_cli.c:51
> FRRouting#10 0x00007f70c8f043f8 in nb_cli_apply_changes_internal (vty=0x556d00025a80,
>     xpath_base=0x7fffff893bb0 "/frr-interface:lib/interface[name='fgljdsf']/frr-pim:pim/address-family[address-family='frr-routing:ipv4']", clear_pending=false) at lib/northbound_cli.c:178
> FRRouting#11 0x00007f70c8f0475d in nb_cli_apply_changes (vty=0x556d00025a80, xpath_base_fmt=0x556cfdde9fe0 "./frr-pim:pim/address-family[address-family='%s']") at lib/northbound_cli.c:234
> FRRouting#12 0x0000556cfdd8298f in pim_process_no_unicast_bsm_cmd (vty=0x556d00025a80) at pimd/pim_cmd_common.c:3493
> FRRouting#13 0x0000556cfddcf782 in no_ip_pim_ucast_bsm (self=0x556cfde40b20 <no_ip_pim_ucast_bsm_cmd>, vty=0x556d00025a80, argc=4, argv=0x556d00031500) at pimd/pim_cmd.c:4950
> FRRouting#14 0x00007f70c8e942f0 in cmd_execute_command_real (vline=0x556d00032070, vty=0x556d00025a80, cmd=0x0, up_level=0) at lib/command.c:1002
> FRRouting#15 0x00007f70c8e94451 in cmd_execute_command (vline=0x556d00032070, vty=0x556d00025a80, cmd=0x0, vtysh=0) at lib/command.c:1061
> FRRouting#16 0x00007f70c8e9499f in cmd_execute (vty=0x556d00025a80, cmd=0x556d00030320 "no ip pim unicast-bsm", matched=0x0, vtysh=0) at lib/command.c:1227
> FRRouting#17 0x00007f70c8f51e44 in vty_command (vty=0x556d00025a80, buf=0x556d00030320 "no ip pim unicast-bsm") at lib/vty.c:616
> FRRouting#18 0x00007f70c8f53bdd in vty_execute (vty=0x556d00025a80) at lib/vty.c:1379
> FRRouting#19 0x00007f70c8f55d59 in vtysh_read (thread=0x7fffff896600) at lib/vty.c:2374
> FRRouting#20 0x00007f70c8f4b209 in event_call (thread=0x7fffff896600) at lib/event.c:2011
> FRRouting#21 0x00007f70c8ed109e in frr_run (master=0x556cffdb4ea0) at lib/libfrr.c:1217
> FRRouting#22 0x0000556cfdddec12 in main (argc=2, argv=0x7fffff896828, envp=0x7fffff896840) at pimd/pim_main.c:165
> (gdb) f 3
> #3  0x0000556cfdd9b16d in lib_interface_pim_address_family_unicast_bsm_modify (args=0x7fffff88f130) at pimd/pim_nb_config.c:1910
> 1910			pim_ifp->ucast_bsm_accept =
> (gdb) list
> 1905		case NB_EV_ABORT:
> 1906			break;
> 1907		case NB_EV_APPLY:
> 1908			ifp = nb_running_get_entry(args->dnode, NULL, true);
> 1909			pim_ifp = ifp->info;
> 1910			pim_ifp->ucast_bsm_accept =
> 1911				yang_dnode_get_bool(args->dnode, NULL);
> 1912
> 1913			break;
> 1914		}
> (gdb) p pim_ifp
> $1 = (struct pim_interface *) 0x0

Fixes: 3bb513c ("lib: adapt to version 2 of libyang")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit 6952bea)
louis-6wind added a commit that referenced this pull request Aug 30, 2024
When 'no rpki' is requested and the rtrlib RPKI object was freed, bgpd
is crashing.

RPKI is configured in VRF red.

> ip l set red down
> ip l del red
> printf 'conf\n vrf red\n no rpki' | vtysh

> Core was generated by `/usr/bin/bgpd -A 127.0.0.1 -M snmp -M rpki -M bmp'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  __pthread_kill_implementation (no_tid=0, signo=11, threadid=140411103615424) at ./nptl/pthread_kill.c:44
> 44	./nptl/pthread_kill.c: No such file or directory.
> [Current thread is 1 (Thread 0x7fb401f419c0 (LWP 190226))]
> (gdb) bt
> #0  __pthread_kill_implementation (no_tid=0, signo=11, threadid=140411103615424) at ./nptl/pthread_kill.c:44
> #1  __pthread_kill_internal (signo=11, threadid=140411103615424) at ./nptl/pthread_kill.c:78
> #2  __GI___pthread_kill (threadid=140411103615424, signo=signo@entry=11) at ./nptl/pthread_kill.c:89
> #3  0x00007fb4021ad476 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
> FRRouting#4  0x00007fb4025ce22b in core_handler (signo=11, siginfo=0x7fff831b2d70, context=0x7fff831b2c40) at lib/sigevent.c:248
> FRRouting#5  <signal handler called>
> FRRouting#6  rtr_mgr_remove_group (config=0x55fe8789f750, preference=11) at /build/make-pkg/output/source/DIST_RTRLIB/rtrlib/rtrlib/rtr_mgr.c:607
> FRRouting#7  0x00007fb40145f518 in rpki_delete_all_cache_nodes (rpki_vrf=0x55fe8789f4f0) at bgpd/bgp_rpki.c:442
> FRRouting#8  0x00007fb401463098 in no_rpki_magic (self=0x7fb40146bba0 <no_rpki_cmd>, vty=0x55fe877f5130, argc=2, argv=0x55fe877fccd0) at bgpd/bgp_rpki.c:1732
> FRRouting#9  0x00007fb40145c09a in no_rpki (self=0x7fb40146bba0 <no_rpki_cmd>, vty=0x55fe877f5130, argc=2, argv=0x55fe877fccd0) at ./bgpd/bgp_rpki_clippy.c:37
> FRRouting#10 0x00007fb402527abc in cmd_execute_command_real (vline=0x55fe877fd150, vty=0x55fe877f5130, cmd=0x0, up_level=0) at lib/command.c:984
> FRRouting#11 0x00007fb402527c35 in cmd_execute_command (vline=0x55fe877fd150, vty=0x55fe877f5130, cmd=0x0, vtysh=0) at lib/command.c:1043
> FRRouting#12 0x00007fb4025281e5 in cmd_execute (vty=0x55fe877f5130, cmd=0x55fe877fb8c0 "no rpki\n", matched=0x0, vtysh=0) at lib/command.c:1209
> FRRouting#13 0x00007fb4025f0aed in vty_command (vty=0x55fe877f5130, buf=0x55fe877fb8c0 "no rpki\n") at lib/vty.c:615
> FRRouting#14 0x00007fb4025f2a11 in vty_execute (vty=0x55fe877f5130) at lib/vty.c:1378
> FRRouting#15 0x00007fb4025f513d in vtysh_read (thread=0x7fff831b5fa0) at lib/vty.c:2373
> FRRouting#16 0x00007fb4025e9611 in event_call (thread=0x7fff831b5fa0) at lib/event.c:2011
> FRRouting#17 0x00007fb402566976 in frr_run (master=0x55fe871a14a0) at lib/libfrr.c:1212
> FRRouting#18 0x000055fe857829fa in main (argc=9, argv=0x7fff831b6218) at bgpd/bgp_main.c:549

Fixes: 8156765 ("bgpd: Add `no rpki` command")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit 4e053d6)
louis-6wind added a commit that referenced this pull request Aug 30, 2024
Fix crash when flex-algo is configured and mpls-te is disabled.

> interface eth0
>  ip router isis 1
> !
> router isis 1
>  flex-algo 129
>   dataplane sr-mpls
>   advertise-definition

> #0  __pthread_kill_implementation (no_tid=0, signo=11, threadid=140486233631168) at ./nptl/pthread_kill.c:44
> #1  __pthread_kill_internal (signo=11, threadid=140486233631168) at ./nptl/pthread_kill.c:78
> #2  __GI___pthread_kill (threadid=140486233631168, signo=signo@entry=11) at ./nptl/pthread_kill.c:89
> #3  0x00007fc5802e9476 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
> FRRouting#4  0x00007fc58076021f in core_handler (signo=11, siginfo=0x7ffd38d42470, context=0x7ffd38d42340) at lib/sigevent.c:248
> FRRouting#5  <signal handler called>
> FRRouting#6  0x000055c527f798c9 in isis_link_params_update_asla (circuit=0x55c52aaed3c0, ifp=0x55c52a1044e0) at isisd/isis_te.c:176
> FRRouting#7  0x000055c527fb29da in isis_instance_flex_algo_create (args=0x7ffd38d43120) at isisd/isis_nb_config.c:2875
> FRRouting#8  0x00007fc58072655b in nb_callback_create (context=0x55c52ab1d2f0, nb_node=0x55c529f72950, event=NB_EV_APPLY, dnode=0x55c52ab06230, resource=0x55c52ab189f8, errmsg=0x7ffd38d43750 "",
>     errmsg_len=8192) at lib/northbound.c:1262
> FRRouting#9  0x00007fc580727625 in nb_callback_configuration (context=0x55c52ab1d2f0, event=NB_EV_APPLY, change=0x55c52ab189c0, errmsg=0x7ffd38d43750 "", errmsg_len=8192) at lib/northbound.c:1662
> FRRouting#10 0x00007fc580727c39 in nb_transaction_process (event=NB_EV_APPLY, transaction=0x55c52ab1d2f0, errmsg=0x7ffd38d43750 "", errmsg_len=8192) at lib/northbound.c:1794
> FRRouting#11 0x00007fc580725f77 in nb_candidate_commit_apply (transaction=0x55c52ab1d2f0, save_transaction=true, transaction_id=0x0, errmsg=0x7ffd38d43750 "", errmsg_len=8192)
>     at lib/northbound.c:1131
> FRRouting#12 0x00007fc5807260d1 in nb_candidate_commit (context=..., candidate=0x55c529f0a730, save_transaction=true, comment=0x0, transaction_id=0x0, errmsg=0x7ffd38d43750 "", errmsg_len=8192)
>     at lib/northbound.c:1164
> FRRouting#13 0x00007fc58072d220 in nb_cli_classic_commit (vty=0x55c52a0fc6b0) at lib/northbound_cli.c:51
> FRRouting#14 0x00007fc58072d839 in nb_cli_apply_changes_internal (vty=0x55c52a0fc6b0,
>     xpath_base=0x7ffd38d477f0 "/frr-isisd:isis/instance[area-tag='1'][vrf='default']/flex-algos/flex-algo[flex-algo='129']", clear_pending=false) at lib/northbound_cli.c:178
> FRRouting#15 0x00007fc58072dbcf in nb_cli_apply_changes (vty=0x55c52a0fc6b0, xpath_base_fmt=0x55c528014de0 "./flex-algos/flex-algo[flex-algo='%ld']") at lib/northbound_cli.c:234
> FRRouting#16 0x000055c527fd3403 in flex_algo_magic (self=0x55c52804f1a0 <flex_algo_cmd>, vty=0x55c52a0fc6b0, argc=2, argv=0x55c52ab00ec0, algorithm=129, algorithm_str=0x55c52ab120d0 "129")
>     at isisd/isis_cli.c:3752
> FRRouting#17 0x000055c527fc97cb in flex_algo (self=0x55c52804f1a0 <flex_algo_cmd>, vty=0x55c52a0fc6b0, argc=2, argv=0x55c52ab00ec0) at ./isisd/isis_cli_clippy.c:6445
> FRRouting#18 0x00007fc5806b9abc in cmd_execute_command_real (vline=0x55c52aaf78f0, vty=0x55c52a0fc6b0, cmd=0x0, up_level=0) at lib/command.c:984
> FRRouting#19 0x00007fc5806b9c35 in cmd_execute_command (vline=0x55c52aaf78f0, vty=0x55c52a0fc6b0, cmd=0x0, vtysh=0) at lib/command.c:1043
> FRRouting#20 0x00007fc5806ba1e5 in cmd_execute (vty=0x55c52a0fc6b0, cmd=0x55c52aae6bd0 "flex-algo 129\n", matched=0x0, vtysh=0) at lib/command.c:1209
> FRRouting#21 0x00007fc580782ae1 in vty_command (vty=0x55c52a0fc6b0, buf=0x55c52aae6bd0 "flex-algo 129\n") at lib/vty.c:615
> FRRouting#22 0x00007fc580784a05 in vty_execute (vty=0x55c52a0fc6b0) at lib/vty.c:1378
> FRRouting#23 0x00007fc580787131 in vtysh_read (thread=0x7ffd38d4ab10) at lib/vty.c:2373
> FRRouting#24 0x00007fc58077b605 in event_call (thread=0x7ffd38d4ab10) at lib/event.c:2011
> FRRouting#25 0x00007fc5806f8976 in frr_run (master=0x55c529df9b30) at lib/libfrr.c:1212
> FRRouting#26 0x000055c527f301bc in main (argc=5, argv=0x7ffd38d4ad58, envp=0x7ffd38d4ad88) at isisd/isis_main.c:350
> (gdb) f 6
> FRRouting#6  0x000055c527f798c9 in isis_link_params_update_asla (circuit=0x55c52aaed3c0, ifp=0x55c52a1044e0) at isisd/isis_te.c:176
> 176                     list_delete_all_node(ext->aslas);
> (gdb) p ext
> $1 = (struct isis_ext_subtlvs *) 0x0

Fixes: ae27101 ("isisd: fix building asla at first flex-algo config")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
(cherry picked from commit cd81d28)
louis-6wind added a commit that referenced this pull request Sep 9, 2024
The following causes a isisd crash.

> # cat config
> affinity-map green bit-position 0
> router isis 1
>  flex-algo 129
>   affinity exclude-any green
> # vtysh -f config

> #0  raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:50
> #1  0x00007f650cd32756 in core_handler (signo=6, siginfo=0x7ffc56f93070, context=0x7ffc56f92f40) at lib/sigevent.c:258
> #2  <signal handler called>
> #3  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> FRRouting#4  0x00007f650c91c537 in __GI_abort () at abort.c:79
> FRRouting#5  0x00007f650cd007c9 in nb_running_get_entry_worker (dnode=0x0, xpath=0x0, abort_if_not_found=true, rec_search=true) at lib/northbound.c:2531
> FRRouting#6  0x00007f650cd007f9 in nb_running_get_entry (dnode=0x55d9ad406e00, xpath=0x0, abort_if_not_found=true) at lib/northbound.c:2537
> FRRouting#7  0x000055d9ab302248 in isis_instance_flex_algo_affinity_set (args=0x7ffc56f947a0, type=2) at isisd/isis_nb_config.c:2998
> FRRouting#8  0x000055d9ab3027c0 in isis_instance_flex_algo_affinity_exclude_any_create (args=0x7ffc56f947a0) at isisd/isis_nb_config.c:3155
> FRRouting#9  0x00007f650ccfe284 in nb_callback_create (context=0x7ffc56f94d20, nb_node=0x55d9ad28b540, event=NB_EV_VALIDATE, dnode=0x55d9ad406e00, resource=0x0, errmsg=0x7ffc56f94de0 "",
>     errmsg_len=8192) at lib/northbound.c:1487
> FRRouting#10 0x00007f650ccff067 in nb_callback_configuration (context=0x7ffc56f94d20, event=NB_EV_VALIDATE, change=0x55d9ad406d40, errmsg=0x7ffc56f94de0 "", errmsg_len=8192) at lib/northbound.c:1884
> FRRouting#11 0x00007f650ccfda31 in nb_candidate_validate_code (context=0x7ffc56f94d20, candidate=0x55d9ad20d710, changes=0x7ffc56f94d38, errmsg=0x7ffc56f94de0 "", errmsg_len=8192)
>     at lib/northbound.c:1246
> FRRouting#12 0x00007f650ccfdc67 in nb_candidate_commit_prepare (context=..., candidate=0x55d9ad20d710, comment=0x0, transaction=0x7ffc56f94da0, skip_validate=false, ignore_zero_change=false,
>     errmsg=0x7ffc56f94de0 "", errmsg_len=8192) at lib/northbound.c:1317
> FRRouting#13 0x00007f650ccfdec4 in nb_candidate_commit (context=..., candidate=0x55d9ad20d710, save_transaction=true, comment=0x0, transaction_id=0x0, errmsg=0x7ffc56f94de0 "", errmsg_len=8192)
>     at lib/northbound.c:1381
> FRRouting#14 0x00007f650cd045ba in nb_cli_classic_commit (vty=0x55d9ad3f7490) at lib/northbound_cli.c:57
> FRRouting#15 0x00007f650cd04749 in nb_cli_pending_commit_check (vty=0x55d9ad3f7490) at lib/northbound_cli.c:96
> FRRouting#16 0x00007f650cc94340 in cmd_execute_command_real (vline=0x55d9ad3eea10, vty=0x55d9ad3f7490, cmd=0x0, up_level=0) at lib/command.c:1000
> FRRouting#17 0x00007f650cc94599 in cmd_execute_command (vline=0x55d9ad3eea10, vty=0x55d9ad3f7490, cmd=0x0, vtysh=0) at lib/command.c:1080
> FRRouting#18 0x00007f650cc94a0c in cmd_execute (vty=0x55d9ad3f7490, cmd=0x55d9ad401d30 "XFRR_end_configuration", matched=0x0, vtysh=0) at lib/command.c:1228
> FRRouting#19 0x00007f650cd523a4 in vty_command (vty=0x55d9ad3f7490, buf=0x55d9ad401d30 "XFRR_end_configuration") at lib/vty.c:625
> FRRouting#20 0x00007f650cd5413d in vty_execute (vty=0x55d9ad3f7490) at lib/vty.c:1388
> FRRouting#21 0x00007f650cd56353 in vtysh_read (thread=0x7ffc56f99370) at lib/vty.c:2400
> FRRouting#22 0x00007f650cd4b6fd in event_call (thread=0x7ffc56f99370) at lib/event.c:1996
> FRRouting#23 0x00007f650ccd1365 in frr_run (master=0x55d9ad103cf0) at lib/libfrr.c:1231
> FRRouting#24 0x000055d9ab29036e in main (argc=2, argv=0x7ffc56f99598, envp=0x7ffc56f995b0) at isisd/isis_main.c:354

Configuring the same in vtysh configure interactive mode works properly.
When using "vtysh -f", the northbound compatible configuration is
committed together whereas, in interactive mode, it committed line by
line. In the first situation, in validation state nb_running_get_entry()
fails because the area not yet in running.

Do not use nb_running_get_entry() northbound validation state.

Fixes: 893882e ("isisd: add isis flex-algo configuration backend")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
louis-6wind added a commit that referenced this pull request Sep 10, 2024
Fix a crash when modifying a route-map with set as-path exclude without
as-path-access-list:

> router(config)# route-map routemaptest deny 1
> router(config-route-map)# set as-path exclude 33 34 35
> router(config-route-map)# set as-path exclude as-path-access-list test

> #0  raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:50
> #1  0x00007fb3959327de in core_handler (signo=11, siginfo=0x7ffd122da530, context=0x7ffd122da400) at lib/sigevent.c:258
> #2  <signal handler called>
> #3  0x000055ab2762a1bd in as_list_list_del (h=0x55ab27897680 <as_exclude_list_orphan>, item=0x55ab28204e20) at ./bgpd/bgp_aspath.h:77
> FRRouting#4  0x000055ab2762d1a8 in as_exclude_remove_orphan (ase=0x55ab28204e20) at bgpd/bgp_aspath.c:1574
> FRRouting#5  0x000055ab27550538 in route_aspath_exclude_free (rule=0x55ab28204e20) at bgpd/bgp_routemap.c:2366
> FRRouting#6  0x00007fb39591f00c in route_map_rule_delete (list=0x55ab28203498, rule=0x55ab28204170) at lib/routemap.c:1357
> FRRouting#7  0x00007fb39591f87c in route_map_add_set (index=0x55ab28203460, set_name=0x55ab276ad2aa "as-path exclude", set_arg=0x55ab281e4f70 "as-path-access-list test") at lib/routemap.c:1674
> FRRouting#8  0x00007fb39591d3f3 in generic_set_add (index=0x55ab28203460, command=0x55ab276ad2aa "as-path exclude", arg=0x55ab281e4f70 "as-path-access-list test", errmsg=0x7ffd122db870 "",
>     errmsg_len=8192) at lib/routemap.c:533
> FRRouting#9  0x000055ab2755e78e in lib_route_map_entry_set_action_rmap_set_action_exclude_as_path_modify (args=0x7ffd122db290) at bgpd/bgp_routemap_nb_config.c:2427
> FRRouting#10 0x00007fb3958fe417 in nb_callback_modify (context=0x55ab28205aa0, nb_node=0x55ab27cb31e0, event=NB_EV_APPLY, dnode=0x55ab28202690, resource=0x55ab27c32148, errmsg=0x7ffd122db870 "",
>     errmsg_len=8192) at lib/northbound.c:1538
> FRRouting#11 0x00007fb3958ff0ab in nb_callback_configuration (context=0x55ab28205aa0, event=NB_EV_APPLY, change=0x55ab27c32110, errmsg=0x7ffd122db870 "", errmsg_len=8192) at lib/northbound.c:1888
> FRRouting#12 0x00007fb3958ff5e4 in nb_transaction_process (event=NB_EV_APPLY, transaction=0x55ab28205aa0, errmsg=0x7ffd122db870 "", errmsg_len=8192) at lib/northbound.c:2016
> FRRouting#13 0x00007fb3958fddba in nb_candidate_commit_apply (transaction=0x55ab28205aa0, save_transaction=true, transaction_id=0x0, errmsg=0x7ffd122db870 "", errmsg_len=8192)
>     at lib/northbound.c:1356
> FRRouting#14 0x00007fb3958fdef0 in nb_candidate_commit (context=..., candidate=0x55ab27c2c9a0, save_transaction=true, comment=0x0, transaction_id=0x0, errmsg=0x7ffd122db870 "", errmsg_len=8192)
>     at lib/northbound.c:1389
> FRRouting#15 0x00007fb3959045ba in nb_cli_classic_commit (vty=0x55ab281f6680) at lib/northbound_cli.c:57
> FRRouting#16 0x00007fb395904b5a in nb_cli_apply_changes_internal (vty=0x55ab281f6680, xpath_base=0x7ffd122dfd10 "/frr-route-map:lib/route-map[name='routemaptest']/entry[sequence='1']",
>     clear_pending=false) at lib/northbound_cli.c:184
> FRRouting#17 0x00007fb395904ebf in nb_cli_apply_changes (vty=0x55ab281f6680, xpath_base_fmt=0x0) at lib/northbound_cli.c:240
> --Type <RET> for more, q to quit, c to continue without paging--
> FRRouting#18 0x000055ab27557d2e in set_aspath_exclude_access_list_magic (self=0x55ab2775c300 <set_aspath_exclude_access_list_cmd>, vty=0x55ab281f6680, argc=5, argv=0x55ab28204c80,
>     as_path_filter_name=0x55ab28202040 "test") at bgpd/bgp_routemap.c:6397
> FRRouting#19 0x000055ab2754bdea in set_aspath_exclude_access_list (self=0x55ab2775c300 <set_aspath_exclude_access_list_cmd>, vty=0x55ab281f6680, argc=5, argv=0x55ab28204c80)
>     at ./bgpd/bgp_routemap_clippy.c:856
> FRRouting#20 0x00007fb39589435d in cmd_execute_command_real (vline=0x55ab281e61f0, vty=0x55ab281f6680, cmd=0x0, up_level=0) at lib/command.c:1003
> FRRouting#21 0x00007fb3958944be in cmd_execute_command (vline=0x55ab281e61f0, vty=0x55ab281f6680, cmd=0x0, vtysh=0) at lib/command.c:1062
> FRRouting#22 0x00007fb395894a0c in cmd_execute (vty=0x55ab281f6680, cmd=0x55ab28200f20 "set as-path exclude as-path-access-list test", matched=0x0, vtysh=0) at lib/command.c:1228
> FRRouting#23 0x00007fb39595242c in vty_command (vty=0x55ab281f6680, buf=0x55ab28200f20 "set as-path exclude as-path-access-list test") at lib/vty.c:625
> FRRouting#24 0x00007fb3959541c5 in vty_execute (vty=0x55ab281f6680) at lib/vty.c:1388
> FRRouting#25 0x00007fb3959563db in vtysh_read (thread=0x7ffd122e2bb0) at lib/vty.c:2400
> FRRouting#26 0x00007fb39594b785 in event_call (thread=0x7ffd122e2bb0) at lib/event.c:1996
> FRRouting#27 0x00007fb3958d1365 in frr_run (master=0x55ab27b56d70) at lib/libfrr.c:1231
> FRRouting#28 0x000055ab2747f1cc in main (argc=3, argv=0x7ffd122e2e08) at bgpd/bgp_main.c:555

Fixes: 094dcc3 ("bgpd: fix "bgp as-pah access-list" with "set aspath exclude" set/unset issues")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
louis-6wind added a commit that referenced this pull request Sep 12, 2024
Level 2 adjacency list is not supposed to be always set.

> #0  raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:50
> #1  0x00007f9f0353274f in core_handler (signo=6, siginfo=0x7ffe95260770, context=0x7ffe95260640) at lib/sigevent.c:258
> #2  <signal handler called>
> #3  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> FRRouting#4  0x00007f9f0324e537 in __GI_abort () at abort.c:79
> FRRouting#5  0x00007f9f035744ea in _zlog_assert_failed (xref=0x7f9f0362c6c0 <_xref.15>, extra=0x0) at lib/zlog.c:789
> FRRouting#6  0x00007f9f034d25ee in listnode_head (list=0x0) at lib/linklist.c:316
> FRRouting#7  0x000055cd65aaa481 in lib_interface_state_isis_adjacencies_adjacency_get_next (args=0x7ffe95261730) at isisd/isis_nb_state.c:101
> FRRouting#8  0x00007f9f034feadd in nb_callback_get_next (nb_node=0x55cd673c0190, parent_list_entry=0x55cd67570d30, list_entry=0x55cd6758f8a0) at lib/northbound.c:1748
> FRRouting#9  0x00007f9f0350bf07 in __walk (ys=0x55cd675782b0, is_resume=false) at lib/northbound_oper.c:1264
> FRRouting#10 0x00007f9f0350deaa in nb_op_walk_start (ys=0x55cd675782b0) at lib/northbound_oper.c:1741
> FRRouting#11 0x00007f9f0350e079 in nb_oper_iterate_legacy (xpath=0x55cd67595c60 "/frr-interface:lib", translator=0x0, flags=0, cb=0x0, cb_arg=0x0, tree=0x7ffe952621b0) at lib/northbound_oper.c:1803
> FRRouting#12 0x00007f9f03507661 in show_yang_operational_data_magic (self=0x7f9f03634a80 <show_yang_operational_data_cmd>, vty=0x55cd675a61f0, argc=4, argv=0x55cd6758eab0,
>     xpath=0x55cd67595c60 "/frr-interface:lib", json=0x0, xml=0x0, translator_family=0x0, with_config=0x0) at lib/northbound_cli.c:1576
> FRRouting#13 0x00007f9f035037f0 in show_yang_operational_data (self=0x7f9f03634a80 <show_yang_operational_data_cmd>, vty=0x55cd675a61f0, argc=4, argv=0x55cd6758eab0)
>     at ./lib/northbound_cli_clippy.c:906
> FRRouting#14 0x00007f9f0349435d in cmd_execute_command_real (vline=0x55cd6758e490, vty=0x55cd675a61f0, cmd=0x0, up_level=0) at lib/command.c:1003
> FRRouting#15 0x00007f9f03494477 in cmd_execute_command (vline=0x55cd67585340, vty=0x55cd675a61f0, cmd=0x0, vtysh=0) at lib/command.c:1053
> FRRouting#16 0x00007f9f03494a0c in cmd_execute (vty=0x55cd675a61f0, cmd=0x55cd67579040 "do show yang operational-data /frr-interface:lib", matched=0x0, vtysh=0) at lib/command.c:1228
> FRRouting#17 0x00007f9f0355239d in vty_command (vty=0x55cd675a61f0, buf=0x55cd67579040 "do show yang operational-data /frr-interface:lib") at lib/vty.c:625
> FRRouting#18 0x00007f9f03554136 in vty_execute (vty=0x55cd675a61f0) at lib/vty.c:1388
> FRRouting#19 0x00007f9f0355634c in vtysh_read (thread=0x7ffe952647a0) at lib/vty.c:2400
> FRRouting#20 0x00007f9f0354b6f6 in event_call (thread=0x7ffe952647a0) at lib/event.c:1996
> FRRouting#21 0x00007f9f034d1365 in frr_run (master=0x55cd67204da0) at lib/libfrr.c:1231
> FRRouting#22 0x000055cd65a3236e in main (argc=7, argv=0x7ffe952649c8, envp=0x7ffe95264a08) at isisd/isis_main.c:354

Fixes: 2a1c520 ("isisd: split northbound callbacks into multiple files")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
louis-6wind added a commit that referenced this pull request Sep 14, 2024
There is a race condition with addpath withdrawal. The withdrawal never
happens if it was request when coalesce timer running.

It can be demonstrated by adding to bgp_snmp_bgp4v2mib/rr/bgpd.conf:

> router bgp 65004
> + coalesce-time 10000

But it also happens in other conditions. For example, when using
"vtysh -f" to load the configuration.

It results in having two identical paths but with different addpath on
r2:

> r2# sh bgp ipv4 10.0.0.0/31
> BGP routing table entry for 10.0.0.0/31, version 3
> Paths: (3 available, best #1, table default)
>   65004
>     192.168.12.4 from 192.168.12.4 (192.168.12.4)
>       Origin IGP, metric 1, valid, external, multipath, best (AS Path)
>       AddPath ID: RX 0, TX-All 2 TX-Best-Per-AS 0 TX-Best-Selected 0
>       Advertised to: 192.168.12.4
>       Last update: Thu Sep 12 16:13:59 2024
>   65004
>     192.168.12.4 from 192.168.12.4 (192.168.12.4)
>       Origin IGP, metric 1, valid, external, multipath
>       AddPath ID: RX 3, TX-All 4 TX-Best-Per-AS 0 TX-Best-Selected 0
>       Advertised to: 192.168.12.4
>       Last update: Thu Sep 12 16:13:59 2024

The first path has been created first but should be withdrawn later.

Withdraw the stale addpath even the coalesce timer is running.

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
louis-6wind added a commit that referenced this pull request Sep 20, 2024
Fix a heap-after-free that causes zebra to crash even without
address-sanitizer. To reproduce:

> echo "100 my_table" | tee -a /etc/iproute2/rt_tables
> ip route add blackhole default table 100
> ip route show table 100
> ip l add red type vrf table 100
> ip l del red
> ip route del blackhole default table 100

Zebra manages routing tables for all existing Linux RT tables,
regardless of whether they are assigned to a VRF interface. When a table
is not assigned to any VRF, zebra arbitrarily assigns it to the default
VRF, even though this is not strictly accurate (the code expects this
behavior).

When an RT table is created after a VRF, zebra correctly assigns the
table to the VRF. However, if a VRF interface is assigned to an existing
RT table, zebra does not update the table owner, which remains as the
default VRF. As a result, existing routing entries remain under the
default VRF, while new entries are correctly assigned to the VRF. The
VRF mismatch is unexpected in the code and creates crashes and memory
related issues.

Furthermore, Linux does not automatically delete RT tables when they are
unassigned from a VRF. It is incorrect to delete these tables from zebra.

Instead, at VRF disabling, do not release the table but reassign it to
the default VRF. At VRF enabling, change the table owner back to the
appropriate VRF.

> ==2866266==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000154f54 at pc 0x7fa32474b83f bp 0x7ffe94f67d90 sp 0x7ffe94f67d88
> READ of size 1 at 0x606000154f54 thread T0
>     #0 0x7fa32474b83e in rn_hash_node_const_find lib/table.c:28
>     #1 0x7fa32474bab1 in rn_hash_node_find lib/table.c:28
>     #2 0x7fa32474d783 in route_node_get lib/table.c:283
>     #3 0x7fa3247328dd in srcdest_rnode_get lib/srcdest_table.c:231
>     FRRouting#4 0x55b0e4fa8da4 in rib_find_rn_from_ctx zebra/zebra_rib.c:1957
>     FRRouting#5 0x55b0e4fa8e31 in rib_process_result zebra/zebra_rib.c:1988
>     FRRouting#6 0x55b0e4fb9d64 in rib_process_dplane_results zebra/zebra_rib.c:4894
>     FRRouting#7 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#8 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#9 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#10 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308
>     FRRouting#11 0x55b0e4e2d649 in _start (/usr/lib/frr/zebra+0x1a1649)
>
> 0x606000154f54 is located 20 bytes inside of 56-byte region [0x606000154f40,0x606000154f78)
> freed by thread T0 here:
>     #0 0x7fa324ca9b6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123
>     #1 0x7fa324668d8f in qfree lib/memory.c:130
>     #2 0x7fa32474c421 in route_table_free lib/table.c:126
>     #3 0x7fa32474bf96 in route_table_finish lib/table.c:46
>     FRRouting#4 0x55b0e4fbca3a in zebra_router_free_table zebra/zebra_router.c:191
>     FRRouting#5 0x55b0e4fbccea in zebra_router_release_table zebra/zebra_router.c:214
>     FRRouting#6 0x55b0e4fd428e in zebra_vrf_disable zebra/zebra_vrf.c:219
>     FRRouting#7 0x7fa32476fabf in vrf_disable lib/vrf.c:326
>     FRRouting#8 0x7fa32476f5d4 in vrf_delete lib/vrf.c:231
>     FRRouting#9 0x55b0e4e4ad36 in interface_vrf_change zebra/interface.c:1478
>     FRRouting#10 0x55b0e4e4d5d2 in zebra_if_dplane_ifp_handling zebra/interface.c:1949
>     FRRouting#11 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268
>     FRRouting#12 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954
>     FRRouting#13 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#14 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#15 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#16 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308
>
> previously allocated by thread T0 here:
>     #0 0x7fa324caa037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     #1 0x7fa324668c4d in qcalloc lib/memory.c:105
>     #2 0x7fa32474bf33 in route_table_init_with_delegate lib/table.c:38
>     #3 0x7fa32474e73c in route_table_init lib/table.c:512
>     FRRouting#4 0x55b0e4fbc353 in zebra_router_get_table zebra/zebra_router.c:137
>     FRRouting#5 0x55b0e4fd4da0 in zebra_vrf_table_create zebra/zebra_vrf.c:358
>     FRRouting#6 0x55b0e4fd3d30 in zebra_vrf_enable zebra/zebra_vrf.c:140
>     FRRouting#7 0x7fa32476f9b2 in vrf_enable lib/vrf.c:286
>     FRRouting#8 0x55b0e4e4af76 in interface_vrf_change zebra/interface.c:1533
>     FRRouting#9 0x55b0e4e4d612 in zebra_if_dplane_ifp_handling zebra/interface.c:1968
>     FRRouting#10 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268
>     FRRouting#11 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954
>     FRRouting#12 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#13 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#14 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#15 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308

Fixes: d8612e6 ("zebra: Track tables allocated by vrf and cleanup")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
louis-6wind added a commit that referenced this pull request Sep 20, 2024
Fix a heap-after-free that causes zebra to crash even without
address-sanitizer. To reproduce:

> echo "100 my_table" | tee -a /etc/iproute2/rt_tables
> ip route add blackhole default table 100
> ip route show table 100
> ip l add red type vrf table 100
> ip l del red
> ip route del blackhole default table 100

Zebra manages routing tables for all existing Linux RT tables,
regardless of whether they are assigned to a VRF interface. When a table
is not assigned to any VRF, zebra arbitrarily assigns it to the default
VRF, even though this is not strictly accurate (the code expects this
behavior).

When an RT table is created after a VRF, zebra correctly assigns the
table to the VRF. However, if a VRF interface is assigned to an existing
RT table, zebra does not update the table owner, which remains as the
default VRF. As a result, existing routing entries remain under the
default VRF, while new entries are correctly assigned to the VRF. The
VRF mismatch is unexpected in the code and creates crashes and memory
related issues.

Furthermore, Linux does not automatically delete RT tables when they are
unassigned from a VRF. It is incorrect to delete these tables from zebra.

Instead, at VRF disabling, do not release the table but reassign it to
the default VRF. At VRF enabling, change the table owner back to the
appropriate VRF.

> ==2866266==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000154f54 at pc 0x7fa32474b83f bp 0x7ffe94f67d90 sp 0x7ffe94f67d88
> READ of size 1 at 0x606000154f54 thread T0
>     #0 0x7fa32474b83e in rn_hash_node_const_find lib/table.c:28
>     #1 0x7fa32474bab1 in rn_hash_node_find lib/table.c:28
>     #2 0x7fa32474d783 in route_node_get lib/table.c:283
>     #3 0x7fa3247328dd in srcdest_rnode_get lib/srcdest_table.c:231
>     FRRouting#4 0x55b0e4fa8da4 in rib_find_rn_from_ctx zebra/zebra_rib.c:1957
>     FRRouting#5 0x55b0e4fa8e31 in rib_process_result zebra/zebra_rib.c:1988
>     FRRouting#6 0x55b0e4fb9d64 in rib_process_dplane_results zebra/zebra_rib.c:4894
>     FRRouting#7 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#8 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#9 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#10 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308
>     FRRouting#11 0x55b0e4e2d649 in _start (/usr/lib/frr/zebra+0x1a1649)
>
> 0x606000154f54 is located 20 bytes inside of 56-byte region [0x606000154f40,0x606000154f78)
> freed by thread T0 here:
>     #0 0x7fa324ca9b6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123
>     #1 0x7fa324668d8f in qfree lib/memory.c:130
>     #2 0x7fa32474c421 in route_table_free lib/table.c:126
>     #3 0x7fa32474bf96 in route_table_finish lib/table.c:46
>     FRRouting#4 0x55b0e4fbca3a in zebra_router_free_table zebra/zebra_router.c:191
>     FRRouting#5 0x55b0e4fbccea in zebra_router_release_table zebra/zebra_router.c:214
>     FRRouting#6 0x55b0e4fd428e in zebra_vrf_disable zebra/zebra_vrf.c:219
>     FRRouting#7 0x7fa32476fabf in vrf_disable lib/vrf.c:326
>     FRRouting#8 0x7fa32476f5d4 in vrf_delete lib/vrf.c:231
>     FRRouting#9 0x55b0e4e4ad36 in interface_vrf_change zebra/interface.c:1478
>     FRRouting#10 0x55b0e4e4d5d2 in zebra_if_dplane_ifp_handling zebra/interface.c:1949
>     FRRouting#11 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268
>     FRRouting#12 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954
>     FRRouting#13 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#14 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#15 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#16 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308
>
> previously allocated by thread T0 here:
>     #0 0x7fa324caa037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     #1 0x7fa324668c4d in qcalloc lib/memory.c:105
>     #2 0x7fa32474bf33 in route_table_init_with_delegate lib/table.c:38
>     #3 0x7fa32474e73c in route_table_init lib/table.c:512
>     FRRouting#4 0x55b0e4fbc353 in zebra_router_get_table zebra/zebra_router.c:137
>     FRRouting#5 0x55b0e4fd4da0 in zebra_vrf_table_create zebra/zebra_vrf.c:358
>     FRRouting#6 0x55b0e4fd3d30 in zebra_vrf_enable zebra/zebra_vrf.c:140
>     FRRouting#7 0x7fa32476f9b2 in vrf_enable lib/vrf.c:286
>     FRRouting#8 0x55b0e4e4af76 in interface_vrf_change zebra/interface.c:1533
>     FRRouting#9 0x55b0e4e4d612 in zebra_if_dplane_ifp_handling zebra/interface.c:1968
>     FRRouting#10 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268
>     FRRouting#11 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954
>     FRRouting#12 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#13 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#14 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#15 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308

Fixes: d8612e6 ("zebra: Track tables allocated by vrf and cleanup")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
louis-6wind added a commit that referenced this pull request Sep 24, 2024
Fix a heap-after-free that causes zebra to crash even without
address-sanitizer. To reproduce:

> echo "100 my_table" | tee -a /etc/iproute2/rt_tables
> ip route add blackhole default table 100
> ip route show table 100
> ip l add red type vrf table 100
> ip l del red
> ip route del blackhole default table 100

Zebra manages routing tables for all existing Linux RT tables,
regardless of whether they are assigned to a VRF interface. When a table
is not assigned to any VRF, zebra arbitrarily assigns it to the default
VRF, even though this is not strictly accurate (the code expects this
behavior).

When an RT table is created after a VRF, zebra correctly assigns the
table to the VRF. However, if a VRF interface is assigned to an existing
RT table, zebra does not update the table owner, which remains as the
default VRF. As a result, existing routing entries remain under the
default VRF, while new entries are correctly assigned to the VRF. The
VRF mismatch is unexpected in the code and creates crashes and memory
related issues.

Furthermore, Linux does not automatically delete RT tables when they are
unassigned from a VRF. It is incorrect to delete these tables from zebra.

Instead, at VRF disabling, do not release the table but reassign it to
the default VRF. At VRF enabling, change the table owner back to the
appropriate VRF.

> ==2866266==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000154f54 at pc 0x7fa32474b83f bp 0x7ffe94f67d90 sp 0x7ffe94f67d88
> READ of size 1 at 0x606000154f54 thread T0
>     #0 0x7fa32474b83e in rn_hash_node_const_find lib/table.c:28
>     #1 0x7fa32474bab1 in rn_hash_node_find lib/table.c:28
>     #2 0x7fa32474d783 in route_node_get lib/table.c:283
>     #3 0x7fa3247328dd in srcdest_rnode_get lib/srcdest_table.c:231
>     FRRouting#4 0x55b0e4fa8da4 in rib_find_rn_from_ctx zebra/zebra_rib.c:1957
>     FRRouting#5 0x55b0e4fa8e31 in rib_process_result zebra/zebra_rib.c:1988
>     FRRouting#6 0x55b0e4fb9d64 in rib_process_dplane_results zebra/zebra_rib.c:4894
>     FRRouting#7 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#8 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#9 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#10 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308
>     FRRouting#11 0x55b0e4e2d649 in _start (/usr/lib/frr/zebra+0x1a1649)
>
> 0x606000154f54 is located 20 bytes inside of 56-byte region [0x606000154f40,0x606000154f78)
> freed by thread T0 here:
>     #0 0x7fa324ca9b6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123
>     #1 0x7fa324668d8f in qfree lib/memory.c:130
>     #2 0x7fa32474c421 in route_table_free lib/table.c:126
>     #3 0x7fa32474bf96 in route_table_finish lib/table.c:46
>     FRRouting#4 0x55b0e4fbca3a in zebra_router_free_table zebra/zebra_router.c:191
>     FRRouting#5 0x55b0e4fbccea in zebra_router_release_table zebra/zebra_router.c:214
>     FRRouting#6 0x55b0e4fd428e in zebra_vrf_disable zebra/zebra_vrf.c:219
>     FRRouting#7 0x7fa32476fabf in vrf_disable lib/vrf.c:326
>     FRRouting#8 0x7fa32476f5d4 in vrf_delete lib/vrf.c:231
>     FRRouting#9 0x55b0e4e4ad36 in interface_vrf_change zebra/interface.c:1478
>     FRRouting#10 0x55b0e4e4d5d2 in zebra_if_dplane_ifp_handling zebra/interface.c:1949
>     FRRouting#11 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268
>     FRRouting#12 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954
>     FRRouting#13 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#14 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#15 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#16 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308
>
> previously allocated by thread T0 here:
>     #0 0x7fa324caa037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     #1 0x7fa324668c4d in qcalloc lib/memory.c:105
>     #2 0x7fa32474bf33 in route_table_init_with_delegate lib/table.c:38
>     #3 0x7fa32474e73c in route_table_init lib/table.c:512
>     FRRouting#4 0x55b0e4fbc353 in zebra_router_get_table zebra/zebra_router.c:137
>     FRRouting#5 0x55b0e4fd4da0 in zebra_vrf_table_create zebra/zebra_vrf.c:358
>     FRRouting#6 0x55b0e4fd3d30 in zebra_vrf_enable zebra/zebra_vrf.c:140
>     FRRouting#7 0x7fa32476f9b2 in vrf_enable lib/vrf.c:286
>     FRRouting#8 0x55b0e4e4af76 in interface_vrf_change zebra/interface.c:1533
>     FRRouting#9 0x55b0e4e4d612 in zebra_if_dplane_ifp_handling zebra/interface.c:1968
>     FRRouting#10 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268
>     FRRouting#11 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954
>     FRRouting#12 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#13 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#14 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#15 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308

Fixes: d8612e6 ("zebra: Track tables allocated by vrf and cleanup")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
louis-6wind added a commit that referenced this pull request Sep 24, 2024
Fix a heap-after-free that causes zebra to crash even without
address-sanitizer. To reproduce:

> echo "100 my_table" | tee -a /etc/iproute2/rt_tables
> ip route add blackhole default table 100
> ip route show table 100
> ip l add red type vrf table 100
> ip l del red
> ip route del blackhole default table 100

Zebra manages routing tables for all existing Linux RT tables,
regardless of whether they are assigned to a VRF interface. When a table
is not assigned to any VRF, zebra arbitrarily assigns it to the default
VRF, even though this is not strictly accurate (the code expects this
behavior).

When an RT table is created after a VRF, zebra correctly assigns the
table to the VRF. However, if a VRF interface is assigned to an existing
RT table, zebra does not update the table owner, which remains as the
default VRF. As a result, existing routing entries remain under the
default VRF, while new entries are correctly assigned to the VRF. The
VRF mismatch is unexpected in the code and creates crashes and memory
related issues.

Furthermore, Linux does not automatically delete RT tables when they are
unassigned from a VRF. It is incorrect to delete these tables from zebra.

Instead, at VRF disabling, do not release the table but reassign it to
the default VRF. At VRF enabling, change the table owner back to the
appropriate VRF.

> ==2866266==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000154f54 at pc 0x7fa32474b83f bp 0x7ffe94f67d90 sp 0x7ffe94f67d88
> READ of size 1 at 0x606000154f54 thread T0
>     #0 0x7fa32474b83e in rn_hash_node_const_find lib/table.c:28
>     #1 0x7fa32474bab1 in rn_hash_node_find lib/table.c:28
>     #2 0x7fa32474d783 in route_node_get lib/table.c:283
>     #3 0x7fa3247328dd in srcdest_rnode_get lib/srcdest_table.c:231
>     FRRouting#4 0x55b0e4fa8da4 in rib_find_rn_from_ctx zebra/zebra_rib.c:1957
>     FRRouting#5 0x55b0e4fa8e31 in rib_process_result zebra/zebra_rib.c:1988
>     FRRouting#6 0x55b0e4fb9d64 in rib_process_dplane_results zebra/zebra_rib.c:4894
>     FRRouting#7 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#8 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#9 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#10 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308
>     FRRouting#11 0x55b0e4e2d649 in _start (/usr/lib/frr/zebra+0x1a1649)
>
> 0x606000154f54 is located 20 bytes inside of 56-byte region [0x606000154f40,0x606000154f78)
> freed by thread T0 here:
>     #0 0x7fa324ca9b6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123
>     #1 0x7fa324668d8f in qfree lib/memory.c:130
>     #2 0x7fa32474c421 in route_table_free lib/table.c:126
>     #3 0x7fa32474bf96 in route_table_finish lib/table.c:46
>     FRRouting#4 0x55b0e4fbca3a in zebra_router_free_table zebra/zebra_router.c:191
>     FRRouting#5 0x55b0e4fbccea in zebra_router_release_table zebra/zebra_router.c:214
>     FRRouting#6 0x55b0e4fd428e in zebra_vrf_disable zebra/zebra_vrf.c:219
>     FRRouting#7 0x7fa32476fabf in vrf_disable lib/vrf.c:326
>     FRRouting#8 0x7fa32476f5d4 in vrf_delete lib/vrf.c:231
>     FRRouting#9 0x55b0e4e4ad36 in interface_vrf_change zebra/interface.c:1478
>     FRRouting#10 0x55b0e4e4d5d2 in zebra_if_dplane_ifp_handling zebra/interface.c:1949
>     FRRouting#11 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268
>     FRRouting#12 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954
>     FRRouting#13 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#14 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#15 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#16 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308
>
> previously allocated by thread T0 here:
>     #0 0x7fa324caa037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     #1 0x7fa324668c4d in qcalloc lib/memory.c:105
>     #2 0x7fa32474bf33 in route_table_init_with_delegate lib/table.c:38
>     #3 0x7fa32474e73c in route_table_init lib/table.c:512
>     FRRouting#4 0x55b0e4fbc353 in zebra_router_get_table zebra/zebra_router.c:137
>     FRRouting#5 0x55b0e4fd4da0 in zebra_vrf_table_create zebra/zebra_vrf.c:358
>     FRRouting#6 0x55b0e4fd3d30 in zebra_vrf_enable zebra/zebra_vrf.c:140
>     FRRouting#7 0x7fa32476f9b2 in vrf_enable lib/vrf.c:286
>     FRRouting#8 0x55b0e4e4af76 in interface_vrf_change zebra/interface.c:1533
>     FRRouting#9 0x55b0e4e4d612 in zebra_if_dplane_ifp_handling zebra/interface.c:1968
>     FRRouting#10 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268
>     FRRouting#11 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954
>     FRRouting#12 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#13 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#14 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#15 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308

Fixes: d8612e6 ("zebra: Track tables allocated by vrf and cleanup")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
louis-6wind added a commit that referenced this pull request Sep 24, 2024
Fix a heap-after-free that causes zebra to crash even without
address-sanitizer. To reproduce:

> echo "100 my_table" | tee -a /etc/iproute2/rt_tables
> ip route add blackhole default table 100
> ip route show table 100
> ip l add red type vrf table 100
> ip l del red
> ip route del blackhole default table 100

Zebra manages routing tables for all existing Linux RT tables,
regardless of whether they are assigned to a VRF interface. When a table
is not assigned to any VRF, zebra arbitrarily assigns it to the default
VRF, even though this is not strictly accurate (the code expects this
behavior).

When an RT table is created after a VRF, zebra correctly assigns the
table to the VRF. However, if a VRF interface is assigned to an existing
RT table, zebra does not update the table owner, which remains as the
default VRF. As a result, existing routing entries remain under the
default VRF, while new entries are correctly assigned to the VRF. The
VRF mismatch is unexpected in the code and creates crashes and memory
related issues.

Furthermore, Linux does not automatically delete RT tables when they are
unassigned from a VRF. It is incorrect to delete these tables from zebra.

Instead, at VRF disabling, do not release the table but reassign it to
the default VRF. At VRF enabling, change the table owner back to the
appropriate VRF.

> ==2866266==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000154f54 at pc 0x7fa32474b83f bp 0x7ffe94f67d90 sp 0x7ffe94f67d88
> READ of size 1 at 0x606000154f54 thread T0
>     #0 0x7fa32474b83e in rn_hash_node_const_find lib/table.c:28
>     #1 0x7fa32474bab1 in rn_hash_node_find lib/table.c:28
>     #2 0x7fa32474d783 in route_node_get lib/table.c:283
>     #3 0x7fa3247328dd in srcdest_rnode_get lib/srcdest_table.c:231
>     FRRouting#4 0x55b0e4fa8da4 in rib_find_rn_from_ctx zebra/zebra_rib.c:1957
>     FRRouting#5 0x55b0e4fa8e31 in rib_process_result zebra/zebra_rib.c:1988
>     FRRouting#6 0x55b0e4fb9d64 in rib_process_dplane_results zebra/zebra_rib.c:4894
>     FRRouting#7 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#8 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#9 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#10 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308
>     FRRouting#11 0x55b0e4e2d649 in _start (/usr/lib/frr/zebra+0x1a1649)
>
> 0x606000154f54 is located 20 bytes inside of 56-byte region [0x606000154f40,0x606000154f78)
> freed by thread T0 here:
>     #0 0x7fa324ca9b6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123
>     #1 0x7fa324668d8f in qfree lib/memory.c:130
>     #2 0x7fa32474c421 in route_table_free lib/table.c:126
>     #3 0x7fa32474bf96 in route_table_finish lib/table.c:46
>     FRRouting#4 0x55b0e4fbca3a in zebra_router_free_table zebra/zebra_router.c:191
>     FRRouting#5 0x55b0e4fbccea in zebra_router_release_table zebra/zebra_router.c:214
>     FRRouting#6 0x55b0e4fd428e in zebra_vrf_disable zebra/zebra_vrf.c:219
>     FRRouting#7 0x7fa32476fabf in vrf_disable lib/vrf.c:326
>     FRRouting#8 0x7fa32476f5d4 in vrf_delete lib/vrf.c:231
>     FRRouting#9 0x55b0e4e4ad36 in interface_vrf_change zebra/interface.c:1478
>     FRRouting#10 0x55b0e4e4d5d2 in zebra_if_dplane_ifp_handling zebra/interface.c:1949
>     FRRouting#11 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268
>     FRRouting#12 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954
>     FRRouting#13 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#14 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#15 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#16 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308
>
> previously allocated by thread T0 here:
>     #0 0x7fa324caa037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     #1 0x7fa324668c4d in qcalloc lib/memory.c:105
>     #2 0x7fa32474bf33 in route_table_init_with_delegate lib/table.c:38
>     #3 0x7fa32474e73c in route_table_init lib/table.c:512
>     FRRouting#4 0x55b0e4fbc353 in zebra_router_get_table zebra/zebra_router.c:137
>     FRRouting#5 0x55b0e4fd4da0 in zebra_vrf_table_create zebra/zebra_vrf.c:358
>     FRRouting#6 0x55b0e4fd3d30 in zebra_vrf_enable zebra/zebra_vrf.c:140
>     FRRouting#7 0x7fa32476f9b2 in vrf_enable lib/vrf.c:286
>     FRRouting#8 0x55b0e4e4af76 in interface_vrf_change zebra/interface.c:1533
>     FRRouting#9 0x55b0e4e4d612 in zebra_if_dplane_ifp_handling zebra/interface.c:1968
>     FRRouting#10 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268
>     FRRouting#11 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954
>     FRRouting#12 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#13 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#14 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#15 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308

Fixes: d8612e6 ("zebra: Track tables allocated by vrf and cleanup")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
louis-6wind pushed a commit that referenced this pull request Sep 25, 2024
```
==5445==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000008 (pc 0x7ff4c6bedb19 bp 0x7ffc95f2e400 sp 0x7ffc95f2e3c0 T0)
==5445==The signal is caused by a READ memory access.
==5445==Hint: address points to the zero page.
    #0 0x7ff4c6bedb19 in hash_iterate lib/hash.c:246
    #1 0x5618f41f5f59 in bgp_evpn_nh_finish bgpd/bgp_evpn_mh.c:4663
    #2 0x5618f41dcbe8 in bgp_evpn_vrf_delete bgpd/bgp_evpn.c:7336
    #3 0x5618f43bdd35 in bgp_delete bgpd/bgpd.c:4098
    FRRouting#4 0x5618f417ef6e in bgp_exit bgpd/bgp_main.c:206
    FRRouting#5 0x5618f417ef6e in sigint bgpd/bgp_main.c:164
    FRRouting#6 0x7ff4c6cac6c4 in frr_sigevent_process lib/sigevent.c:117
    FRRouting#7 0x7ff4c6cd8258 in event_fetch lib/event.c:1767
    FRRouting#8 0x7ff4c6c0dcbc in frr_run lib/libfrr.c:1230
    FRRouting#9 0x5618f418080d in main bgpd/bgp_main.c:555
    FRRouting#10 0x7ff4c670c249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    FRRouting#11 0x7ff4c670c304 in __libc_start_main_impl ../csu/libc-start.c:360
    FRRouting#12 0x5618f417ea20 in _start (/usr/lib/frr/bgpd+0x2e4a20)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV lib/hash.c:246 in hash_iterate
```

Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
louis-6wind added a commit that referenced this pull request Sep 25, 2024
Fix a heap-after-free that causes zebra to crash even without
address-sanitizer. To reproduce:

> echo "100 my_table" | tee -a /etc/iproute2/rt_tables
> ip route add blackhole default table 100
> ip route show table 100
> ip l add red type vrf table 100
> ip l del red
> ip route del blackhole default table 100

Zebra manages routing tables for all existing Linux RT tables,
regardless of whether they are assigned to a VRF interface. When a table
is not assigned to any VRF, zebra arbitrarily assigns it to the default
VRF, even though this is not strictly accurate (the code expects this
behavior).

When an RT table is created after a VRF, zebra correctly assigns the
table to the VRF. However, if a VRF interface is assigned to an existing
RT table, zebra does not update the table owner, which remains as the
default VRF. As a result, existing routing entries remain under the
default VRF, while new entries are correctly assigned to the VRF. The
VRF mismatch is unexpected in the code and creates crashes and memory
related issues.

Furthermore, Linux does not automatically delete RT tables when they are
unassigned from a VRF. It is incorrect to delete these tables from zebra.

Instead, at VRF disabling, do not release the table but reassign it to
the default VRF. At VRF enabling, change the table owner back to the
appropriate VRF.

> ==2866266==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000154f54 at pc 0x7fa32474b83f bp 0x7ffe94f67d90 sp 0x7ffe94f67d88
> READ of size 1 at 0x606000154f54 thread T0
>     #0 0x7fa32474b83e in rn_hash_node_const_find lib/table.c:28
>     #1 0x7fa32474bab1 in rn_hash_node_find lib/table.c:28
>     #2 0x7fa32474d783 in route_node_get lib/table.c:283
>     #3 0x7fa3247328dd in srcdest_rnode_get lib/srcdest_table.c:231
>     FRRouting#4 0x55b0e4fa8da4 in rib_find_rn_from_ctx zebra/zebra_rib.c:1957
>     FRRouting#5 0x55b0e4fa8e31 in rib_process_result zebra/zebra_rib.c:1988
>     FRRouting#6 0x55b0e4fb9d64 in rib_process_dplane_results zebra/zebra_rib.c:4894
>     FRRouting#7 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#8 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#9 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#10 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308
>     FRRouting#11 0x55b0e4e2d649 in _start (/usr/lib/frr/zebra+0x1a1649)
>
> 0x606000154f54 is located 20 bytes inside of 56-byte region [0x606000154f40,0x606000154f78)
> freed by thread T0 here:
>     #0 0x7fa324ca9b6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123
>     #1 0x7fa324668d8f in qfree lib/memory.c:130
>     #2 0x7fa32474c421 in route_table_free lib/table.c:126
>     #3 0x7fa32474bf96 in route_table_finish lib/table.c:46
>     FRRouting#4 0x55b0e4fbca3a in zebra_router_free_table zebra/zebra_router.c:191
>     FRRouting#5 0x55b0e4fbccea in zebra_router_release_table zebra/zebra_router.c:214
>     FRRouting#6 0x55b0e4fd428e in zebra_vrf_disable zebra/zebra_vrf.c:219
>     FRRouting#7 0x7fa32476fabf in vrf_disable lib/vrf.c:326
>     FRRouting#8 0x7fa32476f5d4 in vrf_delete lib/vrf.c:231
>     FRRouting#9 0x55b0e4e4ad36 in interface_vrf_change zebra/interface.c:1478
>     FRRouting#10 0x55b0e4e4d5d2 in zebra_if_dplane_ifp_handling zebra/interface.c:1949
>     FRRouting#11 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268
>     FRRouting#12 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954
>     FRRouting#13 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#14 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#15 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#16 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308
>
> previously allocated by thread T0 here:
>     #0 0x7fa324caa037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     #1 0x7fa324668c4d in qcalloc lib/memory.c:105
>     #2 0x7fa32474bf33 in route_table_init_with_delegate lib/table.c:38
>     #3 0x7fa32474e73c in route_table_init lib/table.c:512
>     FRRouting#4 0x55b0e4fbc353 in zebra_router_get_table zebra/zebra_router.c:137
>     FRRouting#5 0x55b0e4fd4da0 in zebra_vrf_table_create zebra/zebra_vrf.c:358
>     FRRouting#6 0x55b0e4fd3d30 in zebra_vrf_enable zebra/zebra_vrf.c:140
>     FRRouting#7 0x7fa32476f9b2 in vrf_enable lib/vrf.c:286
>     FRRouting#8 0x55b0e4e4af76 in interface_vrf_change zebra/interface.c:1533
>     FRRouting#9 0x55b0e4e4d612 in zebra_if_dplane_ifp_handling zebra/interface.c:1968
>     FRRouting#10 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268
>     FRRouting#11 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954
>     FRRouting#12 0x7fa32476689c in event_call lib/event.c:1996
>     FRRouting#13 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232
>     FRRouting#14 0x55b0e4e6c32a in main zebra/main.c:526
>     FRRouting#15 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308

Fixes: d8612e6 ("zebra: Track tables allocated by vrf and cleanup")
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
louis-6wind added a commit that referenced this pull request Sep 25, 2024
Fix heap-after-free seen with bgp_vrf_dynamic_route_leak /
test_bgp_vrf_dynamic_route_leak_topo1 topotest.

> =================================================================
> ==1899==ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x559fe9aee213 bp 0x7fffef3d3ef0 sp 0x7fffef3d3ee8
> READ of size 4 at 0x6160000acba4 thread T0
>     #0 0x559fe9aee212 in ctx_info_from_zns zebra/zebra_dplane.c:3331
>     #1 0x559fe9aee212 in dplane_ctx_ns_init zebra/zebra_dplane.c:3347
>     #2 0x559fe9af50a2 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3696
>     #3 0x559fe9afadd1 in dplane_nexthop_update_internal zebra/zebra_dplane.c:4503
>     FRRouting#4 0x559fe9afb0ff in dplane_nexthop_delete zebra/zebra_dplane.c:4730
>     FRRouting#5 0x559fe9b6fddf in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3278
>     FRRouting#6 0x559fe9b700bf in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1768
>     FRRouting#7 0x559fe9b8a5d3 in route_entry_update_nhe zebra/zebra_rib.c:457
>     FRRouting#8 0x559fe9b932f7 in rib_re_nhg_free zebra/zebra_rib.c:2691
>     FRRouting#9 0x559fe9b932f7 in rib_unlink zebra/zebra_rib.c:4088
>     FRRouting#10 0x559fe9b93485 in zebra_rtable_node_cleanup zebra/zebra_rib.c:958
>     FRRouting#11 0x7fd4040c6672 in route_node_free lib/table.c:75
>     FRRouting#12 0x7fd4040c7378 in route_table_free lib/table.c:111
>     FRRouting#13 0x7fd4040c7378 in route_table_finish lib/table.c:46
>     FRRouting#14 0x559fe9b9cd50 in zebra_router_free_table zebra/zebra_router.c:191
>     FRRouting#15 0x559fe9b9e3fe in zebra_router_terminate zebra/zebra_router.c:244
>     FRRouting#16 0x559fe9aa3c40 in zebra_finalize zebra/main.c:244
>     FRRouting#17 0x7fd4040da2c1 in event_call lib/event.c:1996
>     FRRouting#18 0x7fd40400dcc6 in frr_run lib/libfrr.c:1237
>     FRRouting#19 0x559fe9aa435e in main zebra/main.c:526
>     FRRouting#20 0x7fd403b0c249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     FRRouting#21 0x7fd403b0c304 in __libc_start_main_impl ../csu/libc-start.c:360
>     FRRouting#22 0x559fe9a777a0 in _start (/usr/lib/frr/zebra+0x1a47a0)
>
> 0x6160000acba4 is located 36 bytes inside of 584-byte region [0x6160000acb80,0x6160000acdc8)
> freed by thread T0 here:
>     #0 0x7fd4044b76a8 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52
>     #1 0x7fd40402d060 in qfree lib/memory.c:131
>     #2 0x559fe9b73aaa in zebra_ns_delete zebra/zebra_ns.c:74
>     #3 0x559fe9b7415a in zebra_ns_final_shutdown zebra/zebra_ns.c:192
>     FRRouting#4 0x7fd40404d45e in ns_walk_func lib/netns_linux.c:372
>     FRRouting#5 0x559fe9aa3c36 in zebra_finalize zebra/main.c:241
>     FRRouting#6 0x7fd4040da2c1 in event_call lib/event.c:1996
>     FRRouting#7 0x7fd40400dcc6 in frr_run lib/libfrr.c:1237
>     FRRouting#8 0x559fe9aa435e in main zebra/main.c:526
>     FRRouting#9 0x7fd403b0c249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>
> previously allocated by thread T0 here:
>     #0 0x7fd4044b83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
>     #1 0x7fd40402c76d in qcalloc lib/memory.c:106
>     #2 0x559fe9b73b9b in zebra_ns_new zebra/zebra_ns.c:55
>     #3 0x559fe9b74283 in zebra_ns_init zebra/zebra_ns.c:221
>     FRRouting#4 0x559fe9aa4159 in main zebra/main.c:440
>     FRRouting#5 0x7fd403b0c249 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
louis-6wind pushed a commit that referenced this pull request Oct 2, 2024
```
ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000aecf0 at pc 0x5555557ecdb9 bp 0x7fffffffe350 sp 0x7fffffffe340
READ of size 4 at 0x6160000aecf0 thread T0
    #0 0x5555557ecdb8 in igmp_source_delete pimd/pim_igmpv3.c:340
    #1 0x5555557ed475 in igmp_source_delete_expired pimd/pim_igmpv3.c:405
    #2 0x5555557de574 in igmp_group_timer pimd/pim_igmp.c:1346
    #3 0x7ffff7275421 in event_call lib/event.c:1996
    FRRouting#4 0x7ffff7140797 in frr_run lib/libfrr.c:1237
    FRRouting#5 0x5555557f5840 in main pimd/pim_main.c:166
    FRRouting#6 0x7ffff6a54082 in __libc_start_main ../csu/libc-start.c:308
    FRRouting#7 0x555555686eed in _start (/usr/lib/frr/pimd+0x132eed)

0x6160000aecf0 is located 112 bytes inside of 600-byte region [0x6160000aec80,0x6160000aeed8)
freed by thread T0 here:
    #0 0x7ffff767b40f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:122
    #1 0x7ffff716ed34 in qfree lib/memory.c:131
    #2 0x5555557169ae in pim_channel_oil_free pimd/pim_oil.c:84
    #3 0x555555717981 in pim_channel_oil_del pimd/pim_oil.c:199
    FRRouting#4 0x55555573c42c in tib_sg_gm_prune pimd/pim_tib.c:196
    FRRouting#5 0x5555557d6d04 in igmp_source_forward_stop pimd/pim_igmp.c:229
    FRRouting#6 0x5555557d5855 in igmp_anysource_forward_stop pimd/pim_igmp.c:61
    FRRouting#7 0x5555557de539 in igmp_group_timer pimd/pim_igmp.c:1344
    FRRouting#8 0x7ffff7275421 in event_call lib/event.c:1996
    FRRouting#9 0x7ffff7140797 in frr_run lib/libfrr.c:1237
    FRRouting#10 0x5555557f5840 in main pimd/pim_main.c:166
    FRRouting#11 0x7ffff6a54082 in __libc_start_main ../csu/libc-start.c:308

previously allocated by thread T0 here:
    #0 0x7ffff767ba06 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:153
    #1 0x7ffff716ebe1 in qcalloc lib/memory.c:106
    #2 0x555555716eb7 in pim_channel_oil_add pimd/pim_oil.c:133
    #3 0x55555573b2b9 in tib_sg_oil_setup pimd/pim_tib.c:30
    FRRouting#4 0x55555573bdd3 in tib_sg_gm_join pimd/pim_tib.c:119
    FRRouting#5 0x5555557d6788 in igmp_source_forward_start pimd/pim_igmp.c:193
    FRRouting#6 0x5555557d5771 in igmp_anysource_forward_start pimd/pim_igmp.c:51
    FRRouting#7 0x5555557ecaa0 in group_exclude_fwd_anysrc_ifempty pimd/pim_igmpv3.c:310
    FRRouting#8 0x5555557ef937 in toex_incl pimd/pim_igmpv3.c:839
    FRRouting#9 0x5555557f00a2 in igmpv3_report_toex pimd/pim_igmpv3.c:938
    FRRouting#10 0x5555557f543d in igmp_v3_recv_report pimd/pim_igmpv3.c:2000
    FRRouting#11 0x5555557da2b4 in pim_igmp_packet pimd/pim_igmp.c:787
    FRRouting#12 0x5555556ee46a in process_igmp_packet pimd/pim_mroute.c:763
    FRRouting#13 0x5555556ee5f3 in pim_mroute_msg pimd/pim_mroute.c:787
    FRRouting#14 0x5555556eef58 in mroute_read pimd/pim_mroute.c:877
    FRRouting#15 0x7ffff7275421 in event_call lib/event.c:1996
    FRRouting#16 0x7ffff7140797 in frr_run lib/libfrr.c:1237
    FRRouting#17 0x5555557f5840 in main pimd/pim_main.c:166
    FRRouting#18 0x7ffff6a54082 in __libc_start_main ../csu/libc-start.c:308

SUMMARY: AddressSanitizer: heap-use-after-free pimd/pim_igmpv3.c:340 in igmp_source_delete
Shadow bytes around the buggy address:
  0x0c2c8000dd40: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2c8000dd50: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2c8000dd60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2c8000dd70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2c8000dd80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c2c8000dd90: fd fd fd fd fd fd fd fd fd fd fd fd fd fd[fd]fd
  0x0c2c8000dda0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2c8000ddb0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2c8000ddc0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c2c8000ddd0: fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa fa
  0x0c2c8000dde0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
```

Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants