Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prefix yang'ification of code performance is beyond awful #6658

Closed
donaldsharp opened this issue Jun 30, 2020 · 2 comments
Closed

prefix yang'ification of code performance is beyond awful #6658

donaldsharp opened this issue Jun 30, 2020 · 2 comments

Comments

@donaldsharp
Copy link
Member

7.3.1 load up of a 65k bgpd.conf that has prefix-lists and route-maps is aproximately 42 seconds of cpu. When running against a master with the yang code, I gave up on the read in with over > 2 hours of run time. This is completely unaceptable.

I cannot attach the config file as that it was given to me in confidence:

sharpd@eva:/frr4$ grep "prefix-list" /home/sharpd/frr.conf | wc
25958 201502 1542898
sharpd@eva:
/frr4$ grep "route-map" /home/sharpd/frr.conf | wc
7699 33219 296423
sharpd@eva:~/frr4$

I think we should be able to come up with a configuration that shows this.

@donaldsharp donaldsharp added the triage Needs further investigation label Jun 30, 2020
@donaldsharp
Copy link
Member Author

bgp

@donaldsharp
Copy link
Member Author

30.05% libyang.so.1.8.4 [.] moveto_node_check
4.90% libc-2.30.so [.] __memmove_sse2_unaligned_erms
4.37% libyang.so.1.8.4 [.] moveto_node
3.65% libyang.so.1.8.4 [.] lyd_node_module
2.79% libc-2.30.so [.] __memset_avx2_unaligned_erms
1.24% libpthread-2.30.so [.] __pthread_getspecific
1.09% libyang.so.1.8.4 [.] lyd_node_module@plt
1.05% libasan.so.5.0.0 [.] __interceptor_free
0.78% slack [.] v8::internal::ConcurrentMarking::Run
0.56% libasan.so.5.0.0 [.] 0x000000000002c188
0.55% libasan.so.5.0.0 [.] 0x000000000010b360

rwestphal added a commit to opensourcerouting/frr that referenced this issue Jul 13, 2020
When using the default CLI mode, the northbound layer needs to create
a separate transaction to process each YANG-modeled command since
they are supposed to be applied immediately (there's no candidate
configuration nor the "commit" command like in the transactional
CLI). The problem is that configuration transactions have an overhead
associated to them, in big part because of the use of some heavy
libyang functions like `lyd_validate()` and `lyd_diff()`. As of
now this overhead is substantial and doesn't scale well when large
numbers of transactions need to be performed in sequence.

As an example, loading 50k prefix-lists using a single transaction
takes about 2 seconds on a modern CPU. Loading the same 50k
prefix-lists using 50k transactions can take more than an hour
to complete (which is unacceptable by any standard). To fix this
problem, some heavy optimization work needs to be done on libyang and
on the FRR northbound itself too (e.g. perform partial configuration
diffs whenever possible).  This, however, should be a long term
effort since these optimizations shouldn't be trivial to implement
and we're far from having the performance numbers we need.

In the meanwhile, this commit introduces a simple but efficient
workaround to alleviate the issue. In short, a new back-off timer
was introduced in the CLI to monitor and detect when too many
YANG-modeled commands are being received at the same time. When
a certain threshold is reached (100 YANG-modeled commands within
one second), the northbound starts to group all subsequent commands
into a single large transaction, which allows them to be processed
much faster (e.g. seconds and not hours).  It's essentially a
protection mechanism that creates dynamically-sized transactions
when necessary to prevent performance issues from happening. This
mechanism is enabled both when parsing configuration files and when
reading commands from a terminal.

The downside of this optimization is that, if several YANG-modeled
commands are grouped into the same transaction and at least one of
them fails, the whole transaction is rejected. This is undesirable
since users don't expect transactional behavior when that's not
enabled explicitly. To minimize this issue, the CLI will log all
commands that were rejected whenever that happens, to make the
user aware of what happened and have enough information to fix
the problem. Commands that fail due to parsing errors or CLI-level
validations in general are rejected separately.

Again, this proposed workaround is intended to be temporary. The
goal is to provided a quick fix to issues like FRRouting#6658 while we work
on better long-term solutions.

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
rwestphal added a commit to opensourcerouting/frr that referenced this issue Jul 28, 2020
When using the default CLI mode, the northbound layer needs to create
a separate transaction to process each YANG-modeled command since
they are supposed to be applied immediately (there's no candidate
configuration nor the "commit" command like in the transactional
CLI). The problem is that configuration transactions have an overhead
associated to them, in big part because of the use of some heavy
libyang functions like `lyd_validate()` and `lyd_diff()`. As of
now this overhead is substantial and doesn't scale well when large
numbers of transactions need to be performed in sequence.

As an example, loading 50k prefix-lists using a single transaction
takes about 2 seconds on a modern CPU. Loading the same 50k
prefix-lists using 50k transactions can take more than an hour
to complete (which is unacceptable by any standard). To fix this
problem, some heavy optimization work needs to be done on libyang and
on the FRR northbound itself too (e.g. perform partial configuration
diffs whenever possible).  This, however, should be a long term
effort since these optimizations shouldn't be trivial to implement
and we're far from having the performance numbers we need.

In the meanwhile, this commit introduces a simple but efficient
workaround to alleviate the issue. In short, a new back-off timer
was introduced in the CLI to monitor and detect when too many
YANG-modeled commands are being received at the same time. When
a certain threshold is reached (100 YANG-modeled commands within
one second), the northbound starts to group all subsequent commands
into a single large transaction, which allows them to be processed
much faster (e.g. seconds and not hours).  It's essentially a
protection mechanism that creates dynamically-sized transactions
when necessary to prevent performance issues from happening. This
mechanism is enabled both when parsing configuration files and when
reading commands from a terminal.

The downside of this optimization is that, if several YANG-modeled
commands are grouped into the same transaction and at least one of
them fails, the whole transaction is rejected. This is undesirable
since users don't expect transactional behavior when that's not
enabled explicitly. To minimize this issue, the CLI will log all
commands that were rejected whenever that happens, to make the
user aware of what happened and have enough information to fix
the problem. Commands that fail due to parsing errors or CLI-level
validations in general are rejected separately.

Again, this proposed workaround is intended to be temporary. The
goal is to provided a quick fix to issues like FRRouting#6658 while we work
on better long-term solutions.

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
rwestphal added a commit to opensourcerouting/frr that referenced this issue Aug 3, 2020
When using the default CLI mode, the northbound layer needs to create
a separate transaction to process each YANG-modeled command since
they are supposed to be applied immediately (there's no candidate
configuration nor the "commit" command like in the transactional
CLI). The problem is that configuration transactions have an overhead
associated to them, in big part because of the use of some heavy
libyang functions like `lyd_validate()` and `lyd_diff()`. As of
now this overhead is substantial and doesn't scale well when large
numbers of transactions need to be performed in sequence.

As an example, loading 50k prefix-lists using a single transaction
takes about 2 seconds on a modern CPU. Loading the same 50k
prefix-lists using 50k transactions can take more than an hour
to complete (which is unacceptable by any standard). To fix this
problem, some heavy optimization work needs to be done on libyang and
on the FRR northbound itself too (e.g. perform partial configuration
diffs whenever possible).  This, however, should be a long term
effort since these optimizations shouldn't be trivial to implement
and we're far from having the performance numbers we need.

In the meanwhile, this commit introduces a simple but efficient
workaround to alleviate the issue. In short, a new back-off timer
was introduced in the CLI to monitor and detect when too many
YANG-modeled commands are being received at the same time. When
a certain threshold is reached (100 YANG-modeled commands within
one second), the northbound starts to group all subsequent commands
into a single large transaction, which allows them to be processed
much faster (e.g. seconds and not hours).  It's essentially a
protection mechanism that creates dynamically-sized transactions
when necessary to prevent performance issues from happening. This
mechanism is enabled both when parsing configuration files and when
reading commands from a terminal.

The downside of this optimization is that, if several YANG-modeled
commands are grouped into the same transaction and at least one of
them fails, the whole transaction is rejected. This is undesirable
since users don't expect transactional behavior when that's not
enabled explicitly. To minimize this issue, the CLI will log all
commands that were rejected whenever that happens, to make the
user aware of what happened and have enough information to fix
the problem. Commands that fail due to parsing errors or CLI-level
validations in general are rejected separately.

Again, this proposed workaround is intended to be temporary. The
goal is to provided a quick fix to issues like FRRouting#6658 while we work
on better long-term solutions.

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
maduri111 pushed a commit to maduri111/frr that referenced this issue Aug 9, 2020
When using the default CLI mode, the northbound layer needs to create
a separate transaction to process each YANG-modeled command since
they are supposed to be applied immediately (there's no candidate
configuration nor the "commit" command like in the transactional
CLI). The problem is that configuration transactions have an overhead
associated to them, in big part because of the use of some heavy
libyang functions like `lyd_validate()` and `lyd_diff()`. As of
now this overhead is substantial and doesn't scale well when large
numbers of transactions need to be performed in sequence.

As an example, loading 50k prefix-lists using a single transaction
takes about 2 seconds on a modern CPU. Loading the same 50k
prefix-lists using 50k transactions can take more than an hour
to complete (which is unacceptable by any standard). To fix this
problem, some heavy optimization work needs to be done on libyang and
on the FRR northbound itself too (e.g. perform partial configuration
diffs whenever possible).  This, however, should be a long term
effort since these optimizations shouldn't be trivial to implement
and we're far from having the performance numbers we need.

In the meanwhile, this commit introduces a simple but efficient
workaround to alleviate the issue. In short, a new back-off timer
was introduced in the CLI to monitor and detect when too many
YANG-modeled commands are being received at the same time. When
a certain threshold is reached (100 YANG-modeled commands within
one second), the northbound starts to group all subsequent commands
into a single large transaction, which allows them to be processed
much faster (e.g. seconds and not hours).  It's essentially a
protection mechanism that creates dynamically-sized transactions
when necessary to prevent performance issues from happening. This
mechanism is enabled both when parsing configuration files and when
reading commands from a terminal.

The downside of this optimization is that, if several YANG-modeled
commands are grouped into the same transaction and at least one of
them fails, the whole transaction is rejected. This is undesirable
since users don't expect transactional behavior when that's not
enabled explicitly. To minimize this issue, the CLI will log all
commands that were rejected whenever that happens, to make the
user aware of what happened and have enough information to fix
the problem. Commands that fail due to parsing errors or CLI-level
validations in general are rejected separately.

Again, this proposed workaround is intended to be temporary. The
goal is to provided a quick fix to issues like FRRouting#6658 while we work
on better long-term solutions.

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
donaldsharp pushed a commit to donaldsharp/frr that referenced this issue Oct 29, 2020
When using the default CLI mode, the northbound layer needs to create
a separate transaction to process each YANG-modeled command since
they are supposed to be applied immediately (there's no candidate
configuration nor the "commit" command like in the transactional
CLI). The problem is that configuration transactions have an overhead
associated to them, in big part because of the use of some heavy
libyang functions like `lyd_validate()` and `lyd_diff()`. As of
now this overhead is substantial and doesn't scale well when large
numbers of transactions need to be performed in sequence.

As an example, loading 50k prefix-lists using a single transaction
takes about 2 seconds on a modern CPU. Loading the same 50k
prefix-lists using 50k transactions can take more than an hour
to complete (which is unacceptable by any standard). To fix this
problem, some heavy optimization work needs to be done on libyang and
on the FRR northbound itself too (e.g. perform partial configuration
diffs whenever possible).  This, however, should be a long term
effort since these optimizations shouldn't be trivial to implement
and we're far from having the performance numbers we need.

In the meanwhile, this commit introduces a simple but efficient
workaround to alleviate the issue. In short, a new back-off timer
was introduced in the CLI to monitor and detect when too many
YANG-modeled commands are being received at the same time. When
a certain threshold is reached (100 YANG-modeled commands within
one second), the northbound starts to group all subsequent commands
into a single large transaction, which allows them to be processed
much faster (e.g. seconds and not hours).  It's essentially a
protection mechanism that creates dynamically-sized transactions
when necessary to prevent performance issues from happening. This
mechanism is enabled both when parsing configuration files and when
reading commands from a terminal.

The downside of this optimization is that, if several YANG-modeled
commands are grouped into the same transaction and at least one of
them fails, the whole transaction is rejected. This is undesirable
since users don't expect transactional behavior when that's not
enabled explicitly. To minimize this issue, the CLI will log all
commands that were rejected whenever that happens, to make the
user aware of what happened and have enough information to fix
the problem. Commands that fail due to parsing errors or CLI-level
validations in general are rejected separately.

Again, this proposed workaround is intended to be temporary. The
goal is to provided a quick fix to issues like FRRouting#6658 while we work
on better long-term solutions.

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
@qlyoung qlyoung closed this as completed Aug 2, 2021
@qlyoung qlyoung added libyang performance and removed triage Needs further investigation labels Aug 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants