Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Zbus Benchmark Sample is smashing the stack and is too fast #53563

Merged
merged 4 commits into from
Jan 10, 2023

Conversation

alpsayin
Copy link
Contributor

@alpsayin alpsayin commented Jan 6, 2023

ZBus subscriber threads were smashing their threads. Upped their stack with something safe by utilising IDLE_STACK_SIZE for subscribers and MAIN_STACK_SIZE for producer.

Benchmark also finishes way too fast for most if not all targets. Used cycles instead of uptime as originally suggested by @rodrigopex and used nanosecond conversions to get around the duration==0 check.

Native posix still returns start and end times as 0 though, so with further commits utilised native_rtc with RTC_CLOCK_PSEUDOHOSTREALTIME to get as much resolution as possible.

@alpsayin alpsayin changed the title Fix Zbus Sample is smashing the stack and too fast Fix Zbus Sample is smashing the stack and is too fast Jan 6, 2023
@alpsayin alpsayin changed the title Fix Zbus Sample is smashing the stack and is too fast Fix Zbus Benchmark Sample is smashing the stack and is too fast Jan 6, 2023
@alpsayin alpsayin marked this pull request as ready for review January 6, 2023 18:13
@alpsayin alpsayin requested a review from nashif as a code owner January 6, 2023 18:13
@alpsayin
Copy link
Contributor Author

alpsayin commented Jan 6, 2023

@rodrigopex for reviews?

Copy link
Contributor

@rodrigopex rodrigopex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only small changes are necessary.

samples/subsys/zbus/benchmark/src/benchmark.c Outdated Show resolved Hide resolved
samples/subsys/zbus/benchmark/src/benchmark.c Outdated Show resolved Hide resolved
@rodrigopex
Copy link
Contributor

rodrigopex commented Jan 6, 2023

@alpsayin maybe we could leave the duration zero occurs. The only thing we need to do is to avoid the zero division and force fand i also to be zero. What do you think? And that would remove platform_exclude: native_posix from the sample yaml you are suggesting.

@asayin-xlnx
Copy link

@alpsayin maybe we could leave the duration zero occurs. The only thing we need to do is to avoid the zero division and force fand i also to be zero. What do you think? And that would remove platform_exclude: native_posix from the sample yaml you are suggesting.

I found a somewhat reliable way to measure time for native_posix. "Reliable" is when you up the BYTES_TO_BE_SENT to MBs then duration becomes consistently directly proportional (as one would expect) to BYTES_TO_BE_SENT. With 256kB, the deviation is about 30% on the machine that I'm testing (workstation with 0 load).

Eyeballed means/deviations are 30+/-10us for native_posix_64 and 35+/-5us for native_posix.

Let's see what CI/CD will end up showing.

Copy link
Contributor

@rodrigopex rodrigopex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would avoid abbreviations for one-time used variables and declare variables inside ifdefs.

samples/subsys/zbus/benchmark/src/benchmark.c Outdated Show resolved Hide resolved
samples/subsys/zbus/benchmark/src/benchmark.c Outdated Show resolved Hide resolved
samples/subsys/zbus/benchmark/src/benchmark.c Outdated Show resolved Hide resolved
Stack sizes need to be dependent on something arch specific such as
idle stack size because they're smashing their stacks.

Signed-off-by: Alp Sayin <alp.sayin@amd.com>
…time

Needed because benchmark finishes faster than a tick, especially on
emulated platforms where tick rate is 100Hz. Output units are also updated
to print numbers with less digits.

Signed-off-by: Alp Sayin <alp.sayin@amd.com>
rodrigopex
rodrigopex previously approved these changes Jan 7, 2023
@carlescufi
Copy link
Member

@alpsayin can you please fix the failing checks?

Copy link
Contributor

@rodrigopex rodrigopex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just waiting for @cfriedt to comment on the native_posix get-time approach.


#if defined(CONFIG_ARCH_POSIX)
#include "native_rtc.h"
#define GET_ARCH_TIME_NS() (native_rtc_gettime_us(RTC_CLOCK_PSEUDOHOSTREALTIME) * NSEC_PER_USEC)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cfriedt Could you please give us your opinion about line 17? Is this a good practice for native_posix?

@alpsayin
Copy link
Contributor Author

alpsayin commented Jan 9, 2023

Hi @carlescufi thanks for showing interest.
I think 1st failure here is a checkpatch bug, checkpatch either complains line's too long or I'm "unnecessarily" breaking it. Please see in fact our discussion on 2nd line. What should I do?
https://github.com/zephyrproject-rtos/zephyr/actions/runs/3871989306/jobs/6603001449#step:9:31
#53563 (comment)

K_THREAD_DEFINE(producer_thread_id, PRODUCER_STACK_SIZE, producer_thread, NULL, NULL, NULL, 5, 0, 0);

2nd failure I see here is just me being somehow very unlucky:
https://github.com/zephyrproject-rtos/zephyr/actions/runs/3871989309/jobs/6603001548#step:5:54

Err:1 http://azure.archive.ubuntu.com/ubuntu focal/universe amd64 ocaml-base-nox amd64 4.08.1-8
  503  Service Unavailable [IP: 20.106.104.242 80]
E: Failed to fetch http://azure.archive.ubuntu.com/ubuntu/pool/universe/o/ocaml/ocaml-base-nox_4.08.1-8_amd64.deb  503  Service Unavailable [IP: 20.106.104.242 80]

@rodrigopex
Copy link
Contributor

2nd failure I see here is just me being somehow very unlucky: https://github.com/zephyrproject-rtos/zephyr/actions/runs/3871989309/jobs/6603001548#step:5:54

Err:1 http://azure.archive.ubuntu.com/ubuntu focal/universe amd64 ocaml-base-nox amd64 4.08.1-8
  503  Service Unavailable [IP: 20.106.104.242 80]
E: Failed to fetch http://azure.archive.ubuntu.com/ubuntu/pool/universe/o/ocaml/ocaml-base-nox_4.08.1-8_amd64.deb  503  Service Unavailable [IP: 20.106.104.242 80]

@alpsayin

Let's try it again in some minutes.

@carlescufi
Copy link
Member

I think 1st failure here is a checkpatch bug, checkpatch either complains line's too long or I'm "unnecessarily" breaking it. Please see in fact our discussion on 2nd line. What should I do?

I think we can let this one go, but double-checking with @stephanosio

#include "native_rtc.h"
#define GET_ARCH_TIME_NS() (native_rtc_gettime_us(RTC_CLOCK_PSEUDOHOSTREALTIME) * NSEC_PER_USEC)
#else
#define GET_ARCH_TIME_NS() (k_cyc_to_ns_near32(sys_clock_cycle_get_32()))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was seeing issues on the ARC QEMU platform where the tick count runs at 100Hz; does this API provide a higher resolution value for that case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems so

-- west build: running target run_qemu
[0/1] To exit from QEMU enter: 'CTRL+a, x'[QEMU] CPU: arcem
*** Booting Zephyr OS build zephyr-v3.2.0-3158-g7bb6557ba710 ***
I: Benchmark 1 to 8: Dynamic memory, SYNC transmission and message size 256
I: Bytes sent = 262144, received = 262144
I: Average data rate: 2.11MB/s
I: Duration: 118061.400us

@118061400
[0/1] To exit from QEMU enter: 'CTRL+a, x'[QEMU] CPU: archs
*** Booting Zephyr OS build zephyr-v3.2.0-3158-g7bb6557ba710 ***
I: Benchmark 1 to 8: Dynamic memory, SYNC transmission and message size 256
I: Bytes sent = 262144, received = 262144
I: Average data rate: 2.12MB/s
I: Duration: 117709.400us

@117709400
[0/1] To exit from QEMU enter: 'CTRL+a, x'[QEMU] CPU: hs6x
*** Booting Zephyr OS build zephyr-v3.2.0-3158-g7bb6557ba710 ***
I: Benchmark 1 to 8: Dynamic memory, SYNC transmission and message size 256
I: Bytes sent = 262144, received = 262144
I: Average data rate: 3.12MB/s
I: Duration: 80011.600us

@80011600

@stephanosio
Copy link
Member

I think we can let this one go, but double-checking with @stephanosio

The package download failure has been fixed and can be ignored. The LINE_CONTINUATIONS failure, however, is valid -- it is warning that \ is not necessary and should be removed.

…POSIX

Because posix emulated execution is significantly faster than posix
emulated timer and benchmark finishes before even the first tick occurs.

Signed-off-by: Alp Sayin <alp.sayin@amd.com>
The sample is now able to run on all previously excluded platforms.

Signed-off-by: Alp Sayin <alp.sayin@amd.com>
@alpsayin
Copy link
Contributor Author

Of course 🤦🏼 Dont need \ when inside parenthesis...
Thanks both!

@alpsayin
Copy link
Contributor Author

@rodrigopex I have a feeling I should also compare ticks for good measure.
I couldn't trust that cycles isn't going to overflow.
Some numbers I saw are 100s of Mbytes/s whereas arc returned 2-3.
Can we really expect this much deviation?

@carlescufi carlescufi requested review from asayin-xlnx and keith-packard and removed request for asayin-xlnx January 10, 2023 13:40
@carlescufi
Copy link
Member

Merging this now, additional improvements can be done in a subsequent PR.

@carlescufi carlescufi merged commit 345b3bc into zephyrproject-rtos:main Jan 10, 2023
@rodrigopex
Copy link
Contributor

@rodrigopex I have a feeling I should also compare ticks for good measure. I couldn't trust that cycles isn't going to overflow. Some numbers I saw are 100s of Mbytes/s whereas arc returned 2-3. Can we really expect this much deviation?

@alpsayin I need to think more about it. We can discuss it in the following days. Ok?

@keith-packard
Copy link
Collaborator

This doesn't fix the problem on qemu_nios2 where the cycle counter resolution is the the same as the system clock (100Hz by default). Cranking up the tick rate (CONFIG_SYS_CLOCK_TICKS_PER_SEC) to 1kHz fixes the problem.

Output with minimal libc (with 1kHz tick):

*** Booting Zephyr OS build zephyr-v3.2.0-3310-g4ec18e5fbf52 ***
I: Benchmark 1 to 8: Dynamic memory, SYNC transmission and message size 256
I: Bytes sent = 262144, received = 262144
I: Average data rate: 7.35MB/s
I: Duration: 34000.0us

@34000000

Output with picolibc (with 1kHz tick):

*** Booting Zephyr OS build zephyr-v3.2.0-3310-g4ec18e5fbf52 ***
I: Benchmark 1 to 8: Dynamic memory, SYNC transmission and message size 256
I: Bytes sent = 262144, received = 262144
I: Average data rate: 62.50MB/s
I: Duration: 4000.0us

@4000000

As you can see, when using picolibc, the test completes in 4ms, well under the 10ms system tick.

You should be able to verify this using this command:

$ west build -b qemu_nios2 samples/subsys/zbus/benchmark -T sample.zbus.benchmark_sync -- -DCONFIG_PICOLIBC=y

I'll submit another PR that just increases the TICK value for these tests.

@rodrigopex
Copy link
Contributor

That would be great.

@alpsayin alpsayin deleted the zbus_fix branch January 11, 2023 10:29
@alpsayin
Copy link
Contributor Author

@rodrigopex I have a feeling I should also compare ticks for good measure. I couldn't trust that cycles isn't going to overflow. Some numbers I saw are 100s of Mbytes/s whereas arc returned 2-3. Can we really expect this much deviation?

@alpsayin I need to think more about it. We can discuss it in the following days. Ok?

I checked, and we get away with (mostly by chance). In the slowest platforms, tests still finish before cycles can roll over.
It also seems nanoseconds was a bit of an overkill but oh well ¯\_(ツ)_/¯

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants