Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg: tlsf: initialize memory pool early #4490

Closed
OlegHahm opened this issue Dec 16, 2015 · 9 comments · May be fixed by #12032
Closed

pkg: tlsf: initialize memory pool early #4490

OlegHahm opened this issue Dec 16, 2015 · 9 comments · May be fixed by #12032
Assignees
Labels
Area: pkg Area: External package ports State: duplicate State: The issue/PR is a duplicate of another issue/PR State: stale State: The issue / PR has no activity for >185 days Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)

Comments

@OlegHahm
Copy link
Member

TLSF and should be initialized with a memory pool for its heap early on. Preferably this could be done inside the CPU startup. This would allow using TLSF's malloc implementation for basically every other module (including BSP or libc stuff) and choosing an appropriate size for the heap (i.e. RAM minus stack usage).

@OlegHahm OlegHahm added Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation Area: pkg Area: External package ports labels Dec 16, 2015
@OlegHahm OlegHahm added this to the Release 2016.03 milestone Dec 16, 2015
@miri64
Copy link
Member

miri64 commented Dec 16, 2015

That's also interesting in regards to lwIP. However, they bring their own version of pool-based memory allocation. We need to look into this, especially if we want to use e.g. CCN-lite together with lwIP.

@OlegHahm
Copy link
Member Author

I think in the long-run we should do something similar to PRNG for malloc and offer various alternative dynamic memory allocations. For CCN-Lite I was even thinking about reusing (misusing) pktbuf.

https://github.com/emeryberger/Malloc-Implementations may be a good starting point.

@haukepetersen
Copy link
Contributor

wont be done for this release -> postponed

@haukepetersen haukepetersen modified the milestones: Release 2016.07, Release 2016.04 Apr 20, 2016
@kYc0o
Copy link
Contributor

kYc0o commented Jul 26, 2016

And postponed again...

@kaspar030
Copy link
Contributor

+1

Maybe using the linker variables to find out about unused memory areas?

@OlegHahm
Copy link
Member Author

OlegHahm commented Sep 1, 2016

Yes, if I remember correctly that was also an approach that @haukepetersen and me discussed some time ago.

@miri64 miri64 modified the milestones: Release 2017.01, Release 2016.10 Oct 31, 2016
@PeterKietzmann PeterKietzmann modified the milestones: Release 2017.01, Release 2017.04 Jan 26, 2017
@aabadie aabadie modified the milestone: Release 2017.10 Jul 14, 2017
jcarrano added a commit to jcarrano/RIOT that referenced this issue Jul 18, 2018
This is a rough experiment at initializing the allocator before
program startup. It should fix RIOT-OS#4490.
@jcarrano
Copy link
Contributor

Please see https://github.com/jcarrano/RIOT/blob/tlsf-init-experiments/examples/tlsf-init-test/main.c.
It seems to solve the problem in both embedded (only tried the samr21) and native. ATM it's a rough sketch.

Interestingly, newlib's preinit_array is a cross file array!!

@stale
Copy link

stale bot commented Aug 10, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions.

@stale stale bot added the State: stale State: The issue / PR has no activity for >185 days label Aug 10, 2019
jcarrano added a commit to jcarrano/RIOT that referenced this issue Aug 19, 2019
This test is currently failing because of RIOT-OS#4490, RIOT-OS#5796 and RIOT-OS#12021.

When using TLSF as the system allocator it should be initialized

- Automatically, as that is what the user expects.
- Early in the boot process, since the C library mallocs internal buffers.

Failing to do so will lead to a crash as the issues and this test shows.

The test is blacklisted and will be whitelisted in the next commit with
the fix.
jcarrano added a commit to jcarrano/RIOT that referenced this issue Aug 19, 2019
The TLSF allocator needs to be initialized before use. This is an issue when
it is used as a default system allocator since the user expects to be able to
call malloc right away. This is made worse by the fact that the C library uses
malloc internally to create buffers and that may happen before the user's code
has a chance to run.

As a consequence, even doing printf when using USEMODULE=tlsf-malloc will lead
to a crash.

A mechanism is needed to:

1. Initialize the pool early.
2. Determine which memory should be used as a heap and reserve it.

Issue (1) is solved by adding the initializer to the C library's `.preinit_array`,
which is a cross-file array of function pointers that run before the library is
initialized -that is before _init(). See the newlib source code for more details.

Point (2) is important because TLSF dows not support growing the pool, only adding
new ones. We would like to initialize it with a pool as big as possible.

In native (2) is handled by defining a static array of fixed size (given by
TLSF_NATIVE_HEAPSIZE). Memory is plentiful in native and we down't care about
the overhead of zeroing out this array.

On embedded targets using newlib (this may be working on other plaforms, I only
tested ARM) `sbrk()` is used to find the start of the heap and reserve it and
the `_eheap` linker symbol is used to determine the end of the usable heap.
An array is a bad choice here because the size would be board dependent and hard
to determine without build-system magic and because it would be zeroed by default,
making the boot sequence way longer.

sbrk() does nothing more than move a pointer that marks the fraction of the space
between _sheap and _eheap that is reserved. Since we are using the whole heap it
might be tempting to just use the symbols to derive the pool location and size and
to sidestep sbrk(). Especially since the memory allocation functions are expected to
be the only users of such a feature. That "trick" would make the OS impossible to
debug in case the was a mistake and some of the original allocation functions slipped
through non-overriden. If sbrk is used to reserve the entirety of the space then that
rogue function will try to call it and fail as no more heap is available. In fact
this is how I found out that I was overriding the wrong functions (put a breakpoint
int sbrk and show a traceback.) If sbrk is sidestepped one would have nasty and
impossible to debug memory corruption errors.

A third option could be to use the heap space directly and not define sbrk. This
is beyond the scope of this change, but is probably the route to go for platform
that do not define this call (but first do a thoroug investigation of how the libc
works in that platform).

Messing with the global system allocator is not an easy thing to do. I would say
that tslf-malloc is ATM _only_ supported in native and cortex-m.

Testing procedure:

Run `tests/pkg_tlsf_malloc`.

Fixes: RIOT-OS#4490, RIOT-OS#5796.
Closes: RIOT-OS#12021
jcarrano added a commit to jcarrano/RIOT that referenced this issue Aug 19, 2019
This test is currently failing because of RIOT-OS#4490, RIOT-OS#5796 and RIOT-OS#12021.

When using TLSF as the system allocator it should be initialized

- Automatically, as that is what the user expects.
- Early in the boot process, since the C library mallocs internal buffers.

Failing to do so will lead to a crash as the issues and this test shows.

The test is blacklisted and will be whitelisted in the next commit with
the fix.
jcarrano added a commit to jcarrano/RIOT that referenced this issue Aug 19, 2019
The TLSF allocator needs to be initialized before use. This is an issue when
it is used as a default system allocator since the user expects to be able to
call malloc right away. This is made worse by the fact that the C library uses
malloc internally to create buffers and that may happen before the user's code
has a chance to run.

As a consequence, even doing printf when using USEMODULE=tlsf-malloc will lead
to a crash.

A mechanism is needed to:

1. Initialize the pool early.
2. Determine which memory should be used as a heap and reserve it.

Issue (1) is solved by adding the initializer to the C library's `.preinit_array`,
which is a cross-file array of function pointers that run before the library is
initialized -that is before _init(). See the newlib source code for more details.

Point (2) is important because TLSF dows not support growing the pool, only adding
new ones. We would like to initialize it with a pool as big as possible.

In native (2) is handled by defining a static array of fixed size (given by
TLSF_NATIVE_HEAPSIZE). Memory is plentiful in native and we down't care about
the overhead of zeroing out this array.

On embedded targets using newlib (this may be working on other plaforms, I only
tested ARM) `sbrk()` is used to find the start of the heap and reserve it and
the `_eheap` linker symbol is used to determine the end of the usable heap.
An array is a bad choice here because the size would be board dependent and hard
to determine without build-system magic and because it would be zeroed by default,
making the boot sequence way longer.

sbrk() does nothing more than move a pointer that marks the fraction of the space
between _sheap and _eheap that is reserved. Since we are using the whole heap it
might be tempting to just use the symbols to derive the pool location and size and
to sidestep sbrk(). Especially since the memory allocation functions are expected to
be the only users of such a feature. That "trick" would make the OS impossible to
debug in case the was a mistake and some of the original allocation functions slipped
through non-overriden. If sbrk is used to reserve the entirety of the space then that
rogue function will try to call it and fail as no more heap is available. In fact
this is how I found out that I was overriding the wrong functions (put a breakpoint
int sbrk and show a traceback.) If sbrk is sidestepped one would have nasty and
impossible to debug memory corruption errors.

A third option could be to use the heap space directly and not define sbrk. This
is beyond the scope of this change, but is probably the route to go for platform
that do not define this call (but first do a thoroug investigation of how the libc
works in that platform).

Messing with the global system allocator is not an easy thing to do. I would say
that tslf-malloc is ATM _only_ supported in native and cortex-m.

Testing procedure:

Run `tests/pkg_tlsf_malloc`.

Fixes: RIOT-OS#4490, RIOT-OS#5796.
Closes: RIOT-OS#12021
@stale stale bot closed this as completed Sep 10, 2019
@jcarrano jcarrano added State: duplicate State: The issue/PR is a duplicate of another issue/PR Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) and removed Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation labels Sep 10, 2019
@jcarrano
Copy link
Contributor

Duplicate of #5796

@jcarrano jcarrano marked this as a duplicate of #5796 Sep 10, 2019
jcarrano added a commit to jcarrano/RIOT that referenced this issue Sep 28, 2019
This test is currently failing because of RIOT-OS#4490, RIOT-OS#5796 and RIOT-OS#12021.

When using TLSF as the system allocator it should be initialized

- Automatically, as that is what the user expects.
- Early in the boot process, since the C library mallocs internal buffers.

Failing to do so will lead to a crash as the issues and this test shows.

The test is blacklisted and will be whitelisted in the next commit with
the fix.
jcarrano added a commit to jcarrano/RIOT that referenced this issue Sep 28, 2019
The TLSF allocator needs to be initialized before use. This is an issue when
it is used as a default system allocator since the user expects to be able to
call malloc right away. This is made worse by the fact that the C library uses
malloc internally to create buffers and that may happen before the user's code
has a chance to run.

As a consequence, even doing printf when using USEMODULE=tlsf-malloc will lead
to a crash.

A mechanism is needed to:

1. Initialize the pool early.
2. Determine which memory should be used as a heap and reserve it.

Issue (1) is solved by adding the initializer to the C library's `.preinit_array`,
which is a cross-file array of function pointers that run before the library is
initialized -that is before _init(). See the newlib source code for more details.

Point (2) is important because TLSF dows not support growing the pool, only adding
new ones. We would like to initialize it with a pool as big as possible.

In native (2) is handled by defining a static array of fixed size (given by
TLSF_NATIVE_HEAPSIZE). Memory is plentiful in native and we down't care about
the overhead of zeroing out this array.

On embedded targets using newlib (this may be working on other plaforms, I only
tested ARM) `sbrk()` is used to find the start of the heap and reserve it and
the `_eheap` linker symbol is used to determine the end of the usable heap.
An array is a bad choice here because the size would be board dependent and hard
to determine without build-system magic and because it would be zeroed by default,
making the boot sequence way longer.

sbrk() does nothing more than move a pointer that marks the fraction of the space
between _sheap and _eheap that is reserved. Since we are using the whole heap it
might be tempting to just use the symbols to derive the pool location and size and
to sidestep sbrk(). Especially since the memory allocation functions are expected to
be the only users of such a feature. That "trick" would make the OS impossible to
debug in case the was a mistake and some of the original allocation functions slipped
through non-overriden. If sbrk is used to reserve the entirety of the space then that
rogue function will try to call it and fail as no more heap is available. In fact
this is how I found out that I was overriding the wrong functions (put a breakpoint
int sbrk and show a traceback.) If sbrk is sidestepped one would have nasty and
impossible to debug memory corruption errors.

A third option could be to use the heap space directly and not define sbrk. This
is beyond the scope of this change, but is probably the route to go for platform
that do not define this call (but first do a thoroug investigation of how the libc
works in that platform).

Messing with the global system allocator is not an easy thing to do. I would say
that tslf-malloc is ATM _only_ supported in native and cortex-m.

Testing procedure:

Run `tests/pkg_tlsf_malloc`.

Fixes: RIOT-OS#4490, RIOT-OS#5796.
Closes: RIOT-OS#12021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: pkg Area: External package ports State: duplicate State: The issue/PR is a duplicate of another issue/PR State: stale State: The issue / PR has no activity for >185 days Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants