Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempt to Use Library Crashes with GLIBC Error #349

Open
luarvique opened this issue Jul 15, 2024 · 14 comments
Open

Attempt to Use Library Crashes with GLIBC Error #349

luarvique opened this issue Jul 15, 2024 · 14 comments
Labels

Comments

@luarvique
Copy link

When trying to use nrsc5 library from OpenWebRX+ on the amd64/x64 architecture, things immediately crash with the following message:

double free or corruption (!prev)

Both arm64 and armhf architectures are ok. Also, command-line nrsc5 utility appears to be ok.

I have been trying to debug the issue but failed: the process gets killed before GDB has a chance to look at it. Tried MALLOC_CHECK_=2, did not help.

@argilo
Copy link
Collaborator

argilo commented Jul 15, 2024

This could happen if nrsc5_close is called twice for the same nrsc5_t object.

Perhaps you could try building OpenWebRX+ and libnrsc5 using ASAN (-fsanitize=address) or running OpenWebRX+ inside valgrind.

@luarvique
Copy link
Author

Are there any known issues with creating multiple instances of the NRSC5 Python class? That might be one of the causes for multiple closes.

@argilo
Copy link
Collaborator

argilo commented Jul 15, 2024

I have not tried creating multiple instances before. It's possible there could be bugs which come up only in that scenario. If you are able to make a script that reproduces the issue, or learn more about the crash, please let me know.

@luarvique
Copy link
Author

luarvique commented Jul 15, 2024

Right now, I seem to hit the issue on every call to NRSC5 object. So, open_pipe() crashes things, and start() crashes things as well, even with everything else commented out. I do seem to have only one instance at the moment, so it might not be a multiple instance issue. Trying ASAN.

@luarvique
Copy link
Author

Ok, so ASAN did not help, still cannot get a hold of the crash. It does happen with 100% repeatability though, as long as you install and run OpenWebRX+. Is it something you would be willing to do in order to catch the cause? An amd64 PC with Ubuntu Jammy or Debian Bullseye and an RTLSDR dongle will be required.

@luarvique
Copy link
Author

Multiple trial-and-error experiments have pointed to this code as causing the crash:

void sync_init(sync_t *st, input_t *input)
{
    float loop_bw = 0.05, damping = 0.70710678;
    float denom = 1 + (2 * damping * loop_bw) + (loop_bw * loop_bw);
    st->alpha = (4 * damping * loop_bw) / denom;
    st->beta = (4 * loop_bw * loop_bw) / denom;

    st->input = input;
    sync_reset(st);
}

As soon as at least one st-> assignment is made here, things crash afterwards (even if the rest of the initialization is commented out). The interesting part is that the sync_t variable is the last one in the input_t structure.

@luarvique
Copy link
Author

Found and fixed the issue. Apparently, open_pipe() (and likely some other NRSC5 calls) cannot be called from a Python thread. No idea why.

@argilo
Copy link
Collaborator

argilo commented Jul 17, 2024

Let's keep this issue open until the root cause is understood. It's possible there's an issue in nrsc5 that needs to be fixed so that it can be used with in a Python thread.

@argilo argilo reopened this Jul 17, 2024
@markjfine
Copy link

markjfine commented Jul 17, 2024

As Python 9 and 11 were both updated in Homebrew on my Mac today, I'm reminded that this might be a Python version issue. Unsure what version is being used to access the nrsc5 API, but there were lots of changes in v12 that broke quite a number of things. May want to try down-versioning in Ubuntu to see if the same problems exist.

@luarvique
Copy link
Author

Let's keep this issue open until the root cause is understood. It's possible there's an issue in nrsc5 that needs to be fixed so that it can be used with in a Python thread.

Well, one possible reason, as pointed by @jketterl, would be the unfortunate fact that fftw3 is not thread-safe and OWRX obviously uses fftw3 a lot. The simplest workaround is to avoid creating fftw3 plans in threads, so at least NRSC5 open() should happen in the main thread.

The disturbing thing is that when I commented out fftw3 plan initialization from NRSC5, that alone did not eliminate the crash.

Then there is the nature of the crash. From the look of things, it somehow corrupts the malloc() heap, so the next free() call trips the safety checks.

@TheDaChicken
Copy link
Contributor

Yeah, this isn't an easy issue to fix. It is possible to call fftw_make_planner_thread_safe somewhere in OWRX to see if that somehow fixes it but it's not recommended. I guess it's good to note that open() needs to be from the main thread?

@luarvique
Copy link
Author

It is, in theory, possible, but will lead to performance degradation. The current change (calling open() from Python object constructor) seems to do the job though.

@argilo
Copy link
Collaborator

argilo commented Jul 18, 2024

I guess it's good to note that open() needs to be from the main thread?

Yeah, if nrsc5_open_* are not thread-safe then we should probably document that. A better alternative might be to make them thread-safe. There's no reason that FFTW plans need to be made more than once, so we could perhaps move the fftwf_plan_dft_1d calls out into a global initialization function.

@TheDaChicken
Copy link
Contributor

so we could perhaps move the fftwf_plan_dft_1d calls out into a global initialization function.

I did a quick Google search. It seems that is a great idea and seems there wouldn't be any problems with that. Note to use fftw_execute_dft as well to allow a given plan to be different arrays to prevent any multi-instance issues!

Maybe a good idea to make the global initialization an optional thing?

@argilo argilo added the bug label Jul 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants