Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM not always signaled properly in Fedora 33 #44539

Closed
chflood opened this issue Mar 9, 2022 · 1 comment
Closed

OOM not always signaled properly in Fedora 33 #44539

chflood opened this issue Mar 9, 2022 · 1 comment

Comments

@chflood
Copy link
Member

chflood commented Mar 9, 2022

This fails pretty spectacularly by taking the surrounding process (terminal, gdb, script,... with it).

using Random
function oom()
    i = 2147483648
    j = 31
    rng = MersenneTwister(12345)
    while (true)
       i = i * 2
       j = j + 1
       println("j =  ", j, "  i = ", i)
       temp = rand(rng, Int, i)
    end
end

oom()

My machine:

[chf@gotland chfloodjulia]$ uname -a
Linux gotland 5.16.12-200.fc35.x86_64 #1 SMP PREEMPT Wed Mar 2 19:06:17 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

It fails on both the linux distribution of Julia and on my personal development version.

[chf@gotland chfloodjulia]$ /home/chf/chfloodjulia/julia -version
julia version 1.9.0-DEV
[chf@gotland chfloodjulia]$ julia -version
julia version 1.7.2
[chf@gotland chfloodjulia]$ 

I tracked it in gdb to just before the failure:

(gdb) info threads
  Id   Target Id                                  Frame 
  1    Thread 0x7ffff7d7f740 (LWP 159211) "julia" 0x00007fff8f4493ae in dsfmt_fill_array_close1_open2 () from /home/chf/chfloodjulia/usr/bin/../lib/libdSFMT.so
* 2    Thread 0x7fffe4eb1640 (LWP 159212) "julia" __pthread_kill_implementation (threadid=<optimized out>, signo=2, no_tid=<optimized out>) at pthread_kill.c:44
  3    Thread 0x7fffd1d57640 (LWP 159213) "julia" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x7fffd3b618e0 <thread_status+96>) at futex-internal.c:57
  4    Thread 0x7fffc9556640 (LWP 159214) "julia" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x7fffd3b61960 <thread_status+224>) at futex-internal.c:57
  5    Thread 0x7fffc8d55640 (LWP 159215) "julia" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x7fffd3b619e0 <thread_status+352>) at futex-internal.c:57
  6    Thread 0x7fffc0554640 (LWP 159216) "julia" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x7fffd3b61a60 <thread_status+480>) at futex-internal.c:57
  7    Thread 0x7fffafd53640 (LWP 159217) "julia" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x7fffd3b61ae0 <thread_status+608>) at futex-internal.c:57
  8    Thread 0x7fffa7552640 (LWP 159218) "julia" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x7fffd3b61b60 <thread_status+736>) at futex-internal.c:57
  9    Thread 0x7fff9ed51640 (LWP 159219) "julia" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x7fffd3b61be0 <thread_status+864>) at futex-internal.c:57
(gdb) thread 1
[Switching to thread 1 (Thread 0x7ffff7d7f740 (LWP 159211))]
#0  0x00007fff8f4493ae in dsfmt_fill_array_close1_open2 () from /home/chf/chfloodjulia/usr/bin/../lib/libdSFMT.so
(gdb) where
#0  0x00007fff8f4493ae in dsfmt_fill_array_close1_open2 () from /home/chf/chfloodjulia/usr/bin/../lib/libdSFMT.so
#1  0x00007fff8f94de80 in julia_dsfmt_fill_array_close1_open2!_16 (s=..., A=140701228888128, n=566948310) at /home/chf/chfloodjulia/usr/share/julia/stdlib/v1.9/Random/src/DSFMT.jl:85
#2  0x00007fff8f94e41c in fill_array! () at /home/chf/chfloodjulia/usr/share/julia/stdlib/v1.9/Random/src/RNGs.jl:527
#3  fill_array! () at /home/chf/chfloodjulia/usr/share/julia/stdlib/v1.9/Random/src/RNGs.jl:521
#4  julia_rand!_25 (r=..., A=..., I=...) at /home/chf/chfloodjulia/usr/share/julia/stdlib/v1.9/Random/src/RNGs.jl:550
#5  0x00007fff8f94e5f4 in rand! () at /home/chf/chfloodjulia/usr/share/julia/stdlib/v1.9/Random/src/Random.jl:267
#6  julia_rand!_18 (r=..., A=...) at /home/chf/chfloodjulia/usr/share/julia/stdlib/v1.9/Random/src/RNGs.jl:622
#7  0x00007fff8f94ed79 in rand! () at /home/chf/chfloodjulia/usr/share/julia/stdlib/v1.9/Random/src/Random.jl:268
#8  rand! () at /home/chf/chfloodjulia/usr/share/julia/stdlib/v1.9/Random/src/Random.jl:268
#9  julia_rand!_9 (r=..., A=...) at /home/chf/chfloodjulia/usr/share/julia/stdlib/v1.9/Random/src/RNGs.jl:654
#10 0x00007fff8f94efda in rand! () at /home/chf/chfloodjulia/usr/share/julia/stdlib/v1.9/Random/src/RNGs.jl:645
#11 rand! () at /home/chf/chfloodjulia/usr/share/julia/stdlib/v1.9/Random/src/Random.jl:268
#12 rand () at /home/chf/chfloodjulia/usr/share/julia/stdlib/v1.9/Random/src/Random.jl:289
#13 rand () at /home/chf/chfloodjulia/usr/share/julia/stdlib/v1.9/Random/src/Random.jl:292
#14 julia_oom_4 () at /home/chf/gc_benchmarks/oom.jl:10
#15 0x00007fff8f94f000 in jfptr_oom_5 ()
#16 0x00007ffff757ab23 in _jl_invoke (F=0x7fffe45942a0, args=0x7fffffffb368, nargs=0, mfunc=0x7fffe051a6d0, world=32356) at /home/chf/chfloodjulia/src/gf.c:2367
#17 0x00007ffff757b4e6 in ijl_apply_generic (F=0x7fffe45942a0, args=0x7fffffffb368, nargs=0) at /home/chf/chfloodjulia/src/gf.c:2549
#18 0x00007ffff759985f in jl_apply (args=0x7fffffffb360, nargs=1) at /home/chf/chfloodjulia/src/julia.h:1827
#19 0x00007ffff7599cfd in do_call (args=0x7fffe050f778, nargs=1, s=0x7fffffffb780) at /home/chf/chfloodjulia/src/interpreter.c:126
#20 0x00007ffff759a551 in eval_value (e=0x7fffe055f430, s=0x7fffffffb780) at /home/chf/chfloodjulia/src/interpreter.c:215
#21 0x00007ffff759a01f in eval_stmt_value (stmt=0x7fffe055f430, s=0x7fffffffb780) at /home/chf/chfloodjulia/src/interpreter.c:166
#22 0x00007ffff759c6c5 in eval_body (stmts=0x7fffe050f710, s=0x7fffffffb780, ip=0, toplevel=1) at /home/chf/chfloodjulia/src/interpreter.c:594
#23 0x00007ffff759d483 in jl_interpret_toplevel_thunk (m=0x7fffd67e3540 <jl_system_image_data+8651584>, src=0x7fffe0568690) at /home/chf/chfloodjulia/src/interpreter.c:750
#24 0x00007ffff75c78bc in jl_toplevel_eval_flex (m=0x7fffd67e3540 <jl_system_image_data+8651584>, e=0x7fffe055e2b0, fast=1, expanded=0) at /home/chf/chfloodjulia/src/toplevel.c:906
#25 0x00007ffff75c736b in jl_toplevel_eval_flex (m=0x7fffd67e3540 <jl_system_image_data+8651584>, e=0x7fffe055e370, fast=1, expanded=0) at /home/chf/chfloodjulia/src/toplevel.c:850
#26 0x00007ffff75c7918 in ijl_toplevel_eval (m=0x7fffd67e3540 <jl_system_image_data+8651584>, v=0x7fffe055e370) at /home/chf/chfloodjulia/src/toplevel.c:915
#27 0x00007ffff75c7baa in ijl_toplevel_eval_in (m=0x7fffd67e3540 <jl_system_image_data+8651584>, ex=0x7fffe055e370) at /home/chf/chfloodjulia/src/toplevel.c:965
#28 0x00007fffd584d695 in eval () at boot.jl:368
#29 japi1_include_string_38875 (mapexpr=..., mod=0x7fffe4439348, code=0xe8, filename=0x1e) at loading.jl:1295
#30 0x00007ffff7579ebe in jl_fptr_args (f=0x7fffd6a1ece0 <jl_system_image_data+10992352>, args=0x7fffffffc090, nargs=4, m=0x7fffd6a1efa0 <jl_system_image_data+10993056>) at /home/chf/chfloodjulia/src/gf.c:2128
#31 0x00007ffff757aa44 in _jl_invoke (F=0x7fffd6a1ece0 <jl_system_image_data+10992352>, args=0x7fffffffc090, nargs=4, mfunc=0x7fffd6a1ef50 <jl_system_image_data+10992976>, world=32355)
    at /home/chf/chfloodjulia/src/gf.c:2348
#32 0x00007ffff757b4e6 in ijl_apply_generic (F=0x7fffd6a1ece0 <jl_system_image_data+10992352>, args=0x7fffffffc090, nargs=4) at /home/chf/chfloodjulia/src/gf.c:2549
#33 0x00007fffd57a3858 in japi1__include_48728 (mapexpr=0x7fffd6148d60 <jl_system_image_data+1727328>, mod=0x7fffe4439348, _path=0x6) at loading.jl:1352
#34 0x00007fffd57a3aca in julia_include_30539 (mod=0x7fffe4439348, _path=0x6) at Base.jl:422
#35 0x00007fffd57a3b20 in jfptr_include_30540 () from /home/chf/chfloodjulia/usr/lib/julia/sys-debug.so
#36 0x00007ffff757aa44 in _jl_invoke (F=0x7fffd6454750 <jl_system_image_data+4920656>, args=0x7fffffffd7c0, nargs=2, mfunc=0x7fffd64549c0 <jl_system_image_data+4921280>, world=32355)
    at /home/chf/chfloodjulia/src/gf.c:2348
#37 0x00007ffff757b4e6 in ijl_apply_generic (F=0x7fffd6454750 <jl_system_image_data+4920656>, args=0x7fffffffd7c0, nargs=2) at /home/chf/chfloodjulia/src/gf.c:2549
#38 0x00007fffd5d4fdae in julia_exec_options_43705 (opts=...) at client.jl:303
#39 0x00007fffd5d50a8a in julia__start_41863 () at client.jl:522
#40 0x00007fffd5d50bd9 in jfptr.start_41864 () from /home/chf/chfloodjulia/usr/lib/julia/sys-debug.so
#41 0x00007ffff757aa44 in _jl_invoke (F=0x7fffd61c74d0 <jl_system_image_data+2245328>, args=0x7fffffffdc70, nargs=0, mfunc=0x7fffd61c7310 <jl_system_image_data+2244880>, world=32355)
    at /home/chf/chfloodjulia/src/gf.c:2348
#42 0x00007ffff757b4e6 in ijl_apply_generic (F=0x7fffd61c74d0 <jl_system_image_data+2245328>, args=0x7fffffffdc70, nargs=0) at /home/chf/chfloodjulia/src/gf.c:2549
#43 0x00007ffff75fa00c in jl_apply (args=0x7fffffffdc68, nargs=1) at /home/chf/chfloodjulia/src/julia.h:1827
#44 0x00007ffff75fbbb1 in true_main (argc=1, argv=0x7fffffffe0b0) at /home/chf/chfloodjulia/src/jlapi.c:562
#45 0x00007ffff75fc14a in jl_repl_entrypoint (argc=1, argv=0x7fffffffe0a8) at /home/chf/chfloodjulia/src/jlapi.c:706
#46 0x00007ffff7d93991 in jl_load_repl (argc=2, argv=0x7fffffffe0a8) at /home/chf/chfloodjulia/cli/loader_lib.c:271
#47 0x000000000040117a in _fini ()
#48 0x00007fffffffe0a8 in ?? ()
#49 0x0000000225467400 in ?? ()
#50 0x00007ffff7d9e878 in __frame_dummy_init_array_entry () from /home/chf/chfloodjulia/usr/bin/../lib/libjulia-debug.so.1
#51 0x00007fffffffdf90 in ?? ()
#52 0x0000000000000002 in ?? ()
#53 0x00007ffff7dd0560 in __libc_start_call_main (main=main@entry=0x40115a, argc=2, argc@entry=-8336, argv=argv@entry=0x7fffffffe0a8) at ../sysdeps/nptl/libc_start_call_main.h:58
#54 0x00007ffff7dd060c in __libc_start_main_impl (main=0x40115a, argc=-8336, argv=0x7fffffffe0a8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe098)
    at ../csu/libc-start.c:409
#55 0x0000000000401075 in _start ()
(gdb) list
39	         delivery of all pending signals after unblocking in the code
40	         below.  POSIX only guarantees delivery of a single signal,
41	         which may not be the right one.)  */
42	      pid_t tid = INTERNAL_SYSCALL_CALL (gettid);
43	      int ret = INTERNAL_SYSCALL_CALL (tgkill, __getpid (), tid, signo);
44	      return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
45	    }
46	
47	  /* Block all signals, as required by pd->exit_lock.  */
48	  sigset_t old_mask
@chflood
Copy link
Member Author

chflood commented Mar 9, 2022

I'm closing this. You can work around the issue by turning off linux memory overcommit (Put vm.overcommit_memory = 2 in /etc/sysctl.conf). I don't think this is a Julia specific issue.

@chflood chflood closed this as completed Mar 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant