I've been chasing a segfault that is triggered on MacOSX. To setup and reproduce:
opam install . --deps-only --with-test
I can pretty consistently reproduce by running the following (9/10 times or so):
$ dune exec src/lazy/lazy_lin_test.exe -- -v -s 249901845
random seed: 249901845
generated error fail pass / total time test name
[ ] 19 0 0 19 / 100 22.4s Linearizable lazy test with DomainSegmentation fault: 11
An attempt at reducing the problem is also available. This does not crash as consistently - but the code is a bit simpler and has fewer dependencies:
$ dune exec src/lazy/lazy_lin_reduced.exe
0 t
1 t
2 t
3 t
4 t
5 t
6 CamlinternalLazy.Undefined
7 CamlinternalLazy.Undefined
8 Segmentation fault: 11
What (I think) I know so far:
at_exit
I followed a suggestion by @dra27 and tried on Refine Domain.{at_exit,at_startup,at_first_spawn,at_each_spawn} callback semantics #11213 where it also crashes.Lazy
values. Uncommenting the Lazy
tests still crashed our CI tests.Here's first the output of an lldb
run without the debug runtime which stops with a EXC_BAD_ACCESS
:
$ lldb _build/default/src/lazy/lazy_lin_reduced.exe
(lldb) target create "_build/default/src/lazy/lazy_lin_reduced.exe"
Current executable set to '/Users/jmi/software/ocaml-04-28-2022-11213/multicoretests/_build/default/src/lazy/lazy_lin_reduced.exe' (x86_64).
(lldb) run
Process 1503 launched: '/Users/jmi/software/ocaml-04-28-2022-11213/multicoretests/_build/default/src/lazy/lazy_lin_reduced.exe' (x86_64)
0 t
1 t
2 t
3 t
4 t
5 t
6 CamlinternalLazy.Undefined
7 CamlinternalLazy.Undefined
8 Process 1503 stopped
* thread #3, name = 'Domain3', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
frame #0: 0x00000001000b7cf4 lazy_lin_reduced.exe`caml_c_call + 4
lazy_lin_reduced.exe`caml_c_call:
-> 0x1000b7cf4 <+4>: movq %rsp, (%r10)
0x1000b7cf7 <+7>: movq 0x30(%r14), %r11
0x1000b7cfb <+11>: movq %rsp, 0x8(%r11)
0x1000b7cff <+15>: movq %r10, (%r11)
Target 0: (lazy_lin_reduced.exe) stopped.
(lldb) bt all
lazy_lin_reduced.exe was compiled with optimization - stepping may behave oddly; variables may not be available.
thread #1, name = 'Domain0', queue = 'com.apple.main-thread'
frame #0: 0x00007fff20608cce libsystem_kernel.dylib`__psynch_cvwait + 10
frame #1: 0x00007fff2063be49 libsystem_pthread.dylib`_pthread_cond_wait + 1298
frame #2: 0x00000001000b2fd8 lazy_lin_reduced.exe`caml_ml_condition_wait [inlined] sync_condvar_wait(c=0x0000000100515290, m=0x0000000100515250) at sync_posix.h:122:10 [opt]
frame #3: 0x00000001000b2fcd lazy_lin_reduced.exe`caml_ml_condition_wait(wcond=<unavailable>, wmut=<unavailable>) at sync.c:172:13 [opt]
frame #4: 0x00000001000b7d0b lazy_lin_reduced.exe`caml_c_call + 27
frame #5: 0x00000001000556bc lazy_lin_reduced.exe`camlStdlib__Domain__loop_718 + 44
frame #6: 0x000000010005565d lazy_lin_reduced.exe`camlStdlib__Domain__join_713 + 141
frame #7: 0x000000010000837f lazy_lin_reduced.exe`camlDune__exe__Lazy_lin_reduced__lin_prop_domain_754 + 287
frame #8: 0x000000010000907f lazy_lin_reduced.exe`camlUtil__repeat_268 + 95
frame #9: 0x0000000100008522 lazy_lin_reduced.exe`camlDune__exe__Lazy_lin_reduced__exec_test_802 + 146
frame #10: 0x000000010003b228 lazy_lin_reduced.exe`camlStdlib__List__map_483 + 56
frame #11: 0x000000010003b23f lazy_lin_reduced.exe`camlStdlib__List__map_483 + 79
frame #12: 0x000000010003b23f lazy_lin_reduced.exe`camlStdlib__List__map_483 + 79
frame #13: 0x000000010003b23f lazy_lin_reduced.exe`camlStdlib__List__map_483 + 79
frame #14: 0x000000010003b23f lazy_lin_reduced.exe`camlStdlib__List__map_483 + 79
frame #15: 0x000000010003b23f lazy_lin_reduced.exe`camlStdlib__List__map_483 + 79
frame #16: 0x000000010003b23f lazy_lin_reduced.exe`camlStdlib__List__map_483 + 79
frame #17: 0x000000010003b23f lazy_lin_reduced.exe`camlStdlib__List__map_483 + 79
frame #18: 0x000000010003b23f lazy_lin_reduced.exe`camlStdlib__List__map_483 + 79
frame #19: 0x0000000100008fec lazy_lin_reduced.exe`camlDune__exe__Lazy_lin_reduced__entry + 2012
frame #20: 0x0000000100002b8b lazy_lin_reduced.exe`caml_program + 747
frame #21: 0x00000001000b7dc4 lazy_lin_reduced.exe`caml_start_program + 112
frame #22: 0x00000001000b760b lazy_lin_reduced.exe`caml_main [inlined] caml_startup(argv=<unavailable>) at startup_nat.c:136:7 [opt]
frame #23: 0x00000001000b7604 lazy_lin_reduced.exe`caml_main(argv=<unavailable>) at startup_nat.c:142:3 [opt]
frame #24: 0x00000001000a787c lazy_lin_reduced.exe`main(argc=<unavailable>, argv=<unavailable>) at main.c:37:3 [opt]
frame #25: 0x00007fff20656f3d libdyld.dylib`start + 1
thread #2, name = 'Backup0'
frame #0: 0x00000001000af98e lazy_lin_reduced.exe`pool_sweep(local=<unavailable>, plist=<unavailable>, sz=1, release_to_global_pool=1) at shared_heap.c:457:31 [opt]
frame #1: 0x00000001000af524 lazy_lin_reduced.exe`caml_sweep(local=0x0000000111008200, work=512) at shared_heap.c:545:7 [opt]
frame #2: 0x00000001000a89f0 lazy_lin_reduced.exe`major_collection_slice(howmuch=<unavailable>, participant_count=0, barrier_participants=0x0000000000000000, mode=Slice_opportunistic) at major_gc.c:1208:14 [opt]
frame #3: 0x0000000100094e98 lazy_lin_reduced.exe`handle_incoming at domain.c:1248:9 [opt]
frame #4: 0x0000000100094e59 lazy_lin_reduced.exe`handle_incoming(s=<unavailable>) at domain.c:305:5 [opt]
frame #5: 0x00000001000970e2 lazy_lin_reduced.exe`backup_thread_func [inlined] caml_handle_incoming_interrupts at domain.c:318:3 [opt]
frame #6: 0x00000001000970cd lazy_lin_reduced.exe`backup_thread_func(v=0x000000010014a810) at domain.c:956:13 [opt]
frame #7: 0x00007fff2063b8fc libsystem_pthread.dylib`_pthread_start + 224
frame #8: 0x00007fff20637443 libsystem_pthread.dylib`thread_start + 15
* thread #3, name = 'Domain3', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
* frame #0: 0x00000001000b7cf4 lazy_lin_reduced.exe`caml_c_call + 4
frame #1: 0x000000010007beef lazy_lin_reduced.exe`camlStdlib__Format__buffered_out_flush_1279 + 111
frame #2: 0x000000010007f83e lazy_lin_reduced.exe`camlStdlib__Format__flush_standard_formatters_2002 + 62
frame #3: 0x0000000100055139 lazy_lin_reduced.exe`camlStdlib__Domain__new_exit_673 + 41
frame #4: 0x00000001000554a7 lazy_lin_reduced.exe`camlStdlib__Domain__body_706 + 135
frame #5: 0x00000001000b7dc4 lazy_lin_reduced.exe`caml_start_program + 112
frame #6: 0x000000010009364e lazy_lin_reduced.exe`caml_callback_exn(closure=<unavailable>, arg=1) at callback.c:169:1 [opt]
frame #7: 0x0000000100093af9 lazy_lin_reduced.exe`caml_callback(closure=<unavailable>, arg=1) at callback.c:253:34 [opt]
frame #8: 0x0000000100096151 lazy_lin_reduced.exe`domain_thread_func(v=<unavailable>) at domain.c:1085:5 [opt]
frame #9: 0x00007fff2063b8fc libsystem_pthread.dylib`_pthread_start + 224
frame #10: 0x00007fff20637443 libsystem_pthread.dylib`thread_start + 15
thread #4, name = 'Backup3'
frame #0: 0x00007fff206084ba libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff206392ab libsystem_pthread.dylib`_pthread_mutex_firstfit_lock_wait + 76
frame #2: 0x00007fff20637192 libsystem_pthread.dylib`_pthread_mutex_firstfit_lock_slow + 204
frame #3: 0x0000000100097078 lazy_lin_reduced.exe`backup_thread_func [inlined] caml_plat_lock(m=0x000000010014aca8) at platform.h:144:21 [opt]
frame #4: 0x0000000100097070 lazy_lin_reduced.exe`backup_thread_func(v=0x000000010014abe8) at domain.c:975:9 [opt]
frame #5: 0x00007fff2063b8fc libsystem_pthread.dylib`_pthread_start + 224
frame #6: 0x00007fff20637443 libsystem_pthread.dylib`thread_start + 15
thread #5, name = 'Domain2'
frame #0: 0x00000001000969ca lazy_lin_reduced.exe`caml_try_run_on_all_domains_with_spin_work [inlined] caml_wait_interrupt_serviced at domain.c:342:14 [opt]
frame #1: 0x00000001000969b7 lazy_lin_reduced.exe`caml_try_run_on_all_domains_with_spin_work(handler=(lazy_lin_reduced.exe`caml_stw_empty_minor_heap at minor_gc.c:721), data=<unavailable>, leader_setup=<unavailable>, enter_spin_callback=<unavailable>, enter_spin_data=0x0000000000000000) at domain.c:1429:5 [opt]
frame #2: 0x00000001000acc1d lazy_lin_reduced.exe`caml_empty_minor_heaps_once [inlined] caml_try_stw_empty_minor_heap_on_all_domains at minor_gc.c:758:10 [opt]
frame #3: 0x00000001000acbf1 lazy_lin_reduced.exe`caml_empty_minor_heaps_once at minor_gc.c:778:5 [opt]
frame #4: 0x00000001000961d8 lazy_lin_reduced.exe`domain_thread_func [inlined] domain_terminate at domain.c:1654:5 [opt]
frame #5: 0x0000000100096151 lazy_lin_reduced.exe`domain_thread_func(v=<unavailable>) at domain.c:1086:5 [opt]
frame #6: 0x00007fff2063b8fc libsystem_pthread.dylib`_pthread_start + 224
frame #7: 0x00007fff20637443 libsystem_pthread.dylib`thread_start + 15
thread #6, name = 'Backup2'
frame #0: 0x00007fff206084ba libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff206392ab libsystem_pthread.dylib`_pthread_mutex_firstfit_lock_wait + 76
frame #2: 0x00007fff20637192 libsystem_pthread.dylib`_pthread_mutex_firstfit_lock_slow + 204
frame #3: 0x0000000100097078 lazy_lin_reduced.exe`backup_thread_func [inlined] caml_plat_lock(m=0x000000010014ab60) at platform.h:144:21 [opt]
frame #4: 0x0000000100097070 lazy_lin_reduced.exe`backup_thread_func(v=0x000000010014aaa0) at domain.c:975:9 [opt]
frame #5: 0x00007fff2063b8fc libsystem_pthread.dylib`_pthread_start + 224
frame #6: 0x00007fff20637443 libsystem_pthread.dylib`thread_start + 15
(lldb)
and here's another one with the debug runtime which stops with EXC_BREAKPOINT
:
$ lldb _build/default/src/lazy/lazy_lin_reduced.exe
(lldb) target create "_build/default/src/lazy/lazy_lin_reduced.exe"
Current executable set to '/Users/jmi/software/ocaml-04-28-2022-11213/multicoretests/_build/default/src/lazy/lazy_lin_reduced.exe' (x86_64).
(lldb) run
Process 1548 launched: '/Users/jmi/software/ocaml-04-28-2022-11213/multicoretests/_build/default/src/lazy/lazy_lin_reduced.exe' (x86_64)
### OCaml runtime: debug mode ###
0 t
1 t
2 t
3 t
4 t
5 t
6 CamlinternalLazy.Undefined
7 CamlinternalLazy.Undefined
8 Process 1548 stopped
* thread #5, name = 'Domain3', stop reason = EXC_BREAKPOINT (code=EXC_I386_BPT, subcode=0x0)
frame #0: 0x00000001000b9cf8 lazy_lin_reduced.exe`caml_c_call + 48
lazy_lin_reduced.exe`caml_c_call:
-> 0x1000b9cf8 <+48>: movq 0x20(%r14), %r11
0x1000b9cfc <+52>: movq (%r11), %r11
0x1000b9cff <+55>: cmpq %r11, 0x8(%rsp)
0x1000b9d04 <+60>: je 0x1000b9d07 ; <+63>
Target 0: (lazy_lin_reduced.exe) stopped.
(lldb) bt all
lazy_lin_reduced.exe was compiled with optimization - stepping may behave oddly; variables may not be available.
thread #1, name = 'Domain0', queue = 'com.apple.main-thread'
frame #0: 0x00000001000b1380 lazy_lin_reduced.exe`pool_sweep(local=<unavailable>, plist=<unavailable>, sz=2, release_to_global_pool=1) at shared_heap.c:0:5 [opt]
frame #1: 0x00000001000b0d94 lazy_lin_reduced.exe`caml_sweep(local=0x0000000111808200, work=512) at shared_heap.c:545:7 [opt]
frame #2: 0x00000001000a8e00 lazy_lin_reduced.exe`major_collection_slice(howmuch=<unavailable>, participant_count=0, barrier_participants=0x0000000000000000, mode=Slice_opportunistic) at major_gc.c:1208:14 [opt]
frame #3: 0x0000000100094e18 lazy_lin_reduced.exe`handle_incoming at domain.c:1248:9 [opt]
frame #4: 0x0000000100094dd6 lazy_lin_reduced.exe`handle_incoming(s=<unavailable>) at domain.c:305:5 [opt]
frame #5: 0x0000000100097145 lazy_lin_reduced.exe`caml_handle_gc_interrupt [inlined] caml_handle_incoming_interrupts at domain.c:318:3 [opt]
frame #6: 0x0000000100097130 lazy_lin_reduced.exe`caml_handle_gc_interrupt at domain.c:1531:5 [opt]
frame #7: 0x00000001000b2649 lazy_lin_reduced.exe`caml_process_pending_actions at signals.c:236:3 [opt]
frame #8: 0x00000001000b9785 lazy_lin_reduced.exe`caml_garbage_collection at signals_nat.c:104:7 [opt]
frame #9: 0x00000001000b9ba1 lazy_lin_reduced.exe`caml_call_gc + 241
frame #10: 0x00000001000552cd lazy_lin_reduced.exe`camlStdlib__Domain__join_713 + 173
frame #11: 0x0000000100007fdd lazy_lin_reduced.exe`camlDune__exe__Lazy_lin_reduced__lin_prop_domain_754 + 301
frame #12: 0x0000000100008ccf lazy_lin_reduced.exe`camlUtil__repeat_268 + 95
frame #13: 0x0000000100008172 lazy_lin_reduced.exe`camlDune__exe__Lazy_lin_reduced__exec_test_802 + 146
frame #14: 0x000000010003ae78 lazy_lin_reduced.exe`camlStdlib__List__map_483 + 56
frame #15: 0x000000010003ae8f lazy_lin_reduced.exe`camlStdlib__List__map_483 + 79
frame #16: 0x000000010003ae8f lazy_lin_reduced.exe`camlStdlib__List__map_483 + 79
frame #17: 0x000000010003ae8f lazy_lin_reduced.exe`camlStdlib__List__map_483 + 79
frame #18: 0x000000010003ae8f lazy_lin_reduced.exe`camlStdlib__List__map_483 + 79
frame #19: 0x000000010003ae8f lazy_lin_reduced.exe`camlStdlib__List__map_483 + 79
frame #20: 0x000000010003ae8f lazy_lin_reduced.exe`camlStdlib__List__map_483 + 79
frame #21: 0x000000010003ae8f lazy_lin_reduced.exe`camlStdlib__List__map_483 + 79
frame #22: 0x000000010003ae8f lazy_lin_reduced.exe`camlStdlib__List__map_483 + 79
frame #23: 0x0000000100008c3c lazy_lin_reduced.exe`camlDune__exe__Lazy_lin_reduced__entry + 2012
frame #24: 0x0000000100002610 lazy_lin_reduced.exe`caml_program + 752
frame #25: 0x00000001000b9e02 lazy_lin_reduced.exe`caml_start_program + 150
frame #26: 0x00000001000b955b lazy_lin_reduced.exe`caml_main [inlined] caml_startup(argv=<unavailable>) at startup_nat.c:136:7 [opt]
frame #27: 0x00000001000b9554 lazy_lin_reduced.exe`caml_main(argv=<unavailable>) at startup_nat.c:142:3 [opt]
frame #28: 0x00000001000a794c lazy_lin_reduced.exe`main(argc=<unavailable>, argv=<unavailable>) at main.c:37:3 [opt]
frame #29: 0x00007fff20656f3d libdyld.dylib`start + 1
thread #2, name = 'Backup0'
frame #0: 0x00007fff206084ba libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff206392ab libsystem_pthread.dylib`_pthread_mutex_firstfit_lock_wait + 76
frame #2: 0x00007fff20637192 libsystem_pthread.dylib`_pthread_mutex_firstfit_lock_slow + 204
frame #3: 0x0000000100097848 lazy_lin_reduced.exe`backup_thread_func [inlined] caml_plat_lock(m=0x000000010014e9f0) at platform.h:144:21 [opt]
frame #4: 0x0000000100097839 lazy_lin_reduced.exe`backup_thread_func(v=0x000000010014e930) at domain.c:975:9 [opt]
frame #5: 0x00007fff2063b8fc libsystem_pthread.dylib`_pthread_start + 224
frame #6: 0x00007fff20637443 libsystem_pthread.dylib`thread_start + 15
thread #3, name = 'Domain2'
frame #0: 0x0000000100096eea lazy_lin_reduced.exe`caml_try_run_on_all_domains_with_spin_work [inlined] caml_wait_interrupt_serviced at domain.c:342:14 [opt]
frame #1: 0x0000000100096ed7 lazy_lin_reduced.exe`caml_try_run_on_all_domains_with_spin_work(handler=(lazy_lin_reduced.exe`caml_stw_empty_minor_heap at minor_gc.c:721), data=<unavailable>, leader_setup=<unavailable>, enter_spin_callback=<unavailable>, enter_spin_data=0x00000001000adce0) at domain.c:1429:5 [opt]
frame #2: 0x00000001000add86 lazy_lin_reduced.exe`caml_empty_minor_heaps_once [inlined] caml_try_stw_empty_minor_heap_on_all_domains at minor_gc.c:758:10 [opt]
frame #3: 0x00000001000add5a lazy_lin_reduced.exe`caml_empty_minor_heaps_once at minor_gc.c:778:5 [opt]
frame #4: 0x0000000100096434 lazy_lin_reduced.exe`domain_thread_func [inlined] domain_terminate at domain.c:1654:5 [opt]
frame #5: 0x00000001000963ac lazy_lin_reduced.exe`domain_thread_func(v=<unavailable>) at domain.c:1086:5 [opt]
frame #6: 0x00007fff2063b8fc libsystem_pthread.dylib`_pthread_start + 224
frame #7: 0x00007fff20637443 libsystem_pthread.dylib`thread_start + 15
thread #4, name = 'Backup2'
frame #0: 0x00007fff206084ba libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff206392ab libsystem_pthread.dylib`_pthread_mutex_firstfit_lock_wait + 76
frame #2: 0x00007fff20637192 libsystem_pthread.dylib`_pthread_mutex_firstfit_lock_slow + 204
frame #3: 0x0000000100097848 lazy_lin_reduced.exe`backup_thread_func [inlined] caml_plat_lock(m=0x000000010014ec80) at platform.h:144:21 [opt]
frame #4: 0x0000000100097839 lazy_lin_reduced.exe`backup_thread_func(v=0x000000010014ebc0) at domain.c:975:9 [opt]
frame #5: 0x00007fff2063b8fc libsystem_pthread.dylib`_pthread_start + 224
frame #6: 0x00007fff20637443 libsystem_pthread.dylib`thread_start + 15
* thread #5, name = 'Domain3', stop reason = EXC_BREAKPOINT (code=EXC_I386_BPT, subcode=0x0)
* frame #0: 0x00000001000b9cf8 lazy_lin_reduced.exe`caml_c_call + 48
frame #1: 0x000000010002f69f lazy_lin_reduced.exe`camlStdlib__output_substring_258 + 79
frame #2: 0x000000010007bb2e lazy_lin_reduced.exe`camlStdlib__Format__buffered_out_flush_1279 + 94
frame #3: 0x000000010007f48e lazy_lin_reduced.exe`camlStdlib__Format__flush_standard_formatters_2002 + 62
frame #4: 0x0000000100054d89 lazy_lin_reduced.exe`camlStdlib__Domain__new_exit_673 + 41
frame #5: 0x00000001000550f7 lazy_lin_reduced.exe`camlStdlib__Domain__body_706 + 135
frame #6: 0x00000001000b9e02 lazy_lin_reduced.exe`caml_start_program + 150
frame #7: 0x000000010009334f lazy_lin_reduced.exe`caml_callback_exn(closure=<unavailable>, arg=1) at callback.c:169:1 [opt]
frame #8: 0x00000001000938f9 lazy_lin_reduced.exe`caml_callback(closure=<unavailable>, arg=1) at callback.c:253:34 [opt]
frame #9: 0x00000001000963ac lazy_lin_reduced.exe`domain_thread_func(v=<unavailable>) at domain.c:1085:5 [opt]
frame #10: 0x00007fff2063b8fc libsystem_pthread.dylib`_pthread_start + 224
frame #11: 0x00007fff20637443 libsystem_pthread.dylib`thread_start + 15
thread #6, name = 'Backup3'
frame #0: 0x00007fff206084ba libsystem_kernel.dylib`__psynch_mutexwait + 10
frame #1: 0x00007fff206392ab libsystem_pthread.dylib`_pthread_mutex_firstfit_lock_wait + 76
frame #2: 0x00007fff20637192 libsystem_pthread.dylib`_pthread_mutex_firstfit_lock_slow + 204
frame #3: 0x0000000100097848 lazy_lin_reduced.exe`backup_thread_func [inlined] caml_plat_lock(m=0x000000010014edc8) at platform.h:144:21 [opt]
frame #4: 0x0000000100097839 lazy_lin_reduced.exe`backup_thread_func(v=0x000000010014ed08) at domain.c:975:9 [opt]
frame #5: 0x00007fff2063b8fc libsystem_pthread.dylib`_pthread_start + 224
frame #6: 0x00007fff20637443 libsystem_pthread.dylib`thread_start + 15
(lldb)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4