A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/ocaml/ocaml/issues/13586 below:

`flush` can raise `Sys_error` when used in parallel · Issue #13586 · ocaml/ocaml · GitHub

We've encountered that Stdlib/Out_channel.flush may raise a Sys_error exception when used in parallel with a close.
Consider this reproducer program:

let path = Filename.temp_file "stm-" ""

let test () =
  let channel = Atomic.make (Out_channel.open_text path) in

  let wait = Atomic.make true in

  let d1 = Domain.spawn (fun () ->
      while Atomic.get wait do Domain.cpu_relax() done;
      Out_channel.close (Atomic.get channel)) in

  let d2 = Domain.spawn (fun () ->
      while Atomic.get wait do Domain.cpu_relax() done;
      (try
        Out_channel.flush (Atomic.get channel)
       with (Sys_error msg) -> Printf.printf "Out_channel.flush raised Sys_error %S\n%!" msg; assert false)) in

  (* Output a sufficiently long string *)
  Out_channel.output_string (Atomic.get channel) (String.make (65531 (*65536*)) 'a');
  (* Let the races begin! *)
  Atomic.set wait false;
  let () = Domain.join d1 in
  let () = Domain.join d2 in
  (* Please leave the torture chamber nice and clean as you found it *)
  (try Out_channel.close (Atomic.get channel) with Sys_error _ -> ());
  Sys.remove path

let _ =
  for i = 1 to 50_000 do
    if i mod 250 = 0 then Printf.printf "#%!";
    test ()
  done

with this behaviour:

$ ocamlopt -g flushexc.ml
$ ./a.out
##############################################################################################################Out_channel.flush raised Sys_error "Bad file descriptor"
Fatal error: exception File "flushexc.ml", line 21, characters 94-100: Assertion failed

The documentation does not warn of unsynchronized access to Out_channel operations, as they are generally protected by internal channel locks.

The documentation for close mentions that flush should not raise Sys_error when called on a closed channel, but does
strictly speaking not specify the behaviour of flush in parallel with close:

val close : t -> unit
(** Close the given channel, flushing all buffered write operations.  Output
    functions raise a [Sys_error] exception when they are applied to a closed
    output channel, except {!close} and {!flush}, which do nothing when applied
    to an already closed channel.  Note that {!close} may raise [Sys_error] if
    the operating system signals an error when flushing or closing. *)

The problem is that flush on a still-open channel ends up in caml_flush_partial which calls check_pending that may
temporarily unlock the channel to process pending actions. This creates a small window for caml_ml_close_channel to lock and close the
underlying file descriptor, and offset channel->curr by 1 from channel->buff into dummy_buff. As a result, when caml_flush_partial resumes, it will attempt to output 1 character, fail, and raise an exception.

Before jumping into discussions of fixes, we may want to discuss how to proceed.
Here's a few suggestions:

Note: the above example program can easily be modified to race on two closes also resulting
in a Sys_error. The last sentence of the documentation could account for that behaviour
though.

Thanks to @ncik-roberts for figuring out the above explanation.

CC to @damiendoligez who last had his fingers in these parts in #12678


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4