There are various uses of _unsafe
functions in the implementation of Buffer. In 4.x, it's impossible1 to write OCaml code which can
invalidate the checks done before these _unsafe
functions are used, but in 5.x it's relatively easy, e.g.:
let worker b n () = let old_size = ref (Buffer.length b) in let s = String.make 500 'x' in while true do let l = Buffer.length b in if !old_size <> l then begin Format.eprintf "%d size: %d\n%!" n l; old_size := l; end; let () = Buffer.reset b in try Buffer.add_string b s with e -> Printf.eprintf "%s\n%!" (Printexc.to_string e) done let _ = let buffer = Buffer.create 1024 in let _ = Domain.spawn (worker buffer 1) in let _ = Domain.spawn (worker buffer 2) in let _ = Domain.spawn (worker buffer 3) in let _ = Domain.spawn (worker buffer 4) in let _ = Domain.spawn (worker buffer 5) in let _ = Domain.spawn (worker buffer 6) in let _ = Domain.spawn (worker buffer 7) in let _ = Domain.spawn (worker buffer 8) in while true do () done
Obviously Buffers should not be accessed in parallel without being guarded by some kind of lock, but parallel accesses may happen by mistake and these should never cause the running program to segfault.
There are three possible solutions:
_unsafe
functions entirely. This will impose a measurable penalty on correct single-domain use of Buffer (which is why effort was put into switching to the _unsafe
functions before)bytes
value. In particular, relaxing the invariant on the cached length field and the position fields should yield a buffer which has a very similar fast path for the adding functions and only one additional check required for the less-frequent retrieval functions (contents
, etc.)I have a partial implementation of 3, but it's clearly not for 5.0 and, if we're going to do something that crazy, it will need benchmarks to justify it.
2 is straightforward, and I intend to open a PR for it - I'm just making sure there's a tracking issue, as I'm on vacation next week 🙂
(credit to @jmid's property testing work and @jonludlam for spinning the repro case out from it)
well, almost impossible: I think it's possible to engineer a program which might just manage to execute a parallel access to a Buffer which interleaves the checks with a reset (using either signals or some Gc evil), but the only reason this would happen is because you were actually writing a program which tried to do this, so it's much less relevant than in 5.x where such programs could be written by mistake. ↩
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4