Date: Fri, 29 May 2026 19:51:11 +0000 From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 295707] aio_write: O_APPEND write ordering guarantee is not enforced Message-ID: <bug-295707-227@https.bugs.freebsd.org/bugzilla/>
index | next in thread | raw e-mail
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=295707 Bug ID: 295707 Summary: aio_write: O_APPEND write ordering guarantee is not enforced Product: Base System Version: 15.0-STABLE Hardware: Any OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: i.maximets@ovn.org Attachment #271331 text/plain mime type: Created attachment 271331 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=271331&action=edit reproducer The man page [1] says: If O_APPEND is set for iocb->aio_fildes, write operations append to the file in the same order as the calls were made. [1] https://man.freebsd.org/cgi/man.cgi?query=aio_write Open vSwitch is using aio for logging, and we see a fairly frequent log reordering or even interleaving in our tests on FreeBSD. The reason seems to be that the kernel doesn't actually enforce the ordering for O_APPEND on the same file descriptor. It appears that kernel workers just pick up new requests whenever they can and so the writes end up out of order on systems with more than one core. AFAICT, POSIX technically has an exemption for the ordering rule for multiprocessor systems. However, this is not mentioned in the man page and the spirit of the exemption seems to actually be an exceptional case and not a general rule for how things should work. And aio in general would not be very useful if we had to wait for every request to be completed before submitting a new one. Attached a relatively simple reproducer program that mimics the usage pattern we have in OVS. It makes 50K writes with numbered lines with at most 256 requests in-flight at the same time. A ring buffer is used to track the requests. On EAGAIN - waits for one request to be done and tries again. At the end checks the file for the order of the written lines and the correctness of the written data. This test always passes on Linux, which has the same ordering claim in their man page, but different implementation, of course. On FreeBSD the test fails in our CI with ~25% of rows getting reordered: $ clang -o aio-append aio-append.c $ ./aio-append REORDERED at line 2: expected seq 1, got 2 REORDERED at line 5: expected seq 5, got 1 REORDERED at line 6: expected seq 2, got 5 REORDERED at line 11: expected seq 10, got 11 REORDERED at line 12: expected seq 12, got 10 REORDERED at line 13: expected seq 11, got 12 REORDERED at line 27: expected seq 26, got 27 REORDERED at line 28: expected seq 28, got 29 REORDERED at line 31: expected seq 32, got 26 REORDERED at line 32: expected seq 27, got 28 50000 lines, 13445 reordered, 0 corrupted While, I guess, that can be fixed by updating the docs while still being sort of POSIX compliant, would be nice to actually have kernel enforcing the currently documented behavior. WDYT? -- You are receiving this mail because: You are the assignee for the bug.home | help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-295707-227>
