Date: Mon, 17 Feb 2014 15:10:28 -0700 From: Alan Somers <asomers@freebsd.org> To: FreeBSD Net <freebsd-net@freebsd.org>, rwatson@freebsd.org Subject: kern/185812: send(2) on a UNIX domain SEQPACKET socket returns EMSGSIZE instead of EAGAIN Message-ID: <CAOtMX2imCCeKHW3FALdU1mD927H45zvF94snRb4eSVKbK8fxeg@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
SOCK_SEQPACKET Unix domain sockets don't work on FreeBSD. kern/185812 is one of the reasons. When send(2) ought to block on a seqpacket socket, it returns EMSGSIZE instead. If the socket is nonblocking, send(2) returns EMSGSIZE instead of EAGAIN. The problem can be demonstrated on FreeBSD 10 or head by the ATF testcase sys/kern/unix_seqpacket_test:eagain_8k_8k. The problem dates to an old hack. It's at least as old as 4.4BSD Lite. When you write to a unix domain socket, the data goes directly to the receiving socket's sockbuf, bypassing the sending socket's sockbuf.. However, sosend_generic doesn't know anything about Unix domain sockets, and it doesn't know anything about the receiving socket. Without some form of backpressure, sosend_generic would never block. So, uipc_send updates the _sending_ sockbuf's sb_hiwat to account for whatever it wrote to the _receiving_ sockbuf. (For those not in the know, sb_hiwat is the maximum allowed amount of data in the buffer.) The next time that sosend_generic gets called, it sees that the sending sockbuf is empty, but it has a lower maximum size than before. If the maximum size is 0, sosend_generic will block. This hack worked fine for SOCK_STREAM sockets, but it breaks SOCK_SEQPACKET sockets, since the latter consist of messages that must be sent atomically. When sb_hiwat is too low to fit an entire message, sosend_generic will return EMSGSIZE instead of blocking or returning EAGAIN. Fortunately, we have a template for how to fix this bug. DragonFlyBSD fixed it back in 2008. Instead of applying backpressure through sb_hiwat, it uses a new sockbuf flag called SSB_STOP. When the receiving sockbuf runs out of space, uipc_send sets SSB_STOP on the sending sockbuf. Then, sosend_generic will block (or return EAGAIN) on the next attempt to write. This solution is very clean and simple. It might also be slightly faster than the legacy method, because it eliminates the need to call chgsbsize() on every send() and recv(). I am aware of one drawback: since ssb_space() will only ever return 0 or ssb_hiwat, sosend_generic will allow the sockbuf to exceed its nominal maximum size by at most one packet of size less than ssb_hiwat. I don't think that's a serious problem. In fact, I'm not even positive that FreeBSD guarantees a socket will always stay within its nominal size limit. Does this solution sound acceptable in FreeBSD? Is there any reason that I shouldn't port it? Note that DragonFly long ago refactored struct sockbuf into two separate structures: struct sockbuf and struct signalsockbuf. I won't make that change as part of the port. In case you're wondering, NetBSD 6.0 suffers from the same bug, OpenBSD 5.4 doesn't appear to support SOCK_SEQPACKET unix domain sockets, and Linux 3.2.0 does not suffer. The relevant commit in DragonFlyBSD: https://github.com/DragonFlyBSD/DragonFlyBSD/commit/3a6117bbe0ed6a87605c1e43e12a1438d8844380 -Alan
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOtMX2imCCeKHW3FALdU1mD927H45zvF94snRb4eSVKbK8fxeg>