From owner-svn-src-all@freebsd.org Sun Feb 4 05:15:23 2018 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5835EEE7C24; Sun, 4 Feb 2018 05:15:23 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail107.syd.optusnet.com.au (mail107.syd.optusnet.com.au [211.29.132.53]) by mx1.freebsd.org (Postfix) with ESMTP id 9F09F6DBA0; Sun, 4 Feb 2018 05:15:21 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au [110.21.101.228]) by mail107.syd.optusnet.com.au (Postfix) with ESMTPS id E80D6D4E645; Sun, 4 Feb 2018 16:15:16 +1100 (AEDT) Date: Sun, 4 Feb 2018 16:15:16 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Conrad Meyer cc: Bruce Evans , src-committers , svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r327495 - head/usr.sbin/rpcbind In-Reply-To: Message-ID: <20180204150751.V909@besplex.bde.org> References: <201801021725.w02HPDaj068477@repo.freebsd.org> <20180203232725.U1389@besplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=YbvN30Zf c=1 sm=1 tr=0 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=6COIcUhGSTRIru9ACGAA:9 a=CjuIK1q_8ugA:10 X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Feb 2018 05:15:23 -0000 On Sat, 3 Feb 2018, Conrad Meyer wrote: > On Sat, Feb 3, 2018 at 6:46 AM, Bruce Evans wrote: >> On Tue, 2 Jan 2018, Conrad Meyer wrote: >>> ... >> Today I learned from the POSIX list that the pselect() function is designed >> to avoid this bug. It takes a signal mask arg (like msleep() in the >> kernel). >> >> I haven't noticed any function related to poll() or kevent that takes a >> signal mask. > > There is the similar function ppoll(), although it complies only with > the Linux standard, not POSIX. ppoll() is relatively new in FreeBSD (2014-11-13). Its man page says that it is first appeared in 11.0, but it is now in 10.x, and according to bsd-family-tree it actually appeared in the 10.2 release more than a year before it appeared in the 11.0 release: - 2014-11-13 pre-11.0 source tree commit - 2014-12-21 10.x source tree commit - 2015-08-13 10.2 release - 2016-10-10 11.0 release. > With kevent, you can simply mask all (or most) signals and watch on a > signal event. Conversion to kevent is more complicated, though. > >> Programs broken by buggy conversion: >> ... >> - rpcbind(8). This is not interactive and normally doesn't use ttys >> which might block. However, the -d flag makes it do fprintf() to >> stderr. This may block forever (e.g., for flow control), and you >> would like to be able to kill it using a signal. But write() will >> restart. rpcbind also uses plain signal() and doesn't know anything >> about SA_RESTART. > > This was not broken by conversion -- it was already broken in this > case. If the signal delivery raced with an fprintf, we ended up > invoking the stdio atexit handlers via exit(3) call in terminate(), > which of course encountered corrupted state. Now it is broken in a > slightly different way, yes, if signal delivery races fprintf *and* > fprintf is blocked in flow control. This might happen with a slow > serial for stderr but seems extraordinarily unlikely otherwise. Unsafe signal handlers are broken by their existence, but sometimes work. It is unlikely for ttys too. I forgot that stdio is clueless about ttys so it doesn't wait for output to drain even for stderr. So the output is normally buffered in the kernel and write() returns. > ... >> perror() >> is broken as designed since it uses stdio, so it is unsafe in signal >> handlers. The err() family copies this bug. Even *s*printf() is not >> required to be safe in signal handlers. I would fix the latter first. > > It does seem like the printf family of routines should be > signal-handler safe. Unfortunately, they are implemented in terms of > the unsafe stdio streams :-(. Even sprintf() uses the generic function __vfprintf() on a fake FILE. I think it has problems with locales. No ctype functions are in the list of async-signal-safe functions in at least the 2001 version of POSIX. It is easy to see why they might not be safe. They access lots of static storage, and setlocale() or a simpler function that changes state might run in another or just a nested signal handler, so locking is required... So the safe sprintf() would probably have to use the C locale. The locale is already passed to __vfprintf(), as needed for sprintf_l(). >... >>> Modified: head/usr.sbin/rpcbind/rpcbind.c >>> ... >>> terminate(int signum) >>> ... >>> + wr = write(terminate_wfd, &c, 1); >> >> >> Not async-signal-safe. Acccesses in signal handlers to objects with >> static storage duration give undefined behaviour except for assignment to >> objects of type volatile sig_atomic_t, but the access to terminate_wfd is >> a read and the type is plain int. > > The type can be changed to volatile sig_atomic_t if you believe plain > int will trigger nonsensical compiler behavior. The value is > initialized once, before the signal handler is registered, so unless > the compiler does something totally insane it should be fine on all > architectures FreeBSD runs on. sig_atomic_t is no better than plain int. This behaviour now makes complete sense. It is just like the undefined behaviour with the ctype functions, except since we own terminate_wfd we can guarantee that it doesn't change while the handler is active (and is valid when the handler is entered). We could also use atomic ops. However, the C standard doesn't require anything that we do to work (except maybe in C11, atomic ops might be explicitly or implicitly specifed to work for things like this). >>> + if (wr < 1) >>> + _exit(2); >> >> >> Best to not check for write errors, since the error handling of using >> _exit() >> is worse than none. It loses stdio flushing to handle an almost-harmless >> error. The main problem with keeping everything in a safe handler is that >> it >> is impossible to keep stdio flushing there and we would prefer to not lost >> the stdio flushing. > > I don't necessarily agree. If the write fails, we missed the signal > telling us to terminate the program and will never exit. That said, > how would the write ever fail? We still set the flag, so can only fail to exit() in the race case, but the write() failure affects all cases. I can't see how the write() can fail. That was part of the excuse for removing the check. Bruce