From owner-svn-src-all@freebsd.org  Sun Feb  4 05:15:23 2018
Return-Path: <owner-svn-src-all@freebsd.org>
Delivered-To: svn-src-all@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5835EEE7C24;
 Sun,  4 Feb 2018 05:15:23 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from mail107.syd.optusnet.com.au (mail107.syd.optusnet.com.au
 [211.29.132.53])
 by mx1.freebsd.org (Postfix) with ESMTP id 9F09F6DBA0;
 Sun,  4 Feb 2018 05:15:21 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from [192.168.0.102] (c110-21-101-228.carlnfd1.nsw.optusnet.com.au
 [110.21.101.228])
 by mail107.syd.optusnet.com.au (Postfix) with ESMTPS id E80D6D4E645;
 Sun,  4 Feb 2018 16:15:16 +1100 (AEDT)
Date: Sun, 4 Feb 2018 16:15:16 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Conrad Meyer <cem@freebsd.org>
cc: Bruce Evans <brde@optusnet.com.au>, 
 src-committers <src-committers@freebsd.org>, svn-src-all@freebsd.org, 
 svn-src-head@freebsd.org
Subject: Re: svn commit: r327495 - head/usr.sbin/rpcbind
In-Reply-To: <CAG6CVpXodjf7HfagpaY6K1t69a1jx37FRqvQq2acg5eT0r1Z-A@mail.gmail.com>
Message-ID: <20180204150751.V909@besplex.bde.org>
References: <201801021725.w02HPDaj068477@repo.freebsd.org>
 <20180203232725.U1389@besplex.bde.org>
 <CAG6CVpXodjf7HfagpaY6K1t69a1jx37FRqvQq2acg5eT0r1Z-A@mail.gmail.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.2 cv=YbvN30Zf c=1 sm=1 tr=0
 a=PalzARQSbocsUSjMRkwAPg==:117 a=PalzARQSbocsUSjMRkwAPg==:17
 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=6COIcUhGSTRIru9ACGAA:9
 a=CjuIK1q_8ugA:10
X-BeenThere: svn-src-all@freebsd.org
X-Mailman-Version: 2.1.25
Precedence: list
List-Id: "SVN commit messages for the entire src tree \(except for &quot;
 user&quot; and &quot; projects&quot; \)" <svn-src-all.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/svn-src-all>,
 <mailto:svn-src-all-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/svn-src-all/>
List-Post: <mailto:svn-src-all@freebsd.org>
List-Help: <mailto:svn-src-all-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/svn-src-all>,
 <mailto:svn-src-all-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 04 Feb 2018 05:15:23 -0000

On Sat, 3 Feb 2018, Conrad Meyer wrote:

> On Sat, Feb 3, 2018 at 6:46 AM, Bruce Evans <brde@optusnet.com.au> wrote:
>> On Tue, 2 Jan 2018, Conrad Meyer wrote:
>>> ...
>> Today I learned from the POSIX list that the pselect() function is designed
>> to avoid this bug.  It takes a signal mask arg (like msleep() in the
>> kernel).
>>
>> I haven't noticed any function related to poll() or kevent that takes a
>> signal mask.
>
> There is the similar function ppoll(), although it complies only with
> the Linux standard, not POSIX.

ppoll() is relatively new in FreeBSD (2014-11-13).  Its man page says
that it is first appeared in 11.0, but it is now in 10.x, and according
to bsd-family-tree it actually appeared in the 10.2 release more than a
year before it appeared in the 11.0 release:
- 2014-11-13  pre-11.0 source tree commit
- 2014-12-21  10.x source tree commit
- 2015-08-13  10.2 release
- 2016-10-10  11.0 release.

> With kevent, you can simply mask all (or most) signals and watch on a
> signal event.  Conversion to kevent is more complicated, though.
>
>> Programs broken by buggy conversion:
>> ...
>> - rpcbind(8).  This is not interactive and normally doesn't use ttys
>>   which might block.  However, the -d flag makes it do fprintf() to
>>   stderr.  This may block forever (e.g., for flow control), and you
>>   would like to be able to kill it using a signal.  But write() will
>>   restart.  rpcbind also uses plain signal() and doesn't know anything
>>   about SA_RESTART.
>
> This was not broken by conversion -- it was already broken in this
> case.  If the signal delivery raced with an fprintf, we ended up
> invoking the stdio atexit handlers via exit(3) call in terminate(),
> which of course encountered corrupted state.  Now it is broken in a
> slightly different way, yes, if signal delivery races fprintf *and*
> fprintf is blocked in flow control.  This might happen with a slow
> serial for stderr but seems extraordinarily unlikely otherwise.

Unsafe signal handlers are broken by their existence, but sometimes work.

It is unlikely for ttys too.  I forgot that stdio is clueless about ttys
so it doesn't wait for output to drain even for stderr.  So the output is
normally buffered in the kernel and write() returns.

> ...
>> perror()
>> is broken as designed since it uses stdio, so it is unsafe in signal
>> handlers.  The err() family copies this bug.  Even *s*printf() is not
>> required to be safe in signal handlers.  I would fix the latter first.
>
> It does seem like the printf family of routines should be
> signal-handler safe.  Unfortunately, they are implemented in terms of
> the unsafe stdio streams :-(.

Even sprintf() uses the generic function __vfprintf() on a fake FILE.  I
think it has problems with locales.  No ctype functions are in the list
of async-signal-safe functions in at least the 2001 version of POSIX.  It
is easy to see why they might not be safe.  They access lots of static
storage, and setlocale() or a simpler function that changes state might
run in another or just a nested signal handler, so locking is required...

So the safe sprintf() would probably have to use the C locale.  The locale
is already passed to __vfprintf(), as needed for sprintf_l().

>...
>>> Modified: head/usr.sbin/rpcbind/rpcbind.c
>>> ...
>>> terminate(int signum)
>>> ...
>>> +       wr = write(terminate_wfd, &c, 1);
>>
>>
>> Not async-signal-safe.  Acccesses in signal handlers to objects with
>> static storage duration give undefined behaviour except for assignment to
>> objects of type volatile sig_atomic_t, but the access to terminate_wfd is
>> a read and the type is plain int.
>
> The type can be changed to volatile sig_atomic_t if you believe plain
> int will trigger nonsensical compiler behavior.  The value is
> initialized once, before the signal handler is registered, so unless
> the compiler does something totally insane it should be fine on all
> architectures FreeBSD runs on.

sig_atomic_t is no better than plain int.  This behaviour now makes complete
sense.  It is just like the undefined behaviour with the ctype functions,
except since we own terminate_wfd we can guarantee that it doesn't change
while the handler is active (and is valid when the handler is entered).
We could also use atomic ops.  However, the C standard doesn't require
anything that we do to work (except maybe in C11, atomic ops might be
explicitly or implicitly specifed to work for things like this).

>>> +       if (wr < 1)
>>> +               _exit(2);
>>
>>
>> Best to not check for write errors, since the error handling of using
>> _exit()
>> is worse than none.  It loses stdio flushing to handle an almost-harmless
>> error.  The main problem with keeping everything in a safe handler is that
>> it
>> is impossible to keep stdio flushing there and we would prefer to not lost
>> the stdio flushing.
>
> I don't necessarily agree.  If the write fails, we missed the signal
> telling us to terminate the program and will never exit.  That said,
> how would the write ever fail?

We still set the flag, so can only fail to exit() in the race case,
but the write() failure affects all cases.

I can't see how the write() can fail.  That was part of the excuse for
removing the check.

Bruce