Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 7 Mar 2020 00:33:02 -0500
From:      Keno Fischer <keno@juliacomputing.com>
To:        Konstantin Belousov <kib@freebsd.org>
Cc:        freebsd-hackers@freebsd.org, Elliot Saba <elliot.saba@juliacomputing.com>
Subject:   Re: FreeBSD Pipe behavior in pipe OOM situations
Message-ID:  <CABV8kRxvYvZtsPA4o5Hugx%2BPBOdE6akdgRFbLNT9r4D3D=Z-DA@mail.gmail.com>
In-Reply-To: <20200304233906.GB98340@kib.kiev.ua>
References:  <CABV8kRy2Uu6fZwQR37135LvgUCxYFd6eiNt4NMQLg_jpHq42Lg@mail.gmail.com> <20200304233906.GB98340@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi Konstantin,

thanks for getting back to me,

First, there is a requirement that an atomic write size exists, i.e. writes
> less than SC_PIPE_BUF are guaranteed to not interleave if succeeded.  Our
> PIPE_BUF is 512 bytes.
>

Useful to know, thanks.


> I think that unexplained blocking (it is very hard to track down such
> state) is worse then ENOMEM outcome.
>

This is probably true, but the ENOMEM behavior is by no means benign.
If a process exhausts all pipe kva, the next process trying to allocate and
write to a pipe will probably crash. That could basically be anything in the
system. From the ssh server to (in our case) the infrastructure that runs
jobs.
Of course arguable the same thing happens in a regular ENOMEM situation
also, but between paging and user space monitoring of memory usage, such
situations seem easier to manage. I guess there may also be a concern about
dos-style attacks. It seems pretty easy to allocate enough pipes to exhaust
that limit.

Regardless, I just wanted to raise this, since I considered the behavior
odd and
we didn't see it elsewhere. We have since found the culprit for the OOM
condition:
One of our tests tries to provoke an EMFILE condition to test our handling
of this
corner case, so it just fills every unallocated fd with pipes. However,
since it doesn't
write to them, we never actually see a failure there. Instead some random
other process
in the system will crash. Sometimes another test run (where we saw the
error), but
occasionally also the CI system itself. I suspect this is responsible for a
fair number
of mysterious failures we observed. From our perspective this issue should
be resolved -
I guess I'll leave it to you to decide whether there's anything to be done
about the denial
of service concern. I don't know what guarantees FreeBSD makes for kernel
resource
usage particularly in the context of jails, so I don't know if this is of
concern at all. In
regular usage, without a malicious program (or
well-it's-malicious-but-it's-a-test-script),
the admin probably would just bump the sysctl and everything would keep
running nicely.

Thanks again for your detailed reply.

Keno



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CABV8kRxvYvZtsPA4o5Hugx%2BPBOdE6akdgRFbLNT9r4D3D=Z-DA>