Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 27 Jul 2012 12:07:07 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Garrett Cooper <yanegomi@gmail.com>
Cc:        freebsd-bugs@FreeBSD.org, freebsd-gnats-submit@FreeBSD.org
Subject:   Re: kern/170203: [kern] piped dd's don't behave sanely when dealing with a fifo
Message-ID:  <20120727103622.B933@besplex.bde.org>
In-Reply-To: <201207262256.q6QMurVf077480@red.freebsd.org>
References:  <201207262256.q6QMurVf077480@red.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 26 Jul 2012, Garrett Cooper wrote:

>> Description:
> Creating a fifo and then dd'ing across the fifo using /dev/zero doesn't seem to yield the behavior one would expect to have; dd should either exit thanks to SIGPIPE being sent or the count being completed.
>
> Furthermore, the count is bogus:
>
> Terminal 1:
>
> $ dd if=fifo bs=512k count=4
> 0+4 records in
> 0+4 records out
> 32768 bytes transferred in 0.002121 secs (15449523 bytes/sec)
> $ dd if=fifo bs=512k count=4
> 0+4 records in
> 0+4 records out
> 32768 bytes transferred in 0.001483 secs (22096295 bytes/sec)
> ...

I think it's working almost as expected.  Large blocks give non-atomic
I/O, so the reader sees small blocks, then EOF when it gets ahead of
the writer.  This always happens without SMP.

Not is a bug (debugged below).  There is no SIGPIPE at the start of
write() because there is a reader then, and no SIGPIPE for the next
write() because there is no next write() -- the current one doesn't
notice when the reader goes away.

This is what happens under FreeBSD-~5.2 with the old fifo implementation,
at least.  It also shows a bug in truss(1) -- the current write() is not
shown, because it hasn't returned.  kdump shows that the write() has
started but not returned.

> $ dd if=fifo bs=512M count=4
> 0+4 records in
> 0+4 records out
> 32768 bytes transferred in 0.003908 secs (8384514 bytes/sec)
>
> Terminal 2:
>
> $ dd if=/dev/zero bs=512k count=4 of=fifo
> ^T
> load: 0.40  cmd: dd 1779 [sbwait] 2.63r 0.00u 0.00s 0% 1800k

FreeBSD-~5.2 shows [runnable] for the wait channel.  This is
strange.  dd should be blocked waiting for a reader, and only
sbwait makes sense for that.  FreeBSD-9 apparently doesn't
have the new named pipe implementation either.  -current shows
[pipdwt].  This makes it clearer that is waiting in write()
and not in open().  dd probably does the wrong thing for
fifos, by always trying to open files in O_RDWR mode first.
This breaks the normal synchronization of readers and writers.
In fact, this explains why there is no SIGPIPE -- there is
always a reader since dd can always talk to itself.  First
the open succeeds without blocking as expected.

After changing the O_RDWR to O_WRONLY in FreeBSD-~5.2, dd almost
works as expected.  The reader reads 4 blocks of size 8K and
then exits.  The writer first blocks in open.  Then it is
killed by SIGPIPE.  Its SIGPIPE handling is broken (nonexistent),
and the signal kills it without it printing a status message:

%   1266 dd       RET   read 524288/0x80000
%   1266 dd       CALL  write(0x4,0x8063000,0x80000)
%   1266 dd       RET   write -1 errno 32 Broken pipe
%   1266 dd       PSIG  SIGPIPE SIG_DFL

The read is from /dev/zero.  The write is of 512K to the fifo.
This delivers 4*8K then is killed.  If dd caught the signal
like it should, then we would expect to see either a short
write().  The signal handling should clear SA_RESTART, else
the write() would be restarted and would deliver endless
SIGPIPEs, now for failing writes.  Reporting of short writes
is quite broken and this is an interesting test for it.

-current delivers 4*64K instead of 4*8K.  This is because
the i/o unit is BIG_PIPE_SIZE = 64K for nameless pipes and
now for nameless pipes.  Apparently the unit is 8K for
sockets.  I think the unit of atomicity is only 512 bytes
for both.  Certainly, PIPE_BUF is still 512 in limits.h.
I think limits.h is broken since the unit isn't actually
512 bytes for _all_ file types.  For sockets, you can control
the watermarks and I think this changes the unit of atomicity.
I wonder if the socket ioctls for this the old named pipe
implemention.

The pipe wait channel names are less than perfect.  "pipdw"
means "pipe direct write".  "wt" looks like an abreviation
for "write", but there are 3 waits in pipe_direct_write()
and they are distinguished by the suffixes "w", "c" and "t".
It isn't clear what these mean.

>> How-To-Repeat:
> mkfifo fifo
>
> Terminal 1:
>
> dd if=fifo bs=512k count=4
>
> Terminal 2:
>
> dd if=/dev/zero bs=512k count=4 of=fifo

Remember to kill the writing dd if you stop it with ^Z.  Otherwise, since
the unhacked version is talking to itself, the fifo acts strangely for
other tests.

conv=block and conv=noerror (with cbs=512k) change the behaviour only
slightly (slightly worse).  What works easily is omitting the count.
dd then reads until EOF, in 256 records of size exactly 8K each under
FreeBSD-~5.2.  Not giving the count is normal practice, since you
rarely know the block size for pipes and many other file types.  It
there is another bug here, then it is conv=foo not working.  But
reblocking is confusing, and I probably did it wrong.

ANother thing that doesn't work well here is trying to control the
writer with SIGPIPE from the reader.  Even if you can get the reblocking
right and read precisily 2MB, and fix SIGPIPE, then the SIGPIPE may be
delivered after the writer has dirtied the fifo with a little more than
2MB.  The unread data then remains to bite the next reader.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120727103622.B933>