Date: Fri, 27 Jul 2012 16:52:22 +0800 From: David Xu <listlog2011@gmail.com> To: Bruce Evans <brde@optusnet.com.au> Cc: Garrett Cooper <yanegomi@gmail.com>, freebsd-bugs@FreeBSD.org, freebsd-gnats-submit@FreeBSD.org Subject: Re: kern/170203: [kern] piped dd's don't behave sanely when dealing with a fifo Message-ID: <501256C6.5000307@gmail.com> In-Reply-To: <20120727103622.B933@besplex.bde.org> References: <201207262256.q6QMurVf077480@red.freebsd.org> <20120727103622.B933@besplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2012/7/27 10:07, Bruce Evans wrote: > On Thu, 26 Jul 2012, Garrett Cooper wrote: > >>> Description: >> Creating a fifo and then dd'ing across the fifo using /dev/zero >> doesn't seem to yield the behavior one would expect to have; dd >> should either exit thanks to SIGPIPE being sent or the count being >> completed. >> >> Furthermore, the count is bogus: >> >> Terminal 1: >> >> $ dd if=fifo bs=512k count=4 >> 0+4 records in >> 0+4 records out >> 32768 bytes transferred in 0.002121 secs (15449523 bytes/sec) >> $ dd if=fifo bs=512k count=4 >> 0+4 records in >> 0+4 records out >> 32768 bytes transferred in 0.001483 secs (22096295 bytes/sec) >> ... > > I think it's working almost as expected. Large blocks give non-atomic > I/O, so the reader sees small blocks, then EOF when it gets ahead of > the writer. This always happens without SMP. > > Not is a bug (debugged below). There is no SIGPIPE at the start of > write() because there is a reader then, and no SIGPIPE for the next > write() because there is no next write() -- the current one doesn't > notice when the reader goes away. > After fixed dd to not open fifo output file in O_RDWR mode, I still found the writer is blocked there even the reader is already exited. I think this is definitely a bug. if reader is exited, the writer should be aborted too, but I found it still be blocked in state "pipedwt", obviously, the code in /sys/fs/fifo_vnops.c wants to wake up the writer when the reader is closing the fifo, but it failed, because the bit flag PIPE_WANTW is forgotten to be set by writer, so it skips executing wakeup(), and then the writer has no chance to find EOF bit flag is set. I have to apply the following two patches to make the bug go away: http://people.freebsd.org/~davidxu/patch/fifopipe/kernel_pipe.diff <http://people.freebsd.org/%7Edavidxu/patch/fifopipe/kernel_pipe.diff> http://people.freebsd.org/~davidxu/patch/fifopipe/dd.diff <http://people.freebsd.org/%7Edavidxu/patch/fifopipe/dd.diff> > This is what happens under FreeBSD-~5.2 with the old fifo implementation, > at least. It also shows a bug in truss(1) -- the current write() is not > shown, because it hasn't returned. kdump shows that the write() has > started but not returned. > >> $ dd if=fifo bs=512M count=4 >> 0+4 records in >> 0+4 records out >> 32768 bytes transferred in 0.003908 secs (8384514 bytes/sec) >> >> Terminal 2: >> >> $ dd if=/dev/zero bs=512k count=4 of=fifo >> ^T >> load: 0.40 cmd: dd 1779 [sbwait] 2.63r 0.00u 0.00s 0% 1800k > > FreeBSD-~5.2 shows [runnable] for the wait channel. This is > strange. dd should be blocked waiting for a reader, and only > sbwait makes sense for that. FreeBSD-9 apparently doesn't > have the new named pipe implementation either. -current shows > [pipdwt]. This makes it clearer that is waiting in write() > and not in open(). dd probably does the wrong thing for > fifos, by always trying to open files in O_RDWR mode first. > This breaks the normal synchronization of readers and writers. > In fact, this explains why there is no SIGPIPE -- there is > always a reader since dd can always talk to itself. First > the open succeeds without blocking as expected. > > After changing the O_RDWR to O_WRONLY in FreeBSD-~5.2, dd almost > works as expected. The reader reads 4 blocks of size 8K and > then exits. The writer first blocks in open. Then it is > killed by SIGPIPE. Its SIGPIPE handling is broken (nonexistent), > and the signal kills it without it printing a status message: > > % 1266 dd RET read 524288/0x80000 > % 1266 dd CALL write(0x4,0x8063000,0x80000) > % 1266 dd RET write -1 errno 32 Broken pipe > % 1266 dd PSIG SIGPIPE SIG_DFL > > The read is from /dev/zero. The write is of 512K to the fifo. > This delivers 4*8K then is killed. If dd caught the signal > like it should, then we would expect to see either a short > write(). The signal handling should clear SA_RESTART, else > the write() would be restarted and would deliver endless > SIGPIPEs, now for failing writes. Reporting of short writes > is quite broken and this is an interesting test for it. > > -current delivers 4*64K instead of 4*8K. This is because > the i/o unit is BIG_PIPE_SIZE = 64K for nameless pipes and > now for nameless pipes. Apparently the unit is 8K for > sockets. I think the unit of atomicity is only 512 bytes > for both. Certainly, PIPE_BUF is still 512 in limits.h. > I think limits.h is broken since the unit isn't actually > 512 bytes for _all_ file types. For sockets, you can control > the watermarks and I think this changes the unit of atomicity. > I wonder if the socket ioctls for this the old named pipe > implemention. > > The pipe wait channel names are less than perfect. "pipdw" > means "pipe direct write". "wt" looks like an abreviation > for "write", but there are 3 waits in pipe_direct_write() > and they are distinguished by the suffixes "w", "c" and "t". > It isn't clear what these mean. > >>> How-To-Repeat: >> mkfifo fifo >> >> Terminal 1: >> >> dd if=fifo bs=512k count=4 >> >> Terminal 2: >> >> dd if=/dev/zero bs=512k count=4 of=fifo > > Remember to kill the writing dd if you stop it with ^Z. Otherwise, since > the unhacked version is talking to itself, the fifo acts strangely for > other tests. > > conv=block and conv=noerror (with cbs=512k) change the behaviour only > slightly (slightly worse). What works easily is omitting the count. > dd then reads until EOF, in 256 records of size exactly 8K each under > FreeBSD-~5.2. Not giving the count is normal practice, since you > rarely know the block size for pipes and many other file types. It > there is another bug here, then it is conv=foo not working. But > reblocking is confusing, and I probably did it wrong. > > ANother thing that doesn't work well here is trying to control the > writer with SIGPIPE from the reader. Even if you can get the reblocking > right and read precisily 2MB, and fix SIGPIPE, then the SIGPIPE may be > delivered after the writer has dirtied the fifo with a little more than > 2MB. The unread data then remains to bite the next reader. > > Bruce > _______________________________________________ > freebsd-bugs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-bugs > To unsubscribe, send any mail to "freebsd-bugs-unsubscribe@freebsd.org" > . >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?501256C6.5000307>