Date: Sun, 19 Oct 2008 02:19:28 +0200 From: "Ivan Voras" <ivoras@freebsd.org> To: "Dan Nelson" <dnelson@allantgroup.com> Cc: freebsd-hackers@freebsd.org Subject: Re: Pipes, cat buffer size Message-ID: <9bbcef730810181719x4387a14yec74bdb6893d1a2a@mail.gmail.com> In-Reply-To: <20081018231201.GM99270@dan.emsphone.com> References: <gddjoj$apg$1@ger.gmane.org> <20081018213502.GL99270@dan.emsphone.com> <gddmip$im0$1@ger.gmane.org> <20081018231201.GM99270@dan.emsphone.com>
next in thread | previous in thread | raw e-mail | index | archive | help
2008/10/19 Dan Nelson <dnelson@allantgroup.com>: > In the last episode (Oct 19), Ivan Voras said: >> Of course. But that's not the point :) From what I see (didn't look at >> the code), Linux for example does some kind of internal buffering that >> decouples how the reader and the writer interact. I think that with >> FreeBSD's current behaviour the writer could write 1-byte buffers and >> the reader will be forced to read each byte individually. I don't know >> if there's some ulterior reason for this. > > No; take a look at /sys/kern/sys_pipe.c . Depending on how much data > is in the pipe, it switches between async in-kernel buffering (<8192 > bytes), and direct page wiring between sender and receiver (basically > zero-copy). Ok, maybe it's just not behaving as I thought it should. See this test program: ---- #include <sys/fcntl.h> #include <stdlib.h> #include <stdio.h> #define BSIZE (1024*1024) void main() { int r; char buf[BSIZE]; while (1) { r = read(0, buf, BSIZE); fprintf(stderr, "read %d bytes\n", r); if (r <= 0) break; } } ---- and this command line: > dd bs=1 if=/dev/zero| ./reader The output of this on RELENG_7 is: read 8764 bytes read 1 bytes read 1 bytes read 1 bytes read 1 bytes read 1 bytes read 1 bytes read 1 bytes read 1 bytes read 1 bytes read 1 bytes read 1 bytes read 1 bytes read 1 bytes read 1 bytes read 1 bytes read 1 bytes read 1 bytes read 1 bytes ... The first value puzzles me - so it actually is doing some kind of buffering. Linux isn't actually much better, but the intention is there: $ dd if=/dev/zero bs=1 | ./bla read 1 bytes read 38 bytes read 8 bytes read 2 bytes read 2 bytes read 2 bytes read 2 bytes read 4 bytes read 3 bytes read 2 bytes read 2 bytes read 2 bytes read 2 bytes read 2 bytes read 2 bytes read 3 bytes read 3 bytes read 112 bytes read 2 bytes read 2 bytes ... Maybe FreeBSD switches between the writer and the reader too soon so the buffer doesn't get filled? Using cat (which started all this), FreeBSD consistently processes 4096 byte buffers, while Linux's sizes are all over the place - from 4 kB to 1 MB, randomly fluctuating. My goal would be (if it's possible - it might not be) to maximize coalescing in an environment where the reader does something with the data (e.g. compression) so there should be a reasonable amount of backlogged input data. But if it works in general, it may simply be that it isn't really applicable to my purpose (and I should modify the reader to read multiple blocks). Though it won't help me, I still think that modifying cat is worth it :)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9bbcef730810181719x4387a14yec74bdb6893d1a2a>