From owner-freebsd-current@FreeBSD.ORG Wed Aug 3 14:03:37 2005 Return-Path: X-Original-To: current@freebsd.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BC51616A420; Wed, 3 Aug 2005 14:03:37 +0000 (GMT) (envelope-from sobomax@portaone.com) Received: from www.portaone.com (support.portaone.com [195.70.151.35]) by mx1.FreeBSD.org (Postfix) with ESMTP id 96EF243D60; Wed, 3 Aug 2005 14:03:31 +0000 (GMT) (envelope-from sobomax@portaone.com) Received: from [192.168.0.49] (lesnik.portaone.com [195.140.246.50] (may be forged)) (authenticated bits=0) by www.portaone.com (8.12.11/8.12.11) with ESMTP id j73DtfPD004134 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 3 Aug 2005 15:55:42 +0200 (CEST) (envelope-from sobomax@portaone.com) Message-ID: <42F0CCD5.9090200@portaone.com> Date: Wed, 03 Aug 2005 16:55:33 +0300 From: Maxim Sobolev Organization: Porta Software Ltd User-Agent: Mozilla Thunderbird 1.0.6 (Windows/20050716) X-Accept-Language: en-us, en MIME-Version: 1.0 To: "current@freebsd.org" Content-Type: text/plain; charset=KOI8-U; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.86.2/1002/Wed Aug 3 12:29:36 2005 on www.portaone.com X-Virus-Status: Clean X-Spam-Status: No, score=-5.9 required=5.0 tests=ALL_TRUSTED,BAYES_00, TO_ADDRESS_EQ_REAL autolearn=ham version=3.0.0 X-Spam-Checker-Version: SpamAssassin 3.0.0 (2004-09-13) on www.portaone.com Cc: Subject: Sub-optimal libc's read-ahead buffering behaviour X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Maxim.Sobolev@portaone.com List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Aug 2005 14:03:37 -0000 Hi, I have found the scenario in which our libc behaves utterly suboptimally. Consider the following piece of code reads and processes every other 512-bytes block in a file (error handling intentionally omitted): FILE *f; int i; char buf[512]; f = fopen(...); for (i = 0; feof(f) == 0; i++) { fread(buf, sizeof(buf), 1, f); do_process(buf); fseek(f, i * 2 * sizeof(buf), SEEK_SET); } What I have discovered in this case is that libc reads 4096 bytes from the file for *each* fread(3) call, despite the fact that it can only do one actual read(2) for every fourth fread(3) and satisfy the rest from the internal buffer (4096 bytes). However, if I replace fseek(3) with just another dummy fread(3) everything works as expected - libc does only one read for every 8 fread(3) calls (4 dummy and 4 real). Is it something which should be fixed or are there some subtle reasons for the current behaviour? Following is piece of code which illustrates the problem: #include #include int main(int argc, char **argv) { FILE *f; int i; char buf[512]; f = fopen("/dev/zero", "r"); for (i = 0; i < 16; i++) { fread(buf, sizeof(buf), 1, f); if (argc == 1) fread(buf, sizeof(buf), 1, f); else fseek(f, i * 2 * sizeof(buf), SEEK_SET); } exit(0); } When run with zero arguments relevant truss output looks like: open("/dev/zero",0x0,0666) = 3 (0x3) fstat(3,0xbfbfe900) = 0 (0x0) readlink("/etc/malloc.conf",0xbfbfe8c0,63) ERR#2 'No such file or directory' issetugid() = 0 (0x0) mmap(0x0,4096,(0x3)PROT_READ|PROT_WRITE,(0x1002)MAP_ANON|MAP_PRIVATE,-1,0x0) = 1209335808 (0x48150000) break(0x804b000) = 0 (0x0) break(0x804c000) = 0 (0x0) ioctl(3,TIOCGETA,0xbfbfe940) ERR#19 'Operation not supported by device' read(0x3,0x804b000,0x1000) = 4096 (0x1000) read(0x3,0x804b000,0x1000) = 4096 (0x1000) read(0x3,0x804b000,0x1000) = 4096 (0x1000) read(0x3,0x804b000,0x1000) = 4096 (0x1000) exit(0x0) While when I am specifying some argument it becomes: open("/dev/zero",0x0,0666) = 3 (0x3) fstat(3,0xbfbfe900) = 0 (0x0) readlink("/etc/malloc.conf",0xbfbfe8c0,63) ERR#2 'No such file or directory' issetugid() = 0 (0x0) mmap(0x0,4096,(0x3)PROT_READ|PROT_WRITE,(0x1002)MAP_ANON|MAP_PRIVATE,-1,0x0) = 1209335808 (0x48150000) break(0x804b000) = 0 (0x0) break(0x804c000) = 0 (0x0) ioctl(3,TIOCGETA,0xbfbfe940) ERR#19 'Operation not supported by device' read(0x3,0x804b000,0x1000) = 4096 (0x1000) lseek(3,0x0,SEEK_SET) = 0 (0x0) read(0x3,0x804b000,0x1000) = 4096 (0x1000) lseek(3,0x400,SEEK_SET) = 1024 (0x400) read(0x3,0x804b000,0x1000) = 4096 (0x1000) lseek(3,0x800,SEEK_SET) = 2048 (0x800) read(0x3,0x804b000,0x1000) = 4096 (0x1000) lseek(3,0xc00,SEEK_SET) = 3072 (0xc00) read(0x3,0x804b000,0x1000) = 4096 (0x1000) lseek(3,0x1000,SEEK_SET) = 4096 (0x1000) read(0x3,0x804b000,0x1000) = 4096 (0x1000) lseek(3,0x1400,SEEK_SET) = 5120 (0x1400) read(0x3,0x804b000,0x1000) = 4096 (0x1000) lseek(3,0x1800,SEEK_SET) = 6144 (0x1800) read(0x3,0x804b000,0x1000) = 4096 (0x1000) lseek(3,0x1c00,SEEK_SET) = 7168 (0x1c00) read(0x3,0x804b000,0x1000) = 4096 (0x1000) lseek(3,0x2000,SEEK_SET) = 8192 (0x2000) read(0x3,0x804b000,0x1000) = 4096 (0x1000) lseek(3,0x2400,SEEK_SET) = 9216 (0x2400) read(0x3,0x804b000,0x1000) = 4096 (0x1000) lseek(3,0x2800,SEEK_SET) = 10240 (0x2800) read(0x3,0x804b000,0x1000) = 4096 (0x1000) lseek(3,0x2c00,SEEK_SET) = 11264 (0x2c00) read(0x3,0x804b000,0x1000) = 4096 (0x1000) lseek(3,0x3000,SEEK_SET) = 12288 (0x3000) read(0x3,0x804b000,0x1000) = 4096 (0x1000) lseek(3,0x3400,SEEK_SET) = 13312 (0x3400) read(0x3,0x804b000,0x1000) = 4096 (0x1000) lseek(3,0x3800,SEEK_SET) = 14336 (0x3800) read(0x3,0x804b000,0x1000) = 4096 (0x1000) lseek(3,0x3c00,SEEK_SET) = 15360 (0x3c00) exit(0x0) The output speaks for itself (32 syscalls instead of 4)! -Maxim