From owner-freebsd-stable@FreeBSD.ORG Sat Mar 25 10:39:46 2006 Return-Path: X-Original-To: stable@freebsd.org Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 60A5216A401; Sat, 25 Mar 2006 10:39:46 +0000 (UTC) (envelope-from peterjeremy@optushome.com.au) Received: from mail14.syd.optusnet.com.au (mail14.syd.optusnet.com.au [211.29.132.195]) by mx1.FreeBSD.org (Postfix) with ESMTP id AB5D743D48; Sat, 25 Mar 2006 10:39:45 +0000 (GMT) (envelope-from peterjeremy@optushome.com.au) Received: from turion.vk2pj.dyndns.org (c220-239-19-236.belrs4.nsw.optusnet.com.au [220.239.19.236]) by mail14.syd.optusnet.com.au (8.12.11/8.12.11) with ESMTP id k2PAdSDs029257 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Sat, 25 Mar 2006 21:39:29 +1100 Received: from turion.vk2pj.dyndns.org (localhost.vk2pj.dyndns.org [127.0.0.1]) by turion.vk2pj.dyndns.org (8.13.4/8.13.4) with ESMTP id k2PAdSc5005884; Sat, 25 Mar 2006 21:39:28 +1100 (EST) (envelope-from peter@turion.vk2pj.dyndns.org) Received: (from peter@localhost) by turion.vk2pj.dyndns.org (8.13.4/8.13.4/Submit) id k2PAdRpa005883; Sat, 25 Mar 2006 21:39:27 +1100 (EST) (envelope-from peter) Date: Sat, 25 Mar 2006 21:39:27 +1100 From: Peter Jeremy To: Mikhail Teterin Message-ID: <20060325103927.GE703@turion.vk2pj.dyndns.org> References: <200603232352.k2NNqPS8018729@gate.bitblocks.com> <200603241518.01027.mi+mx@aldan.algebra.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200603241518.01027.mi+mx@aldan.algebra.com> X-PGP-Key: http://members.optusnet.com.au/peterjeremy/pubkey.asc User-Agent: Mutt/1.5.11 Cc: alc@freebsd.org, stable@freebsd.org Subject: Re: Reading via mmap stinks (Re: weird bugs with mmap-ing via NFS) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Mar 2006 10:39:46 -0000 On Fri, 2006-Mar-24 15:18:00 -0500, Mikhail Teterin wrote: >which there is not with the read. Read also requires fairly large buffers in >the user space to be efficient -- *in addition* to the buffers in the kernel. I disagree. With a filesystem read, the kernel is solely responsible for handling physical I/O with an efficient buffer size. The userland buffers simply amortise the cost of the system call and copyout overheads. >I'm also quite certain, that fulfulling my "demands" would add quite a bit of >complexity to the mmap support in kernel, but hey, that's what the kernel is >there for :-) Unfortunately, your patches to implement this seem to have become detached from your e-mail. :-) >Unlike grep, which seems to use only 32k buffers anyway (and does not use >madvise -- see attachment), my program mmaps gigabytes of the input file at >once, trusting the kernel to do a better job at reading the data in the most >efficient manner :-) mmap can lend itself to cleaner implementatione because there's no need to have a nested loop to read buffers and then process them. You can mmap then entire file and process it. The downside is that on a 32-bit architecture, this limits you to processing files that are somewhat less than 2GB. The downside is that touching an uncached page triggers a trap which may not be as efficient as reading a block of data through the filesystem interface, and I/O errors are delivered via signals (which may not be as easy to handle). >Peter Jeremy wrote: >> On an amd64 system running about 6-week old -stable, both ['grep' and 'grep >> --mmap' -mi] behave pretty much identically. > >Peter, I read grep's source -- it is not using madvise (because it hurts >performance on SunOS-4.1!) and reads in chunks of 32k anyway. Would you care >to look at my program instead? Thanks: > > http://aldan.algebra.com/mzip.c fetch: http://aldan.algebra.com/mzip.c: Not Found I tried writing a program that just mmap'd my entire (2GB) test file and summed all the longwords in it. This gave me similar results to grep. Setting MADV_SEQUENTIAL and/or MADV_WILLNEED made no noticable difference. I suspect something about your code or system is disabling the mmap read-ahead functionality. What happens if you simulate read-ahead yourself? Have your main program fork and the child access pages slightly ahead of the parent but do nothing else. -- Peter Jeremy