From owner-freebsd-current@FreeBSD.ORG Tue Mar 21 21:24:02 2006 Return-Path: X-Original-To: current@freebsd.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A1AB916A41F; Tue, 21 Mar 2006 21:24:02 +0000 (UTC) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 64ABF43DB5; Tue, 21 Mar 2006 21:23:48 +0000 (GMT) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.13.4/8.13.4) with ESMTP id k2LLNM4S006345; Tue, 21 Mar 2006 13:23:22 -0800 (PST) Received: (from dillon@localhost) by apollo.backplane.com (8.13.4/8.13.4/Submit) id k2LLNMhO006344; Tue, 21 Mar 2006 13:23:22 -0800 (PST) Date: Tue, 21 Mar 2006 13:23:22 -0800 (PST) From: Matthew Dillon Message-Id: <200603212123.k2LLNMhO006344@apollo.backplane.com> To: Mikhail Teterin References: <200603211607.30372.mi+mx@aldan.algebra.com> Cc: alc@freebsd.org, current@freebsd.org Subject: Re: weird bugs with mmap-ing via NFS X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Mar 2006 21:24:02 -0000 :Hello! : :I have a program, that writes a file via mmap. Normally the target is on a :local filesystem, so there are no issues. : :Today, however, I tried running it on another machine writing via NFS. : :If the output share is mounted with default parameters, the writing succeeds, :but involves very high READ bandwidth (the client is not reading anything). :For example, here is the output of `netstat -1' on the client: : : input (Total) output : packets errs bytes packets errs bytes colls : 2 0 152 0 0 0 0 : 3081 0 4369834 519 0 82006 0 :... You might be doing just writes to the mmap()'d memory, but the system doesn't know that. The moment you touch any mmap()'d page, reading or writing, the system has to fault it in, which means it has to read it and load valid data into the page. :When I mount with large read and write sizes: : : mount_nfs -r 65536 -w 65536 -U -ointr pandora:/backup /backup : :it changes -- for the worse. Short time into it -- the file stops growing :according to the `ls -sl' run on the NFS server (pandora) at exactly 3200 FS :blocks (the FS was created with `-b 65536 -f 8129'). : :At the same time, according to `systat -if' on both client and server, the :client continues to send (and the server continues to receive) about 30Mb of :some (?) data per second. : :The client is the freshly rebuilt FreeBSD-6.1/i386 -- with alc's recent big :MFC included. The server is an older 6.1/amd64 from Feb 7. : :Please, advise. Thanks! : : -mi It kinda sounds like the buffer cache is getting blown out, but not having seen the program I can't really analyze it. It will always be more efficient to write to a file using write() then using mmap(), and it will always be far, far more efficient to write to an NFS file in nfs block-sized chunks rather then in smaller chunks due to the way the buffer cache works. The only write case using write lengths less then the NFS block size that is optimized is the file-append case. All other cases (when writing less then the NFS block size) will have to perform a read-before-write to validate the buffer cache buffer. Writes that are multiples of the NFS block size (and aligned to the NFS block size) should be optimized and will not have to perform a read-before-write. -Matt Matthew Dillon