From owner-freebsd-stable@FreeBSD.ORG Tue Mar 21 22:18:20 2006 Return-Path: X-Original-To: stable@freebsd.org Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5EB8316A401; Tue, 21 Mar 2006 22:18:20 +0000 (UTC) (envelope-from mi+mx@aldan.algebra.com) Received: from zig.murex.com (mail.murex.com [194.98.239.11]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9E22043D45; Tue, 21 Mar 2006 22:18:19 +0000 (GMT) (envelope-from mi+mx@aldan.algebra.com) Received: from interscan.fr.murex.com (interscan.fr.murex.com [172.21.17.207] (may be forged)) by zig.murex.com with ESMTP id k2LMK2JQ026849; Tue, 21 Mar 2006 23:20:03 +0100 (CET) Received: from mxmail.murex.com (interscan.murex.fr [127.0.0.1]) by interscan.fr.murex.com (8.11.6/8.11.6) with ESMTP id k2LMmJw14016; Tue, 21 Mar 2006 23:48:19 +0100 Received: from mteterin.us.murex.com ([172.21.130.86]) by mxmail.murex.com with Microsoft SMTPSVC(6.0.3790.0); Tue, 21 Mar 2006 23:17:36 +0100 From: Mikhail Teterin Organization: Virtual Estates, Inc. To: Matthew Dillon Date: Tue, 21 Mar 2006 17:17:33 -0500 User-Agent: KMail/1.8.3 References: <200603211607.30372.mi+mx@aldan.algebra.com> <200603212123.k2LLNMhO006344@apollo.backplane.com> In-Reply-To: <200603212123.k2LLNMhO006344@apollo.backplane.com> MIME-Version: 1.0 Content-Type: text/plain; charset="koi8-u" Content-Transfer-Encoding: 8bit Content-Disposition: inline Message-Id: <200603211717.34348.mi+mx@aldan.algebra.com> X-OriginalArrivalTime: 21 Mar 2006 22:17:36.0783 (UTC) FILETIME=[4250B9F0:01C64D35] Cc: alc@freebsd.org, stable@freebsd.org Subject: Re: weird bugs with mmap-ing via NFS X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Mar 2006 22:18:20 -0000 [Moved from -current to -stable] ืฆืิฯาฯห 21 ยลาลฺลฮุ 2006 16:23, Matthew Dillon ๗ษ ฮมะษำมฬษ: > š š You might be doing just writes to the mmap()'d memory, but the system > š š doesn't know that. Actually, it does. The program tells it, that I don't care to read, what's currently there, by specifying the PROT_READ flag only. > š š The moment you touch any mmap()'d page, reading or writing, the system > š š has to fault it in, which means it has to read it and load valid data > š š into the page. Sounds like a missed optimization opportunity :-( > :When I mount with large read and write sizes: > : > :šššššššmount_nfs -r 65536 -w 65536 -U -ointr pandora:/backup /backup > : > :it changes -- for the worse. Short time into it -- the file stops growing > :according to the `ls -sl' run on the NFS server (pandora) at exactly 3200 > : FS blocks (the FS was created with `-b 65536 -f 8129'). > : > :At the same time, according to `systat -if' on both client and server, the > : š client continues to send (and the server continues to receive) about > : 30Mb of some (?) data per second. > š š It kinda sounds like the buffer cache is getting blown out, but not > š š having seen the program I can't really analyze it. See http://aldan.algebra.com/~mi/mzip.c > š š It will always be more efficient to write to a file using write() then > š š using mmap() I understand, that write() is much better optimized at the moment, but the mmap interface carries some advantages, which may allow future OSes to optimize their ways. The application can hint at its planned usage of the data via madvise, for example. Unfortunately, my problem, so far, is with it not writing _at all_... > š š and it will always be far, far more efficient to write to an NFS file in > š š nfs block-sized chunks rather then in smaller chunks > š š due to the way the buffer cache works. Yes, this is an example of how a good implemented mmap can be better than write. Without explicit writes by the application and without doubling the memory requirements, the data can be written in the most optimal way. Thanks for your help. Yours, -mi