From owner-freebsd-fs@FreeBSD.ORG Fri Jun 15 15:34:52 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (unknown [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E1B561065670 for ; Fri, 15 Jun 2012 15:34:52 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-scalar.mail.uoguelph.ca (esa-scalar.mail.uoguelph.ca [66.199.40.18]) by mx1.freebsd.org (Postfix) with ESMTP id 76E938FC18 for ; Fri, 15 Jun 2012 15:34:52 +0000 (UTC) Received: from zcs3.mail.uoguelph.ca (new.mail.uoguelph.ca [131.104.93.37]) by esa-scalar.mail.uoguelph.ca (8.14.1/8.14.1) with ESMTP id q5FFYmk5017695; Fri, 15 Jun 2012 11:34:48 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 0760FB4047; Fri, 15 Jun 2012 11:34:48 -0400 (EDT) Date: Fri, 15 Jun 2012 11:34:48 -0400 (EDT) From: Rick Macklem To: Konstantin Belousov Message-ID: <1116727909.1836239.1339774488001.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20120614122456.GZ2337@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: freebsd-fs@freebsd.org, Pavlo Subject: Re: mmap() incoherency on hi I/O load (FS is zfs) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Jun 2012 15:34:53 -0000 Kostik wrote: > On Thu, Jun 14, 2012 at 07:32:36AM -0400, Rick Macklem wrote: > > Pavlo wrote: > > > There's a case when some parts of files that are mapped and then > > > modified getting corrupted. By corrupted I mean some data is ok > > > (one > > > that > > > was written using write()/pwrite()) but some looks like it never > > > existed. > > > Like it was some time in buffers, when several processes > > > simultaneously > > > (of course access was synchronised) used shared pages and reported > > > it's > > > existence. But after time pass they (processes) screamed that it > > > is > > > now > > > lost. Only part of data written with pwrite() was there. > > > Everything > > > that > > > was written via mmap() is zero. > > > > > > So as I said it occurs on hi I/O busyness. When in background 4+ > > > processes do indexing of huge ammount of data. Also I want to > > > note, it > > > never occurred in the life of our project while we used mmap() > > > under > > > same I/O stress conditions when mapping was done for a whole file > > > of > > > just > > > a part(header) starting from a beginning of a file. First time we > > > used > > > mapping of individual pages, just to save RAM, and this popped up. > > > > > > Solution for this problem is msync() before any munmap(). But man > > > says: > > > > > > The msync() system call is usually not needed since BSD implements > > > a > > > coherent file system buffer cache. However, it may be used to > > > associate > > > dirty VM pages with file system buffers and thus cause them to be > > > flushed > > > to physical media sooner rather than later. > > > > > > Any thoughts? Thanks. > > > > > With a recent kernel from head, I am seeing dirty mmap'd pages being > > written > > quite late for the NFSv4 client. Even after the NFS client > > VOP_RECLAIM() has > > been called, it seems. I didn't observe this behaviour in a kernel > > from > > head in March. (I don't know enough about the vm/mmap area to know > > if this > > is correct behaviour or not?) > > > > I thought I'd mention this, since you didn't say how recent a kernel > > you > > were running and thought it might be caused by the same change? > Can you, please, comment more on this ? > How is this possible at all ? > > Could you please show at least a backtrace for the moment when a write > request is made for the page which belong to already reclaimed vnode ? After some off list discussion, it was determined that my problem was doing nfsrpc_close() before vnode_destroy_object() in the NFSv4 client's VOP_RECLAIM(). This is an NFSv4 specific bug and wouldn't be related to the above issue. Sorry about the noise, rick