From owner-cvs-all  Wed Jun 19 21:38: 2 2002
Delivered-To: cvs-all@freebsd.org
Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2])
	by hub.freebsd.org (Postfix) with ESMTP
	id 8C9A437B407; Wed, 19 Jun 2002 21:37:56 -0700 (PDT)
Received: from apollo.backplane.com (localhost [127.0.0.1])
	by apollo.backplane.com (8.12.3/8.12.3) with ESMTP id g5K4buCV036455;
	Wed, 19 Jun 2002 21:37:56 -0700 (PDT)
	(envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.12.3/8.12.3/Submit) id g5K4buMu036454;
	Wed, 19 Jun 2002 21:37:56 -0700 (PDT)
	(envelope-from dillon)
Date: Wed, 19 Jun 2002 21:37:56 -0700 (PDT)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <200206200437.g5K4buMu036454@apollo.backplane.com>
To: Bruce Evans <bde@zeta.org.au>
Cc: cvs-committers@FreeBSD.ORG, <cvs-all@FreeBSD.ORG>
Subject: Re: cvs commit: src/sys/ufs/ufs ufs_readwrite.c
Sender: owner-cvs-all@FreeBSD.ORG
Precedence: bulk
List-ID: <cvs-all.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20cvs-all>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20cvs-all>
X-Loop: FreeBSD.ORG

:>   Log:
:>   In rev 1.72 a situation related to write/mmap was fixed which could result
:>   in a user process gaining visibility into the 'old' contents of a filesystem
:>   block.  There were two cases:  (1) when uiomove() fails (user process issues
:>   illegal write), and (2) when uiomove() overlaps a mmap() of the same file at
:>   the same offset (fault -> recursive buffer I/O reads contents of old block).
:
:I fixed (1) in FreeBSD-1 by always backing out the write in the EFAULT case:

    Yah, #1 is fairly easy to deal with.  #2 is a real mess.  Even with the
    fix I originally had in there (and just moved around a little in this
    commit), Tor was able to write a little two line program to demonstrate
    that there are still issues with fragment extension.

    My little fix doesn't actually change the outstanding issues at all,
    it just hacks around the read-before-write that the original fix 
    introduced.  It's necessary because the read-before-write kills rewrite
    performance by 75% (e.g. 20 MBytes/sec -> 5 MBytes/sec), and on 
    hardware RAID systems it can be 80 - 90% write performance *LOSS*,
    depending on the configuration.  

    We still have serious issues with fragment extension (which wasn't 
    covered by the original fix or this commit)... Tor has a two-line
    program which demonstrates data visibility during fragment extension.
    Kirk, Tor and I (mainly Tor and I) are exploring options for a real fix.
    Constructive comments are welcome but this particular area of the
    codebase is extremely complex and I doubt more then a handful of people
    even understand how it works.  

    The case in question here is write()ing an overlapped mmap()'d
    buffer to the same descriptor.  The code path is:  write() ->
    ufs_readwrite() -> uiomove() -> (fault) -> ffs_getpages() ... I/O,
    (fault return) resume-uiomove() -> bdwrite().  Approximately.  Very,
    very nasty.

						-Matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe cvs-all" in the body of the message