Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 19 Jun 2002 21:37:56 -0700 (PDT)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Bruce Evans <bde@zeta.org.au>
Cc:        cvs-committers@FreeBSD.ORG, <cvs-all@FreeBSD.ORG>
Subject:   Re: cvs commit: src/sys/ufs/ufs ufs_readwrite.c
Message-ID:  <200206200437.g5K4buMu036454@apollo.backplane.com>

next in thread | raw e-mail | index | archive | help
:>   Log:
:>   In rev 1.72 a situation related to write/mmap was fixed which could result
:>   in a user process gaining visibility into the 'old' contents of a filesystem
:>   block.  There were two cases:  (1) when uiomove() fails (user process issues
:>   illegal write), and (2) when uiomove() overlaps a mmap() of the same file at
:>   the same offset (fault -> recursive buffer I/O reads contents of old block).
:
:I fixed (1) in FreeBSD-1 by always backing out the write in the EFAULT case:

    Yah, #1 is fairly easy to deal with.  #2 is a real mess.  Even with the
    fix I originally had in there (and just moved around a little in this
    commit), Tor was able to write a little two line program to demonstrate
    that there are still issues with fragment extension.

    My little fix doesn't actually change the outstanding issues at all,
    it just hacks around the read-before-write that the original fix 
    introduced.  It's necessary because the read-before-write kills rewrite
    performance by 75% (e.g. 20 MBytes/sec -> 5 MBytes/sec), and on 
    hardware RAID systems it can be 80 - 90% write performance *LOSS*,
    depending on the configuration.  

    We still have serious issues with fragment extension (which wasn't 
    covered by the original fix or this commit)... Tor has a two-line
    program which demonstrates data visibility during fragment extension.
    Kirk, Tor and I (mainly Tor and I) are exploring options for a real fix.
    Constructive comments are welcome but this particular area of the
    codebase is extremely complex and I doubt more then a handful of people
    even understand how it works.  

    The case in question here is write()ing an overlapped mmap()'d
    buffer to the same descriptor.  The code path is:  write() ->
    ufs_readwrite() -> uiomove() -> (fault) -> ffs_getpages() ... I/O,
    (fault return) resume-uiomove() -> bdwrite().  Approximately.  Very,
    very nasty.

						-Matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe cvs-all" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200206200437.g5K4buMu036454>