From owner-freebsd-current  Wed Apr 21 13:23:25 1999
Delivered-To: freebsd-current@freebsd.org
Received: from granite.sentex.net (granite.sentex.ca [199.212.134.1])
	by hub.freebsd.org (Postfix) with ESMTP id A893D1588B
	for <current@FreeBSD.ORG>; Wed, 21 Apr 1999 13:22:02 -0700 (PDT)
	(envelope-from mike@sentex.net)
Received: from simoeon (simeon.sentex.ca [209.112.4.47]) by granite.sentex.net (8.8.8/8.6.9) with SMTP id QAA26690; Wed, 21 Apr 1999 16:19:23 -0400 (EDT)
Message-Id: <3.0.5.32.19990421161837.01beabf0@staff.sentex.ca>
X-Sender: mdtpop@staff.sentex.ca
X-Mailer: QUALCOMM Windows Eudora Light Version 3.0.5 (32)
Date: Wed, 21 Apr 1999 16:18:37 -0400
To: Matthew Dillon <dillon@apollo.backplane.com>
From: Mike Tancsa <mike@sentex.net>
Subject: Re: solid NFS patch #6 avail for -current - need testers files)
Cc: current@FreeBSD.ORG
In-Reply-To: <199904212009.NAA08036@apollo.backplane.com>
References: <199904211751.NAA31922@misha.cisco.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

At 01:09 PM 4/21/99 -0700, Matthew Dillon wrote:
>:Speaking of, when can we expect to see this wonderfull _stability_
>:improvement in -stable? I'm setting up a server here, and would
>:rather have fixed NFS code in it... Yet, jumping to -current is
>:officially wrong... Thanks!
>:
>:	-mi
>
>    Well, you already see a lot of the pure bug fixes being backported.
>    What you don't see in -stable are the bug fixes that also depend on the
>    rewritten portions of the system, nor do you see the rewritten portions 
>    of the system themselves.  The latest NFS patch is borderline -- it
>    would be possible to backport in time enough for the 3.2 deadline, but
>    it wouldn't be fun.

Hi,
Just wondering if these changes also have the side effect of fixing the
nmap problem that crashes 3.x boxes ?  i.e. as you wrote back on 3/4/99


>    The problem is a deadlock caused by the fgrep.  The fgrep is mmap()ing
>    the file, but then it does some really weird crap when dealing with
>    larger files.
>
>    It's the most idiotic code I've ever seen.  The code uses a PRIVATE+RW
>    mmap() until it gets to odd point in the file, at which point it read()'s
>    additional information from the file into the mapped space ( that might
>    contain a previous mmap'd portion of the file ).
>
>    So what happens is this:
>
>	* read() call
>	* shared lock obtained on vnode
>
>	( some other process attempts to get shared lock on vnode and
>	succeeds... for example, a namei operation is attempted by
>	another grep )
>
>	* access MMAP'd area
>	* exclusive lock attempt obtained on same vnode.  This blocks because
>	  some other process has a shared lock on the vnode.
>
>	( the other process then attempts to get an exclusive lock on the 
>	vnode this blocks.
>
>    Deadlock.  
>
>    Even worse, the gnu grep does not bother munmap()'ing the space so, in
>    fact, the deadlock can occur between two unrelated files as well as with 
>    the same file.  This is the more likely deadlock scenario.
>
>    The solution is more difficult.   We could hack an exception for PRIVATE
>    mmap's... there really is no need for the vm_fault code to lock the
vnode.
>    Howver, other situations can occur where this hack would not work.
>
>    This is 'kinda a known problem' in FreeBSD.   We really need to find a
>    solution to it.  Other similar deadlocks can occur if you mmap() one file
>    and read() or write() data from it to another file, and vise versa at
>    the same time.
>
>    Personally, I think the only real solution is to make vn_read() and 
>    vn_write() lock the uio space as well as the vnode being read or
>    written.  It would have to do it in the right order, and it would have
>    to deal with the situation where the uio space covers multiple vnodes.
>
>    Alternately, vnodes need to be redesigned without these fraggin 
>    all-encompassing locks for data R+W ops.
>
>						-Matt
>
>(kgdb) back
>#0  mi_switch () at ../../kern/kern_synch.c:827
>#1  0xf0151919 in tsleep (ident=0xf0a2e500, priority=0x8, wmesg=0xf0263a9c
"inode", timo=0x0)
>    at ../../kern/kern_synch.c:443
>#2  0xf014b774 in acquire (lkp=0xf0a2e500, extflags=0x1000040,
wanted=0x700) at ../../kern/kern_lock.c:145
>#3  0xf014b835 in lockmgr (lkp=0xf0a2e500, flags=0x1030041,
interlkp=0xf51719b0, p=0xf5151200)
>    at ../../kern/kern_lock.c:209
>#4  0xf0171df0 in vop_stdlock (ap=0xf5176b94) at ../../kern/vfs_default.c:210
>#5  0xf0204b39 in ufs_vnoperate (ap=0xf5176b94) at
../../ufs/ufs/ufs_vnops.c:2309
>#6  0xf017aba4 in vn_lock (vp=0xf5171940, flags=0x1030041, p=0xf5151200)
at vnode_if.h:811
>#7  0xf01747b0 in vget (vp=0xf5171940, flags=0x1020041, p=0xf5151200) at
../../kern/vfs_subr.c:1348
>#8  0xf0212f1e in vnode_pager_lock (object=0xf02af2b4) at
../../vm/vnode_pager.c:960
>#9  0xf0206a56 in vm_fault (map=0xf51497c0, vaddr=0x805d000,
fault_type=0x3, fault_flags=0x8)
>    at ../../vm/vm_fault.c:243
>#10 0xf022aebe in trap_pfault (frame=0xf5176d14, usermode=0x0,
eva=0x805d038) at ../../i386/i386/trap.c:816
>#11 0xf022ab92 in trap (frame={tf_es = 0x10, tf_ds = 0x10, tf_edi =
0x805d000, tf_esi = 0xf223e000, 
>      tf_ebp = 0xf5176e0c, tf_isp = 0xf5176d3c, tf_ebx = 0x1060, tf_edx =
0x0, tf_ecx = 0x700, tf_eax = 0x0, 
>      tf_trapno = 0xc, tf_err = 0x3, tf_eip = 0xf0229ce4, tf_cs = 0x8,
tf_eflags = 0x10287, tf_esp = 0xffff1272, 
>      tf_ss = 0xffff0000}) at ../../i386/i386/trap.c:437
>#12 0xf0229ce4 in fastmove_loop ()
>#13 0xf0229b2b in i586_copyout ()
>#14 0xf01fd707 in ffs_read (ap=0xf5176ef0) at
../../ufs/ufs/ufs_readwrite.c:289
>#15 0xf017a689 in vn_read (fp=0xf09dda80, uio=0xf5176f38, cred=0xf09ef480)
at vnode_if.h:303
>#16 0xf015a757 in read (p=0xf5151200, uap=0xf5176f94) at
../../kern/sys_generic.c:121
>#17 0xf022b4a0 in syscall (frame={tf_es = 0x2f, tf_ds = 0x2f, tf_edi =
0x0, tf_esi = 0x8000, tf_ebp = 0xefbf85c8, 
>      tf_isp = 0xf5176fe4, tf_ebx = 0xffffffff, tf_edx = 0x8059000, tf_ecx
= 0xa0000, tf_eax = 0x3, 
>      tf_trapno = 0xc, tf_err = 0x2, tf_eip = 0x280b303c, tf_cs = 0x1f,
tf_eflags = 0x206, tf_esp = 0xefbf85a0, 
>      tf_ss = 0x2f}) at ../../i386/i386/trap.c:1100
>#18 0xf0220dcc in Xint0x80_syscall ()
>#19 0x804dc46 in ?? ()
>#20 0x804e85c in ?? ()
>#21 0x8048f7d in ?? ()
>(kgdb) 
>
>
>
------------------------------------------------------------------------
Mike Tancsa,                          	          tel 01.519.651.3400
Network Administrator,     			  mike@sentex.net
Sentex Communications                 		  www.sentex.net
Cambridge, Ontario Canada


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message