Date: Mon, 16 Nov 1998 10:23:07 -0700 (MST) From: "David G. Andersen" <danderse@cs.utah.edu> To: hackers@FreeBSD.ORG Subject: nfs intr hangs. Message-ID: <13904.24008.516709.742888@torrey.cs.utah.edu>
next in thread | raw e-mail | index | archive | help
NFS/FS people care to comment?
(Regarding the looping 'tsleep' in vfs_subr.c: vinvalbuf() which
causes a system hang).
To reiterate a bit, the code in question is:
while (vp->v_numoutput) {
vp->v_flag |= VBWAIT;
tsleep((caddr_t)&vp->v_numoutput,
slpflag | (PRIBIO + 1),
"vinvlbuf", slptimeo);
}
When the filesystem is NFS mounted with the 'intr' flag, this tsleep
gets interrupted occasionally, and the system begins infinitely
looping here.
The discussion about which we need comments:
Lo and Behold, Mike Hibler said:
> > From: David G Andersen <danderse@cs>
>
> > I can see a few options for the way to go, but I'm not sure which is
> > right.
> >
> > 1 - return EINTR on the close ('man close' says that's a possible error
> > code)
> >
> > 2 - retry the flush a few times, then return EINTR.
> > (more likely to make clients happy)
> >
> > 3 - For those of us who are lazy bastards, ignore SIGINTR during
> > NFS flushes. This seems like a bad idea.
> >
> > 4 - Something else?
> >
>
> There are really two issues involved. One is whether the FreeBSD change
> to vinvalbuf is even necessary/correct... Ok, I just did a cvs annotate
> and found what the change was:
> ==================
>
> revision 1.156
> date: 1998/06/10 22:02:14; author: julian; state: Exp; lines: +4 -2
> Replace 'sleep()' with 'tsleep()'
> Accidentally imported from Kirk's codebase.
>
> Pointed out by: various.
> ----------------------------
> revision 1.155
> date: 1998/06/10 18:13:19; author: julian; state: Exp; lines: +18 -8
> Submitted by: Kirk McKusick <mckusick@McKusick.COM>
>
> Fix for potential hang when trying to reboot the system or
> to forcibly unmount a soft update enabled filesystem.
> FreeBSD already handled the reboot case differently, this is however a better
> fix.
>
> ==================
> So as 1.155 indicates, this change came directly from The Source so I believe
> it is necessary. The change in 1.156 is the key: by changing from the 4.4bsd
> non-interruptible "sleep" to the possibly interruptible "tsleep" and OR'ing
> in the "slpflag" the problem was introduced--now the sleep became
> interruptible when called on an interruptible NFS mount.
>
> That brings us to issue #2 which is what is the correct behavior in this case?
> The easy way out is to just not OR in slpflag and go back to full-time non-
> interruptibility (your #3). However, that probably isn't necessary. I'm a
> bettin' that you could just slpx() and return the tsleep value (your #1)
> and all will be fine. (well, as fine as it ever is in the NFS world...)
Thanks in advance.
-Dave
--
work: danderse@cs.utah.edu me: angio@pobox.com
University of Utah http://www.angio.net/
Department of Computer Science
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?13904.24008.516709.742888>
