Date: Tue, 17 Nov 1998 12:53:40 -0700 (MST) From: David G Andersen <danderse@cs.utah.edu> To: FreeBSD-gnats-submit@FreeBSD.ORG Subject: kern/8732: nfs mounts with 'intr' can cause system hang Message-ID: <199811171953.MAA21319@torrey.cs.utah.edu>
next in thread | raw e-mail | index | archive | help
>Number: 8732 >Category: kern >Synopsis: nfs mounts with 'intr' can cause system hang >Confidential: no >Severity: serious >Priority: high >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Tue Nov 17 12:00:01 PST 1998 >Last-Modified: >Originator: David G Andersen >Organization: University of Utah >Release: FreeBSD 3.0-CURRENT i386 >Environment: FreeBSD 3.0, on dual PII-350, 128M ram. Moderate NFS usage. The problem is independent of the # processors, memory, or hardware configuration as far as we could test. We tested the NFS primarily with NFSv2. We were using amd, but the problem is separate from amd. >Description: If a program gets a SIGINTR while performing a close() on an NFS file descriptor, the system will hang. This only occurs if the NFS filesystem is mounted with the 'intr' flag and the system is running nfsiod processes. In sys/kern/vfs_subr.c, vinvalbuf(): while (vp->v_numoutput) { vp->v_flag |= VBWAIT; => tsleep((caddr_t)&vp->v_numoutput, slpflag | (PRIBIO + 1), "vinvlbuf", slptimeo); } The test program is stuck in this loop in vinvalbuf because there is a SIGINTR pending. This causes tsleep to return immediately (without sleeping) with the return value EINTR or ERESTART but they aren't checking the return value! Hence, it spins forever in this loop because... Meanwhile one of the pending nfsbiod's has been awakened because its reply to the write request has arrived, but it never gets to run. The other three nfsbiods are blocked because only one biod can be in the socket receive at a time. And until the biods return, v_numoutput won't be decremented. It works with no nfsbiods because the test program does all the buffer writes itself so by the time it gets to vinvalbuf, v_numoutput is 0. >How-To-Repeat: Run the following program, with args: ./program <path to NFS file> 1000 (the 1000 tells it to do 1000 opens/closes) and ctrl-C it while it's running. May take a few runs to hang, because it has to interrupt during the flush. #include <stdio.h> #include <stdlib.h> #include <sys/types.h> #include <fcntl.h> #include <unistd.h> int main(int argc, char **argv) { char *filename; int fd; char buffer[10500]; char filebuf[2048]; int i; filename = argv[1]; bzero(buffer, sizeof(buffer)); for (i = 0; i < atoi(argv[2]); i++) { sprintf(filebuf, filename, i); printf("creating %s\n", filebuf); fd = open(filebuf, O_CREAT | O_WRONLY); write(fd, buffer, 8192); close(fd); unlink(filebuf); } printf("I did it, I did it\n"); exit(0); } >Fix: [With commentary stolen shamelessly from Mike Hibler. Thanks, Mike] There appear to be a few options for the fix: 1 - return EINTR on the close (close(2) indicates it's a potential error code). This could break a lot of clients. 2 - Ignore SIGINTR during NFS flushes. This seems like a bad idea too. 3 - Something else? There are really two issues involved. One is whether the FreeBSD change to vinvalbuf is even necessary/correct... A cvs annotate shows: ================== revision 1.156 date: 1998/06/10 22:02:14; author: julian; state: Exp; lines: +4 -2 Replace 'sleep()' with 'tsleep()' Accidentally imported from Kirk's codebase. Pointed out by: various. ---------------------------- revision 1.155 date: 1998/06/10 18:13:19; author: julian; state: Exp; lines: +18 -8 Submitted by: Kirk McKusick <mckusick@McKusick.COM> Fix for potential hang when trying to reboot the system or to forcibly unmount a soft update enabled filesystem. FreeBSD already handled the reboot case differently, this is however a better fix. ================== So as 1.155 indicates, this change came directly from The Source so I believe it is necessary. The change in 1.156 is the key: by changing from the 4.4bsd non-interruptible "sleep" to the possibly interruptible "tsleep" and OR'ing in the "slpflag" the problem was introduced--now the sleep became interruptible when called on an interruptible NFS mount. That brings us to issue #2 which is what is the correct behavior in this case? The easy way out is to just not OR in slpflag and go back to full-time non- interruptibility (#2). However, that probably isn't necessary. I'm a bettin' that you could just slpx() and return the tsleep value (#1) and all will be fine. (well, as fine as it ever is in the NFS world...) >Audit-Trail: >Unformatted: To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199811171953.MAA21319>