Date: Wed, 17 Jul 2002 15:37:52 -0700 (PDT) From: David Xu <bsddiy@yahoo.com> To: John Baldwin <jhb@FreeBSD.org> Cc: "cvs-all@FreeBSD.org" <cvs-all@FreeBSD.org>, "cvs-committers@FreeBSD.org" <cvs-committers@FreeBSD.org> Subject: Re: cvs commit: src/sys/kern kern_descrip.c Message-ID: <20020717223752.70126.qmail@web20905.mail.yahoo.com> In-Reply-To: <XFMail.20020713121856.jhb@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
----- Original Message -----
From: "John Baldwin" <jhb@FreeBSD.org>
To: "David Xu" <davidx@viasoft.com.cn>
Cc: <cvs-all@FreeBSD.org>; <cvs-committers@FreeBSD.org>
Sent: Wednesday, July 17, 2002 9:48 PM
Subject: Re: cvs commit: src/sys/kern kern_descrip.c
>
> On 17-Jul-2002 David Xu wrote:
> > ----- Original Message -----
> > From: "John Baldwin" <jhb@FreeBSD.ORG>
> > To: <cvs-committers@FreeBSD.ORG>; <cvs-all@FreeBSD.ORG>
> > Sent: Wednesday, July 17, 2002 10:48 AM
> > Subject: cvs commit: src/sys/kern kern_descrip.c
> >
> >
> >> jhb 2002/07/16 19:48:43 PDT
> >>
> >> Modified files:
> >> sys/kern kern_descrip.c
> >> Log:
> >> Preallocate a struct file as the first thing in falloc() before we lock
> >> the filelist_lock and check nfiles. This closes a race where we had to
> >> unlock the filedesc to re-lock the filelist_lock.
> >>
> >> Reported by: David Xu
> >> Reviewed by: bde (mostly)
> >>
> >> Revision Changes Path
> >> 1.150 +5 -16 src/sys/kern/kern_descrip.c
> >>
> >
> > holds filelist_lock and does not release until function returns,
> > this lock seems too big for me. it almostly forces the whole system
> > to serialize through open() syscall. I see filelist_lock only maintain
> > global opened file list. why should a program calling falloc be blocked
> > out by another program calling falloc so long time?
>
> Look at the code in question that is under the lock. It may look long but
> it really isn't. It's mostly just doing simple variable assignments and
> shuffles. The only function it calls is fdalloc(). Oh, geez. And that
> unlocks the filedesc lock. So every place that calls fdalloc() is
> actually a race. *sigh* Actually, it doesn't need to. Hmm, but that
> can still be potentially long.
>
> Well, we could do this:
>
> Index: kern_descrip.c
> ===================================================================
> RCS file: /home/ncvs/src/sys/kern/kern_descrip.c,v
> retrieving revision 1.150
> diff -u -r1.150 kern_descrip.c
> --- kern_descrip.c 17 Jul 2002 02:48:43 -0000 1.150
> +++ kern_descrip.c 17 Jul 2002 13:22:43 -0000
> @@ -169,7 +169,6 @@
> int i, error;
>
> FILEDESC_LOCK(fdp);
> -retry:
> if (old >= fdp->fd_nfiles ||
> fdp->fd_ofiles[old] == NULL ||
> new >= p->p_rlimit[RLIMIT_NOFILE].rlim_cur ||
> @@ -187,10 +186,6 @@
> FILEDESC_UNLOCK(fdp);
> return (error);
> }
> - /*
> - * fdalloc() may block, retest everything.
> - */
> - goto retry;
> }
> error = do_dup(fdp, (int)old, (int)new, td->td_retval, td);
> return(error);
> @@ -1015,25 +1010,9 @@
> nfiles = NDEXTENT;
> else
> nfiles = 2 * fdp->fd_nfiles;
> - FILEDESC_UNLOCK(fdp);
> - mtx_lock(&Giant);
> MALLOC(newofile, struct file **, nfiles * OFILESIZE,
> M_FILEDESC, M_WAITOK);
> - mtx_unlock(&Giant);
> - FILEDESC_LOCK(fdp);
>
> - /*
> - * deal with file-table extend race that might have occured
> - * when malloc was blocked.
> - */
> - if (fdp->fd_nfiles >= nfiles) {
> - FILEDESC_UNLOCK(fdp);
> - mtx_lock(&Giant);
> - FREE(newofile, M_FILEDESC);
> - mtx_unlock(&Giant);
> - FILEDESC_LOCK(fdp);
> - continue;
> - }
> newofileflags = (char *) &newofile[nfiles];
> /*
> * Copy the existing ofile and ofileflags arrays
> @@ -1053,13 +1032,8 @@
> fdp->fd_ofileflags = newofileflags;
> fdp->fd_nfiles = nfiles;
> fdexpand++;
> - if (oldofile != NULL) {
> - FILEDESC_UNLOCK(fdp);
> - mtx_lock(&Giant);
> + if (oldofile != NULL)
> FREE(oldofile, M_FILEDESC);
> - mtx_unlock(&Giant);
> - FILEDESC_LOCK(fdp);
> - }
> }
> return (0);
> }
> @@ -1122,28 +1096,26 @@
> * descriptor to the list of open files at that point, otherwise
> * put it at the front of the list of open files.
> */
> - FILEDESC_LOCK(p->p_fd);
> - if ((error = fdalloc(td, 0, &i))) {
> - FILEDESC_UNLOCK(p->p_fd);
> - nfiles--;
> - sx_xunlock(&filelist_lock);
> - uma_zfree(file_zone, fp);
> - return (error);
> - }
> fp->f_mtxp = mtx_pool_alloc();
> fp->f_gcflag = 0;
> fp->f_count = 1;
> fp->f_cred = crhold(td->td_ucred);
> fp->f_ops = &badfileops;
> fp->f_seqcount = 1;
> + FILEDESC_LOCK(p->p_fd);
> if ((fq = p->p_fd->fd_ofiles[0])) {
> LIST_INSERT_AFTER(fq, fp, f_list);
> } else {
> LIST_INSERT_HEAD(&filehead, fp, f_list);
> }
> + sx_xunlock(&filelist_lock);
> + if ((error = fdalloc(td, 0, &i))) {
> + FILEDESC_UNLOCK(p->p_fd);
> + fdrop(fp, td);
> + return (error);
> + }
> p->p_fd->fd_ofiles[i] = fp;
> FILEDESC_UNLOCK(p->p_fd);
> - sx_xunlock(&filelist_lock);
> if (resultfp)
> *resultfp = fp;
> if (resultfd)
>
> This doesn't hold filelist_lock across fdalloc() and thus possibly across
> malloc(). It also just holds the filedesc pointer across malloc() so
> that fdalloc() is no longer racy (no internal races or introducing ones
> into calling code).
>
> --
>
> John Baldwin <jhb@FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/
> "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/
your patch seems reasonable, current dup and fcntl with F_DUPFD have
this race condition, to-be-dup'd descriptor could be close by another
thread beforce fdalloc returns, I sent a patch to alfred who I think is
FILEDESC locking maintainer to fix this race in do_dup, I didn't touch
fdalloc() because I think a big patch could be refused.
David Xu
__________________________________________________
Do You Yahoo!?
Yahoo! Autos - Get free new car price quotes
http://autos.yahoo.com
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe cvs-all" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020717223752.70126.qmail>
