Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 17 Jul 2002 15:37:52 -0700 (PDT)
From:      David Xu <bsddiy@yahoo.com>
To:        John Baldwin <jhb@FreeBSD.org>
Cc:        "cvs-all@FreeBSD.org" <cvs-all@FreeBSD.org>, "cvs-committers@FreeBSD.org" <cvs-committers@FreeBSD.org>
Subject:   Re: cvs commit: src/sys/kern kern_descrip.c
Message-ID:  <20020717223752.70126.qmail@web20905.mail.yahoo.com>
In-Reply-To: <XFMail.20020713121856.jhb@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help

----- Original Message ----- 
From: "John Baldwin" <jhb@FreeBSD.org>
To: "David Xu" <davidx@viasoft.com.cn>
Cc: <cvs-all@FreeBSD.org>; <cvs-committers@FreeBSD.org>
Sent: Wednesday, July 17, 2002 9:48 PM
Subject: Re: cvs commit: src/sys/kern kern_descrip.c


> 
> On 17-Jul-2002 David Xu wrote:
> > ----- Original Message ----- 
> > From: "John Baldwin" <jhb@FreeBSD.ORG>
> > To: <cvs-committers@FreeBSD.ORG>; <cvs-all@FreeBSD.ORG>
> > Sent: Wednesday, July 17, 2002 10:48 AM
> > Subject: cvs commit: src/sys/kern kern_descrip.c
> > 
> > 
> >> jhb         2002/07/16 19:48:43 PDT
> >> 
> >>   Modified files:
> >>     sys/kern             kern_descrip.c 
> >>   Log:
> >>   Preallocate a struct file as the first thing in falloc() before we lock
> >>   the filelist_lock and check nfiles.  This closes a race where we had to
> >>   unlock the filedesc to re-lock the filelist_lock.
> >>   
> >>   Reported by:    David Xu
> >>   Reviewed by:    bde (mostly)
> >>   
> >>   Revision  Changes    Path
> >>   1.150     +5 -16     src/sys/kern/kern_descrip.c
> >> 
> > 
> > holds filelist_lock and does not release until function returns, 
> > this lock seems too big for me. it almostly forces the whole system
> > to serialize through open() syscall. I see filelist_lock only maintain
> > global opened file list. why should a program calling falloc be blocked
> > out by another program calling falloc so long time?
> 
> Look at the code in question that is under the lock.  It may look long but
> it really isn't.  It's mostly just doing simple variable assignments and
> shuffles.  The only function it calls is fdalloc().  Oh, geez.  And that
> unlocks the filedesc lock.  So every place that calls fdalloc() is
> actually a race.  *sigh*  Actually, it doesn't need to.  Hmm, but that
> can still be potentially long.
> 
> Well, we could do this:
> 
> Index: kern_descrip.c
> ===================================================================
> RCS file: /home/ncvs/src/sys/kern/kern_descrip.c,v
> retrieving revision 1.150
> diff -u -r1.150 kern_descrip.c
> --- kern_descrip.c      17 Jul 2002 02:48:43 -0000      1.150
> +++ kern_descrip.c      17 Jul 2002 13:22:43 -0000
> @@ -169,7 +169,6 @@
>         int i, error;
>  
>         FILEDESC_LOCK(fdp);
> -retry:
>         if (old >= fdp->fd_nfiles ||
>             fdp->fd_ofiles[old] == NULL ||
>             new >= p->p_rlimit[RLIMIT_NOFILE].rlim_cur ||
> @@ -187,10 +186,6 @@
>                         FILEDESC_UNLOCK(fdp);
>                         return (error);
>                 }
> -               /*
> -                * fdalloc() may block, retest everything.
> -                */
> -               goto retry;
>         }
>         error = do_dup(fdp, (int)old, (int)new, td->td_retval, td);
>         return(error);
> @@ -1015,25 +1010,9 @@
>                         nfiles = NDEXTENT;
>                 else
>                         nfiles = 2 * fdp->fd_nfiles;
> -               FILEDESC_UNLOCK(fdp);
> -               mtx_lock(&Giant);
>                 MALLOC(newofile, struct file **, nfiles * OFILESIZE,
>                     M_FILEDESC, M_WAITOK);
> -               mtx_unlock(&Giant);
> -               FILEDESC_LOCK(fdp);
>  
> -               /*
> -                * deal with file-table extend race that might have occured
> -                * when malloc was blocked.
> -                */
> -               if (fdp->fd_nfiles >= nfiles) {
> -                       FILEDESC_UNLOCK(fdp);
> -                       mtx_lock(&Giant);
> -                       FREE(newofile, M_FILEDESC);
> -                       mtx_unlock(&Giant);
> -                       FILEDESC_LOCK(fdp);
> -                       continue;
> -               }
>                 newofileflags = (char *) &newofile[nfiles];
>                 /*
>                  * Copy the existing ofile and ofileflags arrays
> @@ -1053,13 +1032,8 @@
>                 fdp->fd_ofileflags = newofileflags;
>                 fdp->fd_nfiles = nfiles;
>                 fdexpand++;
> -               if (oldofile != NULL) {
> -                       FILEDESC_UNLOCK(fdp);
> -                       mtx_lock(&Giant);
> +               if (oldofile != NULL)
>                         FREE(oldofile, M_FILEDESC);
> -                       mtx_unlock(&Giant);
> -                       FILEDESC_LOCK(fdp);
> -               }
>         }
>         return (0);
>  }
> @@ -1122,28 +1096,26 @@
>          * descriptor to the list of open files at that point, otherwise
>          * put it at the front of the list of open files.
>          */
> -       FILEDESC_LOCK(p->p_fd);
> -       if ((error = fdalloc(td, 0, &i))) {
> -               FILEDESC_UNLOCK(p->p_fd);
> -               nfiles--;
> -               sx_xunlock(&filelist_lock);
> -               uma_zfree(file_zone, fp);
> -               return (error);
> -       }
>         fp->f_mtxp = mtx_pool_alloc();
>         fp->f_gcflag = 0;
>         fp->f_count = 1;
>         fp->f_cred = crhold(td->td_ucred);
>         fp->f_ops = &badfileops;
>         fp->f_seqcount = 1;
> +       FILEDESC_LOCK(p->p_fd);
>         if ((fq = p->p_fd->fd_ofiles[0])) {
>                 LIST_INSERT_AFTER(fq, fp, f_list);
>         } else {
>                 LIST_INSERT_HEAD(&filehead, fp, f_list);
>         }
> +       sx_xunlock(&filelist_lock);
> +       if ((error = fdalloc(td, 0, &i))) {
> +               FILEDESC_UNLOCK(p->p_fd);
> +               fdrop(fp, td);
> +               return (error);
> +       }
>         p->p_fd->fd_ofiles[i] = fp;
>         FILEDESC_UNLOCK(p->p_fd);
> -       sx_xunlock(&filelist_lock);
>         if (resultfp)
>                 *resultfp = fp;
>         if (resultfd)
> 
> This doesn't hold filelist_lock across fdalloc() and thus possibly across
> malloc().  It also just holds the filedesc pointer across malloc() so
> that fdalloc() is no longer racy (no internal races or introducing ones
> into calling code).
> 
> -- 
> 
> John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
> "Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

your patch seems reasonable, current dup and fcntl with F_DUPFD have
this race condition, to-be-dup'd descriptor could be close by another
thread beforce fdalloc returns, I sent a patch to alfred who I think is 
FILEDESC locking maintainer to fix this race in do_dup, I didn't touch
fdalloc() because I think a big patch could be refused.

David Xu


__________________________________________________
Do You Yahoo!?
Yahoo! Autos - Get free new car price quotes
http://autos.yahoo.com

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe cvs-all" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020717223752.70126.qmail>