Date: Mon, 8 Aug 2005 15:24:01 +0200 From: Antoine Pelisse <apelisse@gmail.com> To: Don Lewis <truckman@freebsd.org>, freebsd-current@freebsd.org Subject: Re: Fix for some stress panics Message-ID: <61c746830508080624312e35a4@mail.gmail.com> In-Reply-To: <200508080913.j789DPW4094068@gw.catspoiler.org> References: <61c7468305080712591f8c7fda@mail.gmail.com> <200508080913.j789DPW4094068@gw.catspoiler.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 8/8/05, Don Lewis <truckman@freebsd.org> wrote: >=20 > On 7 Aug, Antoine Pelisse wrote: > > http://people.freebsd.org/~pho/stress/log/cons149.html > > http://people.freebsd.org/~pho/stress/log/cons130.html > > > > I've been working on this panic today (the two are obviously > > the same) and here is a patch to fix it: > > --- sys/kern/kern_proc.c.orig Mon Apr 18 04:10:36 2005 > > +++ sys/kern/kern_proc.c Sun Aug 7 21:18:03 2005 > > @@ -884,10 +884,8 @@ > > _PHOLD(p); > > FOREACH_THREAD_IN_PROC(p, td) { > > fill_kinfo_thread(td, &kinfo_proc); > > - PROC_UNLOCK(p); > > error =3D SYSCTL_OUT(req, (caddr_t)&kinfo_proc, > > sizeof(kinfo_proc)); > > - PROC_LOCK(p); > > if (error) > > break; > > } > > > > As a matter of fact, if td is removed from the list through=20 > thread_unlink > > while > > the mutex is released and the next thread is removed just after, the=20 > FOREACH > > > > is looping through an unlinked list where the td_ksegrp has been set to= =20 > NULL > > > > by thread_exit. > > If we absolutely have to release the lock, then it's probably safer to= =20 > check > > if > > td_ksegroup !=3D NULL in the fill_kinfo_thread function. >=20 > Calling SYSCTL_OUT(), which calls copyout(), is not allowed while > holding a mutex unless the userland buffer is wired. The reason is that > if any of the userland buffer is not resident in RAM, the thread can > block while the buffer is paged in, and this is not legal while holding > a mutex. >=20 > There are two ways to fix this: >=20 > Allocate a temporary buffer of the appropriate size, grab the > mutex, traverse the list and copy the data into the temporary > buffer, release the mutex, call SYSCTL_OUT() to copy the > contents of the temporary buffer to the user buffer, and free > the temporary buffer. >=20 > Wire the appropriate amount of the userland buffer using > sysctl_wire_old_buffer(), grab the mutex, traverse the list, > copying each item to userland with SYSCTL_OUT(), release the > mutex, and unwire the userland buffer. Is that what you meant ? --- sys/kern/kern_proc.c.orig Mon Apr 18 04:10:36 2005 +++ sys/kern/kern_proc.c Mon Aug 8 15:09:50 2005 @@ -868,7 +868,7 @@ { struct thread *td; struct kinfo_proc kinfo_proc; - int error =3D 0; + int error, buffersize =3D 0; struct proc *np; pid_t pid =3D p->p_pid; @@ -883,11 +883,23 @@ } else { _PHOLD(p); FOREACH_THREAD_IN_PROC(p, td) { - fill_kinfo_thread(td, &kinfo_proc); - PROC_UNLOCK(p); + buffersize +=3D sizeof(struct kinfo_proc); + } + PROC_UNLOCK(p); + + error =3D sysctl_wire_old_buffer(req, buffersize); + if (error) + { + _PRELE(p); + return error; + } + + PROC_LOCK(p); + FOREACH_THREAD_IN_PROC(p, td) { + fill_kinfo_thread(td, &kinfo_proc); error =3D SYSCTL_OUT(req, (caddr_t)&kinfo_proc, sizeof(kinfo_proc)); - PROC_LOCK(p); + if (error) break; } In either case, the appropriate buffer size must be calculated (possibly > requiring the mutex to be grabbed and released) before executing an > operation that could block, grabbing the mutex, and traversing the list. > It is quite possible for the pre-calculated size to not match the actual > size needed when the list is traversed to actually copy the data. >=20 >=20 How can this issue be addressed ? Maybe some extra memory has to be wired ?
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?61c746830508080624312e35a4>