Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 8 Aug 2005 15:24:01 +0200
From:      Antoine Pelisse <apelisse@gmail.com>
To:        Don Lewis <truckman@freebsd.org>, freebsd-current@freebsd.org
Subject:   Re: Fix for some stress panics
Message-ID:  <61c746830508080624312e35a4@mail.gmail.com>
In-Reply-To: <200508080913.j789DPW4094068@gw.catspoiler.org>
References:  <61c7468305080712591f8c7fda@mail.gmail.com> <200508080913.j789DPW4094068@gw.catspoiler.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 8/8/05, Don Lewis <truckman@freebsd.org> wrote:
>=20
> On 7 Aug, Antoine Pelisse wrote:
> > http://people.freebsd.org/~pho/stress/log/cons149.html
> > http://people.freebsd.org/~pho/stress/log/cons130.html
> >
> > I've been working on this panic today (the two are obviously
> > the same) and here is a patch to fix it:
> > --- sys/kern/kern_proc.c.orig Mon Apr 18 04:10:36 2005
> > +++ sys/kern/kern_proc.c Sun Aug 7 21:18:03 2005
> > @@ -884,10 +884,8 @@
> > _PHOLD(p);
> > FOREACH_THREAD_IN_PROC(p, td) {
> > fill_kinfo_thread(td, &kinfo_proc);
> > - PROC_UNLOCK(p);
> > error =3D SYSCTL_OUT(req, (caddr_t)&kinfo_proc,
> > sizeof(kinfo_proc));
> > - PROC_LOCK(p);
> > if (error)
> > break;
> > }
> >
> > As a matter of fact, if td is removed from the list through=20
> thread_unlink
> > while
> > the mutex is released and the next thread is removed just after, the=20
> FOREACH
> >
> > is looping through an unlinked list where the td_ksegrp has been set to=
=20
> NULL
> >
> > by thread_exit.
> > If we absolutely have to release the lock, then it's probably safer to=
=20
> check
> > if
> > td_ksegroup !=3D NULL in the fill_kinfo_thread function.
>=20
> Calling SYSCTL_OUT(), which calls copyout(), is not allowed while
> holding a mutex unless the userland buffer is wired. The reason is that
> if any of the userland buffer is not resident in RAM, the thread can
> block while the buffer is paged in, and this is not legal while holding
> a mutex.
>=20
> There are two ways to fix this:
>=20
> Allocate a temporary buffer of the appropriate size, grab the
> mutex, traverse the list and copy the data into the temporary
> buffer, release the mutex, call SYSCTL_OUT() to copy the
> contents of the temporary buffer to the user buffer, and free
> the temporary buffer.
>=20
> Wire the appropriate amount of the userland buffer using
> sysctl_wire_old_buffer(), grab the mutex, traverse the list,
> copying each item to userland with SYSCTL_OUT(), release the
> mutex, and unwire the userland buffer.



Is that what you meant ?

--- sys/kern/kern_proc.c.orig Mon Apr 18 04:10:36 2005
+++ sys/kern/kern_proc.c Mon Aug 8 15:09:50 2005
@@ -868,7 +868,7 @@
{
struct thread *td;
struct kinfo_proc kinfo_proc;
- int error =3D 0;
+ int error, buffersize =3D 0;
struct proc *np;
pid_t pid =3D p->p_pid;

@@ -883,11 +883,23 @@
} else {
_PHOLD(p);
FOREACH_THREAD_IN_PROC(p, td) {
- fill_kinfo_thread(td, &kinfo_proc);
- PROC_UNLOCK(p);
+ buffersize +=3D sizeof(struct kinfo_proc);
+ }
+ PROC_UNLOCK(p);
+
+ error =3D sysctl_wire_old_buffer(req, buffersize);
+ if (error)
+ {
+ _PRELE(p);
+ return error;
+ }
+
+ PROC_LOCK(p);
+ FOREACH_THREAD_IN_PROC(p, td) {
+ fill_kinfo_thread(td, &kinfo_proc);
error =3D SYSCTL_OUT(req, (caddr_t)&kinfo_proc,
sizeof(kinfo_proc));
- PROC_LOCK(p);
+
if (error)
break;
}


In either case, the appropriate buffer size must be calculated (possibly
> requiring the mutex to be grabbed and released) before executing an
> operation that could block, grabbing the mutex, and traversing the list.
> It is quite possible for the pre-calculated size to not match the actual
> size needed when the list is traversed to actually copy the data.
>=20
>=20
How can this issue be addressed ? Maybe some extra memory has to be wired ?



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?61c746830508080624312e35a4>