Date: Thu, 25 Aug 2011 22:28:49 -0500 From: Brandon Gooch <jamesbrandongooch@gmail.com> To: Kostik Belousov <kostikbel@gmail.com> Cc: freebsd-hackers@freebsd.org, Charlie Martin <crmartin@sgi.com> Subject: Re: Where to ask about a 7.2 bug, and debugging sys/queue.h errors Message-ID: <CALBk6yLS%2Bm4tp4qJq0Lxx37tiyV295-LR01OtecV44UN0MUT2A@mail.gmail.com> In-Reply-To: <20110825222001.GX17489@deviant.kiev.zoral.com.ua> References: <4E56BB99.6030706@sgi.com> <20110825215348.GW17489@deviant.kiev.zoral.com.ua> <CALBk6yL=T5nfL3rqJtskNS3ahafr0PM-4j5hpqzk8WkK_UXJNQ@mail.gmail.com> <20110825222001.GX17489@deviant.kiev.zoral.com.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
2011/8/25 Kostik Belousov <kostikbel@gmail.com>: > On Thu, Aug 25, 2011 at 05:12:09PM -0500, Brandon Gooch wrote: >> On Thu, Aug 25, 2011 at 4:53 PM, Kostik Belousov <kostikbel@gmail.com> w= rote: >> > On Thu, Aug 25, 2011 at 03:16:09PM -0600, Charlie Martin wrote: >> >> We're having a crash in some internal code running on FreeBSD 7.2 >> >> (specifically =A07.2-PRERELEASE FreeBSD 7.2-PRERELEASE and yeah, I kn= ow >> >> it's quite a bit behind) in which after 18-30 hours of running load >> >> tests, the code panics with: >> >> >> >> panic: Bad link elm 0xffffff0044c09600 next->prev !=3D elm >> >> cpuid =3D 0 >> >> KDB: stack backtrace: >> >> db_trace_self_wrapper() at 0xffffffff8019119a =3D db_trace_self_wrapp= er+0x2a >> >> panic() at 0xffffffff80307c72 =3D panic+0x182 >> >> devfs_populate_loop() at 0xffffffff802a43a8 =3D devfs_populate_loop+0= x548 >> >> >> >> >> >> First question: where's the most appropriate place to ask about this >> >> kind of bug on a back version. >> > It is fine to ask there. >> > >> >> >> >> Second: does this remind anyone of any bugs? =A0Googling came up with= a >> >> few somewhat similar things but hasn't provided much insight so far. >> > In 99% of the cases, it means that you forgot to dev_ref() some cdev. >> >> So dev_ref increments the reference count for a cdev. Even though the >> work "loop" seems to indicate that we will iterate over a list of >> objects (one of which we may be missing a reference to via a missing >> dev_ref()), I'm not seeing how this can cause a panic from inside >> devfs_populate_loop(). >> >> Can you help me understand this? >> > Missing dev_ref() means that the memory for the cdev (and cdev_priv) is > freed prematurely. If this happens before destroy_dev() is called, > then the list which is iterated over by populate_loop(), is corrupted. > > See e.g. MAKEDEV_REF flag for make_dev(9) and its use in the (old) clone > handlers. > Ahhh, thanks Kostik. Reading make_dev(9) (and more source code) now... -Brandon
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CALBk6yLS%2Bm4tp4qJq0Lxx37tiyV295-LR01OtecV44UN0MUT2A>