Date: Thu, 17 Nov 2011 11:06:53 +0200 From: Kostik Belousov <kostikbel@gmail.com> To: Alexander Motin <mav@freebsd.org> Cc: freebsd-current@freebsd.org, Andriy Gapon <avg@freebsd.org> Subject: Re: Stop scheduler on panic Message-ID: <20111117090653.GR50300@deviant.kiev.zoral.com.ua> In-Reply-To: <4EC4C89A.2040208@FreeBSD.org> References: <20111113083215.GV50300@deviant.kiev.zoral.com.ua> <20111116202714.5ee4bd53@fabiankeil.de> <4EC43764.1020202@FreeBSD.org> <4EC4423A.3020904@FreeBSD.org> <20111117081533.GP50300@deviant.kiev.zoral.com.ua> <4EC4C89A.2040208@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--L9AyDL8231zxsA/w Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Nov 17, 2011 at 10:40:58AM +0200, Alexander Motin wrote: > On 11/17/11 10:15, Kostik Belousov wrote: > > On Thu, Nov 17, 2011 at 01:07:38AM +0200, Alexander Motin wrote: > >> On 17.11.2011 00:21, Andriy Gapon wrote: > >>> on 16/11/2011 21:27 Fabian Keil said the following: > >>>> Kostik Belousov<kostikbel@gmail.com> wrote: > >>>> > >>>>> I was tricked into finishing the work by Andrey Gapon, who developed > >>>>> the patch to reliably stop other processors on panic. The patch > >>>>> greatly improves the chances of getting dump on panic on SMP host. > >>>> > >>>> I tested the patch trying to get a dump (from the debugger) for > >>>> kern/162036, which currently results in the double fault reported in: > >>>> http://lists.freebsd.org/pipermail/freebsd-current/2011-September/02= 7766.html > >>>> > >>>> It didn't help, but also didn't make anything worse. > >>>> > >>>> Fabian > >>> > >>> The mi_switch recursion looks very familiar to me: > >>> mi_switch() at mi_switch+0x270 > >>> critical_exit() at critical_exit+0x9b > >>> spinlock_exit() at spinlock_exit+0x17 > >>> mi_switch() at mi_switch+0x275 > >>> critical_exit() at critical_exit+0x9b > >>> spinlock_exit() at spinlock_exit+0x17 > >>> [several pages of the previous three lines skipped] > >>> mi_switch() at mi_switch+0x275 > >>> critical_exit() at critical_exit+0x9b > >>> spinlock_exit() at spinlock_exit+0x17 > >>> intr_even_schedule_thread() at intr_event_schedule_thread+0xbb > >>> ahci_end_transaction() at ahci_end_transaction+0x398 > >>> ahci_ch_intr() at ahci_ch_intr+0x2b5 > >>> ahcipoll() at ahcipoll+0x15 > >>> xpt_polled_action() at xpt_polled_action+0xf7 > >>> > >>> In fact I once discussed with jhb this recursion triggered from a dif= ferent > >>> place. To quote myself: > >>> <avg> spinlock_exit -> critical_exit -> mi_switch -> kdb_switch= -> > >>> thread_unlock -> spinlock_exit -> critical_exit -> mi_switch -> .= .. > >>> <avg> in the kdb context > >>> <avg> this issue seems to be triggered by td_owepreempt being true= at=20 > >>> the time > >>> kdb is entered > >>> <avg> and there of course has to be an initial spinlock_exit call= =20 > >>> somewhere > >>> <avg> in my case it's because of usb keyboard > >>> <avg> I wonder if it would make sense to clear td_owepreempt right= =20 > >>> before > >>> calling kdb_switch in mi_switch > >>> <avg> instead of in sched_switch() > >>> <avg> clearing td_owepreempt seems like a scheduler-independent=20 > >>> operation to me > >>> <avg> or is it better to just skip locking in usb when kdb_active = is set > >>> <avg> ? > >>> > >>> The workaround described above should work in this case. > >>> Another possibility is to pessimize mtx_unlock_spin() implementations= to=20 > >>> check > >>> SCHEDULER_STOPPED() and to bypass any further actions in that case. = But=20 > >>> that > >>> would add unnecessary overhead to the sunny day code paths. > >>> > >>> Going further up the stack one can come up with the following proposa= ls: > >>> - check SCHEDULER_STOPPED() swi_sched() and return early > >>> - do not call swi_sched() from xpt_done() if we somehow know that we = are=20 > >>> in a > >>> polling mode > >> > >> There is no flag in CAM now to indicate polling mode, but if needed, i= t=20 > >> should not be difficult to add one and not call swi_sched(). > >=20 > > I have the following change for eons on my test boxes. Without it, > > I simply cannot get _any_ dump. > >=20 > > diff --git a/sys/cam/cam_xpt.c b/sys/cam/cam_xpt.c > > index 10b89c7..a38e42f 100644 > > --- a/sys/cam/cam_xpt.c > > +++ b/sys/cam/cam_xpt.c > > @@ -4230,7 +4230,7 @@ xpt_done(union ccb *done_ccb) > > TAILQ_INSERT_TAIL(&cam_simq, sim, links); > > mtx_unlock(&cam_simq_lock); > > sim->flags |=3D CAM_SIM_ON_DONEQ; > > - if (first) > > + if (first && panicstr =3D=3D NULL) > > swi_sched(cambio_ih, 0); > > } > > } >=20 > That should be OK for kernel dumping. I was thinking about CAM abusing > polling not only for dumping. But looking on cases where it does it now, > may be it is better to rewrite them instead. So, should I interpret your response as 'Reviewed by" ? --L9AyDL8231zxsA/w Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (FreeBSD) iEYEARECAAYFAk7Ezq0ACgkQC3+MBN1Mb4hbuQCgpbGVw/gU8OWAA/HefdszTLdk YYMAoItZnmQ4LsGQZhC5yOMhXgpJq5Xd =30t7 -----END PGP SIGNATURE----- --L9AyDL8231zxsA/w--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20111117090653.GR50300>