Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 12 Nov 2013 23:46:55 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Justin Hibbits <jhibbits@freebsd.org>
Cc:        FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>
Subject:   Re: Strange panic on ppc64
Message-ID:  <20131112214655.GZ59496@kib.kiev.ua>
In-Reply-To: <CAHSQbTCnFnVBtL%2B9VOn%2B9zNMJqt=cLyKB6AYjDzjHrddZq65ug@mail.gmail.com>
References:  <CAHSQbTD6%2BDd-So88gSArTtpcA=w4D-GibGpoFLoHQuFPjUrKuA@mail.gmail.com> <20131112205142.GY59496@kib.kiev.ua> <CAHSQbTCnFnVBtL%2B9VOn%2B9zNMJqt=cLyKB6AYjDzjHrddZq65ug@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--P5QUT6hvaumhd+N6
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Nov 12, 2013 at 01:13:28PM -0800, Justin Hibbits wrote:
> On Tue, Nov 12, 2013 at 12:51 PM, Konstantin Belousov
> <kostikbel@gmail.com>wrote:
>=20
> > On Tue, Nov 12, 2013 at 08:32:31AM -0800, Justin Hibbits wrote:
> > > The log is attached.  I'm not sure what exactly is going on here.  The
> > > conditions were: building something on zfs, while also accessing files
> > over
> > > NFS.  It seems each of those individually is fine, but doing both it
> > brings
> > > my system down.  I _think_ the actual panic message (recursed on
> > > non-recursive mutex) is a red herring, since it already trapped in the
> > > kernel, twice.  Any clues?  It's 100% reproducible by me.
> > >
> > This does not seems related to NFS or ZFS proper.  What happens is
> > that tc_windup() executing in the interupt context decided to enter
> > a debugger.  I am not sure why the debugger is entered.
> >
> > Apart from this, the situation is clear:
> > the interrupt happens while the referenced mutex was owned. The debugger
> > is entered, and tries to read a char from keyboard, which is USB. For
> > USB to function, it has to access a lot of the kernel services, in
> > particular, busdma, which, in turn, requires some pmap calls, and you
> > end up accessing the same mutex.
> >
> > The bug there is that code executed from interrupt or debugger context
> > must not lock mutexes, or generally, call into top-half of the kernel
> > (now top half is essentially the whole kernel).  I am not sure if
> > USB could ever work in such mode.
> >
>=20
> I discussed this with Nathan on IRC earlier.  You're right that it's not
> related to NFS nor ZFS, at least not directly.  It's actually most likely=
 a
> stack overflow, since currently there are only 4 pages for stack, so when
> it takes the DECR trap it ends up blowing the stack.  This is only made
> evident because ZFS is very stack hungry.  I'm upping the stack to 8 page=
s,
> and testing tonight.
>=20
> As for your assessment of the situation, you're spot on, and I have no id=
ea
> how to properly fix it.

For stack overflow, I would not see the frames I talked about.
The panic clearly states that you get a recursion on mutex, and sleepable
mutex must not be locked from the interrupt or debugger context.

--P5QUT6hvaumhd+N6
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (FreeBSD)

iQIcBAEBAgAGBQJSgqHOAAoJEJDCuSvBvK1B7DgP/iSTNazC1PY5ogX6Tc+oV2SO
84QnWaf75ysiGKoyrqbWHFS+ehzsD1p8eWzuihdZ8YcE09T42FPWlMiQs0NkuKkI
OkczwpIAQiqkhac5MD8ryVmpSc8PBa03zZgDlgYo2euROxT8HWxlEikKMp6WyL22
xCQCrX0+Ndcgps8OlEMMWI4IqQZBPGMomjT4/idO5Qh4i6acyT43piYSG5B8H2NH
V97mAiADZXMZrLpxwklhDoEMrYA7t5EMvwZxm8ErtaNG/G36yxgRwtBhj4Hlry5U
xUZH6YyhKSTTbLujfVckMfaB4Muos4g3G2gfqDdNDiwipJMQ8MyAE0ld8EFXaHoh
fM9VhV0keWRGwknSH9eGei5z1zsgx/tVETW6lhXDJMevqeK0tx39Vs8Rdgoi1qPY
js4Z2VZnOCqWR/y5I3ygH8jvxxyOfHmZd79vhCZdWHIZ44h9SfJDjCv7Ejjad47s
NM/zhA4mc+sv9Zlkax5/zphNnLi9q/13TZojBrchptFTrXZ9getHYVkLi/kGRRn1
tpmHfO4djViq8BqzMwxCqyl+FP8rl9Co2jp1aRpu2axYit+G/zG2L8DxVhuH9Q0i
49Kb1rTw9NrH3ufctH1LTDcvVHpYx9WGay9Usjh/3JhhvybVlNAH1pmZURLxjbT3
Atm0m5iiVapi1jdnKoS6
=m8Qh
-----END PGP SIGNATURE-----

--P5QUT6hvaumhd+N6--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20131112214655.GZ59496>