Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 2 May 2019 16:22:54 +0200
From:      Roger Pau =?utf-8?B?TW9ubsOp?= <roger.pau@citrix.com>
To:        Kyle Evans <kevans@freebsd.org>
Cc:        Cy Schubert <Cy.Schubert@cschubert.com>, src-committers <src-committers@freebsd.org>, <svn-src-all@freebsd.org>, <svn-src-head@freebsd.org>
Subject:   Re: svn commit: r346670 - head/sys/net
Message-ID:  <20190502142254.ywpxblnztanbmu6v@Air-de-Roger.citrite.net>
In-Reply-To: <CACNAnaFKT4nEW5gqZEDN8qr7H-fY3NOQzWMDwiWtnCuznj_t0w@mail.gmail.com>
References:  <roger.pau@citrix.com> <20190502111106.pfosaq73kgo6g33j@Air-de-Roger.citrite.net> <201905021158.x42BwduG006865@slippy.cwsent.com> <20190502125334.ly3putfkfnxvbhqv@Air-de-Roger.citrite.net> <CACNAnaFKT4nEW5gqZEDN8qr7H-fY3NOQzWMDwiWtnCuznj_t0w@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, May 02, 2019 at 08:14:08AM -0500, Kyle Evans wrote:
> On Thu, May 2, 2019 at 7:54 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >
> > On Thu, May 02, 2019 at 04:58:39AM -0700, Cy Schubert wrote:
> > > In message <20190502111106.pfosaq73kgo6g33j@Air-de-Roger.citrite.net>,
> > > Roger Pa
> > > u =?utf-8?B?TW9ubsOp?= writes:
> > > > On Thu, Apr 25, 2019 at 12:44:08PM +0000, Kyle Evans wrote:
> > > > Apr 26 16:23:57.662653 panic: mtx_lock() of spin mutex (null) @ /usr/home/oss
> > > > test/build.135317.build-amd64-freebsd/freebsd/sys/kern/subr_bus.c:620
> > > > Apr 26 16:23:57.674650 cpuid = 2
> > > > Apr 26 16:23:57.686653 time = 1
> > > > Apr 26 16:23:57.686720 KDB: stack backtrace:
> > > > Apr 26 16:23:57.686797 db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/
> > > > frame 0xfffffe003abe8710
> > > > Apr 26 16:23:57.686879 vpanic() at vpanic+0x19d/frame 0xfffffe003abe8760
> > > > Apr 26 16:23:57.698637 panic() at panic+0x43/frame 0xfffffe003abe87c0
> > > > Apr 26 16:23:57.698700 __mtx_lock_flags() at __mtx_lock_flags+0x145/frame 0xf
> > > > ffffe003abe8810
> > > > Apr 26 16:23:57.710640 devctl_queue_data_f() at devctl_queue_data_f+0x6a/fram
> > > > e 0xfffffe003abe8840
> > > > Apr 26 16:23:57.722625 g_dev_taste() at g_dev_taste+0x463/frame 0xfffffe003ab
> > > > e8a00
> > > > Apr 26 16:23:57.722690 g_load_class() at g_load_class+0x1bc/frame 0xfffffe003
> > > > abe8a30
> > > > Apr 26 16:23:57.734638 g_run_events() at g_run_events+0x197/frame 0xfffffe003
> > > > abe8a70
> > > > Apr 26 16:23:57.734704 fork_exit() at fork_exit+0x84/frame 0xfffffe003abe8ab0
> > > > Apr 26 16:23:57.746655 fork_trampoline() at fork_trampoline+0xe/frame 0xfffff
> > > > e003abe8ab0
> > > > Apr 26 16:23:57.746721 --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> > > > Apr 26 16:23:57.758797 KDB: enter: panic
> > > > Apr 26 16:23:57.758913 [ thread pid 13 tid 100029 ]
> > > > Apr 26 16:23:57.758943 Stopped at      kdb_enter+0x3b: movq    $0,kdb_why
> > > > Apr 26 16:23:57.770557 db>
> > > >
> > > > The automatic bisector has pointed as this commit as the culprit, you
> > > > can see the full bisection at:
> > > >
> > > > https://lists.xenproject.org/archives/html/xen-devel/2019-04/msg02061.html
> > > >
> > > > And an example of a failed test at:
> > > >
> > > > https://lists.xenproject.org/archives/html/xen-devel/2019-05/msg00104.html
> > > > http://logs.test-lab.xenproject.org/osstest/logs/135458/
> > > >
> > > > Thanks, Roger.
> > > >
> > > >
> > >
> > > It made a strange connection to this commit. The panic has geom written
> > > all over it.
> >
> > I agree it's a strange connection, but the results from the bisection
> > are quite clear, the previous commit which is 070cf1ede1850d8c
> > (r346664) works fine and d61e108233bfdb3 (r346670) this commit
> > fails.
> >
> > The bisection looks reliable as there are no skipped revisions or
> > spurious failures.
> >
> 
> This panic seems to make sense, generally if I read things right, but
> I'm not immediately sure how my commit triggered it. The mutex in
> question is initialized in devinit, invoked by the root_bus_mod at
> SI_SUB_DRIVERS + SI_ORDER_FIRST [0]. geom classes are also declared at
> SI_SUB_DRIVERS + SI_ORDER_FIRST [1], which takes us on a trip through
> g_modevent -> g_init. g_init creates the g_event thread [2] ->
> g_event_procbody -> g_run_events -> ... -> (boom). I guess the
> timing/ordering of these things normally works out so that devinit
> gets the mutex ready before the g_event thread does any
> loading/tasting, but not here.

My bet is that your usage of SX_SYSINIT somehow changed the order of
the geom stuff and now it's exploding.

I think I might have a fix for this, but I need to reproduce on the
CI test boxes, my test box doesn't trigger the issue.

Roger.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190502142254.ywpxblnztanbmu6v>