Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 26 Aug 2015 16:36:16 -0700
From:      John-Mark Gurney <jmg@funkthat.com>
To:        Andriy Gapon <avg@FreeBSD.org>
Cc:        FreeBSD Current <freebsd-current@FreeBSD.org>, Lawrence Stewart <lstewart@room52.net>, Pawel Pekala <pawel@FreeBSD.org>, "K. Macy" <kmacy@FreeBSD.org>
Subject:   Re: Instant panic while trying run ports-mgmt/poudriere
Message-ID:  <20150826233616.GU33167@funkthat.com>
In-Reply-To: <55D96E24.9060106@FreeBSD.org>
References:  <20150713231205.627bab36@FreeBSD.org> <20150714223829.GY8523@funkthat.com> <20150715174616.652d0aea@FreeBSD.org> <20150715180526.GM8523@funkthat.com> <20150715223703.78b9197c@FreeBSD.org> <CAHM0Q_PLRP4t6JgkstXHNOVV%2B2DyathOgi8bg4-RQkW-BcGXow@mail.gmail.com> <20150806233328.47a02594@FreeBSD.org> <55CB5428.2090505@room52.net> <55D96E24.9060106@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Andriy Gapon wrote this message on Sun, Aug 23, 2015 at 09:54 +0300:
> On 12/08/2015 17:11, Lawrence Stewart wrote:
> > On 08/07/15 07:33, Pawel Pekala wrote:
> >> Hi K.,
> >>
> >> On 2015-08-06 12:33 -0700, "K. Macy" <kmacy@freebsd.org> wrote:
> >>> Is this still happening?
> >>
> >> Still crashes:
> > 
> > +1 for me running r286617
> 
> Here is another +1 with r286922.
> I can add a couple of bits of debugging data:
> 
> (kgdb) fr 8
> #8  0xffffffff80639d60 in knote (list=0xfffff8019a733ea0,
> hint=2147483648, lockflags=<value optimized out>) at
> /usr/src/sys/kern/kern_event.c:1964
> 1964                    } else if ((lockflags & KNF_NOKQLOCK) != 0) {
> (kgdb) p *list
> $2 = {kl_list = {slh_first = 0x0}, kl_lock = 0xffffffff8063a1e0

We should/cannot get here w/ an empty list.  If we do, then there is
something seriously wrong...  The current kn (which we must have as we
are here) MUST be on the list, but as you just showed, there are no
knotes on the list.

Can you get me a print of the knote?  That way I can see what flags
are on it?

> <knlist_mtx_lock>, kl_unlock = 0xffffffff8063a200 <knlist_mtx_unlock>,
>   kl_assert_locked = 0xffffffff8063a220 <knlist_mtx_assert_locked>,
> kl_assert_unlocked = 0xffffffff8063a240 <knlist_mtx_assert_unlocked>,
>   kl_lockarg = 0xfffff8019a733bb0}
> (kgdb) disassemble
> Dump of assembler code for function knote:
> 0xffffffff80639d00 <knote+0>:   push   %rbp
> 0xffffffff80639d01 <knote+1>:   mov    %rsp,%rbp
> 0xffffffff80639d04 <knote+4>:   push   %r15
> 0xffffffff80639d06 <knote+6>:   push   %r14
> 0xffffffff80639d08 <knote+8>:   push   %r13
> 0xffffffff80639d0a <knote+10>:  push   %r12
> 0xffffffff80639d0c <knote+12>:  push   %rbx
> 0xffffffff80639d0d <knote+13>:  sub    $0x18,%rsp
> 0xffffffff80639d11 <knote+17>:  mov    %edx,%r12d
> 0xffffffff80639d14 <knote+20>:  mov    %rsi,-0x30(%rbp)
> 0xffffffff80639d18 <knote+24>:  mov    %rdi,%rbx
> 0xffffffff80639d1b <knote+27>:  test   %rbx,%rbx
> 0xffffffff80639d1e <knote+30>:  je     0xffffffff80639ef6 <knote+502>
> 0xffffffff80639d24 <knote+36>:  mov    %r12d,%eax
> 0xffffffff80639d27 <knote+39>:  and    $0x1,%eax
> 0xffffffff80639d2a <knote+42>:  mov    %eax,-0x3c(%rbp)
> 0xffffffff80639d2d <knote+45>:  mov    0x28(%rbx),%rdi
> 0xffffffff80639d31 <knote+49>:  je     0xffffffff80639d38 <knote+56>
> 0xffffffff80639d33 <knote+51>:  callq  *0x18(%rbx)
> 0xffffffff80639d36 <knote+54>:  jmp    0xffffffff80639d42 <knote+66>
> 0xffffffff80639d38 <knote+56>:  callq  *0x20(%rbx)
> 0xffffffff80639d3b <knote+59>:  mov    0x28(%rbx),%rdi
> 0xffffffff80639d3f <knote+63>:  callq  *0x8(%rbx)
> 0xffffffff80639d42 <knote+66>:  mov    %rbx,-0x38(%rbp)
> 0xffffffff80639d46 <knote+70>:  mov    (%rbx),%rbx
> 0xffffffff80639d49 <knote+73>:  test   %rbx,%rbx
> 0xffffffff80639d4c <knote+76>:  je     0xffffffff80639ee5 <knote+485>
> 0xffffffff80639d52 <knote+82>:  and    $0x2,%r12d
> 0xffffffff80639d56 <knote+86>:  nopw   %cs:0x0(%rax,%rax,1)
> 0xffffffff80639d60 <knote+96>:  mov    0x28(%rbx),%r14
> 
> Panic is in the last quoted instruction.
> And:
> (kgdb) i reg
> rax            0x246    582
> rbx            0xdeadc0dedeadc0de       -2401050962867404578
> rcx            0x0      0
> rdx            0x12e    302
> rsi            0xffffffff80a26a5a       -2136839590
> rdi            0xffffffff80e81b80       -2132272256
> rbp            0xfffffe02b7efea20       0xfffffe02b7efea20
> rsp            0xfffffe02b7efe9e0       0xfffffe02b7efe9e0
> r8             0xffffffff80a269ce       -2136839730
> r9             0xffffffff80e82838       -2132269000
> r10            0x10000  65536
> r11            0xffffffff80fabd10       -2131051248
> r12            0x0      0
> r13            0xfffff801ff84a818       -8787511171048
> r14            0xfffff801ff84a800       -8787511171072
> r15            0xfffff8019a6974f0       -8789207452432
> rip            0xffffffff80639d60       0xffffffff80639d60 <knote+96>
> eflags         0x10286  66182
> 
> I think that $rbx stands out here (this is a kernel with INVARIANTS).

Yeh, it was probably r284861 that I added to catch use after free bugs
like this...  You could try reverting r284861 to see if the bug goes
away... If it does, then this is most likely a use after free bug...

> Looking at the code, is it possible that one of the calls from within
> the loop's body modifies the list?  If that is so and provided that is a
> valid behavior, then maybe using SLIST_FOREACH_SAFE would help.

-- 
  John-Mark Gurney				Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150826233616.GU33167>