Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 26 Jul 2015 14:07:12 +0000
From:      bugzilla-noreply@freebsd.org
To:        freebsd-pf@FreeBSD.org
Subject:   [Bug 201879] panic: boot time panic with a scrub rule on "exclusive sleep mutex pf fragments"...
Message-ID:  <bug-201879-17777-VqZnNy7ijY@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-201879-17777@https.bugs.freebsd.org/bugzilla/>
References:  <bug-201879-17777@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=201879

--- Comment #2 from Jason Unovitch <jason.unovitch@gmail.com> ---
Created attachment 159239
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=159239&action=edit
r285884M panic with some extra debug statements in  pf_purge_expired_fragments

(In reply to Kristof Provost from comment #1)

I'm with you and I'm still trying to understand it myself.  Last night I
sprinkled some debug prints around pf_purge_expired_fragments since the line
237 pointed me in the right direction there.

I managed to find a replication case.  I can get a stable boot with the scrub
rules if I turn off OpenNTPD along with Puppet and Monit since they will try to
start it manually.  The full boot log is attached and the tail end of it is
below.  Once I start OpenNTPD the burst of network traffic causes an instance
panic the next time the pf purge thread fires.

root@xju-rtr:~ # service openntpd onestart
Starting openntpd.
root@xju-rtr:~ # Jul 26 03:43:23 xju-rtr ntpd[23153]: constraint certificate
verification turned off
DEBUG: Entry of pf_purge_expired_fragments()
DEBUG: Trying to PR_FRAG_LOCK()()
DEBUG: Finished PF_FRAG_LOCK()
DEBUG: Start fragment purge()
Kernel page fault with the following non-sleepable locks held:
exclusive sleep mutex pf fragments (pf fragments) r = 0 (0xc9fe2458) locked @
/usr/src/head/sys/modules/pf/../../netpfil/pf/pf_norm.c:239
KDB: stack backtrace:
db_trace_self_wrapper(c1538c45,702f6670,6f6e5f66,632e6d72,3933323a,...) at
db_trace_self_wrapper+0x2a/frame 0xeb7719a0
kdb_backtrace(c153cfd1,0,c9fe2458,c9fdfc3d,ef,...) at kdb_backtrace+0x2d/frame
0xeb771a08
witness_warn(5,0,c16ffc72,c1960a9c,c764f330,...) at witness_warn+0x40f/frame
0xeb771a58
trap_pfault(deadc0de,c,c7e62cc0,7f,c1960a10,...) at trap_pfault+0x58/frame
0xeb771ad0
trap(eb771c1c) at trap+0x6c1/frame 0xeb771c10
calltrap() at calltrap+0x6/frame 0xeb771c10
--- trap 0xc, eip = 0xc9fd00c6, esp = 0xeb771c5c, ebp = 0xeb771c74 ---
pf_purge_expired_fragments(c9fe20a0,c9fdea6a,5b8,c9fdeca0,1999997c,...) at
pf_purge_expired_fragments+0x96/frame 0xeb771c74
pf_purge_thread(0,eb771ce8,c152c72d,3e6,0,...) at pf_purge_thread+0x15/frame
0xeb771cac
fork_exit(c9fb2240,0,eb771ce8) at fork_exit+0x7e/frame 0xeb771cd4
fork_trampoline() at fork_trampoline+0x8/frame 0xeb771cd4
--- trap 0, eip = 0, esp = 0xeb771d20, ebp = 0 ---


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0xdeadc0de
fault code              = supervisor read, page not present
instruction pointer     = 0x20:0xc9fd00c6
stack pointer           = 0x28:0xeb771c5c
frame pointer           = 0x28:0xeb771c74
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 5280 (pf purge)
[ thread pid 5280 tid 100108 ]
Stopped at      pf_purge_expired_fragments+0x96:        movl    0(%eax),%esi
db>

-- 
You are receiving this mail because:
You are the assignee for the bug.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-201879-17777-VqZnNy7ijY>