From owner-freebsd-stable@freebsd.org Tue Apr 17 11:00:10 2018 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2AFFCF99A52 for ; Tue, 17 Apr 2018 11:00:10 +0000 (UTC) (envelope-from eugen@grosbein.net) Received: from hz.grosbein.net (hz.grosbein.net [78.47.246.247]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "hz.grosbein.net", Issuer "hz.grosbein.net" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 751076A015; Tue, 17 Apr 2018 11:00:09 +0000 (UTC) (envelope-from eugen@grosbein.net) Received: from eg.sd.rdtc.ru (root@eg.sd.rdtc.ru [62.231.161.221] (may be forged)) by hz.grosbein.net (8.15.2/8.15.2) with ESMTPS id w3HB01Wo056928 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 17 Apr 2018 13:00:01 +0200 (CEST) (envelope-from eugen@grosbein.net) X-Envelope-From: eugen@grosbein.net X-Envelope-To: freebsd-stable@FreeBSD.org Received: from eg.sd.rdtc.ru (eugen@localhost [127.0.0.1]) by eg.sd.rdtc.ru (8.15.2/8.15.2) with ESMTP id w3HAxqqJ000153; Tue, 17 Apr 2018 17:59:52 +0700 (+07) (envelope-from eugen@grosbein.net) Subject: Re: stable/11 r332356 started panicing in bpf_dtor/__mtx_lock_sleep To: FreeBSD Stable References: <5AD5BEA5.6080306@grosbein.net> Cc: Matt Joras From: Eugene Grosbein X-Enigmail-Draft-Status: N1110 Message-ID: <5AD5D3A8.4060204@grosbein.net> Date: Tue, 17 Apr 2018 17:59:52 +0700 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 MIME-Version: 1.0 In-Reply-To: <5AD5BEA5.6080306@grosbein.net> Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=4.7 required=5.0 tests=BAYES_00, DATE_IN_FUTURE_96_Q, LOCAL_FROM, RDNS_NONE autolearn=no autolearn_force=no version=3.4.1 X-Spam-Report: * 2.5 DATE_IN_FUTURE_96_Q Date: is 4 days to 4 months after Received: date * -2.3 BAYES_00 BODY: Bayes spam probability is 0 to 1% * [score: 0.0000] * 2.6 LOCAL_FROM From my domains * 1.9 RDNS_NONE Delivered to internal network by a host with no rDNS X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on hz.grosbein.net X-Spam-Level: **** X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Apr 2018 11:00:10 -0000 On 17.04.2018 16:30, Eugene Grosbein wrote: CCing mjoras@ as author of suspicious change https://svnweb.freebsd.org/base?view=revision&revision=323477 > I have a server that was running stable/11 rock-stable for many months. It was running stable/11 r314043 before last update. > A week ago I've updates it to 11.1-STABLE r332356 and today it paniced and I have crashdump. > Any thoughts? > > Unread portion of the kernel message buffer: > Sleeping thread (tid 100444, pid 28400) owns a non-sleepable lock > KDB: stack backtrace of thread 100444: > sched_switch() at sched_switch+0x626/frame 0xfffffe03550f9740 > mi_switch() at mi_switch+0xc5/frame 0xfffffe03550f9770 > sleepq_wait() at sleepq_wait+0x2c/frame 0xfffffe03550f97a0 > _sx_xlock_hard() at _sx_xlock_hard+0x2f0/frame 0xfffffe03550f9830 > vlan_ioctl() at vlan_ioctl+0x53f/frame 0xfffffe03550f98b0 > ifpromisc() at ifpromisc+0x106/frame 0xfffffe03550f9910 > bpf_detachd_locked() at bpf_detachd_locked+0x1b4/frame 0xfffffe03550f9960 > bpf_dtor() at bpf_dtor+0x9a/frame 0xfffffe03550f9990 > devfs_fpdrop() at devfs_fpdrop+0x9c/frame 0xfffffe03550f99b0 > devfs_close_f() at devfs_close_f+0x45/frame 0xfffffe03550f99e0 > closef() at closef+0x209/frame 0xfffffe03550f9a70 > closefp() at closefp+0x8b/frame 0xfffffe03550f9ab0 > amd64_syscall() at amd64_syscall+0x25c/frame 0xfffffe03550f9bf0 > fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe03550f9bf0 > --- syscall (6, FreeBSD ELF64, sys_close), rip = 0x4a317a, rsp = 0x7fffffffe3d8, rbp = 0x7fffffffe3f0 --- > panic: sleeping thread > cpuid = 1 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0354c6d7c0 > vpanic() at vpanic+0x177/frame 0xfffffe0354c6d820 > panic() at panic+0x43/frame 0xfffffe0354c6d880 > propagate_priority() at propagate_priority+0x183/frame 0xfffffe0354c6d8b0 > turnstile_wait() at turnstile_wait+0x2bc/frame 0xfffffe0354c6d900 > __mtx_lock_sleep() at __mtx_lock_sleep+0x151/frame 0xfffffe0354c6d960 > bpf_dtor() at bpf_dtor+0x191/frame 0xfffffe0354c6d990 > devfs_fpdrop() at devfs_fpdrop+0x9c/frame 0xfffffe0354c6d9b0 > devfs_close_f() at devfs_close_f+0x45/frame 0xfffffe0354c6d9e0 > closef() at closef+0x209/frame 0xfffffe0354c6da70 > closefp() at closefp+0x8b/frame 0xfffffe0354c6dab0 > amd64_syscall() at amd64_syscall+0x25c/frame 0xfffffe0354c6dbf0 > fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe0354c6dbf0 > --- syscall (6, FreeBSD ELF64, sys_close), rip = 0x4a317a, rsp = 0x7fffffffe3d8, rbp = 0x7fffffffe3f0 --- > Uptime: 6d2h0m17s > Dumping 1414 out of 12265 MB:..2%..11%..21%..31%..41%..51%..62%..71%..81%..91% > > > (kgdb) bt > #0 doadump (textdump=1) at pcpu.h:229 > #1 0xffffffff804f3b9a in kern_reboot (howto=260) at /usr/local/src/sys/kern/kern_shutdown.c:383 > #2 0xffffffff804f3f81 in vpanic (fmt=, ap=) > at /usr/local/src/sys/kern/kern_shutdown.c:776 > #3 0xffffffff804f3dc3 in panic (fmt=) at /usr/local/src/sys/kern/kern_shutdown.c:707 > #4 0xffffffff80546b33 in propagate_priority (td=) at /usr/local/src/sys/kern/subr_turnstile.c:226 > #5 0xffffffff8054720c in turnstile_wait (ts=0xfffff80151ac4180, owner=, queue=) > at /usr/local/src/sys/kern/subr_turnstile.c:742 > #6 0xffffffff804dda81 in __mtx_lock_sleep (c=0xffffffff80c9c910, v=) > at /usr/local/src/sys/kern/kern_mutex.c:627 > #7 0xffffffff805cd061 in bpf_dtor (data=0xfffff801512df800) at /usr/local/src/sys/net/bpf.c:778 > #8 0xffffffff80455e5c in devfs_fpdrop (fp=) at /usr/local/src/sys/fs/devfs/devfs_vnops.c:193 > #9 0xffffffff804595d5 in devfs_close_f (fp=0xfffff802af2fe6e0, td=) > at /usr/local/src/sys/fs/devfs/devfs_vnops.c:675 > #10 0xffffffff804b1549 in closef (fp=0xfffff802af2fe6e0, td=0xfffff8000cf75000) at file.h:346 > #11 0xffffffff804af00b in closefp (fdp=0xfffff80083b48000, fd=, fp=0xfffff802af2fe6e0, > td=0xfffff8000cf75000, holdleaders=0) at /usr/local/src/sys/kern/kern_descrip.c:1191 > #12 0xffffffff807afb6c in amd64_syscall (td=0xfffff8000cf75000, traced=0) at subr_syscall.c:132 > #13 0xffffffff80791c3d in fast_syscall_common () at /usr/local/src/sys/amd64/amd64/exception.S:480 > #14 0x00000000004a317a in ?? () > Previous frame inner to this frame (corrupt stack?) Some more details: this monitoring server has many vlans over em1. It runs some custom perl code utilizing Net::Pcap over bpf. Many times per minute it opens/closes bpf to send/receive custom PPPoE frames using vlans (sends over multiple if_vlan and receives using their parent em1). This server also multiple times per minute starts and stops receiving IPTV multicast traffic using another em0 interface as part of another monitoring job. Again, r314043 was pretty stable.