From owner-freebsd-current@FreeBSD.ORG Mon Mar 10 18:05:12 2014 Return-Path: Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9251E88F; Mon, 10 Mar 2014 18:05:12 +0000 (UTC) Received: from mail0.glenbarber.us (mail0.glenbarber.us [208.86.227.67]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 62800AA8; Mon, 10 Mar 2014 18:05:12 +0000 (UTC) Received: from glenbarber.us (70.15.88.86.res-cmts.sewb.ptd.net [70.15.88.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) (Authenticated sender: gjb) by mail0.glenbarber.us (Postfix) with ESMTPSA id 384055526; Mon, 10 Mar 2014 18:05:10 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.8.3 mail0.glenbarber.us 384055526 Authentication-Results: mail0.glenbarber.us; dkim=none reason="no signature"; dkim-adsp=none Date: Mon, 10 Mar 2014 14:05:08 -0400 From: Glen Barber To: Konstantin Belousov Subject: Re: panic: vm_fault: fault on nofault entry Message-ID: <20140310180508.GI1746@glenbarber.us> References: <20140309165648.GF1776@glenbarber.us> <20140309180132.GO24664@kib.kiev.ua> <20140309181657.GI1776@glenbarber.us> <20140310154606.GQ24664@kib.kiev.ua> <20140310155115.GH1746@glenbarber.us> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="HkMjoL2LAeBLhbFV" Content-Disposition: inline In-Reply-To: <20140310155115.GH1746@glenbarber.us> X-Operating-System: FreeBSD 11.0-CURRENT amd64 X-SCUD-Definition: Sudden Completely Unexpected Dataloss X-SULE-Definition: Sudden Unexpected Learning Event User-Agent: Mutt/1.5.22 (2013-10-16) Cc: freebsd-current@FreeBSD.org X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Mar 2014 18:05:12 -0000 --HkMjoL2LAeBLhbFV Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Mar 10, 2014 at 11:51:15AM -0400, Glen Barber wrote: > On Mon, Mar 10, 2014 at 05:46:06PM +0200, Konstantin Belousov wrote: > > On Sun, Mar 09, 2014 at 02:16:57PM -0400, Glen Barber wrote: > > > panic: vm_fault: fault on nofault entry, addr: fffffe03becbc000 > >=20 > > I see, this panic is for access to the kernel map, not for the direct m= ap. > > I think that this is a race with other CPU unmapping some page in the > > kernel map, which cannot be solved by access checks. > >=20 > > Please try the following. I booted with the patch and checked that > > kgdb /boot/kernel/kernel /dev/mem works, but did not tried to reproduce > > the issue. > >=20 >=20 > Thank you for looking into this. I will report back. >=20 The machine this was tested paniced again, but a bit differently. This is the kgdb session from this vmcore: Script started on Mon Mar 10 17:58:33 2014 command: /bin/sh # kgdb ./kernel.debug /var/crash/vmcore.last GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain condition= s. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: Sleeping thread (tid 100702, pid 24712) owns a non-sleepable lock KDB: stack backtrace of thread 100702: sched_switch() at sched_switch+0x29e/frame 0xfffffe18390b8820 mi_switch() at mi_switch+0xe1/frame 0xfffffe18390b8860 sleepq_catch_signals() at sleepq_catch_signals+0xab/frame 0xfffffe18390b88e0 sleepq_wait_sig() at sleepq_wait_sig+0xf/frame 0xfffffe18390b8910 _sleep() at _sleep+0x2a3/frame 0xfffffe18390b8990 pipe_read() at pipe_read+0x34a/frame 0xfffffe18390b89f0 dofileread() at dofileread+0x95/frame 0xfffffe18390b8a40 kern_readv() at kern_readv+0x68/frame 0xfffffe18390b8a90 sys_read() at sys_read+0x63/frame 0xfffffe18390b8ae0 amd64_syscall() at amd64_syscall+0x3fb/frame 0xfffffe18390b8bf0 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe18390b8bf0 --- syscall (3, FreeBSD ELF64, sys_read), rip =3D 0x800b8443a, rsp =3D 0x7f= ffffffac88, rbp =3D 0x7fffffffb500 --- panic: sleeping thread cpuid =3D 19 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe18392db= 010 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe18392db0c0 panic() at panic+0x155/frame 0xfffffe18392db140 propagate_priority() at propagate_priority+0x259/frame 0xfffffe18392db170 turnstile_wait() at turnstile_wait+0x3fe/frame 0xfffffe18392db1c0 __mtx_lock_sleep() at __mtx_lock_sleep+0x163/frame 0xfffffe18392db240 vm_map_lookup() at vm_map_lookup+0x38/frame 0xfffffe18392db2c0 vm_fault_hold() at vm_fault_hold+0xd1/frame 0xfffffe18392db510 vm_fault() at vm_fault+0x77/frame 0xfffffe18392db550 trap_pfault() at trap_pfault+0x199/frame 0xfffffe18392db5f0 trap() at trap+0x4a0/frame 0xfffffe18392db800 calltrap() at calltrap+0x8/frame 0xfffffe18392db800 --- trap 0xc, rip =3D 0xffffffff80d972cd, rsp =3D 0xfffffe18392db8c0, rbp = =3D 0xfffffe18392db920 --- copyin() at copyin+0x3d/frame 0xfffffe18392db920 pipe_write() at pipe_write+0x10ea/frame 0xfffffe18392db9f0 dofilewrite() at dofilewrite+0x87/frame 0xfffffe18392dba40 kern_writev() at kern_writev+0x68/frame 0xfffffe18392dba90 sys_write() at sys_write+0x63/frame 0xfffffe18392dbae0 amd64_syscall() at amd64_syscall+0x3fb/frame 0xfffffe18392dbbf0 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe18392dbbf0 --- syscall (4, FreeBSD ELF64, sys_write), rip =3D 0x800b35afc, rsp =3D 0x7= fffffffd3b8, rbp =3D 0x41 --- KDB: enter: panic Reading symbols from /boot/kernel/zfs.ko.symbols...done. Loaded symbols for /boot/kernel/zfs.ko.symbols Reading symbols from /boot/kernel/opensolaris.ko.symbols...done. Loaded symbols for /boot/kernel/opensolaris.ko.symbols Reading symbols from /boot/kernel/ums.ko.symbols...done. Loaded symbols for /boot/kernel/ums.ko.symbols Reading symbols from /boot/kernel/tmpfs.ko.symbols...done. Loaded symbols for /boot/kernel/tmpfs.ko.symbols Reading symbols from /boot/kernel/nullfs.ko.symbols...done. Loaded symbols for /boot/kernel/nullfs.ko.symbols Reading symbols from /boot/kernel/linprocfs.ko.symbols...done. Loaded symbols for /boot/kernel/linprocfs.ko.symbols Reading symbols from /boot/kernel/linux.ko.symbols...done. Loaded symbols for /boot/kernel/linux.ko.symbols #0 doadump (textdump=3D-959294432) at pcpu.h:219 219 __asm("movq %%gs:%1,%0" : "=3Dr" (td) (kgdb) bt #0 doadump (textdump=3D-959294432) at pcpu.h:219 #1 0xffffffff8034a175 in db_fncall (dummy1=3D, dummy2= =3D, dummy3=3D, dummy4=3D) at /usr/src/sys/ddb/db_command.c:578 #2 0xffffffff80349e5d in db_command (cmd_table=3D0x0) at /usr/src/sys/ddb/= db_command.c:449 #3 0xffffffff80349bd4 in db_command_loop () at /usr/src/sys/ddb/db_command= =2Ec:502 #4 0xffffffff8034c630 in db_trap (type=3D, code=3D0) = at /usr/src/sys/ddb/db_main.c:231 #5 0xffffffff80987329 in kdb_trap (type=3D3, code=3D0, tf=3D) at /usr/src/sys/kern/subr_kdb.c:656 #6 0xffffffff80d99059 in trap (frame=3D0xfffffe18392daff0) at /usr/src/sys= /amd64/amd64/trap.c:571 #7 0xffffffff80d7dd22 in calltrap () at /usr/src/sys/amd64/amd64/exception= =2ES:231 #8 0xffffffff80986a8e in kdb_enter (why=3D0xffffffff8100edaf "panic", msg= =3D) at cpufunc.h:63 #9 0xffffffff809462b5 in panic (fmt=3D) at /usr/src/s= ys/kern/kern_shutdown.c:752 #10 0xffffffff80999949 in propagate_priority (td=3D) a= t /usr/src/sys/kern/subr_turnstile.c:226 #11 0xffffffff8099a3ce in turnstile_wait (ts=3D, owner= =3D, queue=3D) at /usr/src/sys/ke= rn/subr_turnstile.c:742 #12 0xffffffff8092f923 in __mtx_lock_sleep (c=3D0xfffff800020000b8, tid=3D1= 8446735278394692752, opts=3D, file=3D0x80
, line=3D-16843009) at /usr/src/sys/kern/kern_mutex.c:508 #13 0xffffffff80c14138 in vm_map_lookup (var_map=3D0xfffffe18392db4a8, vadd= r=3D18446741977052954624, fault_typea=3D2 '\002', out_entry=3D0xfffffe18392= db4b0, object=3D0xfffffe18392db498,=20 pindex=3D0xfffffe18392db4a0) at /usr/src/sys/vm/vm_map.c:3843 #14 0xffffffff80c07a71 in vm_fault_hold (map=3D0xfffff80002000000, vaddr=3D= 18446741977052954624, fault_type=3D, fault_flags=3D0, = m_hold=3D0x0) at /usr/src/sys/vm/vm_fault.c:255 #15 0xffffffff80c07957 in vm_fault (map=3D0xfffff80002000000, vaddr=3D, fault_type=3D2 '\002', fault_flags=3D128) at /usr/src/sys= /vm/vm_fault.c:217 #16 0xffffffff80d99849 in trap_pfault (frame=3D0xfffffe18392db810, usermode= =3D0) at /usr/src/sys/amd64/amd64/trap.c:767 #17 0xffffffff80d99070 in trap (frame=3D0xfffffe18392db810) at /usr/src/sys= /amd64/amd64/trap.c:455 #18 0xffffffff80d7dd22 in calltrap () at /usr/src/sys/amd64/amd64/exception= =2ES:231 #19 0xffffffff80d972cd in copyin () at /usr/src/sys/amd64/amd64/support.S:2= 92 #20 0xffffffff8099bb5f in uiomove_faultflag (cp=3D, n= =3D, uio=3D0xfffffe18392dbab0, nofault=3D) at /usr/src/sys/kern/subr_uio.c:194 #21 0xffffffff809a53ba in pipe_write (fp=3D0xfffff80adc4e2640, uio=3D0xffff= fe18392dbab0, active_cred=3D, flags=3D8, td=3D0x0) at = /usr/src/sys/kern/sys_pipe.c:1215 #22 0xffffffff809a1297 in dofilewrite (td=3D0xfffff8002e61d490, fd=3D1, fp= =3D0xfffff80adc4e2640, auio=3D0xfffffe18392dbab0, offset=3D, flags=3D0) at file.h:307 #23 0xffffffff809a0fc8 in kern_writev (td=3D0xfffff8002e61d490, fd=3D1, aui= o=3D0xfffffe18392dbab0) at /usr/src/sys/kern/sys_generic.c:467 #24 0xffffffff809a0f53 in sys_write (td=3D, uap=3D) at /usr/src/sys/kern/sys_generic.c:382 #25 0xffffffff80d9a0bb in amd64_syscall (td=3D0xfffff8002e61d490, traced=3D= 0) at subr_syscall.c:133 #26 0xffffffff80d7e00b in Xfast_syscall () at /usr/src/sys/amd64/amd64/exce= ption.S:390 #27 0x0000000800b35afc in ?? () Previous frame inner to this frame (corrupt stack?) Current language: auto; currently minimal (kgdb) frame 10 #10 0xffffffff80999949 in propagate_priority (td=3D) a= t /usr/src/sys/kern/subr_turnstile.c:226 226 panic("sleeping thread"); (kgdb) l 221 if (TD_IS_SLEEPING(td)) { 222 printf( 223 "Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n", 224 td->td_tid, td->td_proc->p_pid); 225 kdb_backtrace_thread(td); 226 panic("sleeping thread"); 227 } 228=09 229 /* 230 * If this thread already has higher priority than the (kgdb) tid 100702 [Switching to thread 624 (Thread 100702)]#0 sched_switch (td=3D0xfffff8001= 797a920, newtd=3D, flags=3D) at /= usr/src/sys/kern/sched_ule.c:1933 1933 cpuid =3D PCPU_GET(cpuid); (kgdb) p cpuid No symbol "cpuid" in current context. (kgdb) quit # exit Script done on Mon Mar 10 17:59:07 2014 Glen --HkMjoL2LAeBLhbFV Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (FreeBSD) iQIcBAEBCAAGBQJTHf7UAAoJELls3eqvi17Q2iEQAIVnVNurLco9+Fo9PwpSrAcn +YgaB09l5G+A553l9zXH/59tNlTRZ2j49AUbgVOP1/9fE5FVRXAuH72XhzoA1dZz PMUhw9n2RA0utMOOXD+At94gEkXVZrKleekyoMzoptjAluTyZ24SNo+eJymyDJZa MDuPkLdOdCqhL3S8tPWpVsq8IQ00diJxA3T36NgxD0z1jh2C6Fw/udShL5OVxt7y imrIMJwpbVP7MahSueIwOBAmLyBOVYVl5XMBrfcM9ESBfRAITM9aq51QmaTjx9vN AwsGuSlNPjyaF+6U6C79OwWf0MS66EPfm5c6Pm8FnnMvTsO+8CRXVa392mZUa6P8 RepEcW+8Pi5P7VqVtXq3ya4OivZeAOpWnQtUZM0RcC61MwcZ+UuQ6zOrWcAasSik j9CEObW0s+FKOB3afTZqzFyz9BPdV1k0vHhM/djCSBVQBweqw8C2DpJN5m/EpDDc G9KWdI1rT18J0npAdPLTfdy1GvDG0/WL0XcwqvApWQKbtEifaHL+yVyesfzVKUib yUBzw1oMKxxameSGGbMi510V7yljD8WlFC/ohEJMcfMYKzRgc5RPcFfbWXHd2ZvR pV14gaIDPwcRTTShfNM+pffIMae4tMRNCFhXj+vxcLiywecQoYy6NXkJZB9cSq18 LMXskv9k3P/7i3/58FfC =oFSH -----END PGP SIGNATURE----- --HkMjoL2LAeBLhbFV--