From owner-freebsd-fs@freebsd.org Wed Dec 23 10:00:19 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 68474A4F045 for ; Wed, 23 Dec 2015 10:00:19 +0000 (UTC) (envelope-from freebsd-listen@fabiankeil.de) Received: from smtprelay05.ispgateway.de (smtprelay05.ispgateway.de [80.67.31.99]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 050141177 for ; Wed, 23 Dec 2015 10:00:18 +0000 (UTC) (envelope-from freebsd-listen@fabiankeil.de) Received: from [78.35.187.90] (helo=fabiankeil.de) by smtprelay05.ispgateway.de with esmtpsa (TLSv1.2:AES128-GCM-SHA256:128) (Exim 4.84) (envelope-from ) id 1aBgBq-0001cW-QD for freebsd-fs@freebsd.org; Wed, 23 Dec 2015 10:58:38 +0100 Date: Wed, 23 Dec 2015 10:58:37 +0100 From: Fabian Keil To: freebsd-fs@freebsd.org Subject: Re: ZFS:dmu_objset_find_dp_impl() - panic: vm_fault: fault on nofault entry, addr: fffffe0094653000 Message-ID: <20151223105837.53b2c1ae@fabiankeil.de> In-Reply-To: <20151222161200.19ab1832@fabiankeil.de> References: <20151222161200.19ab1832@fabiankeil.de> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; boundary="Sig_/LkPuX3yEHhb4Kyxp4q4kfIC"; protocol="application/pgp-signature" X-Df-Sender: Nzc1MDY3 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Dec 2015 10:00:19 -0000 --Sig_/LkPuX3yEHhb4Kyxp4q4kfIC Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Fabian Keil wrote: > Using a kernel based on r292334, I got this panic while importing > a ZFS pool with vfs.zfs.spa_load_verify_data and > vfs.zfs.spa_load_verify_metadata set to 0. >=20 > I've not been able to reproduce it yet and the changed sysctl's above > may not actually matter (but I usually use the defaults). I unintentionally reproduced it yesterday with the same kernel using the default values for the sysctls above. > The pool has a single leaf vdev that is backed by ggatec which transfers = the > data over a slow and easily saturated connection (< ~120 kB/s up). Graph: > https://www.fabiankeil.de/talks/versteckter-block-speicher/mgp00030.html >=20 > fk@r500 /usr/crash $kgdb /usr/lib/debug/boot/kernel/kernel.debug vmcore.2= =20 > [...] > Unread portion of the kernel message buffer: > [11912] panic: vm_fault: fault on nofault entry, addr: fffffe0094653000 > [11912] cpuid =3D 0 > [11912] KDB: stack backtrace: > [...] > #0 doadump (textdump=3D0) at pcpu.h:221 > 221 pcpu.h: No such file or directory. > in pcpu.h > (kgdb) where > #0 doadump (textdump=3D0) at pcpu.h:221 > #1 0xffffffff8031752b in db_dump (dummy=3D, dummy2= =3Dfalse, dummy3=3D0, dummy4=3D0x0) at /usr/src/sys/ddb/db_command.c:533 > #2 0xffffffff8031731e in db_command (cmd_table=3D0x0) at /usr/src/sys/dd= b/db_command.c:440 > #3 0xffffffff803170b4 in db_command_loop () at /usr/src/sys/ddb/db_comma= nd.c:493 > #4 0xffffffff80319bbb in db_trap (type=3D, code=3D0= ) at /usr/src/sys/ddb/db_main.c:251 > #5 0xffffffff805e2dc3 in kdb_trap (type=3D3, code=3D0, tf=3D) at /usr/src/sys/kern/subr_kdb.c:654 > #6 0xffffffff8087f207 in trap (frame=3D0xfffffe0094f8f220) at /usr/src/s= ys/amd64/amd64/trap.c:549 > #7 0xffffffff808641b7 in calltrap () at /usr/src/sys/amd64/amd64/excepti= on.S:234 > #8 0xffffffff805e24ab in kdb_enter (why=3D0xffffffff8097216b "panic", ms= g=3D0x32
) at cpufunc.h:63 > #9 0xffffffff8059ea4f in vpanic (fmt=3D, ap=3D) at /usr/src/sys/kern/kern_shutdown.c:750 > #10 0xffffffff8059e8a3 in panic (fmt=3D0x0) at /usr/src/sys/kern/kern_shu= tdown.c:688 > #11 0xffffffff80835650 in vm_fault_hold (map=3D, vad= dr=3D, fault_type=3D, fault_flags= =3D, m_hold=3D) > at /usr/src/sys/vm/vm_fault.c:332 > #12 0xffffffff808332f8 in vm_fault (map=3D0xfffff80002000000, vaddr=3D, fault_type=3D1 '\001', fault_flags=3D0) at /usr/src/sys= /vm/vm_fault.c:277 > #13 0xffffffff8087f97a in trap_pfault (frame=3D0xfffffe0094f8f8d0, usermo= de=3D0) at /usr/src/sys/amd64/amd64/trap.c:734 > #14 0xffffffff8087f21e in trap (frame=3D0xfffffe0094f8f8d0) at /usr/src/s= ys/amd64/amd64/trap.c:435 > #15 0xffffffff808641b7 in calltrap () at /usr/src/sys/amd64/amd64/excepti= on.S:234 > #16 0xffffffff81900c9a in dmu_objset_find_dp_impl (dcp=3D0xfffff80078cb02= 00) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c= :1630 > #17 0xffffffff81901189 in dmu_objset_find_dp_cb (arg=3D0xfffff80078cb0200= ) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c:1= 746 [...] > Given the location of the trap, this could be a regression caused > by the import of illumos #5269 (zpool import slow) in r286686: > https://svnweb.freebsd.org/base?view=3Drevision&revision=3Dr286686 On the other hand I've never seen the issue with previous kernels and two times with the one based on r292334. I've updated to a kernel based on r292616 to see if it makes a difference (there were quite a few vm changes). Fabian --Sig_/LkPuX3yEHhb4Kyxp4q4kfIC Content-Type: application/pgp-signature Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEARECAAYFAlZ6cE0ACgkQBYqIVf93VJ03xwCgv6n9yYpeREldBPZOhEPxB0+w AmwAnjnlbZyrJ7vLPwAjPWVsBW9izMzr =HlWV -----END PGP SIGNATURE----- --Sig_/LkPuX3yEHhb4Kyxp4q4kfIC--