Date: Wed, 21 Jun 2017 09:59:14 +0200 From: "O. Hartmann" <ohartmann@walstatt.org> To: "Kenneth D. Merry" <ken@FreeBSD.ORG> Cc: Andriy Gapon <avg@FreeBSD.org>, svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org Subject: Re: svn commit: r320156 - in head: cddl/contrib/opensolaris/cmd/zdb cddl/contrib/opensolaris/cmd/ztest cddl/contrib/opensolaris/lib/libzfs/common sys/cddl/contrib/opensolaris/common/zfs sys/cddl/contri... Message-ID: <20170621095903.4fefe0b5@thor.intern.walstatt.dynvpn.de> In-Reply-To: <20170620212553.GA30559@mithlond.kdm.org> References: <201706201739.v5KHdPhO051256@repo.freebsd.org> <81F84BCA-E973-4D78-B81C-1D398ADFA47E@freebsd.org> <fc648de9-576d-b5c4-0436-e9597decadf2@FreeBSD.org> <20170620212553.GA30559@mithlond.kdm.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--Sig_/OqF89_4IuMou85Cqd.VGkTp Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Am Tue, 20 Jun 2017 17:25:53 -0400 "Kenneth D. Merry" <ken@FreeBSD.ORG> schrieb: > On Tue, Jun 20, 2017 at 23:37:10 +0300, Andriy Gapon wrote: > > On 20/06/2017 23:29, Ken Merry wrote: =20 > > > I don???t know for sure that this commit is the cause, but it (and r3= 20153) are the > > > only ZFS commits between a version of head from June 14th that boots = off a ZFS > > > mirror, and one that panics. r320153 is running well here and stable, but with r320156, my kernel(s) on = all ZFS machines panic immediately (they have ZFS built in into the kernel, not a m= odule). This moment, I went back to r320153. I'm sorry for not having debugging inf= ormations, the boxes are w/o debugging options this moment. > > >=20 > > > Here???s the stack trace: > > >=20 > > > Fatal trap 12: page fault while in kernel mode > > > cpuid =3D 22;=20 > > >=20 > > > Fatal trap 12: page fault while in kernel mode > > > cpuid =3D 9; apic id =3D 09 > > > fault virtual address =3D 0x0 > > > fault code =3D supervisor read data, page not present > > > instruction pointer =3D 0x20:0xffffffff81e47f21 > > > stack pointer =3D 0x28:0xfffffe08b37f8810 > > > frame pointer =3D 0x28:0xfffffe08b37f8860 > > > code segment =3D base 0x0, limit 0xfffff, type 0x1b > > > =3D DPL 0, pres 1, long 1, def32 0, gran 1 > > > processor eflags =3D interrupt enabled, resume, IOPL =3D 0 > > > current process =3D 0 (zio_free_issue_0_3) > > > [ thread pid 0 tid 100478 ] > > > Stopped at 0xffffffff81e47f21 =3D zio_vdev_io_start+0x1f1: tes= tb > > > $0x1,(%rax) =20 > > > db> bt =20 > > > Tracing pid 0 tid 100478 td 0xfffff80193156000 > > > zio_vdev_io_start() at 0xffffffff81e47f21 =3D zio_vdev_io_start+0x1f1= /frame > > > 0xfffffe08b37f8860 zio_execute() at 0xffffffff81e4312c =3D zio_execut= e+0x36c/frame > > > 0xfffffe08b37f88b0 zio_nowait() at 0xffffffff81e422b8 =3D zio_nowait+= 0xb8/frame > > > 0xfffffe08b37f88e0 vdev_mirror_io_start() at 0xffffffff81e224fc =3D > > > vdev_mirror_io_start+0x38c/frame 0xfffffe08b37f8930 zio_vdev_io_start= () at > > > 0xffffffff81e48030 =3D zio_vdev_io_start+0x300/frame 0xfffffe08b37f89= 90 zio_execute() > > > at 0xffffffff81e4312c =3D zio_execute+0x36c/frame 0xfffffe08b37f89e0 > > > taskqueue_run_locked() at 0xffffffff809a9d6d =3D taskqueue_run_locked= +0x13d/frame > > > 0xfffffe08b37f8a40 taskqueue_thread_loop() at 0xffffffff809aab28 =3D > > > taskqueue_thread_loop+0x88/frame 0xfffffe08b37f8a70 fork_exit() at > > > 0xffffffff8091e3e4 =3D fork_exit+0x84/frame 0xfffffe08b37f8ab0 fork_t= rampoline() at > > > 0xffffffff80d930fe =3D fork_trampoline+0xe/frame 0xfffffe08b37f8ab0 -= -- trap 0, rip =3D > > > 0, rsp =3D 0, rbp =3D 0 --- =20 > > > db> =20 > > >=20 > > > (kgdb) list *(zio_vdev_io_start+0x1f1) > > > 0xd9f21 is in zio_vdev_io_start > > > (/usr/home/kenm/perforce4/kenm/FreeBSD-test/sys/cddl/contrib/opensola= ris/uts/common/fs/zfs/zio.c:350). > > > 345 346 /* > > > 347 * Ensure that anyone expecting this zio to contain a= linear ABD > > > isn't 348 * going to get a nasty surprise when they try = to access the > > > data. 349 */ > > > 350 IMPLY(abd_is_linear(zio->io_abd), abd_is_linear(data)= ); > > > 351 > > > 352 zt->zt_orig_abd =3D zio->io_abd; > > > 353 zt->zt_orig_size =3D zio->io_size; > > > 354 zt->zt_bufsize =3D bufsize; > > >=20 > > > I???ll try rebooting and see if the problem goes away. If not, I???l= l roll back > > > the ABD change and see if the problem goes away. =20 > >=20 > > Judging from the thread that panic-ed the problem may have to do with o= ur TRIM > > support. Unfortunately, I didn't have a chance to test the change on = a system > > with working TRIM and, so, I missed it. > > I will look into this further, but it's almost obvious that the problem= is > > caused by zio->io_abd being NULL for a zio of type ZIO_TYPE_FREE. =20 >=20 > FWIW, avg sent me a patch for this particular problem (by checking for NU= LL > before dereferencing the pointer), and although it got me past the above > problem, I hit another related panic: >=20 > Fatal trap 12: page fault while in kernel mode > cpuid =3D 6;=20 >=20 > Fatal trap 12: page fault while in kernel mode > cpuid =3D 14; apic id =3D 22 > fault virtual address =3D 0x4 > fault code =3D supervisor read data, page not present > instruction pointer =3D 0x20:0xffffffff81d92a2d > stack pointer =3D 0x0:0xfffffe08b36e0710 > frame pointer =3D 0x0:0xfffffe08b36e0730 > code segment =3D base 0x0, limit 0xfffff, type 0x1b >=20 >=20 > Fatal trap 12: page fault while in kernel mode > cpuid =3D 11; apic id =3D 0b > fault virtual address =3D 0x4 > Fatal trap 12: page fault while in kernel mode > cpuid =3D 8; apic id =3D 08 > =3D DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags =3D interrupt enabled, resume, IOPL =3D 0 > current process =3D 0 (zio_free_issue_4_1) > [ thread pid 0 tid 100799 ] > Stopped at 0xffffffff81d92a2d =3D abd_verify+0xd: movl 0x4(%r1= 4),%eax > db> bt =20 > Tracing pid 0 tid 100799 td 0xfffff801931b8560 > abd_verify() at 0xffffffff81d92a2d =3D abd_verify+0xd/frame 0xfffffe08b36= e0730 > abd_put() at 0xffffffff81d92eff =3D abd_put+0xf/frame 0xfffffe08b36e0750 > vdev_raidz_map_free() at 0xffffffff81e26312 =3D vdev_raidz_map_free+0x82/= frame > 0xfffffe08b36e0780 zio_vdev_io_assess() at 0xffffffff81e48646 =3D > zio_vdev_io_assess+0x116/frame 0xfffffe08b36e07b0 zio_execute() at 0xffff= ffff81e4312c =3D > zio_execute+0x36c/frame 0xfffffe08b36e0800 zio_vdev_io_start() at 0xfffff= fff81e48184 =3D > zio_vdev_io_start+0x454/frame 0xfffffe08b36e0860 zio_execute() at 0xfffff= fff81e4312c =3D > zio_execute+0x36c/frame 0xfffffe08b36e08b0 zio_nowait() at 0xffffffff81e4= 22b8 =3D > zio_nowait+0xb8/frame 0xfffffe08b36e08e0 vdev_mirror_io_start() at 0xffff= ffff81e224fc =3D > vdev_mirror_io_start+0x38c/frame 0xfffffe08b36e0930 zio_vdev_io_start() at > 0xffffffff81e48030 =3D zio_vdev_io_start+0x300/frame 0xfffffe08b36e0990 z= io_execute() at > 0xffffffff81e4312c =3D zio_execute+0x36c/frame 0xfffffe08b36e09e0 taskque= ue_run_locked() > at 0xffffffff809a9d6d =3D taskqueue_run_locked+0x13d/frame 0xfffffe08b36e= 0a40 > taskqueue_thread_loop() at 0xffffffff809aab28 =3D taskqueue_thread_loop+0= x88/frame > 0xfffffe08b36e0a70 fork_exit() at 0xffffffff8091e3e4 =3D fork_exit+0x84/f= rame > 0xfffffe08b36e0ab0 fork_trampoline() at 0xffffffff80d930fe =3D fork_tramp= oline+0xe/frame > 0xfffffe08b36e0ab0 --- trap 0, rip =3D 0, rsp =3D 0, rbp =3D 0 --- > db> =20 >=20 > (kgdb) list *(abd_verify+0xd) >=20 > 0x24a2d is in abd_verify > (/usr/home/kenm/perforce4/kenm/FreeBSD-test/sys/cddl/contrib/opensolaris/= uts/common/fs/zfs/abd.c:231). > 226 } 227 > 228 static inline void > 229 abd_verify(abd_t *abd) > 230 { > 231 ASSERT3U(abd->abd_size, >, 0); > 232 ASSERT3U(abd->abd_size, <=3D, SPA_MAXBLOCKSIZE); > 233 ASSERT3U(abd->abd_flags, =3D=3D, abd->abd_flags & (ABD_FL= AG_LINEAR | > 234 ABD_FLAG_OWNER | ABD_FLAG_META)); > 235 IMPLY(abd->abd_parent !=3D NULL, !(abd->abd_flags & ABD_F= LAG_OWNER)); > (kgdb) list *(abd_put+0xf) > 0x24eff is in abd_put > (/usr/home/kenm/perforce4/kenm/FreeBSD-test/sys/cddl/contrib/opensolaris/= uts/common/fs/zfs/abd.c:514). > 509 */ 510 void > 511 abd_put(abd_t *abd) > 512 { > 513 abd_verify(abd); > 514 ASSERT(!(abd->abd_flags & ABD_FLAG_OWNER)); > 515 > 516 if (abd->abd_parent !=3D NULL) { > 517 (void) refcount_remove_many(&abd->abd_parent->abd= _children, > 518 abd->abd_size, abd); > (kgdb) list *(vdev_raidz_map_free+0x82) > 0xb8312 is in vdev_raidz_map_free > (/usr/home/kenm/perforce4/kenm/FreeBSD-test/sys/cddl/contrib/opensolaris/= uts/common/fs/zfs/vdev_raidz.c:281). > 276 zio_buf_free(rm->rm_col[c].rc_gdata, > 277 rm->rm_col[c].rc_size); 278 = } > 279 > 280 size =3D 0; > 281 for (c =3D rm->rm_firstdatacol; c < rm->rm_cols; c++) { > 282 abd_put(rm->rm_col[c].rc_abd); > 283 size +=3D rm->rm_col[c].rc_size; > 284 } > 285 > (kgdb) list *(zio_vdev_io_assess+0x116) > 0xda646 is in zio_vdev_io_assess > (/usr/home/kenm/perforce4/kenm/FreeBSD-test/sys/cddl/contrib/opensolaris/= uts/common/fs/zfs/zio.c:3315). > 3310 if (vd =3D=3D NULL && !(zio->io_flags & ZIO_FLAG_CONFIG_W= RITER)) > 3311 spa_config_exit(zio->io_spa, SCL_ZIO, zio); 3312 > 3313 if (zio->io_vsd !=3D NULL) { > 3314 zio->io_vsd_ops->vsd_free(zio); > 3315 zio->io_vsd =3D NULL; > 3316 } > 3317 > 3318 if (zio_injection_enabled && zio->io_error =3D=3D 0) > 3319 zio->io_error =3D zio_handle_fault_injection(zio,= EIO); > (kgdb)=20 >=20 > So, I disabled trim by setting vfs.zfs.trim.enabled=3D0 in the loader, an= d I > can boot now. >=20 > Ken --=20 O. Hartmann Ich widerspreche der Nutzung oder =C3=9Cbermittlung meiner Daten f=C3=BCr Werbezwecke oder f=C3=BCr die Markt- oder Meinungsforschung (=C2=A7 28 Abs.= 4 BDSG). --Sig_/OqF89_4IuMou85Cqd.VGkTp Content-Type: application/pgp-signature Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- iLUEARMKAB0WIQQZVZMzAtwC2T/86TrS528fyFhYlAUCWUonUgAKCRDS528fyFhY lHQ+Af0Zs5JJEHGPnD9+ae/VKtOa3hEsYh1mQJI+oDA9ni80NQ+a33cMDfjGk/Dc iFTQM/8ckRwwgpPQPODf05oRHhRmAf4xk8L5wXKUAF+5nQBVhgApX38egAK0uAYz aR1spEDIBsWB1stG5wt7uNdqCFbT2CJV1Wn/oQ4DNtTqgGEHwWIw =KaKf -----END PGP SIGNATURE----- --Sig_/OqF89_4IuMou85Cqd.VGkTp--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170621095903.4fefe0b5>