From owner-freebsd-bugs@freebsd.org Mon Oct 12 15:22:57 2020 Return-Path: Delivered-To: freebsd-bugs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 2C90A43A43C for ; Mon, 12 Oct 2020 15:22:57 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 4C92Vn0Qchz4LGv for ; Mon, 12 Oct 2020 15:22:57 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id 0C8E043A43B; Mon, 12 Oct 2020 15:22:57 +0000 (UTC) Delivered-To: bugs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 0B26D43A539 for ; Mon, 12 Oct 2020 15:22:57 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4C92Vm6Mtsz4L5M for ; Mon, 12 Oct 2020 15:22:56 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id BE0A4E76C for ; Mon, 12 Oct 2020 15:22:56 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 09CFMuWE020927 for ; Mon, 12 Oct 2020 15:22:56 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 09CFMu8w020926 for bugs@FreeBSD.org; Mon, 12 Oct 2020 15:22:56 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 250297] OpenZFS crash -- zvol_geom_bio_getattr called when volmode=dev Date: Mon, 12 Oct 2020 15:22:56 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: vangyzen@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Oct 2020 15:22:57 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D250297 Bug ID: 250297 Summary: OpenZFS crash -- zvol_geom_bio_getattr called when volmode=3Ddev Product: Base System Version: CURRENT Hardware: Any OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: vangyzen@FreeBSD.org There seems to be a race between creating a zvol with volmode=3Ddev. We cr= eate the zvol with the default mode, then destroy it and re-create it with dev m= ode. It seems that if I/O occurs in that window, it takes the geom code path and walks through a NULL pointer. This seems to be introduced by the OpenZFS merge. I've used this workflow often for over a year, mostly on head, and this appeared only recently. I first saw it on r366500+84ccaf49083c-c272054. I hit it reliably on my workstation with the following command. I also hit= it on a VM, though it takes more tries to hit the window. # zfs create -s -V 20G -o primarycache=3Dnone -o volmode=3Ddev head_root/te= stvol zvol_create_minor_impl:1250[1]: Creating ZVOL head_root/testvol... zvol_create_minor_impl:1371[1]: ZVOL head_root/testvol created. Fatal trap 12: page fault while in kernel mode cpuid =3D 7; apic id =3D 07 fault virtual address =3D 0x110 fault code =3D supervisor read data, page not present instruction pointer =3D 0x20:0xffffffff82167fca stack pointer =3D 0x28:0xfffffe000edcdb30 frame pointer =3D 0x28:0xfffffe000edcdb70 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D interrupt enabled, resume, IOPL =3D 0 current process =3D 13 (g_down) trap number =3D 12 db> acttrace Tracing command zfskern pid 21 tid 100478 td 0xfffffe00610c9800 (CPU 6) cpustop_handler() at cpustop_handler+0x28/frame 0xfffffe0011880e00 ipi_nmi_handler() at ipi_nmi_handler+0x39/frame 0xfffffe0011880e10 trap() at trap+0x56/frame 0xfffffe0011880f20 nmi_calltrap() at nmi_calltrap+0x8/frame 0xfffffe0011880f20 --- trap 0x13, rip =3D 0xffffffff80c25fb2, rsp =3D 0xfffffe006168c820, rbp = =3D 0xfffffe006168c830 --- lock_delay() at lock_delay+0x42/frame 0xfffffe006168c830 _mtx_lock_spin_cookie() at _mtx_lock_spin_cookie+0xc1/frame 0xfffffe006168c= 8a0 __mtx_lock_spin_flags() at __mtx_lock_spin_flags+0xd5/frame 0xfffffe006168c= 8e0 cnputs() at cnputs+0x58/frame 0xfffffe006168c910 vprintf() at vprintf+0xcd/frame 0xfffffe006168c9e0 printf() at printf+0x43/frame 0xfffffe006168ca40 zvol_free() at zvol_free+0x53/frame 0xfffffe006168ca80 zvol_task_cb() at zvol_task_cb+0x271/frame 0xfffffe006168cae0 taskq_run() at taskq_run+0x1f/frame 0xfffffe006168cb00 taskqueue_run_locked() at taskqueue_run_locked+0xaa/frame 0xfffffe006168cb80 taskqueue_thread_loop() at taskqueue_thread_loop+0x94/frame 0xfffffe006168c= bb0 fork_exit() at fork_exit+0x80/frame 0xfffffe006168cbf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe006168cbf0 --- trap 0, rip =3D 0, rsp =3D 0, rbp =3D 0 --- Tracing command geom pid 13 tid 100049 td 0xfffffe0011862700 (CPU 7) kdb_enter() at kdb_enter+0x37/frame 0xfffffe000edcd7e0 vpanic() at vpanic+0x19e/frame 0xfffffe000edcd830 panic() at panic+0x43/frame 0xfffffe000edcd890 trap_fatal() at trap_fatal+0x387/frame 0xfffffe000edcd8f0 trap_pfault() at trap_pfault+0x97/frame 0xfffffe000edcd950 trap() at trap+0x2ab/frame 0xfffffe000edcda60 calltrap() at calltrap+0x8/frame 0xfffffe000edcda60 --- trap 0xc, rip =3D 0xffffffff82167fca, rsp =3D 0xfffffe000edcdb30, rbp = =3D 0xfffffe000edcdb70 --- zvol_geom_bio_start() at zvol_geom_bio_start+0x2a/frame 0xfffffe000edcdb70 g_io_schedule_down() at g_io_schedule_down+0x134/frame 0xfffffe000edcdba0 g_down_procbody() at g_down_procbody+0x5c/frame 0xfffffe000edcdbb0 fork_exit() at fork_exit+0x80/frame 0xfffffe000edcdbf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe000edcdbf0 --- trap 0, rip =3D 0, rsp =3D 0, rbp =3D 0 --- (The other CPUs were idle.) --=20 You are receiving this mail because: You are the assignee for the bug.=