From owner-freebsd-current@freebsd.org Wed Jun 21 08:18:40 2017 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EF5B0D89A8B for ; Wed, 21 Jun 2017 08:18:40 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citapm.icyb.net.ua (citapm.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 5245630E for ; Wed, 21 Jun 2017 08:18:39 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citapm.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id LAA20477; Wed, 21 Jun 2017 11:18:38 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1dNaqU-000Ofy-D0; Wed, 21 Jun 2017 11:18:38 +0300 Subject: Re: Crash in base/head in abd_put() after r320156 To: =?UTF-8?Q?Trond_Endrest=c3=b8l?= , FreeBSD current References: <3987075c-08cd-4add-11dc-24b1e4d071fc@freebsd.org> From: Andriy Gapon Message-ID: Date: Wed, 21 Jun 2017 11:18:01 +0300 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Jun 2017 08:18:41 -0000 On 21/06/2017 00:45, Trond Endrestøl wrote: > On Tue, 20 Jun 2017 17:31-0400, Allan Jude wrote: > >> On 2017-06-20 17:27, Trond Endrestøl wrote: >>> Has anyone else seen a crash in base/head in abd_put() after r320156? >>> >>> One of my experimental VMs at home crashed spectacularly after >>> upgrading to r320156. I even wiped my /usr/obj, recompiled everything >>> and got the same result. Everything's back to normal when I boot >>> r320146. >>> >>> Here's the backtrace: >>> >>> Fatal trap 12: page fault while in kernel mode >>> cpuid = 3; apic id = 03 >>> >>> fault virtual address = 0x8 >>> >>> Fatal trap 12: page fault while in kernel mode >>> >>> cpuid = 2; >>> Fatal trap 12: page fault while in kernel mode >>> apic id = 02 >>> fault virtual address = 0x8 >>> cpuid = 0; apic id = 00 >>> fault virtual address = 0x8 >>> fault code = supervisor read data, page not present >>> fault code = supervisor read data, page not present >>> instruction pointer = 0x20:0xffffffff803260fa >>> stack pointer = 0x28:0xfffffe01b0231860 >>> frame pointer = 0x28:0xfffffe01b0231870 >>> code segment = base 0x0, limit 0xfffff, type 0x1b >>> >>> = DPL 0, pres 1, long 1, def32 0, gran 1 >>> >>> Fatal trap 12: page fault while in kernel mode >>> fault code = supervisor read data, page not present >>> processor eflags = interrupt enabled, resume, IOPL = 0 >>> current process = 0 (zio_free_issue_5_2) >>> trap number = 12 >>> instruction pointer = 0x20:0xffffffff803260fa >>> stack pointer = 0x28:0xfffffe01b022c860 >>> frame pointer = 0x28:0xfffffe01b022c870 >>> panic: page fault >>> cpuid = 0 >>> time = 4 >>> KDB: stack backtrace: >>> db_trace_self_wrapper() at 0xffffffff8044f93b = db_trace_self_wrapper+0x2b/frame 0xfffffe01b0231440 >>> vpanic() at 0xffffffff8067ec0c = vpanic+0x19c/frame 0xfffffe01b02314c0 >>> panic() at 0xffffffff8067ea63 = panic+0x43/frame 0xfffffe01b0231520 >>> trap_fatal() at 0xffffffff80983b32 = trap_fatal+0x322/frame 0xfffffe01b0231570 >>> trap_pfault() at 0xffffffff80983b89 = trap_pfault+0x49/frame 0xfffffe01b02315d0 >>> trap() at 0xffffffff809833c5 = trap+0x295/frame 0xfffffe01b0231790 >>> calltrap() at 0xffffffff80968c21 = calltrap+0x8/frame 0xfffffe01b0231790 >>> --- trap 0xc, rip = 0xffffffff803260fa, rsp = 0xfffffe01b0231860, rbp = 0xfffffe01b0231870 --- >>> abd_put() at 0xffffffff803260fa = abd_put+0xa/frame 0xfffffe01b0231870 >>> vdev_raidz_map_free() at 0xffffffff803aa7c2 = vdev_raidz_map_free+0x82/frame 0xfffffe01b02318a0 >>> zio_vdev_io_assess() at 0xffffffff803ecc04 = zio_vdev_io_assess+0x74/frame 0xfffffe01b02318e0 >>> zio_execute() at 0xffffffff803e913c = zio_execute+0xac/frame 0xfffffe01b0231930 >>> zio_vdev_io_start() at 0xffffffff803ec894 = zio_vdev_io_start+0x2b4/frame 0xfffffe01b0231990 >>> zio_execute() at 0xffffffff803e913c = zio_execute+0xac/frame 0xfffffe01b02319e0 >>> zio_nowait() at 0xffffffff803e8a8b = zio_nowait+0xcb/frame 0xfffffe01b0231a20 >>> vdev_mirror_io_start() at 0xffffffff803a744c = vdev_mirror_io_start+0x35c/frame 0xfffffe01b0231a70 >>> zio_vdev_io_start() at 0xffffffff803ec86c = zio_vdev_io_start+0x28c/frame 0xfffffe01b0231ad0 >>> zio_execute() at 0xffffffff803e913c = zio_execute+0xac/frame 0xfffffe01b0231b20 >>> taskqueue_run_locked() at 0xffffffff806d3d27 = taskqueue_run_locked+0x127/frame 0xfffffe01b0231b80 >>> taskqueue_thread_loop() at 0xffffffff806d4ee8 = taskqueue_thread_loop+0xc8/frame 0xfffffe01b0231bb0 >>> fork_exit() at 0xffffffff80640df5 = fork_exit+0x85/frame 0xfffffe01b0231bf0 >>> fork_trampoline() at 0xffffffff8096915e = fork_trampoline+0xe/frame 0xfffffe01b0231bf0 >>> --- trap 0, rip = 0, rsp = 0, rbp = 0 --- >>> Uptime: 4s >>> >> >> This seems to be an unintended consequence of some code that was pulled >> in from upstream today. >> >> Try adding: vfs.zfs.trim.enabled=0 >> to /boot/loader.conf >> >> (you can set it manually from the boot loader menu with the set command >> to get the system to boot) > > That worked. Thanks. > > BTW, the call to abd_put() was given a NULL pointer. > Could you please re-enable ZFS TRIM support and test r320186 or later? ZFS ABD is a rather large upstream change and our TRIM support is sprinkled over non-trivial amount of code as well. Thank you. -- Andriy Gapon