From owner-freebsd-current@freebsd.org  Wed Jun 21 08:18:40 2017
Return-Path: <owner-freebsd-current@freebsd.org>
Delivered-To: freebsd-current@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id EF5B0D89A8B
 for <freebsd-current@mailman.ysv.freebsd.org>;
 Wed, 21 Jun 2017 08:18:40 +0000 (UTC) (envelope-from avg@FreeBSD.org)
Received: from citapm.icyb.net.ua (citapm.icyb.net.ua [212.40.38.140])
 by mx1.freebsd.org (Postfix) with ESMTP id 5245630E
 for <freebsd-current@FreeBSD.org>; Wed, 21 Jun 2017 08:18:39 +0000 (UTC)
 (envelope-from avg@FreeBSD.org)
Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua
 [212.40.38.100])
 by citapm.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id LAA20477;
 Wed, 21 Jun 2017 11:18:38 +0300 (EEST)
 (envelope-from avg@FreeBSD.org)
Received: from localhost ([127.0.0.1])
 by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
 id 1dNaqU-000Ofy-D0; Wed, 21 Jun 2017 11:18:38 +0300
Subject: Re: Crash in base/head in abd_put() after r320156
To: =?UTF-8?Q?Trond_Endrest=c3=b8l?= <Trond.Endrestol@fagskolen.gjovik.no>,
 FreeBSD current <freebsd-current@FreeBSD.org>
References: <alpine.BSF.2.21.1706202259370.37790@mail.fig.ol.no>
 <3987075c-08cd-4add-11dc-24b1e4d071fc@freebsd.org>
 <alpine.BSF.2.21.1706202344010.37790@mail.fig.ol.no>
From: Andriy Gapon <avg@FreeBSD.org>
Message-ID: <e046e1c2-d39a-9d59-5f2d-277fc7b19ee6@FreeBSD.org>
Date: Wed, 21 Jun 2017 11:18:01 +0300
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101
 Thunderbird/52.1.1
MIME-Version: 1.0
In-Reply-To: <alpine.BSF.2.21.1706202344010.37790@mail.fig.ol.no>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 8bit
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
 <freebsd-current.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current/>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 21 Jun 2017 08:18:41 -0000

On 21/06/2017 00:45, Trond Endrestøl wrote:
> On Tue, 20 Jun 2017 17:31-0400, Allan Jude wrote:
> 
>> On 2017-06-20 17:27, Trond Endrestøl wrote:
>>> Has anyone else seen a crash in base/head in abd_put() after r320156?
>>>
>>> One of my experimental VMs at home crashed spectacularly after 
>>> upgrading to r320156. I even wiped my /usr/obj, recompiled everything 
>>> and got the same result. Everything's back to normal when I boot 
>>> r320146.
>>>
>>> Here's the backtrace:
>>>
>>> Fatal trap 12: page fault while in kernel mode
>>> cpuid = 3; apic id = 03
>>>
>>> fault virtual address	= 0x8
>>>
>>> Fatal trap 12: page fault while in kernel mode
>>>
>>> cpuid = 2; 
>>> Fatal trap 12: page fault while in kernel mode
>>> apic id = 02
>>> fault virtual address	= 0x8
>>> cpuid = 0; apic id = 00
>>> fault virtual address	= 0x8
>>> fault code		= supervisor read data, page not present
>>> fault code		= supervisor read data, page not present
>>> instruction pointer	= 0x20:0xffffffff803260fa
>>> stack pointer	        = 0x28:0xfffffe01b0231860
>>> frame pointer	        = 0x28:0xfffffe01b0231870
>>> code segment		= base 0x0, limit 0xfffff, type 0x1b
>>>
>>> 			= DPL 0, pres 1, long 1, def32 0, gran 1
>>>
>>> Fatal trap 12: page fault while in kernel mode
>>> fault code		= supervisor read data, page not present
>>> processor eflags	= interrupt enabled, resume, IOPL = 0
>>> current process		= 0 (zio_free_issue_5_2)
>>> trap number		= 12
>>> instruction pointer	= 0x20:0xffffffff803260fa
>>> stack pointer	        = 0x28:0xfffffe01b022c860
>>> frame pointer	        = 0x28:0xfffffe01b022c870
>>> panic: page fault
>>> cpuid = 0
>>> time = 4
>>> KDB: stack backtrace:
>>> db_trace_self_wrapper() at 0xffffffff8044f93b = db_trace_self_wrapper+0x2b/frame 0xfffffe01b0231440
>>> vpanic() at 0xffffffff8067ec0c = vpanic+0x19c/frame 0xfffffe01b02314c0
>>> panic() at 0xffffffff8067ea63 = panic+0x43/frame 0xfffffe01b0231520
>>> trap_fatal() at 0xffffffff80983b32 = trap_fatal+0x322/frame 0xfffffe01b0231570
>>> trap_pfault() at 0xffffffff80983b89 = trap_pfault+0x49/frame 0xfffffe01b02315d0
>>> trap() at 0xffffffff809833c5 = trap+0x295/frame 0xfffffe01b0231790
>>> calltrap() at 0xffffffff80968c21 = calltrap+0x8/frame 0xfffffe01b0231790
>>> --- trap 0xc, rip = 0xffffffff803260fa, rsp = 0xfffffe01b0231860, rbp = 0xfffffe01b0231870 ---
>>> abd_put() at 0xffffffff803260fa = abd_put+0xa/frame 0xfffffe01b0231870
>>> vdev_raidz_map_free() at 0xffffffff803aa7c2 = vdev_raidz_map_free+0x82/frame 0xfffffe01b02318a0
>>> zio_vdev_io_assess() at 0xffffffff803ecc04 = zio_vdev_io_assess+0x74/frame 0xfffffe01b02318e0
>>> zio_execute() at 0xffffffff803e913c = zio_execute+0xac/frame 0xfffffe01b0231930
>>> zio_vdev_io_start() at 0xffffffff803ec894 = zio_vdev_io_start+0x2b4/frame 0xfffffe01b0231990
>>> zio_execute() at 0xffffffff803e913c = zio_execute+0xac/frame 0xfffffe01b02319e0
>>> zio_nowait() at 0xffffffff803e8a8b = zio_nowait+0xcb/frame 0xfffffe01b0231a20
>>> vdev_mirror_io_start() at 0xffffffff803a744c = vdev_mirror_io_start+0x35c/frame 0xfffffe01b0231a70
>>> zio_vdev_io_start() at 0xffffffff803ec86c = zio_vdev_io_start+0x28c/frame 0xfffffe01b0231ad0
>>> zio_execute() at 0xffffffff803e913c = zio_execute+0xac/frame 0xfffffe01b0231b20
>>> taskqueue_run_locked() at 0xffffffff806d3d27 = taskqueue_run_locked+0x127/frame 0xfffffe01b0231b80
>>> taskqueue_thread_loop() at 0xffffffff806d4ee8 = taskqueue_thread_loop+0xc8/frame 0xfffffe01b0231bb0
>>> fork_exit() at 0xffffffff80640df5 = fork_exit+0x85/frame 0xfffffe01b0231bf0
>>> fork_trampoline() at 0xffffffff8096915e = fork_trampoline+0xe/frame 0xfffffe01b0231bf0
>>> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
>>> Uptime: 4s
>>>
>>
>> This seems to be an unintended consequence of some code that was pulled
>> in from upstream today.
>>
>> Try adding: vfs.zfs.trim.enabled=0
>> to /boot/loader.conf
>>
>> (you can set it manually from the boot loader menu with the set command
>> to get the system to boot)
> 
> That worked. Thanks.
> 
> BTW, the call to abd_put() was given a NULL pointer.
> 

Could you please re-enable ZFS TRIM support and test r320186 or later?
ZFS ABD is a rather large upstream change and our TRIM support is sprinkled over
non-trivial amount of code as well.
Thank you.

-- 
Andriy Gapon