From owner-freebsd-stable@freebsd.org Wed Oct 3 12:59:08 2018 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A3A9410C2451 for ; Wed, 3 Oct 2018 12:59:08 +0000 (UTC) (envelope-from rainer@ultra-secure.de) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 257B57974F for ; Wed, 3 Oct 2018 12:59:08 +0000 (UTC) (envelope-from rainer@ultra-secure.de) Received: by mailman.ysv.freebsd.org (Postfix) id D9D4810C2450; Wed, 3 Oct 2018 12:59:07 +0000 (UTC) Delivered-To: stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9C93C10C244F for ; Wed, 3 Oct 2018 12:59:07 +0000 (UTC) (envelope-from rainer@ultra-secure.de) Received: from connect.ultra-secure.de (connect.ultra-secure.de [88.198.71.201]) by mx1.freebsd.org (Postfix) with ESMTP id 022F07974E for ; Wed, 3 Oct 2018 12:59:06 +0000 (UTC) (envelope-from rainer@ultra-secure.de) Received: (Haraka outbound); Wed, 03 Oct 2018 14:55:26 +0200 Authentication-Results: connect.ultra-secure.de; auth=pass (login); spf=none smtp.mailfrom=ultra-secure.de Received-SPF: None (connect.ultra-secure.de: domain of ultra-secure.de does not designate 127.0.0.10 as permitted sender) receiver=connect.ultra-secure.de; identity=mailfrom; client-ip=127.0.0.10; helo=connect.ultra-secure.de; envelope-from= Received: from connect.ultra-secure.de (webmail [127.0.0.10]) by connect.ultra-secure.de (Haraka/2.6.2-toaster) with ESMTPSA id 21E63304-CFF3-4049-8C69-76D37510D6E4.1 envelope-from (authenticated bits=0) (version=TLSv1/SSLv3 cipher=AES256-SHA verify=NO); Wed, 03 Oct 2018 14:55:19 +0200 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Date: Wed, 03 Oct 2018 14:55:19 +0200 From: rainer@ultra-secure.de To: stable@freebsd.org Subject: 11.2-RELEASE panics with a bit of load Message-ID: <7b8edb650b9d50a03e60335ebc13ada8@ultra-secure.de> X-Sender: rainer@ultra-secure.de User-Agent: Roundcube Webmail/1.2.0 X-Haraka-GeoIP: --, , NaNkm X-Haraka-GeoIP-Received: X-Haraka-p0f: os="undefined undefined" link_type="undefined" distance=undefined total_conn=undefined shared_ip=Y X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on spamassassin X-Spam-Level: X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham autolearn_force=no version=3.4.1 X-Haraka-Karma: score: 6, good: 719, bad: 0, connections: 725, history: 719, pass:all_good, relaying X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2018 12:59:08 -0000 Hi, I created a PR for this, but maybe somebody here can help. https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=231296 I have a HP DL380 Gen10 server with a smartpqi(4) HBA and some disks smartpqi0: port 0x4000-0x40ff mem 0xe2800000-0xe2807fff at device 0.0 numa-domain 0 on pci4 smartpqi0: using MSI-X interrupts (16 vectors) smartpqi1: port 0xc000-0xc0ff mem 0xf3800000-0xf3807fff at device 0.0 numa-domain 0 on pci9 smartpqi1: using MSI-X interrupts (16 vectors) ses0 at smartpqi0 bus 0 scbus0 target 187 lun 0 ses1 at smartpqi1 bus 0 scbus1 target 187 lun 0 da2 at smartpqi1 bus 0 scbus1 target 64 lun 0 da7 at smartpqi1 bus 0 scbus1 target 69 lun 0 da5 at smartpqi1 bus 0 scbus1 target 67 lun 0 da3 at smartpqi1 bus 0 scbus1 target 65 lun 0 da8 at smartpqi1 bus 0 scbus1 target 70 lun 0 da4 at smartpqi1 bus 0 scbus1 target 66 lun 0 da9 at smartpqi1 bus 0 scbus1 target 71 lun 0 pass3 at smartpqi0 bus 0 scbus0 target 1088 lun 0 da0 at smartpqi0 bus 0 scbus0 target 64 lun 0 da6 at smartpqi1 bus 0 scbus1 target 68 lun 0 pass13 at smartpqi1 bus 0 scbus1 target 1088 lun 0 da1 at smartpqi0 bus 0 scbus0 target 66 lun 0 This server can be made to panic relatively easily by rsyncing packed logfiles over to it and unpacking them. This is (hopefully) a backtrace of a crashdump resulting from one of those panics: Fatal trap 12: page fault while in kernel mode cpuid = 3; apic id = 03 fault virtual address = 0x5a fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80dff90d stack pointer = 0x28:0xfffffe084ed93f00 frame pointer = 0x28:0xfffffe084ed93f40 code segment = base rx0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (zio_write_issue_10) trap number = 12 panic: page fault cpuid = 3 KDB: stack backtrace: #0 0xffffffff80b3d567 at kdb_backtrace+0x67 #1 0xffffffff80af6b07 at vpanic+0x177 #2 0xffffffff80af6983 at panic+0x43 #3 0xffffffff80f77fcf at trap_fatal+0x35f #4 0xffffffff80f78029 at trap_pfault+0x49 #5 0xffffffff80f777f7 at trap+0x2c7 #6 0xffffffff80f57dac at calltrap+0x8 #7 0xffffffff80dee7e2 at kmem_back+0xf2 #8 0xffffffff80dee6c0 at kmem_malloc+0x60 #9 0xffffffff80de6172 at keg_alloc_slab+0xe2 #10 0xffffffff80de8b7e at keg_fetch_slab+0x14e #11 0xffffffff80de83b4 at zone_fetch_slab+0x64 #12 0xffffffff80de848f at zone_import+0x3f #13 0xffffffff80de4b99 at uma_zalloc_arg+0x3d9 #14 0xffffffff82351ab2 at zio_write_compress+0x1e2 #15 0xffffffff8235074c at zio_execute+0xac #16 0xffffffff80b4ed74 at taskqueue_run_locked+0x154 #17 0xffffffff80b4fed8 at taskqueue_thread_loop+0x98 Uptime: 40m34s Dumping 5489 out of 32379 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% Reading symbols from /boot/kernel/geom_mirror.ko...Reading symbols from /usr/lib/debug//boot/kernel/geom_mirror.ko.debug...done. done. Loaded symbols for /boot/kernel/geom_mirror.ko Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /usr/lib/debug//boot/kernel/zfs.ko.debug...done. done. Loaded symbols for /boot/kernel/zfs.ko Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /usr/lib/debug//boot/kernel/opensolaris.ko.debug...done. done. Loaded symbols for /boot/kernel/opensolaris.ko Reading symbols from /boot/kernel/accf_data.ko...Reading symbols from /usr/lib/debug//boot/kernel/accf_data.ko.debug...done. done. Loaded symbols for /boot/kernel/accf_data.ko Reading symbols from /boot/kernel/accf_http.ko...Reading symbols from /usr/lib/debug//boot/kernel/accf_http.ko.debug...done. done. Loaded symbols for /boot/kernel/accf_http.ko Reading symbols from /boot/kernel/cc_htcp.ko...Reading symbols from /usr/lib/debug//boot/kernel/cc_htcp.ko.debug...done. done. Loaded symbols for /boot/kernel/cc_htcp.ko Reading symbols from /boot/kernel/ums.ko...Reading symbols from /usr/lib/debug//boot/kernel/ums.ko.debug...done. done. Loaded symbols for /boot/kernel/ums.ko Reading symbols from /boot/kernel/tmpfs.ko...Reading symbols from /usr/lib/debug//boot/kernel/tmpfs.ko.debug...done. done. Loaded symbols for /boot/kernel/tmpfs.ko #0 0xffffffff80af68fb in doadump (textdump=0) at /usr/src/sys/kern/kern_shutdown.c:309 309 if (dumping) (kgdb) bt #0 0xffffffff80af68fb in doadump (textdump=0) at /usr/src/sys/kern/kern_shutdown.c:309 #1 0xffffffff80af6925 in doadump (textdump=) at /usr/src/sys/kern/kern_shutdown.c:315 #2 0xffffffff80af671b in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:382 #3 0xffffffff80af6b41 in vpanic (fmt=, ap=0xfffffe084ed93c50) at /usr/src/sys/kern/kern_shutdown.c:769 #4 0xffffffff80af6983 in panic (fmt=0x0) at /usr/src/sys/kern/kern_shutdown.c:706 #5 0xffffffff80f77fcf in trap_fatal (frame=0xfffffe084ed93e40, eva=90) at /usr/src/sys/amd64/amd64/trap.c:875 #6 0xffffffff80f78029 in trap_pfault (frame=0xfffffe084ed93e40, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:712 #7 0xffffffff80f777f7 in trap (frame=0xfffffe084ed93e40) at /usr/src/sys/amd64/amd64/trap.c:514 #8 0xffffffff80f57dac in Xtss_pti () at /usr/src/sys/amd64/amd64/exception.S:159 #9 0xffffffff80dff90d in vm_page_rename (m=0x3ff, new_object=0xfffff80018d8d000, new_pindex=) at /usr/src/sys/vm/vm_page.c:1342 #10 0xffffffff80dee7e2 in kmem_suballoc (parent=0x262, min=0x14000, max=0xffffffff81ebc558, size=874980, superpage_align=) at /usr/src/sys/vm/vm_kern.c:290 #11 0xffffffff80dee6c0 in kmem_alloc_contig (vmem=0xfffffe00d59d0000, size=18446744071594296576, flags=, low=18446735303990395200, high=257, alignment=18446735278033391616, boundary=18446735278033391616, memattr=-16 '�') at /usr/src/sys/vm/vm_kern.c:254 #12 0xffffffff80de6172 in uma_prealloc (zone=0x0, items=1322860228) at /usr/src/sys/vm/uma_core.c:3150 #13 0xfffff806240140f0 in ?? () #14 0xfffffe00c51f357e in ?? () #15 0xfffffe00d59b0000 in ?? () #16 0xfffff8000d460498 in ?? () #17 0xfffff80624014140 in ?? () #18 0x02fffe00c520c000 in ?? () #19 0xfffff8000d460480 in ?? () #20 0xfffff8000d4641c0 in ?? () #21 0x0000000000000000 in ?? () Current language: auto; currently minimal As I said in the PR, I've had memtest86 running for 8h with no reported problem. So I think I can rule out memory problems. I don't really have any experience debugging panics because in the last 20-odd years of running FreeBSD, there rarely were any... Best Regards Rainer