Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 03 Oct 2018 14:55:19 +0200
From:      rainer@ultra-secure.de
To:        stable@freebsd.org
Subject:   11.2-RELEASE panics with a bit of load
Message-ID:  <7b8edb650b9d50a03e60335ebc13ada8@ultra-secure.de>

next in thread | raw e-mail | index | archive | help
Hi,

I created a PR for this, but maybe somebody here can help.

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=231296


I have a HP DL380 Gen10 server with a smartpqi(4) HBA and some disks

smartpqi0: <E208i-p SR Gen10> port 0x4000-0x40ff mem 
0xe2800000-0xe2807fff at device 0.0 numa-domain 0 on pci4
smartpqi0: using MSI-X interrupts (16 vectors)
smartpqi1: <P408i-a SR Gen10> port 0xc000-0xc0ff mem 
0xf3800000-0xf3807fff at device 0.0 numa-domain 0 on pci9
smartpqi1: using MSI-X interrupts (16 vectors)
ses0 at smartpqi0 bus 0 scbus0 target 187 lun 0
ses1 at smartpqi1 bus 0 scbus1 target 187 lun 0
da2 at smartpqi1 bus 0 scbus1 target 64 lun 0
da7 at smartpqi1 bus 0 scbus1 target 69 lun 0
da5 at smartpqi1 bus 0 scbus1 target 67 lun 0
da3 at smartpqi1 bus 0 scbus1 target 65 lun 0
da8 at smartpqi1 bus 0 scbus1 target 70 lun 0
da4 at smartpqi1 bus 0 scbus1 target 66 lun 0
da9 at smartpqi1 bus 0 scbus1 target 71 lun 0
pass3 at smartpqi0 bus 0 scbus0 target 1088 lun 0
da0 at smartpqi0 bus 0 scbus0 target 64 lun 0
da6 at smartpqi1 bus 0 scbus1 target 68 lun 0
pass13 at smartpqi1 bus 0 scbus1 target 1088 lun 0
da1 at smartpqi0 bus 0 scbus0 target 66 lun 0


This server can be made to panic relatively easily by rsyncing packed 
logfiles over to it and unpacking them.

This is (hopefully) a backtrace of a crashdump resulting from one of 
those panics:


Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 03
fault virtual address	= 0x5a
fault code		= supervisor read data, page not present
instruction pointer	= 0x20:0xffffffff80dff90d
stack pointer	        = 0x28:0xfffffe084ed93f00
frame pointer	        = 0x28:0xfffffe084ed93f40
code segment		= base rx0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 0 (zio_write_issue_10)
trap number		= 12
panic: page fault
cpuid = 3
KDB: stack backtrace:
#0 0xffffffff80b3d567 at kdb_backtrace+0x67
#1 0xffffffff80af6b07 at vpanic+0x177
#2 0xffffffff80af6983 at panic+0x43
#3 0xffffffff80f77fcf at trap_fatal+0x35f
#4 0xffffffff80f78029 at trap_pfault+0x49
#5 0xffffffff80f777f7 at trap+0x2c7
#6 0xffffffff80f57dac at calltrap+0x8
#7 0xffffffff80dee7e2 at kmem_back+0xf2
#8 0xffffffff80dee6c0 at kmem_malloc+0x60
#9 0xffffffff80de6172 at keg_alloc_slab+0xe2
#10 0xffffffff80de8b7e at keg_fetch_slab+0x14e
#11 0xffffffff80de83b4 at zone_fetch_slab+0x64
#12 0xffffffff80de848f at zone_import+0x3f
#13 0xffffffff80de4b99 at uma_zalloc_arg+0x3d9
#14 0xffffffff82351ab2 at zio_write_compress+0x1e2
#15 0xffffffff8235074c at zio_execute+0xac
#16 0xffffffff80b4ed74 at taskqueue_run_locked+0x154
#17 0xffffffff80b4fed8 at taskqueue_thread_loop+0x98
Uptime: 40m34s
Dumping 5489 out of 32379 
MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

Reading symbols from /boot/kernel/geom_mirror.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/geom_mirror.ko.debug...done.
done.
Loaded symbols for /boot/kernel/geom_mirror.ko
Reading symbols from /boot/kernel/zfs.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/zfs.ko.debug...done.
done.
Loaded symbols for /boot/kernel/zfs.ko
Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/opensolaris.ko.debug...done.
done.
Loaded symbols for /boot/kernel/opensolaris.ko
Reading symbols from /boot/kernel/accf_data.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/accf_data.ko.debug...done.
done.
Loaded symbols for /boot/kernel/accf_data.ko
Reading symbols from /boot/kernel/accf_http.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/accf_http.ko.debug...done.
done.
Loaded symbols for /boot/kernel/accf_http.ko
Reading symbols from /boot/kernel/cc_htcp.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/cc_htcp.ko.debug...done.
done.
Loaded symbols for /boot/kernel/cc_htcp.ko
Reading symbols from /boot/kernel/ums.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/ums.ko.debug...done.
done.
Loaded symbols for /boot/kernel/ums.ko
Reading symbols from /boot/kernel/tmpfs.ko...Reading symbols from 
/usr/lib/debug//boot/kernel/tmpfs.ko.debug...done.
done.
Loaded symbols for /boot/kernel/tmpfs.ko
#0  0xffffffff80af68fb in doadump (textdump=0) at 
/usr/src/sys/kern/kern_shutdown.c:309
309		if (dumping)
(kgdb) bt
#0  0xffffffff80af68fb in doadump (textdump=0) at 
/usr/src/sys/kern/kern_shutdown.c:309
#1  0xffffffff80af6925 in doadump (textdump=<value optimized out>) at 
/usr/src/sys/kern/kern_shutdown.c:315
#2  0xffffffff80af671b in kern_reboot (howto=260) at 
/usr/src/sys/kern/kern_shutdown.c:382
#3  0xffffffff80af6b41 in vpanic (fmt=<value optimized out>, 
ap=0xfffffe084ed93c50) at /usr/src/sys/kern/kern_shutdown.c:769
#4  0xffffffff80af6983 in panic (fmt=0x0) at 
/usr/src/sys/kern/kern_shutdown.c:706
#5  0xffffffff80f77fcf in trap_fatal (frame=0xfffffe084ed93e40, eva=90) 
at /usr/src/sys/amd64/amd64/trap.c:875
#6  0xffffffff80f78029 in trap_pfault (frame=0xfffffe084ed93e40, 
usermode=0) at /usr/src/sys/amd64/amd64/trap.c:712
#7  0xffffffff80f777f7 in trap (frame=0xfffffe084ed93e40) at 
/usr/src/sys/amd64/amd64/trap.c:514
#8  0xffffffff80f57dac in Xtss_pti () at 
/usr/src/sys/amd64/amd64/exception.S:159
#9  0xffffffff80dff90d in vm_page_rename (m=0x3ff, 
new_object=0xfffff80018d8d000, new_pindex=<value optimized out>) at 
/usr/src/sys/vm/vm_page.c:1342
#10 0xffffffff80dee7e2 in kmem_suballoc (parent=0x262, min=0x14000, 
max=0xffffffff81ebc558, size=874980, superpage_align=<value optimized 
out>) at /usr/src/sys/vm/vm_kern.c:290
#11 0xffffffff80dee6c0 in kmem_alloc_contig (vmem=0xfffffe00d59d0000, 
size=18446744071594296576, flags=<value optimized out>, 
low=18446735303990395200, high=257, alignment=18446735278033391616,
     boundary=18446735278033391616, memattr=-16 '�') at 
/usr/src/sys/vm/vm_kern.c:254
#12 0xffffffff80de6172 in uma_prealloc (zone=0x0, items=1322860228) at 
/usr/src/sys/vm/uma_core.c:3150
#13 0xfffff806240140f0 in ?? ()
#14 0xfffffe00c51f357e in ?? ()
#15 0xfffffe00d59b0000 in ?? ()
#16 0xfffff8000d460498 in ?? ()
#17 0xfffff80624014140 in ?? ()
#18 0x02fffe00c520c000 in ?? ()
#19 0xfffff8000d460480 in ?? ()
#20 0xfffff8000d4641c0 in ?? ()
#21 0x0000000000000000 in ?? ()
Current language:  auto; currently minimal


As I said in the PR, I've had memtest86 running for 8h with no reported 
problem. So I think I can rule out memory problems.

I don't really have any experience debugging panics because in the last 
20-odd years of running FreeBSD, there rarely were any...



Best Regards
Rainer






Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7b8edb650b9d50a03e60335ebc13ada8>