Date: Tue, 28 Feb 2017 17:07:56 +0000 From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 217422] Fatal trap 12: page fault while in kernel mode during heavy IO Message-ID: <bug-217422-8@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D217422 Bug ID: 217422 Summary: Fatal trap 12: page fault while in kernel mode during heavy IO Product: Base System Version: 11.0-RELEASE Hardware: amd64 OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: freebsd-bugs@FreeBSD.org Reporter: cam@neo-zeon.de I get 2-4 crashes (followed by automatic soft reboots) a week. Every night = at 1 am PST, I have a cron that backs up another system using rsync over ssh. Last night's crash occurred at ~3:09 am, and the previous crash before that occurred around 3:24 am. I've noticed that the "periodic daily" cron, which I believe causes some IO load of its own, is set to start at 3:01 am. The crashes all occur during this heavy backup via rsync. I went through all the bug reports I could, I don't *think* this is a duplicate. Note: runs on physical hardware w/8 core Intel Avaton CPU. Motherboard: supermicro a1sai-2750f Memory: 16G ECC root file system is ZFS on 2x Intel 730 240G SSD's (ada1 & ada2) Backup drive is 8TB Seagate ST8000NM0055-1RM112 spinning disk. It's less th= an a year old on a SATA2 port (ada0). Filesystem is ZFS. (I recently remade the filesystem so I could create a swap partition for ke= rnel crash dumps). zpool status pool: storage state: ONLINE scan: scrub repaired 0 in 0h0m with 0 errors on Mon Feb 27 13:01:41 2017 config: NAME STATE READ WRITE CKSUM storage ONLINE 0 0 0 ada0p2 ONLINE 0 0 0 errors: No known data errors pool: zroot state: ONLINE scan: scrub repaired 0 in 0h2m with 0 errors on Sun Feb 26 01:54:37 2017 config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 = 0=20=20=20 0 mirror-0 ONLINE 0 = 0=20=20=20 0 gptid/3f64a6eb-0faa-11e4-8b78-002590f1cfc0 ONLINE 0 = 0=20=20=20 0 gptid/2eb24e92-1555-11e4-9076-002590f1cfc0 ONLINE 0 = 0=20=20=20 0 errors: No known data errors I'm using a custom kernel with very few changes (I'll switch to GENERIC to = see if it makes a difference). Here's the diff: diff -u GENERIC VASTEEL --- GENERIC 2016-09-05 10:40:05.944395438 -0700 +++ VASTEEL 2016-09-05 10:40:22.326390926 -0700 @@ -357,3 +357,18 @@ # The crypto framework is required by IPSEC device crypto # Required by IPSEC + +# Enable disk quota. +options QUOTA + +device pf +device pflog +device pfsync + +options ALTQ +options ALTQ_CBQ # Class Bases Queuing (CBQ) +options ALTQ_RED # Random Early Detection (RED) +options ALTQ_RIO # RED In/Out +options ALTQ_HFSC # Hierarchical Packet Scheduler (HF= SC) +options ALTQ_PRIQ # Priority Queuing (PRIQ) +options ALTQ_NOPCC # Required for SMP build kldstat=20 Id Refs Address Size Name 1 25 0xffffffff80200000 20058c0 kernel 2 1 0xffffffff82207000 30b650 zfs.ko 3 2 0xffffffff82513000 adb0 opensolaris.ko 4 1 0xffffffff8251e000 4c60 coretemp.ko 5 1 0xffffffff82621000 587b fdescfs.ko 6 1 0xffffffff82627000 3710 ums.ko 7 1 0xffffffff8262b000 abf1 linprocfs.ko 8 1 0xffffffff82636000 7b18 linux_common.ko Kernel crash dump: GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain condition= s. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid =3D 7; apic id =3D 0e fault virtual address =3D 0x8 fault code =3D supervisor read data, page not present instruction pointer =3D 0x20:0xffffffff80b7b5d0 stack pointer =3D 0x28:0xfffffe04669a87c0 frame pointer =3D 0x28:0xfffffe04669a8800 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D interrupt enabled, resume, IOPL =3D 0 current process =3D 10403 (rsync) trap number =3D 12 panic: page fault cpuid =3D 7 KDB: stack backtrace: #0 0xffffffff80b2dcb7 at kdb_backtrace+0x67 #1 0xffffffff80ae2302 at vpanic+0x182 #2 0xffffffff80ae2173 at panic+0x43 #3 0xffffffff80ff2c71 at trap_fatal+0x351 #4 0xffffffff80ff2e63 at trap_pfault+0x1e3 #5 0xffffffff80ff240d at trap+0x26d #6 0xffffffff80fd5441 at calltrap+0x8 #7 0xffffffff80b7a398 at sbdestroy+0x18 #8 0xffffffff80b7cd9a at sofree+0x22a #9 0xffffffff80b7d516 at soclose+0x516 #10 0xffffffff80a7ad0d at _fdrop+0x1d #11 0xffffffff80a7e90d at closef+0x2ed #12 0xffffffff80a7e35d at fdescfree_fds+0x7d #13 0xffffffff80a7dee9 at fdescfree+0x6b9 #14 0xffffffff80a9011e at exit1+0x75e #15 0xffffffff80a8f9bd at sys_sys_exit+0xd #16 0xffffffff80ff35e3 at amd64_syscall+0x4e3 #17 0xffffffff80fd572b at Xfast_syscall+0xfb Uptime: 13h43m7s Dumping 2913 out of 16321 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..= 91% Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /usr/lib/debug//boot/kernel/zfs.ko.debug...done. done. Loaded symbols for /boot/kernel/zfs.ko Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /usr/lib/debug//boot/kernel/opensolaris.ko.debug...done. done. Loaded symbols for /boot/kernel/opensolaris.ko Reading symbols from /boot/kernel/coretemp.ko...Reading symbols from /usr/lib/debug//boot/kernel/coretemp.ko.debug...done. done. Loaded symbols for /boot/kernel/coretemp.ko Reading symbols from /boot/kernel/fdescfs.ko...Reading symbols from /usr/lib/debug//boot/kernel/fdescfs.ko.debug...done. done. Loaded symbols for /boot/kernel/fdescfs.ko Reading symbols from /boot/kernel/ums.ko...Reading symbols from /usr/lib/debug//boot/kernel/ums.ko.debug...done. done. Loaded symbols for /boot/kernel/ums.ko Reading symbols from /boot/kernel/linprocfs.ko...Reading symbols from /usr/lib/debug//boot/kernel/linprocfs.ko.debug...done. done. Loaded symbols for /boot/kernel/linprocfs.ko Reading symbols from /boot/kernel/linux_common.ko...Reading symbols from /usr/lib/debug//boot/kernel/linux_common.ko.debug...done. done. Loaded symbols for /boot/kernel/linux_common.ko Reading symbols from /boot/kernel/snp.ko...Reading symbols from /usr/lib/debug//boot/kernel/snp.ko.debug...done. done. Loaded symbols for /boot/kernel/snp.ko #0 doadump (textdump=3D<value optimized out>) at pcpu.h:221 221 __asm("movq %%gs:%1,%0" : "=3Dr" (td dmesg: Copyright (c) 1992-2016 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 11.0-RELEASE-p8 #16 r314186: Thu Feb 23 16:40:33 PST 2017 root@vasteel.neo-zeon.de:/usr/obj/usr/src/sys/VASTEEL amd64 FreeBSD clang version 3.8.0 (tags/RELEASE_380/final 262564) (based on LLVM 3.8.0) VT(vga): resolution 640x480 CPU: Intel(R) Atom(TM) CPU C2750 @ 2.40GHz (2400.07-MHz K8-class CPU) Origin=3D"GenuineIntel" Id=3D0x406d8 Family=3D0x6 Model=3D0x4d Steppi= ng=3D8 =20 Features=3D0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,= MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> =20 Features2=3D0x43d8e3bf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,C= X16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,TSCDLT,AESNI,RDRAND> AMD Features=3D0x28100800<SYSCALL,NX,RDTSCP,LM> AMD Features2=3D0x101<LAHF,Prefetch> Structured Extended Features=3D0x2282<TSCADJ,SMEP,ERMS,NFPUSG> VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID TSC: P-state invariant, performance statistics real memory =3D 17179869184 (16384 MB) avail memory =3D 16515948544 (15750 MB) Event timer "LAPIC" quality 600 ACPI APIC Table: <INTEL TIANO > WARNING: L1 data cache covers less APIC IDs than a core 0 < 1 FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs FreeBSD/SMP: 1 package(s) x 8 core(s) random: unblocking device. ioapic0 <Version 2.0> irqs 0-23 on motherboard random: entropy device external interface kbd1 at kbdmux0 netmap: loaded module module_register_init: MOD_LOAD (vesa, 0xffffffff8106e9a0, 0) error 19 random: registering fast source Intel Secure Key RNG random: fast provider: "Intel Secure Key RNG" vtvga0: <VT VGA driver> on motherboard cryptosoft0: <software crypto> on motherboard acpi0: <ALASKA A M I > on motherboard acpi0: Power Button (fixed) cpu0: <ACPI CPU> on acpi0 cpu1: <ACPI CPU> on acpi0 cpu2: <ACPI CPU> on acpi0 cpu3: <ACPI CPU> on acpi0 cpu4: <ACPI CPU> on acpi0 cpu5: <ACPI CPU> on acpi0 cpu6: <ACPI CPU> on acpi0 cpu7: <ACPI CPU> on acpi0 hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0 Timecounter "HPET" frequency 14318180 Hz quality 950 Event timer "HPET" frequency 14318180 Hz quality 350 Event timer "HPET1" frequency 14318180 Hz quality 340 Event timer "HPET2" frequency 14318180 Hz quality 340 atrtc0: <AT realtime clock> port 0x70-0x77 irq 8 on acpi0 atrtc0: Warning: Couldn't map I/O. Event timer "RTC" frequency 32768 Hz quality 0 attimer0: <AT timer> port 0x40-0x43,0x50-0x53 irq 0 on acpi0 Timecounter "i8254" frequency 1193182 Hz quality 0 Event timer "i8254" frequency 1193182 Hz quality 100 Timecounter "ACPI-safe" frequency 3579545 Hz quality 850 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pcib0: _OSC returned error 0x10 pci0: <ACPI PCI bus> on pcib0 pcib1: <ACPI PCI-PCI bridge> mem 0xdf2c0000-0xdf2dffff irq 16 at device 1.0= on pci0 pci1: <ACPI PCI bus> on pcib1 pcib2: <ACPI PCI-PCI bridge> at device 0.0 on pci1 pci2: <ACPI PCI bus> on pcib2 vgapci0: <VGA-compatible display> port 0xd000-0xd07f mem 0xde000000-0xdeffffff,0xdf000000-0xdf01ffff irq 16 at device 0.0 on pci2 vgapci0: Boot video device pcib3: <ACPI PCI-PCI bridge> mem 0xdf2a0000-0xdf2bffff irq 16 at device 2.0= on pci0 pci3: <ACPI PCI bus> on pcib3 xhci0: <XHCI (generic) USB 3.0 controller> mem 0xdf100000-0xdf101fff irq 17= at device 0.0 on pci3 xhci0: 64 bytes context size, 32-bit DMA xhci0: Unable to map MSI-X table=20 usbus0 on xhci0 pcib4: <ACPI PCI-PCI bridge> mem 0xdf280000-0xdf29ffff irq 20 at device 3.0= on pci0 pci4: <ACPI PCI bus> on pcib4 pci0: <base peripheral, IOMMU> at device 15.0 (no driver attached) igb0: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xe0c0-0xe0df mem 0xdf260000-0xdf27ffff,0xdf2ec000-0xdf2effff irq 20 at dev= ice 20.0 on pci0 igb0: Using MSIX interrupts with 9 vectors igb0: Ethernet address: 00:25:90:f1:cf:c0 igb0: Bound queue 0 to cpu 0 igb0: Bound queue 1 to cpu 1 igb0: Bound queue 2 to cpu 2 igb0: Bound queue 3 to cpu 3 igb0: Bound queue 4 to cpu 4 igb0: Bound queue 5 to cpu 5 igb0: Bound queue 6 to cpu 6 igb0: Bound queue 7 to cpu 7 igb0: netmap queues/slots: TX 8/1024, RX 8/1024 igb1: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xe0a0-0xe0bf mem 0xdf240000-0xdf25ffff,0xdf2e8000-0xdf2ebfff irq 21 at dev= ice 20.1 on pci0 igb1: Using MSIX interrupts with 9 vectors igb1: Ethernet address: 00:25:90:f1:cf:c1 igb1: Bound queue 0 to cpu 0 igb1: Bound queue 1 to cpu 1 igb1: Bound queue 2 to cpu 2 igb1: Bound queue 3 to cpu 3 igb1: Bound queue 4 to cpu 4 igb1: Bound queue 5 to cpu 5 igb1: Bound queue 6 to cpu 6 igb1: Bound queue 7 to cpu 7 igb1: netmap queues/slots: TX 8/1024, RX 8/1024 igb2: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xe080-0xe09f mem 0xdf220000-0xdf23ffff,0xdf2e4000-0xdf2e7fff irq 22 at dev= ice 20.2 on pci0 igb2: Using MSIX interrupts with 9 vectors igb2: Ethernet address: 00:25:90:f1:cf:c2 igb2: Bound queue 0 to cpu 0 igb2: Bound queue 1 to cpu 1 igb2: Bound queue 2 to cpu 2 igb2: Bound queue 3 to cpu 3 igb2: Bound queue 4 to cpu 4 igb2: Bound queue 5 to cpu 5 igb2: Bound queue 6 to cpu 6 igb2: Bound queue 7 to cpu 7 igb2: netmap queues/slots: TX 8/1024, RX 8/1024 igb3: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xe060-0xe07f mem 0xdf200000-0xdf21ffff,0xdf2e0000-0xdf2e3fff irq 23 at dev= ice 20.3 on pci0 igb3: Using MSIX interrupts with 9 vectors igb3: Ethernet address: 00:25:90:f1:cf:c3 igb3: Bound queue 0 to cpu 0 igb3: Bound queue 1 to cpu 1 igb3: Bound queue 2 to cpu 2 igb3: Bound queue 3 to cpu 3 igb3: Bound queue 4 to cpu 4 igb3: Bound queue 5 to cpu 5 igb3: Bound queue 6 to cpu 6 igb3: Bound queue 7 to cpu 7 igb3: netmap queues/slots: TX 8/1024, RX 8/1024 ehci0: <Intel Avoton USB 2.0 controller> mem 0xdf2f3000-0xdf2f33ff irq 23 at device 22.0 on pci0 usbus1: EHCI version 1.0 usbus1 on ehci0 ahci0: <Intel Avoton AHCI SATA controller> port 0xe150-0xe157,0xe140-0xe143,0xe130-0xe137,0xe120-0xe123,0xe040-0xe05f mem 0xdf2f2000-0xdf2f27ff irq 19 at device 23.0 on pci0 ahci0: AHCI v1.30 with 4 3Gbps ports, Port Multiplier not supported ahcich0: <AHCI channel> at channel 0 on ahci0 ahcich1: <AHCI channel> at channel 1 on ahci0 ahcich2: <AHCI channel> at channel 2 on ahci0 ahcich3: <AHCI channel> at channel 3 on ahci0 ahci1: <Intel Avoton AHCI SATA controller> port 0xe110-0xe117,0xe100-0xe103,0xe0f0-0xe0f7,0xe0e0-0xe0e3,0xe020-0xe03f mem 0xdf2f1000-0xdf2f17ff irq 19 at device 24.0 on pci0 ahci1: AHCI v1.30 with 2 6Gbps ports, Port Multiplier not supported ahcich4: <AHCI channel> at channel 0 on ahci1 ahcich5: <AHCI channel> at channel 1 on ahci1 isab0: <PCI-ISA bridge> at device 31.0 on pci0 isa0: <ISA bus> on isab0 uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0 orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff on isa0 atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] ppc0: cannot reserve I/O port range coretemp0: <CPU On-Die Thermal Sensors> on cpu0 est0: <Enhanced SpeedStep Frequency Control> on cpu0 coretemp1: <CPU On-Die Thermal Sensors> on cpu1 est1: <Enhanced SpeedStep Frequency Control> on cpu1 coretemp2: <CPU On-Die Thermal Sensors> on cpu2 est2: <Enhanced SpeedStep Frequency Control> on cpu2 coretemp3: <CPU On-Die Thermal Sensors> on cpu3 est3: <Enhanced SpeedStep Frequency Control> on cpu3 coretemp4: <CPU On-Die Thermal Sensors> on cpu4 est4: <Enhanced SpeedStep Frequency Control> on cpu4 coretemp5: <CPU On-Die Thermal Sensors> on cpu5 est5: <Enhanced SpeedStep Frequency Control> on cpu5 coretemp6: <CPU On-Die Thermal Sensors> on cpu6 est6: <Enhanced SpeedStep Frequency Control> on cpu6 coretemp7: <CPU On-Die Thermal Sensors> on cpu7 est7: <Enhanced SpeedStep Frequency Control> on cpu7 usbus0: 5.0Gbps Super Speed USB v3.0 ZFS filesystem version: 5 ZFS storage pool version: features support (5000) Timecounters tick every 1.000 msec nvme cam probe device init usbus1: 480Mbps High Speed USB v2.0 ugen0.1: <0x1912> at usbus0 uhub0: <0x1912 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0 ugen1.1: <Intel> at usbus1 uhub1: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus1 ada0 at ahcich1 bus 0 scbus1 target 0 lun 0 ada0: <ST8000NM0055-1RM112 SN02> ACS-3 ATA SATA 3.x device ada0: Serial Number ZA11E7R9 ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada0: Command Queueing enabled ada0: 7630885MB (15628053168 512 byte sectors) ada1 at ahcich4 bus 0 scbus4 target 0 lun 0 ada1: <INTEL SSDSC2BP240G4 L2010410> ATA8-ACS SATA 3.x device ada1: Serial Number BTJR408202C3240AGN ada1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes) ada1: Command Queueing enabled ada1: 228936MB (468862128 512 byte sectors) ada2 at ahcich5 bus 0 scbus5 target 0 lun 0 ada2: <INTEL SSDSC2BP240G4 L2010410> ATA8-ACS SATA 3.x device ada2: Serial Number BTJR40820CQN240AGN ada2: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes) ada2: Command Queueing enabled ada2: 228936MB (468862128 512 byte sectors) SMP: AP CPU #6 Launched! SMP: AP CPU #2 Launched! SMP: AP CPU #5 Launched! SMP: AP CPU #7 Launched! SMP: AP CPU #3 Launched! SMP: AP CPU #4 Launched! SMP: AP CPU #1 Launched! Timecounter "TSC-low" frequency 1200035112 Hz quality 1000 Trying to mount root from zfs:zroot/ROOT/default []... Root mount waiting for: usbus1 usbus0 uhub0: 8 ports with 8 removable, self powered Root mount waiting for: usbus1 Root mount waiting for: usbus1 uhub1: 8 ports with 8 removable, self powered Root mount waiting for: usbus1 ugen1.2: <vendor 0x8087> at usbus1 uhub2: <vendor 0x8087 product 0x07db, class 9/0, rev 2.00/0.02, addr 2> on usbus1 Root mount waiting for: usbus1 uhub2: 4 ports with 4 removable, self powered ugen1.3: <vendor 0x0000> at usbus1 uhub3: <vendor 0x0000 product 0x0001, class 9/0, rev 2.00/0.00, addr 3> on usbus1 Root mount waiting for: usbus1 uhub3: 4 ports with 3 removable, self powered ugen1.4: <vendor 0x0557> at usbus1 ukbd0: <vendor 0x0557 product 0x2419, class 0/0, rev 1.10/1.00, addr 4> on usbus1 kbd2 at ukbd0 igb0: link state changed to UP ums0: <vendor 0x0557 product 0x2419, class 0/0, rev 1.10/1.00, addr 4> on usbus1 ums0: 3 buttons and [Z] coordinates ID=3D0 pflog0: promiscuous mode enabled igb1: link state changed to UP --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-217422-8>