Date: Thu, 21 Apr 2016 14:47:22 +0000 From: bugzilla-noreply@freebsd.org To: freebsd-amd64@FreeBSD.org Subject: [Bug 208957] Kernel panic (page fault) on 10.3-STABLE with VIMAGE & Infiniband modules Message-ID: <bug-208957-6@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D208957 Bug ID: 208957 Summary: Kernel panic (page fault) on 10.3-STABLE with VIMAGE & Infiniband modules Product: Base System Version: 10.3-RELEASE Hardware: amd64 OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: freebsd-bugs@FreeBSD.org Reporter: justin@postgresql.org CC: freebsd-amd64@FreeBSD.org CC: freebsd-amd64@FreeBSD.org The VIMAGE option is causing a kernel panic (page fault) when compiled along with the Infiniband options on 10.3-STABLE. It's 100% reproducible, and ea= sily triggered. ;) Note - compiled this multiple times over the last few days, across several systems, just to ensure it's not due to bad hw in a system. It panic relia= bly every time, on them all. Definitely a software bug of some sort. Note - Anecdotal evidence suggests the repeated problems of VIMAGE + Infini= band is a large part of the reason Infiniband isn't supported on FreeNAS. The NAS4Free project also has difficulties with Infiniband, very likely also du= e to this. :( https://bugs.freenas.org/issues/2014#note-18 Anyway, backtrace info below in case it helps: (commands taken from https://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-gdb.ht= ml) ***************************************************************************= ******** root@cluster1:/usr/obj/usr/src/sys/CONNECTX # kgdb kernel.debug /var/crash/vmcore.0 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain condition= s. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D interrupt enabled, resume, IOPL =3D 0 current process =3D 12 (irq271: mlx4_core0) trap number =3D 12 panic: page fault cpuid =3D 0 KDB: stack backtrace: #0 0xffffffff807263d0 at kdb_backtrace+0x60 #1 0xffffffff806e8c76 at vpanic+0x126 #2 0xffffffff806e8b43 at panic+0x43 #3 0xffffffff80b8bf3b at trap_fatal+0x36b #4 0xffffffff80b8c23d at trap_pfault+0x2ed #5 0xffffffff80b8b8ba at trap+0x47a #6 0xffffffff80b71892 at calltrap+0x8 #7 0xffffffff807be1a2 at netisr_dispatch_src+0x62 #8 0xffffffff808f89fa at ipoib_cm_handle_rx_wc+0x22a #9 0xffffffff808fcc98 at ipoib_ib_completion+0x78 #10 0xffffffff80930c43 at mlx4_cq_completion+0x63 #11 0xffffffff80933d43 at mlx4_eq_int+0x2c3 #12 0xffffffff80932fac at mlx4_msi_x_interrupt+0xc #13 0xffffffff806b35cb at intr_event_execute_handlers+0xab #14 0xffffffff806b3a16 at ithread_loop+0x96 #15 0xffffffff806b104a at fork_exit+0x9a #16 0xffffffff80b71dce at fork_trampoline+0xe Uptime: 3m47s Dumping 485 out of 7857 MB:..4%..14%..24%..33%..43%..53%..63%..73%..83%..93% Reading symbols from /boot/kernel/ums.ko.symbols...done. Loaded symbols for /boot/kernel/ums.ko.symbols #0 doadump (textdump=3D<value optimized out>) at pcpu.h:219 219 __asm("movq %%gs:%1,%0" : "=3Dr" (td) (kgdb) list *0xffffffff808f89fa 0xffffffff808f89fa is in ipoib_cm_handle_rx_wc (/usr/src/sys/ofed/drivers/infiniband/ulp/ipoib/ipoib_cm.c:565). 560 mb->m_pkthdr.rcvif =3D dev; 561 proto =3D *mtod(mb, uint16_t *); 562 m_adj(mb, IPOIB_ENCAP_LEN); 563=20=20=20=20=20 564 IPOIB_MTAP_PROTO(dev, mb, proto); 565 ipoib_demux(dev, mb, ntohs(proto)); 566=20=20=20=20=20 567 repost: 568 if (has_srq) { 569 if (unlikely(ipoib_cm_post_receive_srq(priv, wr_id)= )) Current language: auto; currently minimal (kgdb) list *0xffffffff807be1a2 0xffffffff807be1a2 is in netisr_dispatch_src (/usr/src/sys/net/netisr.c:976= ). 971 if (dispatch_policy =3D=3D NETISR_DISPATCH_DIRECT) { 972 nwsp =3D DPCPU_PTR(nws); 973 npwp =3D &nwsp->nws_work[proto]; 974 npwp->nw_dispatched++; 975 npwp->nw_handled++; 976 netisr_proto[proto].np_handler(m); 977 error =3D 0; 978 goto out_unlock; 979 } 980=20=20=20=20=20 (kgdb) list *0xffffffff80b71892 0xffffffff80b71892 is at /usr/src/sys/amd64/amd64/exception.S:238. 233 .type calltrap,@function 234 calltrap: 235 movq %rsp,%rdi 236 call trap 237 MEXITCOUNT 238 jmp doreti /* Handle any pending ASTs = */ 239=20=20=20=20=20 240 /* 241 * alltraps_noen entry point. Unlike alltraps above, we wa= nt to 242 * leave the interrupts disabled. This corresponds to (kgdb) list *0xffffffff80b8b8ba 0xffffffff80b8b8ba is in trap (/usr/src/sys/amd64/amd64/trap.c:447). 442=20=20=20=20=20 443 KASSERT(cold || td->td_ucred !=3D NULL, 444 ("kernel trap doesn't have ucred")); 445 switch (type) { 446 case T_PAGEFLT: /* page fault */ 447 (void) trap_pfault(frame, FALSE); 448 goto out; 449=20=20=20=20=20 450 case T_DNA: 451 KASSERT(!PCB_USER_FPU(td->td_pcb), (kgdb) ***************************************************************************= ******** Kernel configuration used: --- # # GENERIC -- Generic kernel configuration file for FreeBSD/amd64 # # For more information on this file, please read the config(5) manual page, # and/or the handbook section on Kernel Configuration Files: # #=20=20=20 http://www.FreeBSD.org/doc/en_US.ISO8859-1/books/handbook/kernelconfig-conf= ig.html # # The handbook is also available locally in /usr/share/doc/handbook # if you've installed the doc distribution, otherwise always see the # FreeBSD World Wide Web server (http://www.FreeBSD.org/) for the # latest information. # # An exhaustive list of options and more detailed explanations of the # device lines is also present in the ../../conf/NOTES and NOTES files. # If you are in doubt as to the purpose or necessity of a line, check first # in NOTES. # # $FreeBSD: stable/10/sys/amd64/conf/GENERIC 286132 2015-07-31 15:25:07Z gj= b $ cpu HAMMER ident CONNECTX2 makeoptions DEBUG=3D-g # Build kernel with gdb(1) debug symbols makeoptions WITH_CTF=3D1 # Run ctfconvert(1) for DTrace su= pport ##################################################################### # NETWORKING OPTIONS # # DEVICE_POLLING adds support for mixed interrupt-polling handling # of network device drivers, which has significant benefits in terms # of robustness to overloads and responsivity, as well as permitting # accurate scheduling of the CPU time between kernel network processing # and other activities. The drawback is a moderate (up to 1/HZ seconds) # potential increase in response times. # It is strongly recommended to use HZ=3D1000 or 2000 with DEVICE_POLLING # to achieve smoother behaviour. # Additionally, you can enable/disable polling at runtime with help of # the ifconfig(8) utility, and select the CPU fraction reserved to # userland with the sysctl variable kern.polling.user_frac # (default 50, range 0..100). # # Not all device drivers support this mode of operation at the time of # this writing. See polling(4) for more details. options DEVICE_POLLING # BPF_JITTER adds support for BPF just-in-time compiler. options BPF_JITTER # OpenFabrics Enterprise Distribution (Infiniband). options OFED options OFED_DEBUG_INIT # Sockets Direct Protocol options SDP options SDP_DEBUG # IP over Infiniband options IPOIB options IPOIB_DEBUG options IPOIB_CM ##################################################################### options SCHED_ULE # ULE scheduler options PREEMPTION # Enable kernel thread preemption options INET # InterNETworking options INET6 # IPv6 communications protocols options TCP_OFFLOAD # TCP offload options SCTP # Stream Control Transmission Proto= col options FFS # Berkeley Fast Filesystem options SOFTUPDATES # Enable FFS soft updates support options UFS_ACL # Support for access control lists options UFS_DIRHASH # Improve performance on big directories options UFS_GJOURNAL # Enable gjournal-based UFS journal= ing options QUOTA # Enable disk quotas for UFS options MD_ROOT # MD is a potential root device options NFSCL # New Network Filesystem Client options NFSD # New Network Filesystem Server options NFSLOCKD # Network Lock Manager options NFS_ROOT # NFS usable as /, requires NFSCL options MSDOSFS # MSDOS Filesystem options CD9660 # ISO 9660 Filesystem options PROCFS # Process filesystem (requires PSEUDOFS) options PSEUDOFS # Pseudo-filesystem framework options GEOM_PART_GPT # GUID Partition Tables. options GEOM_RAID # Soft RAID functionality. options GEOM_LABEL # Provides labelization options COMPAT_FREEBSD32 # Compatible with i386 binaries options COMPAT_FREEBSD4 # Compatible with FreeBSD4 options COMPAT_FREEBSD5 # Compatible with FreeBSD5 options COMPAT_FREEBSD6 # Compatible with FreeBSD6 options COMPAT_FREEBSD7 # Compatible with FreeBSD7 options SCSI_DELAY=3D5000 # Delay (in ms) before probing SC= SI options KTRACE # ktrace(1) support options STACK # stack(9) support options SYSVSHM # SYSV-style shared memory options SYSVMSG # SYSV-style message queues options SYSVSEM # SYSV-style semaphores options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions options PRINTF_BUFR_SIZE=3D128 # Prevent printf output being interspersed. options KBD_INSTALL_CDEV # install a CDEV entry in /dev options HWPMC_HOOKS # Necessary kernel hooks for hwpmc(= 4) options AUDIT # Security event auditing options CAPABILITY_MODE # Capsicum capability mode options CAPABILITIES # Capsicum capabilities options PROCDESC # Support for process descriptors options MAC # TrustedBSD MAC Framework options KDTRACE_FRAME # Ensure frames are compiled in options KDTRACE_HOOKS # Kernel DTrace hooks options DDB_CTF # Kernel ELF linker loads CTF data options INCLUDE_CONFIG_FILE # Include this file in kernel options RACCT # Resource accounting framework options RACCT_DEFAULT_TO_DISABLED # Set kern.racct.enable=3D0 by de= fault options RCTL # Resource limits # Debugging support. Always need this: options KDB # Enable kernel debugger support. options KDB_TRACE # Print a stack trace for a panic. # Make an SMP-capable kernel by default options SMP # Symmetric MultiProcessor Kernel # CPU frequency control device cpufreq # Bus support. device acpi options ACPI_DMAR device pci # Floppy drives #device fdc # ATA controllers device ahci # AHCI-compatible SATA controllers device ata # Legacy ATA/SATA controllers options ATA_STATIC_ID # Static device numbering device mvs # Marvell 88SX50XX/88SX60XX/88SX70XX/SoC SATA device siis # SiliconImage SiI3124/SiI3132/SiI3= 531 SATA # SCSI Controllers device ahc # AHA2940 and onboard AIC7xxx devic= es options AHC_REG_PRETTY_PRINT # Print register bitfields in debug # output. Adds ~128k to driver. device ahd # AHA39320/29320 and onboard AIC79xx devices options AHD_REG_PRETTY_PRINT # Print register bitfields in debug # output. Adds ~215k to driver. device esp # AMD Am53C974 (Tekram DC-390(T)) device hptiop # Highpoint RocketRaid 3xxx series device isp # Qlogic family device ispfw # Firmware for QLogic HBAs- normall= y a module device mpt # LSI-Logic MPT-Fusion device mps # LSI-Logic MPT-Fusion 2 device mpr # LSI-Logic MPT-Fusion 3 device ncr # NCR/Symbios Logic device sym # NCR/Symbios Logic (newer chipsets= + those of `ncr') device trm # Tekram DC395U/UW/F DC315U adapters device adv # Advansys SCSI adapters device adw # Advansys wide SCSI adapters device aic # Adaptec 15[012]x SCSI adapters, AIC-6[23]60. device bt # Buslogic/Mylex MultiMaster SCSI adapters device isci # Intel C600 SAS controller # ATA/SCSI peripherals device scbus # SCSI bus (required for ATA/SCSI) device ch # SCSI media changers device da # Direct Access (disks) device sa # Sequential Access (tape etc) device cd # CD device pass # Passthrough device (direct ATA/SC= SI access) device ses # Enclosure Services (SES and SAF-T= E) #device ctl # CAM Target Layer # RAID controllers interfaced to the SCSI subsystem device amr # AMI MegaRAID device arcmsr # Areca SATA II RAID #XXX it is not 64-bit clean, -scottl #device asr # DPT SmartRAID V, VI and Adaptec S= CSI RAID device ciss # Compaq Smart RAID 5* device dpt # DPT Smartcache III, IV - See NOTES for options device hptmv # Highpoint RocketRAID 182x device hptnr # Highpoint DC7280, R750 device hptrr # Highpoint RocketRAID 17xx, 22xx, 23xx, 25xx device hpt27xx # Highpoint RocketRAID 27xx device iir # Intel Integrated RAID device ips # IBM (Adaptec) ServeRAID device mly # Mylex AcceleRAID/eXtremeRAID device twa # 3ware 9000 series PATA/SATA RAID device tws # LSI 3ware 9750 SATA+SAS 6Gb/s RAID controller # RAID controllers #device aac # Adaptec FSA RAID #device aacp # SCSI passthrough for aac (requires CAM) #device aacraid # Adaptec by PMC RAID #device ida # Compaq Smart RAID #device mfi # LSI MegaRAID SAS #device mlx # Mylex DAC960 family #device mrsas # LSI/Avago MegaRAID SAS/SATA, 6Gb/s and 12Gb/s #XXX PCI ID conflicts with ahd(4) and mvs(4) #device pmspcv # PMC-Sierra SAS/SATA Controller dr= iver #XXX pointer/int warnings #device pst # Promise Supertrak SX6000 #device twe # 3ware ATA RAID # NVM Express (NVMe) support device nvme # base NVMe driver device nvd # expose NVMe namespaces as disks, depends on nvme # atkbdc0 controls both the keyboard and the PS/2 mouse device atkbdc # AT keyboard controller device atkbd # AT keyboard device psm # PS/2 mouse device kbdmux # keyboard multiplexer device vga # VGA video card driver options VESA # Add support for VESA BIOS Extensi= ons (VBE) device splash # Splash screen and screen saver support # syscons is the default console driver, resembling an SCO console device sc options SC_PIXEL_MODE # add support for the raster text m= ode # vt is the new video console driver device vt device vt_vga device vt_efifb device agp # support several AGP chipsets # PCCARD (PCMCIA) support # PCMCIA and cardbus bridge support device cbb # cardbus (yenta) bridge device pccard # PC Card (16-bit) bus device cardbus # CardBus (32-bit) bus # Serial (COM) ports device uart # Generic UART driver # Parallel port device ppc device ppbus # Parallel port bus (required) device lpt # Printer device ppi # Parallel port interface device device vpo # Requires scbus and da device puc # Multi I/O cards and multi-channel UARTs # PCI Ethernet NICs. #device bxe # Broadcom NetXtreme II BCM5771X/BCM578XX 10GbE #device de # DEC/Intel DC21x4x (``Tulip'') device em # Intel PRO/1000 Gigabit Ethernet Family #device igb # Intel PRO/1000 PCIE Server Gigabit Family #device ix # Intel PRO/10GbE PCIE PF Ethernet #device ixv # Intel PRO/10GbE PCIE VF Ethernet #device ixl # Intel XL710 40Gbe PCIE Ethernet #device ixlv # Intel XL710 40Gbe VF PCIE Ethernet device mlx4ib # Mellanox ConnectX HCA InfiniBand device mlxen # Mellanox ConnectX HCA Ethernet device mthca # Mellanox HCA InfiniBand #device le # AMD Am7900 LANCE and Am79C9xx PCn= et #device ti # Alteon Networks Tigon I/II gigabit Ethernet #device txp # 3Com 3cR990 (``Typhoon'') #device vx # 3Com 3c590, 3c595 (``Vortex'') # PCI Ethernet NICs that use the common MII bus controller code. # NOTE: Be sure to keep the 'device miibus' line in order to use these NICs! device miibus # MII bus support #device ae # Attansic/Atheros L2 FastEthernet #device age # Attansic/Atheros L1 Gigabit Ether= net #device alc # Atheros AR8131/AR8132 Ethernet #device ale # Atheros AR8121/AR8113/AR8114 Ethe= rnet #device bce # Broadcom BCM5706/BCM5708 Gigabit Ethernet #device bfe # Broadcom BCM440x 10/100 Ethernet #device bge # Broadcom BCM570xx Gigabit Ethernet #device cas # Sun Cassini/Cassini+ and NS DP830= 65 Saturn #device dc # DEC/Intel 21143 and various workalikes #device et # Agere ET1310 10/100/Gigabit Ether= net #device fxp # Intel EtherExpress PRO/100B (8255= 7, 82558) #device gem # Sun GEM/Sun ERI/Apple GMAC #device hme # Sun HME (Happy Meal Ethernet) #device jme # JMicron JMC250 Gigabit/JMC260 Fast Ethernet #device lge # Level 1 LXT1001 gigabit Ethernet #device msk # Marvell/SysKonnect Yukon II Gigab= it Ethernet #device nfe # nVidia nForce MCP on-board Ethern= et #device nge # NatSemi DP83820 gigabit Ethernet #device nve # nVidia nForce MCP on-board Ethern= et Networking #device pcn # AMD Am79C97x PCI 10/100 (preceden= ce over 'le') device re # RealTek 8139C+/8169/8169S/8110S #device rl # RealTek 8129/8139 #device sf # Adaptec AIC-6915 (``Starfire'') #device sge # Silicon Integrated Systems SiS190= /191 #device sis # Silicon Integrated Systems SiS 900/SiS 7016 #device sk # SysKonnect SK-984x & SK-982x giga= bit Ethernet #device ste # Sundance ST201 (D-Link DFE-550TX) #device stge # Sundance/Tamarack TC9021 gigabit Ethernet #device tl # Texas Instruments ThunderLAN #device tx # SMC EtherPower II (83c170 ``EPIC'= ') #device vge # VIA VT612x gigabit Ethernet #device vr # VIA Rhine, Rhine II #device wb # Winbond W89C840F #device xl # 3Com 3c90x (``Boomerang'', ``Cyclone'') # ISA Ethernet NICs. pccard NICs included. #device cs # Crystal Semiconductor CS89x0 NIC # 'device ed' requires 'device miibus' #device ed # NE[12]000, SMC Ultra, 3c503, DS83= 90 cards #device ex # Intel EtherExpress Pro/10 and Pro= /10+ #device ep # Etherlink III based cards #device fe # Fujitsu MB8696x based cards #device sn # SMC's 9000 series of Ethernet chi= ps #device xe # Xircom pccard Ethernet # Wireless NIC cards #device wlan # 802.11 support #options IEEE80211_DEBUG # enable debug msgs #options IEEE80211_AMPDU_AGE # age frames in AMPDU reorder q's #options IEEE80211_SUPPORT_MESH # enable 802.11s draft support #device wlan_wep # 802.11 WEP support #device wlan_ccmp # 802.11 CCMP support #device wlan_tkip # 802.11 TKIP support #device wlan_amrr # AMRR transmit rate control algori= thm #device an # Aironet 4500/4800 802.11 wireless NICs. #device ath # Atheros NICs #device ath_pci # Atheros pci/cardbus glue #device ath_hal # pci/cardbus chip support #options AH_SUPPORT_AR5416 # enable AR5416 tx/rx descriptors #options AH_AR5416_INTERRUPT_MITIGATION # AR5416 interrupt mitigation #options ATH_ENABLE_11N # Enable 802.11n support for AR5416= and later #device ath_rate_sample # SampleRate tx rate control for ath #device bwi # Broadcom BCM430x/BCM431x wireless NICs. #device bwn # Broadcom BCM43xx wireless NICs. #device ipw # Intel 2100 wireless NICs. #device iwi # Intel 2200BG/2225BG/2915ABG wirel= ess NICs. #device iwn # Intel 4965/1000/5000/6000 wireless NICs. #device malo # Marvell Libertas wireless NICs. #device mwl # Marvell 88W8363 802.11n wireless NICs. #device ral # Ralink Technology RT2500 wireless NICs. #device wi # WaveLAN/Intersil/Symbol 802.11 wireless NICs. #device wpi # Intel 3945ABG wireless NICs. # Pseudo devices. device loop # Network loopback device random # Entropy device device padlock_rng # VIA Padlock RNG device rdrand_rng # Intel Bull Mountain RNG device ether # Ethernet support device vlan # 802.1Q VLAN support device tun # Packet tunnel. device md # Memory "disks" device gif # IPv6 and IPv4 tunneling device faith # IPv6-to-IPv4 relaying (translatio= n) device firmware # firmware assist module # The `bpf' device enables the Berkeley Packet Filter. # Be aware of the administrative consequences of enabling this! # Note that 'bpf' is required for DHCP. device bpf # Berkeley packet filter # USB support options USB_DEBUG # enable debug msgs device uhci # UHCI PCI->USB interface device ohci # OHCI PCI->USB interface device ehci # EHCI PCI->USB interface (USB 2.0) device xhci # XHCI PCI->USB interface (USB 3.0) device usb # USB Bus (required) device ukbd # Keyboard device umass # Disks/Mass storage - Requires scb= us and da # Sound support device sound # Generic sound driver (required) #device snd_cmi # CMedia CMI8338/CMI8738 #device snd_csa # Crystal Semiconductor CS461x/428x #device snd_emu10kx # Creative SoundBlaster Live! and Audigy #device snd_es137x # Ensoniq AudioPCI ES137x device snd_hda # Intel High Definition Audio device snd_ich # Intel, NVidia and other ICH AC'97 Audio device snd_via8233 # VIA VT8233x Audio # MMC/SD device mmc # MMC/SD bus device mmcsd # MMC/SD memory card device sdhci # Generic PCI SD Host Controller # VirtIO support device virtio # Generic VirtIO bus (required) device virtio_pci # VirtIO PCI device device vtnet # VirtIO Ethernet device device virtio_blk # VirtIO Block device device virtio_scsi # VirtIO SCSI device device virtio_balloon # VirtIO Memory Balloon device # HyperV drivers and enchancement support # NOTE: HYPERV depends on hyperv. They must be added or removed together. options HYPERV # Hyper-V kernel infrastructure device hyperv # HyperV drivers=20 # Xen HVM Guest Optimizations # NOTE: XENHVM depends on xenpci. They must be added or removed together. options XENHVM # Xen HVM kernel infrastructure device xenpci # Xen HVM Hypervisor services driver # VMware support device vmx # VMware VMXNET3 Ethernet # 2016-04-21 JC Added VIMAGE just to verify it's the crash causer options VIMAGE --- --=20 You are receiving this mail because: You are on the CC list for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-208957-6>