From owner-freebsd-virtualization@freebsd.org Tue May 22 22:25:22 2018 Return-Path: Delivered-To: freebsd-virtualization@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BC555EB1F75 for ; Tue, 22 May 2018 22:25:22 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 4E38F6BF67 for ; Tue, 22 May 2018 22:25:22 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id 07B07EB1F74; Tue, 22 May 2018 22:25:22 +0000 (UTC) Delivered-To: virtualization@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C107DEB1F72 for ; Tue, 22 May 2018 22:25:21 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 598976BF65 for ; Tue, 22 May 2018 22:25:21 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id 89D1315DC2 for ; Tue, 22 May 2018 22:25:20 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id w4MMPKQa012964 for ; Tue, 22 May 2018 22:25:20 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id w4MMPKFi012958 for virtualization@FreeBSD.org; Tue, 22 May 2018 22:25:20 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: virtualization@FreeBSD.org Subject: [Bug 225791] ena driver causing kernel panics on AWS EC2 Date: Tue, 22 May 2018 22:25:20 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.1-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: terje@elde.net X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: virtualization@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 May 2018 22:25:23 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D225791 Terje Elde changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |terje@elde.net --- Comment #1 from Terje Elde --- We're also affected by this, running c5.large, handling about 13 000 connections through haproxy, then varnish and on to other systems. Activity was about 4000 requests pr. minute leading up to the crash, which doesn't s= eem all that high. It's possible that it could have spiked shortly before the crash though, without getting that in the logs. This is: FreeBSD [host snipped] 11.1-RELEASE-p8 FreeBSD 11.1-RELEASE-p8 #0: Tue Mar = 13 17:07:05 UTC 2018=20=20=20=20 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 It's a lightly modified/configured version of one of the usual FreeBSD AMIs= , I don't recall the AMI ID exactly, sorry. Kernel etc is stock, we've just ma= de additions in terms of software etc for our own AMI. We have two virtually identical machines exposed under the same hostname, receiving a near identical load, and have so far only been noticing this wi= th one of the machines. Could be coincidental, but figured it worthwhile to mention. It strikes me as noteworthy that the data rate was only about 700kBps at the last data point I have before the crash. Unfortunately I don't know anythi= ng about packet rate, and again it's possible that there could have been a peak leading up to the crash, without getting the logs of it. If anyone is interested in any other data from this, please do let me know.= =20 Also, this is part of a redundant setup, allowing some extra room for moving things around if anyone wants anything tested or tried on the setup. >> Crash itself: Limiting open port RST response from 457 to 200 packets/sec Limiting open port RST response from 487 to 200 packets/sec Limiting open port RST response from 541 to 200 packets/sec Limiting open port RST response from 517 to 200 packets/sec Limiting open port RST response from 586 to 200 packets/sec Limiting open port RST response from 237 to 200 packets/sec ena0: Found a Tx that wasn't completed on time, qid 1, index 324. pid 3639 (varnishd), uid 429: exited on signal 6 Limiting open port RST response from 259 to 200 packets/sec Limiting open port RST response from 380 to 200 packets/sec ena0: Found a Tx that wasn't completed on time, qid 1, index 181. Fatal trap 12: page fault while in kernel mode cpuid =3D 0; apic id =3D 00 fault virtual address =3D 0x1c fault code =3D supervisor write data, page not present instruction pointer =3D 0x20:0xffffffff82173f8c stack pointer =3D 0x28:0xfffffe0110f43180 frame pointer =3D 0x28:0xfffffe0110f43260 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags =3D interrupt enabled, resume, IOPL =3D 0 current process =3D 12 (irq261: ena0) trap number =3D 12 panic: page fault cpuid =3D 0 KDB: stack backtrace: #0 0xffffffff80aadac7 at kdb_backtrace+0x67 #1 0xffffffff80a6bba6 at vpanic+0x186 #2 0xffffffff80a6ba13 at panic+0x43 #3 0xffffffff80ee3092 at trap_fatal+0x322 #4 0xffffffff80ee30eb at trap_pfault+0x4b #5 0xffffffff80ee290a at trap+0x2ca #6 0xffffffff80ec3d40 at calltrap+0x8 #7 0xffffffff80a321ec at intr_event_execute_handlers+0xec #8 0xffffffff80a324d6 at ithread_loop+0xd6 #9 0xffffffff80a2f845 at fork_exit+0x85 #10 0xffffffff80ec4a0e at fork_trampoline+0xe Uptime: 8d22h59m55s Rebooting... >> boot log: Copyright (c) 1992-2017 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 11.1-RELEASE-p8 #0: Tue Mar 13 17:07:05 UTC 2018 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on LLVM 4.0.0) VT(vga): text 80x25 CPU: HammerEM64T (3000.05-MHz K8-class CPU) Origin=3D"GenuineIntel" Id=3D0x50653 Family=3D0x6 Model=3D0x55 Steppi= ng=3D3 =20 Features=3D0x1f83fbff =20 Features2=3D0xfffa3203 AMD Features=3D0x2c100800 AMD Features2=3D0x121 Structured Extended Features=3D0xd11f4fbb Structured Extended Features2=3D0x8 XSAVE Features=3D0xf TSC: P-state invariant, performance statistics Hypervisor: Origin =3D "KVMKVMKVM" real memory =3D 5114953728 (4878 MB) avail memory =3D 3844890624 (3666 MB) Event timer "LAPIC" quality 600 ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs FreeBSD/SMP: 1 package(s) x 1 core(s) x 2 hardware threads random: unblocking device. ioapic0 irqs 0-23 on motherboard SMP: AP CPU #1 Launched! random: entropy device external interface kbd1 at kbdmux0 netmap: loaded module module_register_init: MOD_LOAD (vesa, 0xffffffff80f5eb40, 0) error 19 random: registering fast source Intel Secure Key RNG random: fast provider: "Intel Secure Key RNG" nexus0 vtvga0: on motherboard cryptosoft0: on motherboard acpi0: on motherboard acpi0: Power Button (fixed) cpu0: on acpi0 cpu1: on acpi0 atrtc0: port 0x70-0x71,0x72-0x77 irq 8 on acpi0 Event timer "RTC" frequency 32768 Hz quality 0 Timecounter "ACPI-fast" frequency 3579545 Hz quality 900 acpi_timer0: <24-bit timer at 3.579545MHz> port 0xb008-0xb00b on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 isab0: at device 1.0 on pci0 isa0: on isab0 pci0: at device 1.3 (no driver attached) vgapci0: mem 0xfe400000-0xfe7fffff at device 3.0 on pci0 vgapci0: Boot video device nvme0: mem 0xfebf0000-0xfebf3fff irq 11 at device 4.0= on pci0 ena0: mem 0xfebf4000-0xfebf7fff at device 5.0 on pci0 ena0: Elastic Network Adapter (ENA)ena v0.7.0 ena0: initalize 2 io queues ena0: Ethernet address: 02:2b:3a:f4:70:8c ena0: Allocated msix_entries, vectors (cnt: 3) nvme1: mem 0xfebf8000-0xfebfbfff irq 11 at device 31.= 0 on pci0 atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] uart0: port 0x3f8-0x3ff irq 4 f= lags 0x10 on acpi0 uart0: console (115200,n,8,1) orm0: at iomem 0xef000-0xeffff on isa0 vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 attimer0: at port 0x40 on isa0 Timecounter "i8254" frequency 1193182 Hz quality 0 attimer0: Can't map interrupt. ppc0: cannot reserve I/O port range ena0: link is UP ena0: link state changed to UP Timecounters tick every 1.000 msec usb_needs_explore_all: no devclass nvme cam probe device init nvme0: temperature threshold not supported nvd0: NVMe namespace nvd0: 20480MB (41943040 512 byte sectors) nvme1: temperature threshold not supported nvd1: NVMe namespace GEOM: nvd1: corrupt or invalid GPT detected. nvd1: 20480MB (41943040 512 byte sectors) GEOM: nvd1: GPT rejected -- may not be recoverable. Trying to mount root from ufs:/dev/gpt/rootfs [rw]... --=20 You are receiving this mail because: You are the assignee for the bug.=