Date: Tue, 22 Oct 2019 18:41:46 +0000 From: bugzilla-noreply@freebsd.org To: virtualization@FreeBSD.org Subject: [Bug 234838] ena drop-outs on 12.0-RELEASE Message-ID: <bug-234838-27103-FNo4ssDSdn@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-234838-27103@https.bugs.freebsd.org/bugzilla/> References: <bug-234838-27103@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D234838 Ryan Langseth <langseth@iteris.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |langseth@iteris.com --- Comment #6 from Ryan Langseth <langseth@iteris.com> --- It seems like there is still an issue with this. We are running FreeBSD 12.0-RELEASE-p10 on a c5.2xl instance. And have had the system reset the network device twice in the last 24 hours. The current traffic to it is a zfs recv over ssh running at ~40MiB/s. The f= irst time it dropped out it was just the ena device. The second time I also got = nvme 'Missing Interrupts' as well. The system has 6 gp2 volumes for the zpool. `grep kern.crit /var/log/messages` Oct 21 12:26:52 <kern.crit> apache-00 kernel: Trying to mount root from ufs:/dev/gpt/rootfs [rw]... Oct 21 12:26:52 <kern.crit> apache-00 kernel: ena0: device is going UP Oct 21 12:26:52 <kern.crit> apache-00 kernel: ena0: device is going DOWN Oct 21 12:26:52 <kern.crit> apache-00 kernel: ena0: device is going UP Oct 21 12:26:52 <kern.crit> apache-00 kernel: intsmb0: <Intel PIIX4 SMBUS Interface> port 0xb100-0xb10f at device 1.3 on pci0 Oct 21 12:26:52 <kern.crit> apache-00 kernel: intsmb0: intr IRQ 9 enabled revision 255 Oct 21 12:26:52 <kern.crit> apache-00 kernel: smbus0: <System Management Bu= s> on intsmb0 Oct 21 12:26:53 <kern.crit> apache-00 kernel: Security policy loaded: MAC/n= tpd (mac_ntpd) Oct 22 06:33:55 <kern.crit> apache-00 kernel: ena0: The number of lost tx completion is above the threshold (129 > 128). Reset the device Oct 22 06:33:55 <kern.crit> apache-00 kernel: ena0: Trigger reset is on Oct 22 06:33:55 <kern.crit> apache-00 kernel: ena0: device is going DOWN Oct 22 06:34:02 <kern.crit> apache-00 kernel: ena0: free uncompleted tx mbuf qid 0 idx 0x1f2 Oct 22 06:34:03 <kern.crit> apache-00 kernel: ena0: ena0: device is going UP Oct 22 06:34:03 <kern.crit> apache-00 kernel: link is UP Oct 22 13:18:10 <kern.crit> apache-00 kernel: ena0: The number of lost tx completion is above the threshold (129 > 128). Reset the device Oct 22 13:18:10 <kern.crit> apache-00 kernel: ena0: Trigger reset is on Oct 22 13:18:10 <kern.crit> apache-00 kernel: ena0: device is going DOWN Oct 22 13:18:16 <kern.crit> apache-00 kernel: ena0: free uncompleted tx mbuf qid 4 idx 0x3a6 Oct 22 13:18:16 <kern.crit> apache-00 kernel:=20 Oct 22 13:18:16 <kern.crit> apache-00 kernel: ena0: device is going UP Oct 22 13:18:16 <kern.crit> apache-00 kernel: ena0: link is UP Oct 22 13:18:47 <kern.crit> apache-00 kernel: nvme0: Missing interrupt Oct 22 13:18:51 <kern.crit> apache-00 kernel: nvme5: Missing interrupt Oct 22 13:18:51 <kern.crit> apache-00 kernel: nvme2: Missing interrupt Oct 22 13:19:17 <kern.crit> apache-00 kernel: nvme0: Missing interrupt Oct 22 13:20:47 <kern.crit> apache-00 kernel: nvme2: Missing interrupt Oct 22 13:20:47 <kern.crit> apache-00 kernel: nvme5: Missing interrupt Oct 22 13:20:47 <kern.crit> apache-00 kernel: nvme0: Missing interrupt Oct 22 13:20:47 <kern.crit> apache-00 kernel: nvme5: Missing interrupt Oct 22 13:20:47 <kern.crit> apache-00 kernel: nvme6: Missing interrupt Oct 22 13:20:47 <kern.crit> apache-00 kernel: nvme2: Missing interrupt Oct 22 13:20:47 <kern.crit> apache-00 kernel: nvme0: Missing interrupt Oct 22 13:20:47 <kern.crit> apache-00 kernel: nvme5: Missing interrupt Oct 22 13:20:47 <kern.crit> apache-00 kernel: nvme0: Missing interrupt Oct 22 13:22:17 <kern.crit> apache-00 kernel: nvme4: Missing interrupt Oct 22 13:22:17 <kern.crit> apache-00 kernel: nvme2: Missing interrupt Oct 22 13:22:17 <kern.crit> apache-00 kernel: nvme5: Missing interrupt Oct 22 13:22:17 <kern.crit> apache-00 kernel: nvme0: Missing interrupt Oct 22 13:22:17 <kern.crit> apache-00 kernel: nvme5: nvme2: nvme4: Missing interrupt Oct 22 13:22:17 <kern.crit> apache-00 kernel: Missing interrupt Oct 22 13:22:17 <kern.crit> apache-00 syslogd: last message repeated 1 times Oct 22 13:22:17 <kern.crit> apache-00 kernel: nvme0: Missing interrupt Oct 22 13:22:17 <kern.crit> apache-00 kernel: nvme2: Missing interrupt Oct 22 13:22:17 <kern.crit> apache-00 kernel: nvme0:=20 Oct 22 13:22:17 <kern.crit> apache-00 kernel:=20 Oct 22 13:22:17 <kern.crit> apache-00 kernel: Missing interrupt Oct 22 13:22:21 <kern.crit> apache-00 kernel: nvme2: nvme6: Missing interru= pt Oct 22 13:22:21 <kern.crit> apache-00 kernel: nvme5: Missing interrupt Oct 22 13:22:21 <kern.crit> apache-00 kernel: Missing interrupt Oct 22 13:22:21 <kern.crit> apache-00 kernel: nvme4: Missing interrupt Oct 22 13:22:51 <kern.crit> apache-00 kernel: nvme6: nvme4: Missing interru= pt Oct 22 13:22:51 <kern.crit> apache-00 kernel: Missing interrupt Oct 22 13:22:51 <kern.crit> apache-00 kernel: nvme5: Missing interrupt Oct 22 13:22:51 <kern.crit> apache-00 kernel: nvme2: Missing interrupt Oct 22 13:23:16 <kern.crit> apache-00 kernel: nvme0: Missing interrupt Oct 22 13:23:21 <kern.crit> apache-00 kernel: nvme2: Missing interrupt Oct 22 13:23:21 <kern.crit> apache-00 kernel: nvme6: Missing interrupt Oct 22 13:23:26 <kern.crit> apache-00 kernel: nvme4: Missing interrupt I will add that this instance was originally a FreeBSD 11.x system that was freebsd-update'd to 12. As a 11 system it was panicing on the transfer every 3-4 hours. I am bringing up a fresh 12.x system to do additional testing. --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-234838-27103-FNo4ssDSdn>