From owner-freebsd-bugs@freebsd.org Wed Apr 3 13:31:13 2019 Return-Path: Delivered-To: freebsd-bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 600F4156E1CA for ; Wed, 3 Apr 2019 13:31:13 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id EC84081470 for ; Wed, 3 Apr 2019 13:31:12 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id A6C3E156E1C9; Wed, 3 Apr 2019 13:31:12 +0000 (UTC) Delivered-To: bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 83194156E1C8 for ; Wed, 3 Apr 2019 13:31:12 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 070A58146E for ; Wed, 3 Apr 2019 13:31:12 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id 16EB910EF6 for ; Wed, 3 Apr 2019 13:31:11 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id x33DVAOJ054463 for ; Wed, 3 Apr 2019 13:31:10 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id x33DVAnh054462 for bugs@FreeBSD.org; Wed, 3 Apr 2019 13:31:10 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 236989] AWS EC2 lockups "Missing interrupt" Date: Wed, 03 Apr 2019 13:31:10 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 12.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: cao@bus.net X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Apr 2019 13:31:13 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D236989 Bug ID: 236989 Summary: AWS EC2 lockups "Missing interrupt" Product: Base System Version: 12.0-RELEASE Hardware: amd64 OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: cao@bus.net I am experiencing lockups on a production c5d.2xlarge instance running Free= BSD 12.0-RELEASE. Frequency is about once a week. The harbinger of these lockups is the appearance of "nvmX Missing interrupt= " in the logs: Apr 3 00:56:32 host kernel: nvme0: Missing interrupt Apr 3 00:57:43 host syslogd: last message repeated 1 times Apr 3 00:58:43 host kernel: nvme4: Missing interrupt Apr 3 00:58:43 host kernel: nvme1: Missing interrupt Apr 3 00:58:43 host kernel: nvme0: Missing interrupt Apr 3 00:58:43 host kernel: nvme4: Missing interrupt Apr 3 00:58:43 host kernel: nvme1: Missing interrupt Apr 3 00:58:43 host kernel: nvme0: Missing interrupt Apr 3 00:59:43 host kernel: nvme4: Missing interrupt Apr 3 00:59:43 host kernel: nvme1: Missing interrupt Apr 3 00:59:43 host kernel: nvme0: Missing interrupt Apr 3 00:59:43 host kernel: nvme1: nvme4: Missing interrupt Apr 3 00:59:43 host kernel: Missing interrupt Apr 3 00:59:43 host kernel: nvme0: Missing interrupt Apr 3 00:59:44 host kernel: nvme1: Missing interrupt Apr 3 01:00:05 host kernel: nvme0: Missing interrupt Apr 3 01:20:01 host kernel: nvme0:=20 Apr 3 01:20:01 host kernel: Missing interrupt Apr 3 01:22:10 host kernel: sonewconn: pcb 0xfffff802988adb00: Listen queue overflow: 151 already in queue awaiting acceptance (1 occurrences) Apr 3 01:24:33 host kernel: sonewconn: pcb 0xfffff802988adb00: Listen queue overflow: 151 already in queue awaiting acceptance (6 occurrences) Apr 3 01:25:35 host kernel: sonewconn: pcb 0xfffff802988adb00: Listen queue overflow: 151 already in queue awaiting acceptance (4 occurrences) Apr 3 01:26:45 host syslogd: last message repeated 1 times Apr 3 01:27:49 host syslogd: last message repeated 1 times Within a few hours the machine will become unresponsive, CPU pegged at 100%, high disk reads and writes. It will not respond to an EC2 "stop" command and requires a forced (hard) reset. c5d.2xlarge FreeBSD 12.0-RELEASE-p3 GENERIC amd64 zfs is in use on some drives, but not all. I am running several instances with this same configuration, but only one of them has had this issue so far, and it happens to be the host that has the highest disk activity. --=20 You are receiving this mail because: You are the assignee for the bug.=