From owner-freebsd-bugs@freebsd.org Thu Jan 9 19:00:54 2020 Return-Path: Delivered-To: freebsd-bugs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id AA76C1F642E for ; Thu, 9 Jan 2020 19:00:54 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mailman.nyi.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 47twS62WKcz4YKS for ; Thu, 9 Jan 2020 19:00:54 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: by mailman.nyi.freebsd.org (Postfix) id 549031F642D; Thu, 9 Jan 2020 19:00:54 +0000 (UTC) Delivered-To: bugs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 545951F642C for ; Thu, 9 Jan 2020 19:00:54 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 47twS61dRdz4YKR for ; Thu, 9 Jan 2020 19:00:54 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 3373958EE for ; Thu, 9 Jan 2020 19:00:54 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 009J0s7U008834 for ; Thu, 9 Jan 2020 19:00:54 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 009J0sQ9008833 for bugs@FreeBSD.org; Thu, 9 Jan 2020 19:00:54 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 243225] "mpr0: Out of chain frames" boot hang after clang 9.0.1 import (probably timing, not compiler related) Date: Thu, 09 Jan 2020 19:00:53 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 12.0-STABLE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: terry-freebsd@glaver.org X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform op_sys bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Jan 2020 19:00:54 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D243225 Bug ID: 243225 Summary: "mpr0: Out of chain frames" boot hang after clang 9.0.1 import (probably timing, not compiler related) Product: Base System Version: 12.0-STABLE Hardware: Any OS: Any Status: New Severity: Affects Some People Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: terry-freebsd@glaver.org I updated my test system from r356239 to r356557 (which crosses the clang 9= .0.1 import) and started receiving "mpr0: Out of chain frames" at boot time, whi= ch causes a boot hang with the mpr0 controller being reset and reinitialized, = and the error happening again. This happens before the device (tape drive) is detected, and happens regardless of whether anything is connected to the mpr controller. I had this before (many months ago) on this system and worked with Dell service, replacing boards / cables / tape drive, etc. The solution at that point was to put the controller into a different slot, which apparently hid whatever timing problem is causing the boot hang. That's why I say in the PR title that I don't think it is a clang 9.0.1 problem (incorrect code generation). Presumably clang 9 generates faster (hopefully) or slower code that is triggering the problem. Escaping to the boot loader and killing time, then saying "boot" without changing anything will sometimes let the system boot normally. Again pointi= ng to a possible timing problem. The boot messages from r356239 are: mpr0: port 0x8000-0x80ff mem 0xc9100000-0xc910ffff,0xc8000000-0xc80fffff irq 64 at device 0.0 on pci17 mpr0: Firmware: 16.00.08.00, Driver: 23.00.00.00-fbsd mpr0: IOCCapabilities: 7a85c mpr0: Found device ,End Device> <6.0Gbps> handle<0x0009> enclosureHandle<0x0001> slot 7 mpr0: At enclosure level 0 and connector name (1 ) sa0 at mpr0 bus 0 scbus14 target 7 lun 0 In r356557, only the first of those 3 lines appear, followed by: mpr0: Out of chain frames, consider increasing hw.mpr.max_chains And then, eventually by: mpr0: Calling Reinit from mpr_wait_command, timeout=3D60, elapsed=3D60 mpr0: Reinitializing controller At that point we're in a perpetual loop of reinit / timeout. I can make the problem system available via remote console access (Dell iDR= AC 8) or can try any suggestions for debugging this further myself. --=20 You are receiving this mail because: You are the assignee for the bug.=