From owner-freebsd-current@freebsd.org Mon Apr 16 20:26:47 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2AEA6FA3144; Mon, 16 Apr 2018 20:26:47 +0000 (UTC) (envelope-from satan@ukr.net) Received: from hell.ukr.net (hell.ukr.net [212.42.67.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B201779D33; Mon, 16 Apr 2018 20:26:46 +0000 (UTC) (envelope-from satan@ukr.net) Received: from satan by hell.ukr.net with local ID 1f8Ai3-000E76-8W ; Mon, 16 Apr 2018 23:26:43 +0300 Date: Mon, 16 Apr 2018 23:26:43 +0300 From: Vitalij Satanivskij To: Stephen Hurd Cc: Vitalij Satanivskij , cem@freebsd.org, "freebsd-hackers@freebsd.org" , freebsd-current , Stephen Hurd , Sean Bruno , Matthew Macy Subject: Re: Current panic on boot on H11DSI motherboard with epyc cpu (nexus_add_irq: failed) Message-ID: <20180416202643.GA54226@hell.ukr.net> Reply-To: satan@ukr.net References: <20180416102710.GA90028@hell.ukr.net> <20180416195128.GA53754@hell.ukr.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Apr 2018 20:26:47 -0000 Oh bios. It's already lastest bios for now with agesa 1.0.0.5 in it. It's dated 2/14/2018 So most likely new version will not appear soon Stephen Hurd wrote: SH> Yeah, this looks like some sort of general MSI issue, not igb specific. SH> I'm not familiar with that part of the kernel, but maybe check if there's a SH> BIOS update available? SH> SH> On Mon, Apr 16, 2018 at 3:51 PM, Vitalij Satanivskij wrote: SH> SH> > Dear Stephen SH> > SH> > I'm disable msix on igb both 1 and 0 SH> > and enable HPET in bios SH> > SH> > get hpet_attach panic. http://hell.ukr.net/panic/recorder_hpet.webm SH> > so i disable hpet again and get msi_alloc and so on SH> > http://hell.ukr.net/panic/recorder_msi.webm SH> > SH> > So for test I'm set hw.pci.enable_msi=0 and get panic in cpp_hw_attach SH> > wich autoloaded later wile system run rc scripts SH> > SH> > panic here - http://hell.ukr.net/panic/recorder_ccp.webm SH> > SH> > For me it's look like some kind of resource menegment problem? SH> > SH> > SH> > Stephen Hurd wrote: SH> > SH> If you disable msix just for igb0, does it crash somewhere else? SH> > SH> SH> > SH> On Mon, Apr 16, 2018 at 3:13 PM, Stephen Hurd wrote: SH> > SH> SH> > SH> > Oh, you may need to disable msix to boot... SH> > SH> > SH> > SH> > dev.igb.0.iflib.disable_msix=1 SH> > SH> > SH> > SH> > On Mon, Apr 16, 2018 at 3:02 PM, Stephen Hurd SH> > wrote: SH> > SH> > SH> > SH> >> Hrm, it should be trying to allocate three msi-x vectors there, and SH> > it SH> > SH> >> appears that it's reported that 10 are available. What's the SH> > output of SH> > SH> >> ``pciconf -lcv pci1:0:0''? SH> > SH> >> SH> > SH> >> On Mon, Apr 16, 2018 at 1:27 PM, Conrad Meyer SH> > wrote: SH> > SH> >> SH> > SH> >>> Hi Vitalij, SH> > SH> >>> SH> > SH> >>> On Mon, Apr 16, 2018 at 3:27 AM, Vitalij Satanivskij < SH> > satan@ukr.net> SH> > SH> >>> wrote: SH> > SH> >>> > DUMP can be found here http://hell.ukr.net/panic/panic.jpg SH> > SH> >>> > or even video record from screen http://hell.ukr.net/panic/reco SH> > SH> >>> rder.webm SH> > SH> >>> SH> > SH> >>> Looks like the panic message is printed directly after: "igb0: SH> > using 2 SH> > SH> >>> rx queues 2 tx queues" (iflib_msix_init(), called by SH> > SH> >>> iflib_device_register()). SH> > SH> >>> SH> > SH> >>> And stack is indeed coming from iflib in probe (0:17 in linked SH> > video): SH> > SH> >>> SH> > SH> >>> panic() SH> > SH> >>> nexus_add_irq() SH> > SH> >>> msix_alloc() SH> > SH> >>> pci_alloc_msix_method() SH> > SH> >>> iflib_device_register() SH> > SH> >>> iflib_device_attach() SH> > SH> >>> device_attach() SH> > SH> >>> ... SH> > SH> >>> SH> > SH> >>> Stephen, Matt, or Sean might be able to help diagnose further. SH> > SH> >>> SH> > SH> >>> Best, SH> > SH> >>> Conrad SH> > SH> >>> SH> > SH> >> SH> > SH> >> SH> > SH> >> SH> > SH> >> -- SH> > SH> >> [image: Limelight Networks] SH> > SH> >> Stephen Hurd* Principal Engineer* SH> > SH> >> EXPERIENCE FIRST. SH> > SH> >> +1 616 848 0643 <+1+616+848+0643> SH> > SH> >> www.limelight.com SH> > SH> >> [image: Facebook] > >[image: SH> > SH> >> LinkedIn] [ SH> > image: SH> > SH> >> Twitter] SH> > SH> >> SH> > SH> > SH> > SH> > SH> > SH> > SH> > SH> > -- SH> > SH> > [image: Limelight Networks] SH> > SH> > Stephen Hurd* Principal Engineer* SH> > SH> > EXPERIENCE FIRST. SH> > SH> > +1 616 848 0643 <+1+616+848+0643> SH> > SH> > www.limelight.com SH> > SH> > [image: Facebook] > >[image: SH> > SH> > LinkedIn] [ SH> > image: SH> > SH> > Twitter] SH> > SH> > SH> > SH> SH> > SH> SH> > SH> SH> > SH> -- SH> > SH> [image: Limelight Networks] SH> > SH> Stephen Hurd* Principal Engineer* SH> > SH> EXPERIENCE FIRST. SH> > SH> +1 616 848 0643 <+1+616+848+0643> SH> > SH> www.limelight.com SH> > SH> [image: Facebook] [image: SH> > SH> LinkedIn] [image: SH> > SH> Twitter] SH> > SH> SH> SH> SH> -- SH> [image: Limelight Networks] SH> Stephen Hurd* Principal Engineer* SH> EXPERIENCE FIRST. SH> +1 616 848 0643 <+1+616+848+0643> SH> www.limelight.com SH> [image: Facebook] [image: SH> LinkedIn] [image: SH> Twitter]