From owner-freebsd-scsi@freebsd.org Tue Jun 5 07:18:15 2018 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D1AA1FE81C7 for ; Tue, 5 Jun 2018 07:18:14 +0000 (UTC) (envelope-from freebsd@omnilan.de) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 6950A856A1 for ; Tue, 5 Jun 2018 07:18:14 +0000 (UTC) (envelope-from freebsd@omnilan.de) Received: by mailman.ysv.freebsd.org (Postfix) id 22EFEFE81BF; Tue, 5 Jun 2018 07:18:14 +0000 (UTC) Delivered-To: scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F244EFE81BE for ; Tue, 5 Jun 2018 07:18:13 +0000 (UTC) (envelope-from freebsd@omnilan.de) Received: from mx0.gentlemail.de (mx0.gentlemail.de [IPv6:2a00:e10:2800::a130]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7661F8569E for ; Tue, 5 Jun 2018 07:18:13 +0000 (UTC) (envelope-from freebsd@omnilan.de) Received: from mh0.gentlemail.de (mh0.gentlemail.de [IPv6:2a00:e10:2800::a135]) by mx0.gentlemail.de (8.14.5/8.14.5) with ESMTP id w557IA5u031476; Tue, 5 Jun 2018 09:18:10 +0200 (CEST) (envelope-from freebsd@omnilan.de) Received: from titan.inop.mo1.omnilan.net (s1.omnilan.de [217.91.127.234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mh0.gentlemail.de (Postfix) with ESMTPSA id E9B2C941; Tue, 5 Jun 2018 09:18:09 +0200 (CEST) Subject: =?UTF-8?Q?Re:_What_is_ENXIO_=e2=80=93_MSI_allocation_regression_in_?= =?UTF-8?Q?:[Was_Re:_svn_commit:_r321714_-_in_head/sys/dev:_mpr_mps]?= To: Scott Long Cc: scsi@freebsd.org References: <201707300653.v6U6rwLN099096@repo.freebsd.org> <597DA578.6030101@omnilan.de> <597F56A8.1060603@omnilan.de> <59804C8C.1020003@omnilan.de> <78611650-D7A4-4B1D-A254-DB058E1AC1C6@samsco.org> From: Harry Schmalzbauer Organization: OmniLAN Message-ID: Date: Tue, 5 Jun 2018 09:18:06 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <78611650-D7A4-4B1D-A254-DB058E1AC1C6@samsco.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (mx0.gentlemail.de [IPv6:2a00:e10:2800::a130]); Tue, 05 Jun 2018 09:18:10 +0200 (CEST) X-Milter: Spamilter (Reciever: mx0.gentlemail.de; Sender-ip: ; Sender-helo: mh0.gentlemail.de; ) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jun 2018 07:18:15 -0000 Am 05.06.2018 um 00:22 schrieb Scott Long: > > >> On Jun 4, 2018, at 4:51 AM, Harry Schmalzbauer wrote: >> >> Am 01.08.2017 um 11:40 schrieb Harry Schmalzbauer: >>> Bezüglich Scott Long's Nachricht vom 31.07.2017 18:56 (localtime): >>> >>> … >>>>> I'd like to report one I hadn't expected: >>>>> >>>>> mps0: port 0x4000-0x40ff mem 0xc3bc0000-0xc3bc3fff,0xc3b80000-0xc3bbffff irq 19 at device 0.0 on pci7 >>>>> >>>>> mps0: Firmware: 20.00.04.00, Driver: 21.02.00.00-fbsd >>>>> mps0: IOCCapabilities: >>>>> 185c >>>>> mps0: Cannot allocate INTx interrupt >>>>> mps0: mps_iocfacts_allocate failed to setup interrupts >>>>> mps0: mps_attach IOC Facts based allocation failed with error 6 >>>>> panic: resource_list_release: resource entry is not busy >>>>> cpuid = 6 >>>>> KDB: stack backtrace: >>>>> #0 0xffffffff805e32d7 at kdb_backtrace+0x67 >>>>> #1 0xffffffff805a1d26 at vpanic+0x186 >>>>> #2 0xffffffff805a1b93 at panic+0x43 >>>>> #3 0xffffffff805d71c6 at resource_list_release+0x1c6 >>>>> #4 0xffffffff8040fef1 at mps_pci_free+0xe1 >>>>> #5 0xffffffff8040fa23 at mps_pci_attach+0x1b3 >>>>> #6 0xffffffff805d6594 at device_attach+0x3a4 >>>>> #7 0xffffffff805d774d at bus_generic_attach+0x3d >>>>> #8 0xffffffff8044ac05 at pci_attach+0xd5 >>>>> #9 0xffffffff805d6594 at device_attach+0x3a4 >>>>> #10 0xffffffff805d774d at bus_generic_attach+0x3d >>>>> #11 0xffffffff80364761 at acpi_pcib_pci_attach+0xa1 >>>>> #12 0xffffffff805d6594 at device_attach+0x3a4 >>>>> #13 0xffffffff805d774d at bus_generic_attach+0x3d >>>>> #14 0xffffffff8044ac05 at pci_attach+0xd5 >>>>> #15 0xffffffff805d6594 at device_attach+0x3a4 >>>>> #16 0xffffffff805d774d at bus_generic_attach+0x3d >>>>> #17 0xffffffff80363e4d at acpi_pcib_acpi_attach+0x42d >>>>> Uptime: 1s >>> … >>> >>>> Fixed in r321799, thanks for the report. >>> Fix confiremd; merged together with r321733 (and 321737) to 11.1 and >>> panic vanished. >> >> Late in the 11.2 phase, I identified this commit as a regression for MSI (non-x) alloctaion. >> I have an idea what probably causes the problem here (INTx allocation, although MSI (and MSI-x) capability): >> disable_msix is not 0 (I need to disable MSI-x because of ESXi-passthru…). >> >> Corresponding lines: >> { >> device_t dev; >> int error, msgs; >> >> dev = sc->mps_dev; >> error = 0; >> msgs = 0; >> >> if ((sc->disable_msix == 0) && >> ((msgs = pci_msix_count(dev)) >= MPS_MSI_COUNT)) >> error = mps_alloc_msix(sc, MPS_MSI_COUNT); >> if ((error != 0) && (sc->disable_msi == 0) && >> ((msgs = pci_msi_count(dev)) >= MPS_MSI_COUNT)) >> error = mps_alloc_msi(sc, MPS_MSI_COUNT); >> if (error != 0) >> msgs = 0; >> >> sc->msi_msgs = msgs; >> return (error); >> } >> >> Before r321714, error was assigned ENXIO, which, if != 0, could help make me understand the problem. >> Unfortunately I have no idea what ENXIO means, where it's defined and most important, how to find the place where the declaration/definition happens. Only joe and vi available here, any hints highly appreciated. >> >> I can confirm that MSI allocation works with mps.ko_21.02.00.00-fbsd-r321415 with my ESXi-passthru-non_msi-x setup. >> Although the dirver emits no message that an MSI was allocated, like toher drivers do. That's a cosmetic one though. >> But the MSI->INTx regression is a severe one for me, which I'd like to fix myself but I'm missing so many fundamental skills :-( >> > > Hi Harry, > > You are correct about the bug. Please change the line at the top of the function that reads > > error = 0; > > to > > error = ENXIO; > > Let me know if that fixes the MSI problem for you. Hello Scott, thanks for your hint. Unfortunately I have a lot more problems – the system (11.2-RC1) deadlocks for some soconds with iSCSI load... This is far easyer reproducable / heavier impact with mps(4) and INTx allocation than with MSI, but backup runs over night triggered that extreme slowdown although mps(4) was allocating MSI – up to 20 sec locks, where even no terminal update happes. All those update ar queued though, so after about 10-20 sedonds, the screen flickers, showing all queued output. One symptom is that systat(1) shows 25% intr usage which is one core. It's a ZFS machine, so high sys usage is normal, but intr usually is about 10% with GbE traffic. Only when the slowdown/lockup happens, intr usage constantly stays at 25%. Can't imagine ctld(8) or zfs is causing this, but who knows – I don't at the moment. Will have to revert to 11.1 and see if things change, the machine was 10.? before – without such problems. BTW, does anybody have a link where I can get info about ENXIO? Thanks, -harry