Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 21 Sep 2012 13:59:12 -0700
From:      Jim Harris <jimharris@freebsd.org>
To:        Mike Tancsa <mike@sentex.net>, delphij@freebsd.org
Cc:        FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org>
Subject:   Re: tws bug ? (LSI SAS 9750)
Message-ID:  <CAJP=Hc9=Rk5EvbmSe=XJJq_r0WO7DW3oUvxxK3ALAbJRUSgX7g@mail.gmail.com>
In-Reply-To: <505CC8EC.4030608@sentex.net>
References:  <505CC8EC.4030608@sentex.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Sep 21, 2012 at 1:07 PM, Mike Tancsa <mike@sentex.net> wrote:
> Hi,
> I have been trying out a nice new tws controller and decided to enable
> debugging in the kernel and run some stress tests.  With a regular
> GENERIC kernel, it boots up fine.  But with debugging, it panics on
> boot. Anyone know whats up ? Is this something that should be sent
> directly to LSI ?

Through a code inspection, this mutex is being recursed whether or not
debugging is enabled.  There is no code path here specific to
INVARIANTS.  And the main IO path in this driver is always recursing
on this lock - it is not specific to the initialization callstack you
listed below.

The best course of action seems to be initializing the lock with
MTX_RECURSE, since the driver seems to expect to be able to recurse on
the io_lock.  Can you try the following patch?

diff --git a/sys/dev/tws/tws.c b/sys/dev/tws/tws.c
index b1615db..d156d40 100644
--- a/sys/dev/tws/tws.c
+++ b/sys/dev/tws/tws.c
@@ -197,7 +197,7 @@ tws_attach(device_t dev)
     mtx_init( &sc->q_lock, "tws_q_lock", NULL, MTX_DEF);
     mtx_init( &sc->sim_lock,  "tws_sim_lock", NULL, MTX_DEF);
     mtx_init( &sc->gen_lock,  "tws_gen_lock", NULL, MTX_DEF);
-    mtx_init( &sc->io_lock,  "tws_io_lock", NULL, MTX_DEF);
+    mtx_init( &sc->io_lock,  "tws_io_lock", NULL, MTX_DEF | MTX_RECURSE);

     if ( tws_init_trace_q(sc) == FAILURE )
         printf("trace init failure\n");



>
> pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
> pci0: <ACPI PCI bus> on pcib0
> pcib1: <ACPI PCI-PCI bridge> irq 16 at device 1.0 on pci0
> pci1: <ACPI PCI bus> on pcib1
> pcib2: <ACPI PCI-PCI bridge> irq 17 at device 1.1 on pci0
> pci2: <ACPI PCI bus> on pcib2
> LSI 3ware device driver for SAS/SATA storage controllers, version:
> 10.80.00.003
> tws0: <LSI 3ware SAS/SATA Storage Controller> port 0x4000-0x40ff mem
> 0xc2460000-0xc2463fff,0xc2400000-0xc243ffff irq 17 at device 0.0 on pci2
> tws0: Using legacy INTx
> panic: _mtx_lock_sleep: recursed on non-recursive mutex tws_io_lock @
> /usr/HEAD/src/sys/dev/tws/tws_hdm.c:287
>
> cpuid = 0
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
> kdb_backtrace() at kdb_backtrace+0x37
> panic() at panic+0x1d8
> _mtx_lock_sleep() at _mtx_lock_sleep+0x27f
> _mtx_lock_flags() at _mtx_lock_flags+0xf1
> tws_submit_command() at tws_submit_command+0x3f
> tws_dmamap_data_load_cbfn() at tws_dmamap_data_load_cbfn+0xb7
> bus_dmamap_load() at bus_dmamap_load+0x16c
> tws_map_request() at tws_map_request+0x78
> tws_get_param() at tws_get_param+0xe1
> tws_display_ctlr_info() at tws_display_ctlr_info+0x4c
> tws_init_ctlr() at tws_init_ctlr+0x6d
> tws_attach() at tws_attach+0x68c
> device_attach() at device_attach+0x72
> bus_generic_attach() at bus_generic_attach+0x1a
> acpi_pci_attach() at acpi_pci_attach+0x164
> device_attach() at device_attach+0x72
> bus_generic_attach() at bus_generic_attach+0x1a
> acpi_pcib_attach() at acpi_pcib_attach+0x1a7
> acpi_pcib_pci_attach() at acpi_pcib_pci_attach+0x9b
> device_attach() at device_attach+0x72
> bus_generic_attach() at bus_generic_attach+0x1a
> acpi_pci_attach() at acpi_pci_attach+0x164
> device_attach() at device_attach+0x72
> bus_generic_attach() at bus_generic_attach+0x1a
> acpi_pcib_attach() at acpi_pcib_attach+0x1a7
> acpi_pcib_acpi_attach() at acpi_pcib_acpi_attach+0x1f6
> device_attach() at device_attach+0x72
> bus_generic_attach() at bus_generic_attach+0x1a
> acpi_attach() at acpi_attach+0xbc1
> device_attach() at device_attach+0x72
> bus_generic_attach() at bus_generic_attach+0x1a
> nexus_acpi_attach() at nexus_acpi_attach+0x69
> device_attach() at device_attach+0x72
> bus_generic_new_pass() at bus_generic_new_pass+0xd6
> bus_set_pass() at bus_set_pass+0x7a
> configure() at configure+0xa
> mi_startup() at mi_startup+0x77
> btext() at btext+0x2c
> KDB: enter: panic
> [ thread pid 0 tid 100000 ]
> Stopped at      kdb_enter+0x3b: movq    $0,0x993262(%rip)
> db>
>
>
> int
> tws_submit_command(struct tws_softc *sc, struct tws_request *req)
> {
>     u_int32_t regl, regh;
>     u_int64_t mfa=0;
>
>     /*
>      * mfa register  read and write must be in order.
>      * Get the io_lock to protect against simultinous
>      * passthru calls
>      */
>     mtx_lock(&sc->io_lock);
>
>     if ( sc->obfl_q_overrun ) {
>         tws_init_obfl_q(sc);
>     }
>
>
>
> With no debugging in the kernel, it boots up fine
>
> pcib2: <ACPI PCI-PCI bridge> irq 17 at device 1.1 on pci0
> pci2: <ACPI PCI bus> on pcib2
> LSI 3ware device driver for SAS/SATA storage controllers, version:
> 10.80.00.003
> tws0: <LSI 3ware SAS/SATA Storage Controller> port 0x4000-0x40ff mem
> 0xc2460000-0xc2463fff,0xc2400000-0xc243ffff irq 17 at device 0.0 on pci2
> tws0: Using legacy INTx
> tws0: Controller details: Model 9750-4i, 8 Phys, Firmware FH9X
> 5.12.00.007, BIOS BE9X 5.11.00.006
> em0: <Intel(R) PRO/1000 Network Connection 7.3.2> port 0x5040-0x505f mem
> 0xc2500000-0xc251ffff,0xc2570000-0xc2570fff irq 19 at device 25.0 on pci0
> em0: Using an MSI interrupt
> em0: Ethernet address: 00:1e:67:45:b6:29
> ehci0: <EHCI (generic) USB 2.0 controller> mem 0xc2560000-0xc25603ff irq
> 22 at device 26.0 on pci0
> usbus0: EHCI version 1.0
> usbus0 on ehci0
>
>
> tws0@pci0:2:0:0:        class=0x010400 card=0x000113c1 chip=0x101013c1
> rev=0x05 hdr=0x00
>     vendor     = '3ware Inc'
>     device     = '9750 SAS2/SATA-II RAID PCIe'
>     class      = mass storage
>     subclass   = RAID
>     bar   [10] = type I/O Port, range 32, base 0x4000, size 256, enabled
>     bar   [14] = type Memory, range 64, base 0xc2460000, size 16384, enabled
>     bar   [1c] = type Memory, range 64, base 0xc2400000, size 262144,
> enabled
>     cap 01[50] = powerspec 3  supports D0 D1 D2 D3  current D0
>     cap 10[68] = PCI-Express 2 endpoint max data 128(4096) link x4(x8)
>     cap 03[d0] = VPD
>     cap 05[a8] = MSI supports 1 message, 64 bit
> ecap 0001[100] = AER 1 1 fatal 0 non-fatal 0 corrected
> ecap 0004[138] = unknown 1
>   PCI-e errors = Fatal Error Detected
>                  Unsupported Request Detected
>          Fatal = Unsupported Request
>
>
>
>
> Also, any reason NOT to set hw.tws.enable_msi=1 in /boot/loader.conf ?
>
>         ---Mike
>
>
>
> --
> -------------------
> Mike Tancsa, tel +1 519 651 3400
> Sentex Communications, mike@sentex.net
> Providing Internet services since 1994 www.sentex.net
> Cambridge, Ontario Canada   http://www.tancsa.com/
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJP=Hc9=Rk5EvbmSe=XJJq_r0WO7DW3oUvxxK3ALAbJRUSgX7g>