Date: Wed, 22 May 2013 23:37:33 -0700 From: Andreas Turriff <maillist@turriff.net> To: Konstantin Belousov <kostikbel@gmail.com> Cc: freebsd-current@freebsd.org Subject: Re: tws(4) kernel panic on boot Message-ID: <519DB92D.9030105@turriff.net> In-Reply-To: <20130523062306.GR3047@kib.kiev.ua> References: <519AD765.3060601@turriff.net> <20130521123306.GD3047@kib.kiev.ua> <519B9FED.8000806@turriff.net> <519D09B2.2030003@turriff.net> <20130523062306.GR3047@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
On 5/22/2013 11:23 PM, Konstantin Belousov wrote: > On Wed, May 22, 2013 at 11:08:50AM -0700, Andreas Turriff wrote: >> On 5/21/2013 9:25 AM, Andreas Turriff wrote: >>> On 5/21/2013 5:33 AM, Konstantin Belousov wrote: >>>> On Mon, May 20, 2013 at 07:09:41PM -0700, Andreas Turriff wrote: >>>>> On migrating one of my servers to -current, I discovered that the tws >>>>> driver panics on boot; I will follow up with a full backtrace once I >>>>> have a chance to extract it. In the meantime, there is a PR about a >>>>> very >>>>> similar error in twa - 177020. Is it possible those are related, and >>>>> the >>>>> same sort of change needs to be made to tws? >>>> It is possible that the regression was in r246713, but the code is >>>> structured differently, and there were more tws(4) changes since then. >>>> You need to provide data for somebody to start looking into the problem. >>> I know. That's why I said, I'd follow up with more info once I can >>> extract it. >>> >>> The system in question is a Dell PowerEdge 840 server, 8 GiByte RAM, >>> with an Intel NIC driven by em(4) and a 3Ware 9750-4i RAID controller. >>> There is no src.conf >>> >>> /etc/make.conf: >>> CPUTYPE?=core2 >>> >>> Error message: >>> >>> LSI 3ware device driver for SAS/SATA storage controllers, version: >>> 10.80.00.005 >>> tws0: <LSI 3ware SAS/SATA Storage Controller> port 0xec00-0xecff mem >>> 0xfe9fc000-0xfe9fffff,0xfe980000-0xfe9bffff irq 16 at device 0.0 o1 >>> tws0: Using MSIng APIC ID to 4 >>> panic: _bus_dmamap_load_ccb: Unsupported func code 0 >>> cpuid = 0Version 2.0> irqs 0-23 on motherboard >>> KDB: enter: panic2.0> irqs 32-55 on motherboard >>> [ thread pid 0 tid 100000 ] >>> Stopped at kdb_enter+0x3e: movq $0,kdb_why >>> >>> Backtrace >>> >>> Tracing pid 0 tid 100000 td 0xffffffff81376610 >>> kdb_enter() at kdb_enter+0x3e/frame 0xffffffff8191a340 >>> panic() at panic+0x175/frame 0xffffffff8191a3c0 >>> _bus_dmamap_load_ccb() at _bus_dmamap_load_ccb+0x1c3/frame >>> 0xffffffff8191a420 >>> bus_dmamap_load_ccb() at bus_dmamap_load_ccb+0x91/frame >>> 0xffffffff8191a480 >>> tws_map_request() at tws_map_request+0x71/frame 0xffffffff8191a4c0 >>> tws_get_param() at tws_get_param+0xdd/frame 0xffffffff8191a520 >>> tws_display_ctlr_info() at tws_display_ctlr_info+0x38/frame >>> 0xffffffff8191a590 >>> tws_init_ctlr() at tws_init_ctlr+0x6b/frame 0xffffffff8191a5b0 >>> tws_attach() at tws_attach+0xd79/frame 0xffffffff8191a670 >>> device_attach() at device_attach+0x396/frame 0xffffffff8191a6c0 >>> bus_generic_attach() at bus_generic_attach+0x2d/frame 0xffffffff8191a6e0 >>> acpi_pci_attach() at acpi_pci_attach+0x15f/frame 0xffffffff8191a730 >>> device_attach() at device_attach+0x396/frame 0xffffffff8191a780 >>> bus_generic_attach() at bus_generic_attach+0x2d/frame 0xffffffff8191a7a0 >>> acpi_pcib_attach() at acpi_pcib_attach+0x24d/frame 0xffffffff8191a7f0 >>> acpi_pcib_pci_attach() at acpi_pcib_pci_attach+0x9f/frame >>> 0xffffffff8191a830 >>> device_attach() at device_attach+0x396/frame 0xffffffff8191a880 >>> bus_generic_attach() at bus_generic_attach+0x2d/frame 0xffffffff8191a8a0 >>> acpi_pci_attach() at acpi_pci_attach+0x15f/frame 0xffffffff8191a8f0 >>> device_attach() at device_attach+0x396/frame 0xffffffff8191a940 >>> bus_generic_attach() at bus_generic_attach+0x2d/frame 0xffffffff8191a960 >>> acpi_pcib_attach() at acpi_pcib_attach+0x24d/frame 0xffffffff8191a9b0 >>> acpi_pcib_acpi_attach() at acpi_pcib_acpi_attach+0x299/frame >>> 0xffffffff8191aa00 >>> device_attach() at device_attach+0x396/frame 0xffffffff8191aa50 >>> bus_generic_attach() at bus_generic_attach+0x2d/frame 0xffffffff8191aa70 >>> acpi_attach() at acpi_attach+0xdd6/frame 0xffffffff8191ab30 >>> device_attach() at device_attach+0x396/frame 0xffffffff8191ab80 >>> bus_generic_attach() at bus_generic_attach+0x2d/frame 0xffffffff8191aba0 >>> nexus_acpi_attach() at nexus_acpi_attach+0x76/frame 0xffffffff8191abd0 >>> device_attach() at device_attach+0x396/frame 0xffffffff8191ac20 >>> bus_generic_new_pass() at bus_generic_new_pass+0xe9/frame >>> 0xffffffff8191ac50 >>> bus_set_pass() at bus_set_pass+0x8f/frame 0xffffffff8191ac80 >>> configure() at configure+0xa/frame 0xffffffff8191ac90 >>> mi_startup() at mi_startup+0x118/frame 0xffffffff8191acb0 >>> btext() at btext+0x2c >>> >>> >>> >>> _______________________________________________ >>> freebsd-current@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-current >>> To unsubscribe, send any mail to >>> "freebsd-current-unsubscribe@freebsd.org" >> And after taking a very close look at the source code for tws, I spotted >> the problem. Patch included. >> >> ~Andreas >> >> Index: sys/dev/tws/tws.h >> =================================================================== >> --- sys/dev/tws/tws.h (revision 250856) >> +++ sys/dev/tws/tws.h (working copy) >> @@ -137,7 +137,7 @@ >> TWS_DIR_IN = 0x2, >> TWS_DIR_OUT = 0x4, >> TWS_DIR_NONE = 0x8, >> - TWS_DATA_CCB = 0x16, >> + TWS_DATA_CCB = 0x10, >> }; >> >> enum tws_intrs { >> > Do you mean that this change alone fixes your panic and the controller > works after the boot ? > > I started looking at the code, and thought that there some issues > with DATA_CCB flag set too eagerly. I've been running that kernel all day, rebuilding userland (ports) on a 4-drive ZFS RAID-Z on that controller, and not seen a single crash, slowdown, hiccup or untoward log message. ~Andreas
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?519DB92D.9030105>