Date: Sat, 14 Mar 2015 15:47:33 +0200 From: Konstantin Belousov <kostikbel@gmail.com> To: Michael Fuckner <michael@fuckner.net> Cc: Ryan Stone <rysto32@gmail.com>, Steven Hartland <killing@multiplay.co.uk>, "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org> Subject: Re: Server with 3TB Crashing at boot Message-ID: <20150314134733.GI2379@kib.kiev.ua> In-Reply-To: <5504362C.9000904@fuckner.net> References: <5501DD57.7030305@freebsd.org> <CAJ-Vmom%2BZy_6ZCcTA_h4LFL-wDdaCmbqAgN5shpOsBBgWJw1mg@mail.gmail.com> <5503234A.4060103@fuckner.net> <55032C1B.5030004@multiplay.co.uk> <CAFMmRNzmr0Z1vXYK5y2QLk3eyUqXyK8TELgh4KHZ%2BDQVDXWZAg@mail.gmail.com> <275339388.395931.1426279324879.JavaMail.open-xchange@ptangptang.store> <CAFMmRNya8ZV=D3z0dE1%2BDsprAVBvcHZFGN1g1Ezd8k84nE3kVA@mail.gmail.com> <5503DC66.40409@fuckner.net> <55041EF5.9080200@multiplay.co.uk> <5504362C.9000904@fuckner.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Mar 14, 2015 at 02:22:52PM +0100, Michael Fuckner wrote: > On 3/14/2015 12:43 PM, Steven Hartland wrote: > > > > > > On 14/03/2015 06:59, Michael Fuckner wrote: > >> On 3/13/2015 10:17 PM, Ryan Stone wrote: > >>> On Fri, Mar 13, 2015 at 4:42 PM, Michael Fuckner <michael@fuckner.net> > >>> wrote: > >>> > >>>> Now I can kldload zfs without exploding kernel. I'll do some more > >>>> tests > >>>> tomorrow, but this looks fine! > >>>> > >>> > >>> Excellent news! I'd be interested to know whether this fixes the panics > >>> that you saw when zfs.ko was loaded by the bootloader. It's definitely > >>> possible, as the symptoms of this bug are likely to be random memory > >>> corruption after zfs initializes, but your crash happened pretty > >>> early on > >>> and I'm not sure whether zfs would have had a chance to do anything that > >>> early. > >>> > >>> Thanks for all of the work that you did to debug this. > >> > >> > >> Currently there is another issue that prevents me from testing ZFS: > >> only one HBA gets initialized. > >> > >> mpr0: 9300-8i with 8x Intel S3700 > >> mpr1: 9300-4i4e with 4x Intel S3700 and an external JBOD with 24HDD. > >> > >> mpr0 initializes fine, mpr1 fails > >> > >> root@s4l:~ # dmesg |grep mpr > >> mpr1: <LSI SAS3008> port 0xf000-0xf0ff mem 0xfb100000-0xfb10ffff irq > >> 112 at device 0.0 on pci195 > >> mpr1: IOCFacts : > >> mpr1: Firmware: 07.00.01.00, Driver: 05.255.05.00-fbsd > >> mpr1: IOCCapabilities: > >> 7a85c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,MSIXIndex,HostDisc> > >> > >> mpr1: Cannot allocate queues memory > >> mpr1: mpr_iocfacts_allocate failed to alloc queues with error 12 > >> mpr1: mpr_attach IOC Facts based allocation failed with error 12 > >> device_attach: mpr1 attach returned 12 > > did your patch for queue size and sense size. > > I just saw: > > http://dedi3.fuckner.net/~molli123/temp/mpr2.cap (just the second half > of the boot, but nothing changed but the two debug echos) > > > root@s4l:~ # dmesg |grep mpr > mpr0: <LSI SAS3008> port 0x2000-0x20ff mem 0xaba00000-0xaba0ffff irq 42 > at device 0.0 on pci4 > mpr0: IOCFacts : > mpr0: Firmware: 07.00.01.00, Driver: 05.255.05.00-fbsd > mpr0: IOCCapabilities: > 7a85c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,MSIXIndex,HostDisc> > mpr0: attempting to allocate 1 MSI-X vectors (96 supported) > mpr0: using IRQ 300 for MSI-X > mpr1: <LSI SAS3008> port 0xf000-0xf0ff mem 0xfb100000-0xfb10ffff irq 112 > at device 0.0 on pci195 > mpr1: IOCFacts : > mpr1: Firmware: 07.00.01.00, Driver: 05.255.05.00-fbsd > mpr1: IOCCapabilities: > 7a85c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,MSIXIndex,HostDisc> > mpr1: Cannot allocate sense size 258048 memory > mpr1: mpr_iocfacts_allocate failed to alloc queues with error 12 > mpr1: mpr_attach IOC Facts based allocation failed with error 12 > device_attach: mpr1 attach returned 12 > (probe0:mpr0:0:0:0): Down reving Protocol Version from 4 to 0? > da0 at mpr0 bus 0 scbus0 target 0 lun 0 > pass0 at mpr0 bus 0 scbus0 target 0 lun 0 > > In Bios I have VT-d enabled. In the VT-d Menu there are two other options: > ATS (Non-Iscoh VT-D Engine ATS Support), default: enabled ATS is not used by FreeBSD right now, and your device probably does not support it as well. > Coherency Support(Non Iscoh VT_D Engine Coherency Support), default: > Disabled There is a bug in 10.1 which incorrectly invalidates IOMMU pages cache. Enabling the coherency works around the bug. > > In the Processor Configuration tab there also was > extended apic support (default is disabled) On 10.1 this option should not result in any behaviour change. > > Should I give this option a try? Up to you. I am somewhat curious whether it boots with DMAR enabled and what happens if it does not.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150314134733.GI2379>