Date: Sat, 31 Jan 2009 16:48:55 -0500 From: Dylan Alex Simon <dylan@dylex.net> To: Christoph Mallon <christoph.mallon@gmx.de> Cc: freebsd-current@FreeBSD.ORG Subject: Re: SATA DMA errors on second ICH10 bus Message-ID: <20090131214855.GA9123@datura.dylex.net> In-Reply-To: <49844264.7000300@gmx.de> References: <8cb6106e0901200641x4b0bda9ag31e6f059f13035a7@mail.gmail.com> <200901201829.n0KITE8V072323@lurza.secnetix.de> <20090131010855.GA7991@datura.dylex.net> <49844264.7000300@gmx.de>
next in thread | previous in thread | raw e-mail | index | archive | help
> I suspect I see the same problem with some nvidia SATA controller. If > there is high load on both channels of one controller, there are exactly > the errors you showed. > Your kernel does not use INVARIANTS, is this correct? Otherwise you > should see a very specific panic caused by a KASSERT(). I analysed the > problem a bit. You can see my findings in the thread "Question about > panic in brelse()". > I suspect a hardware bug plus incorrect error handling in the driver in > FreeBSD. As a workaround, I suggest you connect each disk to a separate > controller - if you have not more disks than controllers. When I do turn INVARIANTS on I ultimately get a number of different failures, depending on what sort of operation I'm doing. I think I've seen the brelse panic you mentioned but not recently. Here's one from today doing cp on ufs: ad0: FAILURE - load data ad0: setting up DMA failed g_vfs_done():ad0s1e[READ(offset=1843986432, length=65536)]error = 5 vnode_pager_getpages: I/O read error vm_fault: pager read error, pid 819 (cp) kernel trap 9 with interrupts disabled Fatal trap 9: general protection fault while in kernel mode cpuid = 0; apic id = 00 instruction pointer = 0x8:0xffffffff802ae9fe stack pointer = 0x10:0xfffffffeb61bfae0 frame pointer = 0x10:0xfffffffeb61bfb00 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = resume, IOPL = 0 current process = 12 (irq14: ata0) lock order reversal: (Giant after non-sleepable) 1st 0xffffffff80628750 bio queue (bio queue) @ /usr/src/sys/geom/geom_io.c:68 2nd 0xffffffff8062b8c0 Giant (Giant) @ /usr/src/sys/dev/kbdmux/kbdmux.c:1044 KDB: stack backtrace: panic: mutex Giant not owned at /usr/src/sys/kern/tty_ttydisc.c:1127 cpuid = 0 I certainly agree that there's some problems in error handling, but I'm more concerned about the underlying problem causing the errors. :-Dylan
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090131214855.GA9123>