From owner-freebsd-stable@FreeBSD.ORG Mon Jul 19 03:09:36 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6F13D106566C for ; Mon, 19 Jul 2010 03:09:36 +0000 (UTC) (envelope-from mike@sentex.net) Received: from lava.sentex.ca (pyroxene.sentex.ca [199.212.134.18]) by mx1.freebsd.org (Postfix) with ESMTP id 4205E8FC0C for ; Mon, 19 Jul 2010 03:09:35 +0000 (UTC) Received: from mdt-xp.sentex.net (simeon.sentex.ca [192.168.43.27]) by lava.sentex.ca (8.14.4/8.14.3) with ESMTP id o6J39Y9i045639; Sun, 18 Jul 2010 23:09:34 -0400 (EDT) (envelope-from mike@sentex.net) Message-Id: <201007190309.o6J39Y9i045639@lava.sentex.ca> X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9 Date: Sun, 18 Jul 2010 23:09:36 -0400 To: Jeremy Chadwick From: Mike Tancsa In-Reply-To: <20100719025839.GA91809@icarus.home.lan> References: <201007182108.o6IL88eG043887@lava.sentex.ca> <20100718211415.GA84127@icarus.home.lan> <201007182142.o6ILgDQW044046@lava.sentex.ca> <20100719023419.GA91006@icarus.home.lan> <20100719025839.GA91809@icarus.home.lan> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Cc: freebsd-stable@freebsd.org Subject: Re: deadlock or bad disk ? RELENG_8 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Jul 2010 03:09:36 -0000 At 10:58 PM 7/18/2010, Jeremy Chadwick wrote: >I re-worked this out myself based on the OP's dmesg. It's confusing >because there's literally 6 different storage controllers on a single >machine: Its a big storage server. Some files dont require fast or frequent access, others do. The disks on the sata controllers are used with zfs for the large files that require infrequent access for example. >(probe16:arcmsr0:0:16:0): inquiry data fails comparison at DV1 step Thats been a normal message for Areca controllers for some time on AMD64 >So one thing of interest is that the Areca and 3ware controllers are >sharing an IRQ. If you do extensive bidirectional I/O between disks on >the arcmsr0 and twa0 controllers at the same time (e.g. read from >arcmsr0 which writes to twa0, and read from twa0 which writes to >arcmsr0), do you see this problem? Its never been an issue in the past 2yrs. The same box was RELENG_7 for some time and then in the past 3 months updated to RELENG_8. > vmstat -i output would help here, >except that it's going to show the rate as a total (for both >controllers). I don't know if a way to get more granular output. > >pciconf -lvc output might also help (to see if the controllers are using >MSI or not); only interested in the arcmsr0, twa0, and ahci1 entries. interrupt total rate irq1: atkbd0 6 0 irq4: uart0 1049 0 irq18: arcmsr0 twa* 5221148 151 irq19: fwohci0+ 151346 4 irq23: uhci3 ehci1 2 0 cpu0: timer 67544881 1962 irq256: em0 57430641 1668 irq257: ale0 3262365 94 irq258: ahci1 10406081 302 cpu1: timer 67539701 1962 cpu2: timer 67168885 1951 cpu3: timer 67169530 1951 Total 345895635 10049 arcmsr0@pci0:3:14:0: class=0x010400 card=0x121017d3 chip=0x121017d3 rev=0x00 hdr=0x00 vendor = 'Areca Technology Corporation' device = 'ARC-1210 4-Port PCIe to SATA RAID Controller' class = mass storage subclass = RAID cap 01[c0] = powerspec 2 supports D0 D1 D3 current D0 cap 05[d0] = MSI supports 2 messages, 64 bit cap 07[e0] = PCI-X 64-bit supports 133MHz, 1024 burst read, 4 split transactions siis0@pci0:8:0:0: class=0x010400 card=0x71321095 chip=0x31321095 rev=0x01 hdr=0x00 vendor = 'Silicon Image Inc (Was: CMD Technology Inc)' device = 'PCI Express (1x) to 2 Port SATA300 (SiI 3132)' class = mass storage subclass = RAID cap 01[54] = powerspec 2 supports D0 D1 D2 D3 current D0 cap 05[5c] = MSI supports 1 message, 64 bit cap 10[70] = PCI-Express 1 legacy endpoint max data 128(1024) link x1(x1) twa0@pci0:7:0:0: class=0x010400 card=0x100413c1 chip=0x100413c1 rev=0x01 hdr=0x00 vendor = '3ware Inc' device = 'PCI-Express SATA2 Raid Controller (9650SE Series)' class = mass storage subclass = RAID cap 01[40] = powerspec 2 supports D0 D1 D2 D3 current D0 cap 05[50] = MSI supports 32 messages, 64 bit cap 10[70] = PCI-Express 1 legacy endpoint max data 128(512) link x1(x8) ahci0@pci0:5:0:0: class=0x010601 card=0x824f1043 chip=0x2361197b rev=0x02 hdr=0x00 vendor = 'JMicron Technology Corp.' device = 'PCI Express to SATA II and PATA Host Controller (JMB363)' class = mass storage subclass = SATA cap 01[68] = powerspec 2 supports D0 D3 current D0 cap 10[50] = PCI-Express 1 legacy endpoint IRQ 2 max data 128(128) link x1(x1) ahci1@pci0:0:31:2: class=0x010601 card=0x82d41043 chip=0x3a228086 rev=0x00 hdr=0x00 vendor = 'Intel Corporation' device = '6 port SATA AHCI Controller' class = mass storage subclass = SATA cap 05[80] = MSI supports 16 messages enabled with 1 message cap 01[70] = powerspec 3 supports D0 D3 current D0 cap 12[a8] = SATA Index-Data Pair cap 09[b0] = vendor (length 6) Intel cap 2 version 0 >-- >| Jeremy Chadwick jdc@parodius.com | >| Parodius Networking http://www.parodius.com/ | >| UNIX Systems Administrator Mountain View, CA, USA | >| Making life hard for others since 1977. PGP: 4BD6C0CB | -------------------------------------------------------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet since 1994 www.sentex.net Cambridge, Ontario Canada www.sentex.net/mike