From owner-freebsd-fs@FreeBSD.ORG Sat Jun 18 14:26:22 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B83B61065670; Sat, 18 Jun 2011 14:26:22 +0000 (UTC) (envelope-from stephane.lapie@darkbsd.org) Received: from quasar.darkbsd.org (shinigami.darkbsd.org [82.227.96.182]) by mx1.freebsd.org (Postfix) with ESMTP id D16338FC08; Sat, 18 Jun 2011 14:26:21 +0000 (UTC) Received: from quasar.darkbsd.org (localhost [127.0.0.1]) by quasar.darkbsd.org (Postfix) with ESMTP id 4F98A6FF5; Sat, 18 Jun 2011 16:07:56 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=darkbsd.org; h=message-id :date:from:mime-version:to:subject:content-type; s=selector1; bh=qxMQ+Z1joU4CsnjA1GngSTtHhqk=; b=mrQ0I/2cbCztGatgJll5G1XS6CSW 7wDz7XQpYdmMD9i6k3FeKNcYNm1XZGoVTTIG+4hQvhBEPaHPm1HfZ5Y2r3t2luph YkXZdwr7CMsMHyT1O/h3pFejCWP/6mWRjYrrlmOkvQpVO6WkyxRnHP8+SmWTj5L3 3s8gkbGjQoerC+Y= DomainKey-Signature: a=rsa-sha1; c=nofws; d=darkbsd.org; h=message-id :date:from:mime-version:to:subject:content-type; q=dns; s= selector1; b=qS0+daBzZu7nx51fQowh3IC//wFYOuxpZTM9UVh87WQiu001Y2W eYDQbUnFP4VbjNrWQPSEs7D6JOfrDT6vPDnT/AwcJJxJigAqBKhTeEljSfKzizDg Bd+m9TnCyaVmKf9wczgERte81q2U/nLQOlB40kwSuHGr1joMxF0Rut8A= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=darkbsd.org; h= content-type:content-type:subject:subject:mime-version :user-agent:from:from:date:date:message-id:received:received; s= selector1; t=1308406071; bh=HRSrx71OBoInPYqWhZzR/iCvG3HZKXjGv8EX 0GijCXc=; b=xX5WbHzQrh/GYCJ5WkYe/kbhdcNqFou5DoE6V1TSOjz3hPjO0k89 ilQemy/FMDtQtdhwVzkDNrJOySUfWClce883yye1EM2xDniXV1cKZpUwbJ+53+AJ Yi4Vx8JxEWWGk3jRJwEOkeZQrBY3oppTc4js+J/F40LePlQSKbj9I7I= Received: from quasar.darkbsd.org ([127.0.0.1]) by quasar.darkbsd.org (quasar.darkbsd.org [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id dRF6YtYnkAJN; Sat, 18 Jun 2011 16:07:51 +0200 (CEST) Received: from [192.168.3.42] (archer.yomi.darkbsd.org [192.168.3.42]) (Authenticated sender: darksoul) by quasar.darkbsd.org (Postfix) with ESMTPSA id 1984C6FEE; Sat, 18 Jun 2011 16:07:47 +0200 (CEST) Message-ID: <4DFCB12A.6030805@darkbsd.org> Date: Sat, 18 Jun 2011 23:07:38 +0900 From: Stephane LAPIE User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110516 Thunderbird/3.1.10 MIME-Version: 1.0 To: freebsd-hardware@freebsd.org, freebsd-drivers@freebsd.org, freebsd-fs@freebsd.org X-Enigmail-Version: 1.1.2 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig8BE633BB83E59ACA09FD0D2A" Cc: Subject: Problem with a LSILogic SAS/SATA adapter on 8.2-STABLE/ZFSv28 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Jun 2011 14:26:22 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig8BE633BB83E59ACA09FD0D2A Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hello list, I have a problem with my 8.2-STABLE/ZFSv28 server. I am currently upgrading my disks from 1.5TB Seagate drives to 2TB Seagate drives, and therefore replacing devices within ZFS. (I have activated deduplication on a few file systems, for the record) I think this is more related to a hardware problem (flaky memory ? flaky controller/driver maybe ?), but I would appreciate any input. I experienced several kernel panics, all of which seem to point at mpt0 mis-handling interrupts : www.darkbsd.org/~darksoul/kernel-panic-mpt1.txt (no target cmd ptrs) www.darkbsd.org/~darksoul/kernel-panic-mpt2.txt (mpt_intr index =3D=3D ..= =2E) www.darkbsd.org/~darksoul/kernel-panic-mpt3.txt (NMI in kernel mode) www.darkbsd.org/~darksoul/kernel-panic-mpt4.txt (LAN CONTEXT REPLY) www.darkbsd.org/~darksoul/kernel-panic-mpt5.txt (LAN CONTEXT REPLY) www.darkbsd.org/~darksoul/kernel-panic-mpt6.txt (LAN CONTEXT REPLY) www.darkbsd.org/~darksoul/kernel-panic-mpt7.txt (LAN CONTEXT REPLY) I would appeciate any pointers to what on earth "LAN CONTEXT REPLY" means for an LSI controller (using driver mpt(4)), as I have no idea, and the source was not really helpful. The error message about an NMI and RAM parity error is what is scaring me the most here, and points me in the direction of flaky memory. This is a personal machine, so I can add debug options and try stuff if it can help figure out what is going on. Also, any critical data is replicated, backed up and accounted for. Thanks in advance for your time. Here is a zpool list and a zpool status : NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT prana 22.7T 17.4T 5.29T 76% 1.18x DEGRADED - pool: prana state: DEGRADED status: One or more devices is currently being resilvered. The pool will= continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Sat Jun 18 12:43:02 2011 13.8T scanned out of 17.3T at 236/s, (scan is slow, no estimated time= ) 899G resilvered, 79.38% done config: NAME STATE READ WRITE CKSUM prana DEGRADED 0 0 0 raidz1-0 DEGRADED 0 0 0 da3 OFFLINE 0 0 0 ad14 ONLINE 0 0 0 ad12 ONLINE 0 0 0 da1 ONLINE 0 0 0 da0 ONLINE 0 0 0 raidz1-1 DEGRADED 0 0 0 ad26 ONLINE 0 0 0 replacing-1 DEGRADED 0 0 0 da6/old OFFLINE 0 0 0 da6 ONLINE 0 0 0 (resilvering) da4 ONLINE 0 0 0 da7 ONLINE 0 0 0 da5 ONLINE 0 0 0 raidz1-2 ONLINE 0 0 0 ad28 ONLINE 0 0 0 ad8 ONLINE 0 0 0 ad6 ONLINE 0 0 0 ad16 ONLINE 0 0 0 ad18 ONLINE 0 0 0 cache gptid/d9c047d5-c1a7-11df-b584-000e0c707d1e ONLINE 0 0 0= gptid/da695e56-c1a7-11df-b584-000e0c707d1e ONLINE 0 0 0= spares da8 AVAIL da9 AVAIL Here is my dmesg trace : Copyright (c) 1992-2011 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.2-STABLE #1: Thu Jun 16 23:22:47 JST 2011 darksoul@eirei-no-za.yomi.darkbsd.org:/usr/storage/tech/eirei-no-za.yomi.= darkbsd.org/usr/obj/usr/storage/tech/eirei-no-za.yomi.darkbsd.org/usr/src= /sys/DARK-2011KERN amd64 Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz (2666.68-MHz K8-class CPU) Origin =3D "GenuineIntel" Id =3D 0x1067a Family =3D 6 Model =3D 17 Stepping =3D 10 Features=3D0xbfebfbff Features2=3D0x408e3fd AMD Features=3D0x20100800 AMD Features2=3D0x1 TSC: P-state invariant real memory =3D 8589934592 (8192 MB) avail memory =3D 8254509056 (7872 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs FreeBSD/SMP: 1 package(s) x 4 core(s) cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 ioapic0 irqs 0-23 on motherboard ioapic1 irqs 24-47 on motherboard kbd1 at kbdmux0 ichwd module loaded iscsi: version 2.2.4.2 cryptosoft0: on motherboard acpi0: on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 cpu0: on acpi0 cpu1: on acpi0 cpu2: on acpi0 cpu3: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pcib1: irq 16 at device 6.0 on pci0 pci3: on pcib1 pcib2: at device 0.0 on pci3 pci4: on pcib2 em0: port 0x2000-0x203f mem 0xdf980000-0xdf99ffff,0xdf900000-0xdf93ffff irq 16 at device 4.0 on pci4 em0: [FILTER] em0: Ethernet address: 00:0e:0c:70:7d:1e em1: port 0x2040-0x207f mem 0xdf9a0000-0xdf9bffff,0xdf940000-0xdf97ffff irq 17 at device 4.1 on pci4 em1: [FILTER] em1: Ethernet address: 00:0e:0c:70:7d:1f pcib3: at device 0.2 on pci3 pci5: on pcib3 em2: port 0x1820-0x183f mem 0xdfb00000-0xdfb1ffff,0xdfb20000-0xdfb20fff irq 16 at device 25.0 on pci0= em2: Using an MSI interrupt em2: [FILTER] em2: Ethernet address: 00:30:48:de:84:88 uhci0: port 0x1840-0x185f irq 16 at device 26.0 on pci0 uhci0: [ITHREAD] usbus0: on uhci0 uhci1: port 0x1860-0x187f irq 17 at device 26.1 on pci0 uhci1: [ITHREAD] usbus1: on uhci1 uhci2: port 0x1880-0x189f irq 18 at device 26.2 on pci0 uhci2: [ITHREAD] usbus2: on uhci2 ehci0: mem 0xdfb22800-0xdfb22bff irq 18 at device 26.7 on pci0 ehci0: [ITHREAD] usbus3: EHCI version 1.0 usbus3: on ehci0 pcib4: irq 16 at device 28.0 on pci0 pci6: on pcib4 pcib5: at device 0.0 on pci6 pci7: on pcib5 mpt0: port 0x3000-0x30ff mem 0xdf310000-0xdf313fff,0xdf300000-0xdf30ffff irq 24 at device 1.0 on pci7 mpt0: [ITHREAD] mpt0: MPI Version=3D1.5.12.0 mpt0: Capabilities: ( RAID-0 RAID-1E RAID-1 ) mpt0: 0 Active Volumes (2 Max) mpt0: 0 Hidden Drive Members (10 Max) atapci0: port 0x3400-0x34ff mem 0xdf200000-0xdf2fffff irq 28 at device 7.0 on pci7 atapci0: [ITHREAD] ata2: on atapci0 ata2: [ITHREAD] ata3: on atapci0 ata3: [ITHREAD] ata4: on atapci0 ata4: [ITHREAD] ata5: on atapci0 ata5: [ITHREAD] ata6: on atapci0 ata6: [ITHREAD] ata7: on atapci0 ata7: [ITHREAD] ata8: on atapci0 ata8: [ITHREAD] ata9: on atapci0 ata9: [ITHREAD] uhci3: port 0x18a0-0x18bf irq 23 at device 29.0 on pci0 uhci3: [ITHREAD] usbus4: on uhci3 uhci4: port 0x18c0-0x18df irq 22 at device 29.1 on pci0 uhci4: [ITHREAD] usbus5: on uhci4 uhci5: port 0x18e0-0x18ff irq 18 at device 29.2 on pci0 uhci5: [ITHREAD] usbus6: on uhci5 ehci1: mem 0xdfb22c00-0xdfb22fff irq 23 at device 29.7 on pci0 ehci1: [ITHREAD] usbus7: EHCI version 1.0 usbus7: on ehci1 pcib6: at device 30.0 on pci0 pci17: on pcib6 em3: port 0x4080-0x40bf mem 0xdfa00000-0xdfa1ffff irq 20 at device 0.0 on pci17 em3: [FILTER] em3: Ethernet address: 00:07:e9:0f:a3:80 em4: port 0x40c0-0x40ff mem 0xdfa20000-0xdfa3ffff irq 21 at device 0.1 on pci17 em4: [FILTER] em4: Ethernet address: 00:07:e9:0f:a3:81 vgapci0: port 0x4000-0x407f mem 0xde800000-0xdeffffff,0xdfa40000-0xdfa4ffff at device 1.0 on pci17 fwohci0: mem 0xdfa54000-0xdfa547ff,0xdfa50000-0xdfa53fff irq 22 at device 3.0 on pci17= fwohci0: [ITHREAD] fwohci0: OHCI version 1.10 (ROM=3D1) fwohci0: No. of Isochronous channels is 4. fwohci0: EUI64 00:30:48:00:00:20:42:f6 fwohci0: Phy 1394a available S400, 2 ports. fwohci0: Link S400, max_rec 2048 bytes. firewire0: on fwohci0 fwe0: on firewire0 if_fwe0: Fake Ethernet address: 02:30:48:20:42:f6 fwe0: Ethernet address: 02:30:48:20:42:f6 fwip0: on firewire0 fwip0: Firewire address: 00:30:48:00:00:20:42:f6 @ 0xfffe00000000, S400, maxrec 2048 dcons_crom0: on firewire0 dcons_crom0: bus_addr 0x80ab60 fwohci0: Initiate bus reset fwohci0: fwohci_intr_core: BUS reset fwohci0: fwohci_intr_core: node_id=3D0x00000000, SelfID Count=3D1, CYCLEMASTER mode atapci1: port 0x4420-0x4427,0x4414-0x4417,0x4418-0x441f,0x4410-0x4413,0x4400-0x440f irq 23 at device 4.0 on pci17 atapci1: [ITHREAD] ata10: on atapci1 ata10: [ITHREAD] isab0: at device 31.0 on pci0 isa0: on isab0 atapci2: port 0x1c70-0x1c77,0x1c64-0x1c67,0x1c68-0x1c6f,0x1c60-0x1c63,0x1c00-0x1c1f mem 0xdfb22000-0xdfb227ff irq 17 at device 31.2 on pci0 atapci2: [ITHREAD] atapci2: AHCI called from vendor specific driver atapci2: AHCI v1.20 controller with 6 3Gbps ports, PM supported ata11: on atapci2 ata11: [ITHREAD] ata12: on atapci2 ata12: [ITHREAD] ata13: on atapci2 ata13: [ITHREAD] ata14: on atapci2 ata14: [ITHREAD] ata15: on atapci2 ata15: [ITHREAD] ata16: on atapci2 ata16: [ITHREAD] ichsmb0: port 0x1100-0x111f mem 0xdfb23000-0xdfb230ff irq 17 at device 31.3 on pci0 ichsmb0: [ITHREAD] smbus0: on ichsmb0 smb0: on smbus0 pci0: at device 31.6 (no driver attached) acpi_button0: on acpi0 atrtc0: port 0x70-0x71 irq 8 on acpi0 atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 uart0: [FILTER] uart0: console (115200,n,8,1) uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0 uart1: [FILTER] ichwd0: on isa0 ichwd0: Intel ICH9R watchdog timer (ICH9 or equivalent) orm0: at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=3D0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0= coretemp0: on cpu0 est0: on cpu0 p4tcc0: on cpu0 coretemp1: on cpu1 est1: on cpu1 p4tcc1: on cpu1 coretemp2: on cpu2 est2: on cpu2 p4tcc2: on cpu2 coretemp3: on cpu3 est3: on cpu3 p4tcc3: on cpu3 ZFS filesystem version 5 ZFS storage pool version 28 Timecounters tick every 1.000 msec firewire0: 1 nodes, maxhop <=3D 0 cable IRM irm(0) (me) firewire0: bus manager 0 usbus0: 12Mbps Full Speed USB v1.0 usbus1: 12Mbps Full Speed USB v1.0 usbus2: 12Mbps Full Speed USB v1.0 usbus3: 480Mbps High Speed USB v2.0 usbus4: 12Mbps Full Speed USB v1.0 usbus5: 12Mbps Full Speed USB v1.0 usbus6: 12Mbps Full Speed USB v1.0 usbus7: 480Mbps High Speed USB v2.0 ugen0.1: at usbus0 uhub0: on usbus0 ugen1.1: at usbus1 uhub1: on usbus1 ugen2.1: at usbus2 uhub2: on usbus2 ugen3.1: at usbus3 uhub3: on usbus3 ugen4.1: at usbus4 uhub4: on usbus4 ugen5.1: at usbus5 uhub5: on usbus5 ugen6.1: at usbus6 uhub6: on usbus6 ugen7.1: at usbus7 uhub7: on usbus7 ad6: 1907729MB at ata3-master UDMA100 SATA 3G= b/s ad8: 1907729MB at ata4-master UDMA100 SATA 3G= b/s ad12: 1430799MB at ata6-master UDMA100 SATA 3Gb/s ad14: 1907729MB at ata7-master UDMA100 SATA 3Gb/s uhub0: 2 ports with 2 removable, self powered uhub1: 2 ports with 2 removable, self powered uhub2: 2 ports with 2 removable, self powered uhub4: 2 ports with 2 removable, self powered uhub5: 2 ports with 2 removable, self powered uhub6: 2 ports with 2 removable, self powered ad16: 1907729MB at ata8-master UDMA100 SATA 3Gb/s ad18: 1907729MB at ata9-master UDMA100 SATA 3Gb/s ata10: DMA limited to UDMA33, controller found non-ATA66 cable ad20: 3823MB at ata10-master UDMA33 ad21: 61136MB at ata10-slave UDMA133 ad26: 1907729MB at ata13-master UDMA100 SATA 3Gb/s ad28: 1907729MB at ata14-master UDMA100 SATA 3Gb/s uhub3: 6 ports with 6 removable, self powered uhub7: 6 ports with 6 removable, self powered ugen7.2: at usbus7 umass0: on usbus7 umass0: SCSI over Bulk-Only; quirks =3D 0x0000 ugen3.2: at usbus3 da0 at mpt0 bus 0 scbus0 target 0 lun 0 da0: Fixed Direct Access SCSI-5 device da0: 300.000MB/s transfers da0: Command Queueing enabled da0: 1430799MB (2930277168 512 byte sectors: 255H 63S/T 182401C) da1 at mpt0 bus 0 scbus0 target 1 lun 0 da1: Fixed Direct Access SCSI-5 device da1: 300.000MB/s transfers da1: Command Queueing enabled da1: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) da2 at mpt0 bus 0 scbus0 target 2 lun 0 da2: Fixed Direct Access SCSI-5 device da2: 300.000MB/s transfers da2: Command Queueing enabled da2: 61136MB (125206528 512 byte sectors: 255H 63S/T 7793C) da3 at mpt0 bus 0 scbus0 target 3 lun 0 da3: Fixed Direct Access SCSI-5 device da3: 300.000MB/s transfers da3: Command Queueing enabled da3: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) da4 at mpt0 bus 0 scbus0 target 4 lun 0 da4: Fixed Direct Access SCSI-5 device da4: 300.000MB/s transfers da4: Command Queueing enabled da4: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) da5 at mpt0 bus 0 scbus0 target 5 lun 0 da5: Fixed Direct Access SCSI-5 device da5: 300.000MB/s transfers da5: Command Queueing enabled da5: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) da6 at mpt0 bus 0 scbus0 target 6 lun 0 da6: Fixed Direct Access SCSI-5 device da6: 300.000MB/s transfers da6: Command Queueing enabled da6: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) da7 at mpt0 bus 0 scbus0 target 7 lun 0 da7: Fixed Direct Access SCSI-5 device da7: 300.000MB/s transfers da7: Command Queueing enabled da7: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) SMP: AP CPU #2 Launched! SMP: AP CPU #1 Launched! SMP: AP CPU #3 Launched! Root mount waiting for: usbus7 umass0:2:0:-1: Attached to scbus2da8 at umass-sim0 bus 0 scbus2 target 0 lun 0 da8: Fixed Direct Access SCSI-4 device da8: 40.000MB/s transfers da8: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) ugen7.3: at usbus7 uhub8: on usbus7 Root mount waiting for: usbus7 uhub8: 4 ports with 4 removable, self powered ugen7.4: at usbus7 umass1: on usbus7 umass1: SCSI over Bulk-Only; quirks =3D 0x0000 Root mount waiting for: usbus7 umass1:3:1:-1: Attached to scbus3da9 at umass-sim1 bus 1 scbus3 target 0 lun 0 da9: Fixed Direct Access SCSI-4 device da9: 40.000MB/s transfers da9: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) ugen7.5: at usbus7 uhub9: on usbus7 Root mount waiting for: usbus7 uhub9: 4 ports with 4 removable, self powered Trying to mount root from zfs:prana --=20 Stephane LAPIE, EPITA SRS, Promo 2005 "Even when they have digital readouts, I can't understand them." --MegaTokyo --------------enig8BE633BB83E59ACA09FD0D2A Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk38sS4ACgkQ24Ql8u6TF2MdTQCfXGnImFL+4qSWHbV2SW6Qk0DT DkcAniV5OC8yVxhigvYA/4Cpb+UP1eNk =6Q2i -----END PGP SIGNATURE----- --------------enig8BE633BB83E59ACA09FD0D2A--