From owner-freebsd-current@FreeBSD.ORG Wed Mar 25 09:13:13 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D524F106564A for ; Wed, 25 Mar 2009 09:13:13 +0000 (UTC) (envelope-from M.S.Powell@salford.ac.uk) Received: from airy.salford.ac.uk (airy.salford.ac.uk [146.87.0.11]) by mx1.freebsd.org (Postfix) with SMTP id 015088FC16 for ; Wed, 25 Mar 2009 09:13:12 +0000 (UTC) (envelope-from M.S.Powell@salford.ac.uk) Received: (qmail 1020 invoked by uid 98); 25 Mar 2009 09:13:11 +0000 Received: from 146.87.255.121 by airy.salford.ac.uk (envelope-from , uid 401) with qmail-scanner-2.01 (clamdscan: 0.94.2/9164. spamassassin: 3.2.4. Clear:RC:1(146.87.255.121):. Processed in 0.054321 secs); 25 Mar 2009 09:13:11 -0000 Received: from rust.salford.ac.uk (HELO rust.salford.ac.uk) (146.87.255.121) by airy.salford.ac.uk (qpsmtpd/0.3x.614) with SMTP; Wed, 25 Mar 2009 09:13:11 +0000 Received: (qmail 62836 invoked by uid 1002); 25 Mar 2009 09:13:09 -0000 Received: from localhost (sendmail-bs@127.0.0.1) by localhost with SMTP; 25 Mar 2009 09:13:09 -0000 Date: Wed, 25 Mar 2009 09:13:09 +0000 (GMT) From: "Mark Powell" To: kevin In-Reply-To: <49BE4EC1.90207@163.com> Message-ID: <20090325090456.G92412@rust.salford.ac.uk> References: <49BD117B.2080706@163.com> <4F9C9299A10AE74E89EA580D14AA10A635E68A@royal64.emp.zapto.org> <49BE4EC1.90207@163.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-current@freebsd.org Subject: Re: ZFS data error without reasons X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Mar 2009 09:13:14 -0000 Kevin, Did you fix your ZFS CRC errors? I responded to your thread, but no-one got back to me. I'm gonna start another thread later. This time I re-made the zpool in 8 compatible with 7. Once the errors started showing up in 8 I moved back to 7, on the same hardware, to perform the scrub to prove the problem is with 8. The 1st scrub in 7 found some errors, but of course it would if 8 had messed up the data. Removed the few unimportant bad files (all were in snapshots). Just performing the 2nd scrub in 7 now. If this comes back with no errors, then we have stronger proof that there is some wrong, which seems quite intermittent, in 8 that randomly writes bad data. Cheers. On Mon, 16 Mar 2009, kevin wrote: > Daniel Eriksson wrote: >> kevin wrote: >> >> >>> Hi, >>> Will any changes cause zfs data error?I find my disk data error without >>> any reasons(shutdown or reboot normally).disk was bought >>> yesterday.sometimes it can be fixed with a zpool scrub.but mostly zpool >>> scrub will return more errors.Even i restore all zpool from my >>> backup,without 5 mins,zpool status shows data error and many checksum >>> errors. >>> >> >> Is the drive connected to an "nVidia nForce MCP55 SATA300 controller"? I >> have two machines with on-board MCP55 controllers. One of them works >> perfectly, the other causes silent data corruption (each time I run a >> zpool scrub it finds new checksum errors). >> >> If you also have an MCP55 controller then maybe this is related. >> >> > My laptop is T61. RAM is also tested by memtest86+ and return no error. > "zfs send tank/usr/home/kevin@2009-03-15-16:51:21|zfs receive backup/kevin" > hangs system and i have to power off the machine.when the system up,i find > file error in snapshot tank/usr/home/kevin@2009-03-15-16:51:21.when i destroy > tank/usr/home/kevin@2009-03-15-16:51:21,then reboot system, i find more > errors. > > #zpool status -v > pool: tank > state: ONLINE > status: One or more devices has experienced an error resulting in data > corruption. Applications may be affected. > action: Restore the file in question if possible. Otherwise restore the > entire pool from backup. > see: http://www.sun.com/msg/ZFS-8000-8A > scrub: scrub in progress for 0h10m, 96.10% done, 0h0m to go > config: > > NAME STATE READ WRITE CKSUM > tank ONLINE 0 0 2 > ad4s1d ONLINE 0 0 4 > > errors: Permanent errors have been detected in the following files: > > /usr/bin/less > /usr/lib/libstdc++.so.6 > /usr/bin/tbl > /usr/share/misc/termcap.db > /usr/bin/ssh-agent > /usr/local/bin/sudo > /usr/local/lib/libX11.so.6 > /usr/home/kevin/memtest86+-2.11.iso > > when zpool scrub end. > #zpool status -v > pool: tank > state: ONLINE > status: One or more devices has experienced an error resulting in data > corruption. Applications may be affected. > action: Restore the file in question if possible. Otherwise restore the > entire pool from backup. > see: http://www.sun.com/msg/ZFS-8000-8A > scrub: scrub completed after 0h10m with 2 errors on Mon Mar 16 21:01:12 2009 > config: > > NAME STATE READ WRITE CKSUM > tank ONLINE 0 0 2 > ad4s1d ONLINE 0 0 4 > > errors: Permanent errors have been detected in the following files: > > /usr/home/kevin/memtest86+-2.11.iso > > Should i just delete memtest86+-2.11.iso ? > > dmesg: > Copyright (c) 1992-2009 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD is a registered trademark of The FreeBSD Foundation. > FreeBSD 8.0-CURRENT #0: Sun Mar 15 21:11:36 CST 2009 > root@datastream-laptop.people.163.org:/usr/obj/usr/src/sys/G8laptop > Timecounter "i8254" frequency 1193182 Hz quality 0 > CPU: Intel(R) Core(TM)2 Duo CPU T7700 @ 2.40GHz (2394.02-MHz K8-class > CPU) > Origin = "GenuineIntel" Id = 0x6fb Stepping = 11 > Features=0xbfebfbff > Features2=0xe3bd > AMD Features=0x20100800 > AMD Features2=0x1 > TSC: P-state invariant > Cores per package: 2 > usable memory = 4210061312 (4015 MB) > avail memory = 4039487488 (3852 MB) > ACPI APIC Table: > FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs > cpu0 (BSP): APIC ID: 0 > cpu1 (AP): APIC ID: 1 > This module (opensolaris) contains code covered by the > Common Development and Distribution License (CDDL) > see http://opensolaris.org/os/licensing/opensolaris_license/ > ACPI Warning (tbfadt-0505): Optional field "Gpe1Block" has zero address or > length: 0 102C/0 [20070320] > ioapic0: Changing APIC ID to 1 > ioapic0 irqs 0-23 on motherboard > kbd1 at kbdmux0 > acpi0: on motherboard > acpi0: [ITHREAD] > acpi_ec0: port 0x62,0x66 on acpi0 > acpi0: Power Button (fixed) > acpi0: reservation of 0, a0000 (3) failed > acpi0: reservation of 100000, bff00000 (3) failed > Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 > acpi_hpet0: iomem 0xfed00000-0xfed003ff on acpi0 > Timecounter "HPET" frequency 14318180 Hz quality 900 > acpi_lid0: on acpi0 > acpi_button0: on acpi0 > pcib0: port 0xcf8-0xcff on acpi0 > pci0: on pcib0 > pcib1: irq 16 at device 1.0 on pci0 > pci1: on pcib1 > vgapci0: port 0x2000-0x207f mem > 0xd2000000-0xd2ffffff,0xe0000000-0xefffffff,0xd0000000-0xd1ffffff irq 16 at > device 0.0 on pci1 > em0: port 0x1840-0x185f mem > 0xfe200000-0xfe21ffff,0xfe225000-0xfe225fff irq 20 at device 25.0 on pci0 > em0: Using MSI interrupt > em0: [FILTER] > em0: Ethernet address: 00:1c:25:1c:fb:d0 > uhci0: port 0x1860-0x187f irq 20 > at device 26.0 on pci0 > uhci0: [ITHREAD] > uhci0: LegSup = 0x0000 > usbus0: on uhci0 > uhci1: port 0x1880-0x189f irq 21 > at device 26.1 on pci0 > uhci1: [ITHREAD] > uhci1: LegSup = 0x0000 > usbus1: on uhci1 > ehci0: mem > 0xfe226c00-0xfe226fff irq 22 at device 26.7 on pci0 > ehci0: [ITHREAD] > usbus2: EHCI version 1.0 > usbus2: on ehci0 > hdac0: mem > 0xfe220000-0xfe223fff irq 17 at device 27.0 on pci0 > hdac0: HDA Driver Revision: 20090226_0129 > hdac0: [ITHREAD] > pcib2: irq 20 at device 28.0 on pci0 > pci2: on pcib2 > pci2: at device 0.0 (no driver attached) > pcib3: irq 21 at device 28.1 on pci0 > pci3: on pcib3 > iwn0: mem 0xd7dfe000-0xd7dfffff irq 17 at > device 0.0 on pci3 > iwn0: Reg Domain: MoW1, address 00:1d:e0:48:13:2f > iwn0: [ITHREAD] > iwn0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps > iwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps > iwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps > 36Mbps 48Mbps 54Mbps > iwn0: 11na MCS: 15Mbps 30Mbps 45Mbps 60Mbps 90Mbps 120Mbps 135Mbps 150Mbps > 30Mbps 60Mbps 90Mbps 120Mbps 180Mbps 240Mbps 270Mbps 300Mbps > iwn0: 11ng MCS: 15Mbps 30Mbps 45Mbps 60Mbps 90Mbps 120Mbps 135Mbps 150Mbps > 30Mbps 60Mbps 90Mbps 120Mbps 180Mbps 240Mbps 270Mbps 300Mbps > pcib4: irq 22 at device 28.2 on pci0 > pci4: on pcib4 > pcib5: irq 23 at device 28.3 on pci0 > pci5: on pcib5 > pcib6: irq 20 at device 28.4 on pci0 > pci13: on pcib6 > uhci2: port 0x18a0-0x18bf irq 16 > at device 29.0 on pci0 > uhci2: [ITHREAD] > uhci2: LegSup = 0x0000 > usbus3: on uhci2 > uhci3: port 0x18c0-0x18df irq 17 > at device 29.1 on pci0 > uhci3: [ITHREAD] > uhci3: LegSup = 0x0000 > usbus4: on uhci3 > uhci4: port 0x18e0-0x18ff irq 18 > at device 29.2 on pci0 > uhci4: [ITHREAD] > uhci4: LegSup = 0x0000 > usbus5: on uhci4 > ehci1: mem > 0xfe227000-0xfe2273ff irq 19 at device 29.7 on pci0 > ehci1: [ITHREAD] > usbus6: EHCI version 1.0 > usbus6: on ehci1 > pcib7: at device 30.0 on pci0 > pci21: on pcib7 > cbb0: mem 0xf8100000-0xf8100fff irq 16 at device > 0.0 on pci21 > cardbus0: on cbb0 > pccard0: <16-bit PCCard bus> on cbb0 > cbb0: [FILTER] > isab0: at device 31.0 on pci0 > isa0: on isab0 > atapci0: port > 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x1830-0x183f at device 31.1 on pci0 > ata0: on atapci0 > ata0: [ITHREAD] > atapci1: port > 0x1c48-0x1c4f,0x1c1c-0x1c1f,0x1c40-0x1c47,0x1c18-0x1c1b,0x1c20-0x1c3f mem > 0xfe226000-0xfe2267ff irq 16 at device 31.2 on pci0 > atapci1: [ITHREAD] > atapci1: AHCI Version 01.10 controller with 3 ports PM not supported > ata2: on atapci1 > ata2: [ITHREAD] > ata3: on atapci1 > ata3: [ITHREAD] > pci0: at device 31.3 (no driver attached) > acpi_tz0: on acpi0 > acpi_tz1: on acpi0 > atrtc0: port 0x70-0x71 irq 8 on acpi0 > atkbdc0: port 0x60,0x64 irq 1 on acpi0 > atkbd0: irq 1 on atkbdc0 > kbd0 at atkbd0 > atkbd0: [GIANT-LOCKED] > atkbd0: [ITHREAD] > psm0: irq 12 on atkbdc0 > psm0: [GIANT-LOCKED] > psm0: [ITHREAD] > psm0: model Synaptics Touchpad, device ID 0 > battery0: on acpi0 > acpi_acad0: on acpi0 > acpi_ibm0: on acpi0 > cpu0: on acpi0 > coretemp0: on cpu0 > est0: on cpu0 > p4tcc0: on cpu0 > cpu1: on acpi0 > coretemp1: on cpu1 > est1: on cpu1 > p4tcc1: on cpu1 > orm0: at iomem > 0xc0000-0xcefff,0xcf000-0xcffff,0xd0000-0xd0fff,0xe0000-0xeffff on isa0 > sc0: at flags 0x100 on isa0 > sc0: VGA <16 virtual consoles, flags=0x300> > vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 > WARNING: ZFS is considered to be an experimental feature in FreeBSD. > Timecounters tick every 1.000 msec > usbus0: 12Mbps Full Speed USB v1.0 > usbus1: 12Mbps Full Speed USB v1.0 > usbus2: 480Mbps High Speed USB v2.0 > usbus3: 12Mbps Full Speed USB v1.0 > usbus4: 12Mbps Full Speed USB v1.0 > usbus5: 12Mbps Full Speed USB v1.0 > usbus6: 480Mbps High Speed USB v2.0 > ZFS filesystem version 13 > ZFS storage pool version 13 > ugen0.1: at usbus0 > uhub0: on usbus0 > ugen1.1: at usbus1 > uhub1: on usbus1 > ugen2.1: at usbus2 > uhub2: on usbus2 > ugen3.1: at usbus3 > uhub3: on usbus3 > ugen4.1: at usbus4 > uhub4: on usbus4 > ugen5.1: at usbus5 > uhub5: on usbus5 > ugen6.1: at usbus6 > uhub6: on usbus6 > acd0: DVDR at ata0-master UDMA33 > ad4: 305245MB at ata2-master SATA150 > hdac0: HDA Codec #0: Analog Devices AD1984 > hdac0: HDA Codec #1: Conexant (Unknown) > pcm0: at cad 0 nid 1 on hdac0 > pcm1: at cad 0 nid 1 on hdac0 > SMP: AP CPU #1 Launched! > uhub0: 2 ports with 2 removable, self powered > uhub1: 2 ports with 2 removable, self powered > uhub3: 2 ports with 2 removable, self powered > uhub4: 2 ports with 2 removable, self powered > uhub5: 2 ports with 2 removable, self powered > GEOM: ad4s1: geometry does not match label (255h,63s != 16h,63s). > Root mount waiting for: usbus6 usbus2 > Root mount waiting for: usbus6 usbus2 > uhub2: 4 ports with 4 removable, self powered > uhub6: 6 ports with 6 removable, self powered > Root mount waiting for: usbus2 > Trying to mount root from ufs:/dev/ad4s1a > ugen0.2: at usbus0 > ubt0: on usbus0 > ugen0.3: at usbus0 > wlan0: Ethernet address: 00:1d:e0:48:13:2f > WARNING: attempt to net_add_domain(bluetooth) after domainfinalize() > WARNING: attempt to net_add_domain(netgraph) after domainfinalize() > wlan0: link state changed to UP > iwn0: need multicast update callback > iwn0: need multicast update callback > iwn0: need multicast update callback > > Thanks, > kevin > > > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > > -- Mark Powell - UNIX System Administrator - The University of Salford Information & Learning Services, Clifford Whitworth Building, Salford University, Manchester, M5 4WT, UK. Tel: +44 161 295 6843 Fax: +44 161 295 5888 www.pgp.com for PGP key