Date: Wed, 25 Mar 2009 09:13:09 +0000 (GMT) From: "Mark Powell" <M.S.Powell@salford.ac.uk> To: kevin <kevinxlinuz@163.com> Cc: freebsd-current@freebsd.org Subject: Re: ZFS data error without reasons Message-ID: <20090325090456.G92412@rust.salford.ac.uk> In-Reply-To: <49BE4EC1.90207@163.com> References: <49BD117B.2080706@163.com> <4F9C9299A10AE74E89EA580D14AA10A635E68A@royal64.emp.zapto.org> <49BE4EC1.90207@163.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Kevin, Did you fix your ZFS CRC errors? I responded to your thread, but no-one got back to me. I'm gonna start another thread later. This time I re-made the zpool in 8 compatible with 7. Once the errors started showing up in 8 I moved back to 7, on the same hardware, to perform the scrub to prove the problem is with 8. The 1st scrub in 7 found some errors, but of course it would if 8 had messed up the data. Removed the few unimportant bad files (all were in snapshots). Just performing the 2nd scrub in 7 now. If this comes back with no errors, then we have stronger proof that there is some wrong, which seems quite intermittent, in 8 that randomly writes bad data. Cheers. On Mon, 16 Mar 2009, kevin wrote: > Daniel Eriksson wrote: >> kevin wrote: >> >> >>> Hi, >>> Will any changes cause zfs data error?I find my disk data error without >>> any reasons(shutdown or reboot normally).disk was bought >>> yesterday.sometimes it can be fixed with a zpool scrub.but mostly zpool >>> scrub will return more errors.Even i restore all zpool from my >>> backup,without 5 mins,zpool status shows data error and many checksum >>> errors. >>> >> >> Is the drive connected to an "nVidia nForce MCP55 SATA300 controller"? I >> have two machines with on-board MCP55 controllers. One of them works >> perfectly, the other causes silent data corruption (each time I run a >> zpool scrub it finds new checksum errors). >> >> If you also have an MCP55 controller then maybe this is related. >> >> > My laptop is T61. RAM is also tested by memtest86+ and return no error. > "zfs send tank/usr/home/kevin@2009-03-15-16:51:21|zfs receive backup/kevin" > hangs system and i have to power off the machine.when the system up,i find > file error in snapshot tank/usr/home/kevin@2009-03-15-16:51:21.when i destroy > tank/usr/home/kevin@2009-03-15-16:51:21,then reboot system, i find more > errors. > > #zpool status -v > pool: tank > state: ONLINE > status: One or more devices has experienced an error resulting in data > corruption. Applications may be affected. > action: Restore the file in question if possible. Otherwise restore the > entire pool from backup. > see: http://www.sun.com/msg/ZFS-8000-8A > scrub: scrub in progress for 0h10m, 96.10% done, 0h0m to go > config: > > NAME STATE READ WRITE CKSUM > tank ONLINE 0 0 2 > ad4s1d ONLINE 0 0 4 > > errors: Permanent errors have been detected in the following files: > > /usr/bin/less > /usr/lib/libstdc++.so.6 > /usr/bin/tbl > /usr/share/misc/termcap.db > /usr/bin/ssh-agent > /usr/local/bin/sudo > /usr/local/lib/libX11.so.6 > /usr/home/kevin/memtest86+-2.11.iso > > when zpool scrub end. > #zpool status -v > pool: tank > state: ONLINE > status: One or more devices has experienced an error resulting in data > corruption. Applications may be affected. > action: Restore the file in question if possible. Otherwise restore the > entire pool from backup. > see: http://www.sun.com/msg/ZFS-8000-8A > scrub: scrub completed after 0h10m with 2 errors on Mon Mar 16 21:01:12 2009 > config: > > NAME STATE READ WRITE CKSUM > tank ONLINE 0 0 2 > ad4s1d ONLINE 0 0 4 > > errors: Permanent errors have been detected in the following files: > > /usr/home/kevin/memtest86+-2.11.iso > > Should i just delete memtest86+-2.11.iso ? > > dmesg: > Copyright (c) 1992-2009 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD is a registered trademark of The FreeBSD Foundation. > FreeBSD 8.0-CURRENT #0: Sun Mar 15 21:11:36 CST 2009 > root@datastream-laptop.people.163.org:/usr/obj/usr/src/sys/G8laptop > Timecounter "i8254" frequency 1193182 Hz quality 0 > CPU: Intel(R) Core(TM)2 Duo CPU T7700 @ 2.40GHz (2394.02-MHz K8-class > CPU) > Origin = "GenuineIntel" Id = 0x6fb Stepping = 11 > Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> > Features2=0xe3bd<SSE3,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM> > AMD Features=0x20100800<SYSCALL,NX,LM> > AMD Features2=0x1<LAHF> > TSC: P-state invariant > Cores per package: 2 > usable memory = 4210061312 (4015 MB) > avail memory = 4039487488 (3852 MB) > ACPI APIC Table: <LENOVO TP-7L > > FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs > cpu0 (BSP): APIC ID: 0 > cpu1 (AP): APIC ID: 1 > This module (opensolaris) contains code covered by the > Common Development and Distribution License (CDDL) > see http://opensolaris.org/os/licensing/opensolaris_license/ > ACPI Warning (tbfadt-0505): Optional field "Gpe1Block" has zero address or > length: 0 102C/0 [20070320] > ioapic0: Changing APIC ID to 1 > ioapic0 <Version 2.0> irqs 0-23 on motherboard > kbd1 at kbdmux0 > acpi0: <LENOVO TP-7L> on motherboard > acpi0: [ITHREAD] > acpi_ec0: <Embedded Controller: GPE 0x12, ECDT> port 0x62,0x66 on acpi0 > acpi0: Power Button (fixed) > acpi0: reservation of 0, a0000 (3) failed > acpi0: reservation of 100000, bff00000 (3) failed > Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 > acpi_hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0 > Timecounter "HPET" frequency 14318180 Hz quality 900 > acpi_lid0: <Control Method Lid Switch> on acpi0 > acpi_button0: <Sleep Button> on acpi0 > pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 > pci0: <ACPI PCI bus> on pcib0 > pcib1: <ACPI PCI-PCI bridge> irq 16 at device 1.0 on pci0 > pci1: <ACPI PCI bus> on pcib1 > vgapci0: <VGA-compatible display> port 0x2000-0x207f mem > 0xd2000000-0xd2ffffff,0xe0000000-0xefffffff,0xd0000000-0xd1ffffff irq 16 at > device 0.0 on pci1 > em0: <Intel(R) PRO/1000 Network Connection 6.9.6> port 0x1840-0x185f mem > 0xfe200000-0xfe21ffff,0xfe225000-0xfe225fff irq 20 at device 25.0 on pci0 > em0: Using MSI interrupt > em0: [FILTER] > em0: Ethernet address: 00:1c:25:1c:fb:d0 > uhci0: <Intel 82801H (ICH8) USB controller USB-D> port 0x1860-0x187f irq 20 > at device 26.0 on pci0 > uhci0: [ITHREAD] > uhci0: LegSup = 0x0000 > usbus0: <Intel 82801H (ICH8) USB controller USB-D> on uhci0 > uhci1: <Intel 82801H (ICH8) USB controller USB-E> port 0x1880-0x189f irq 21 > at device 26.1 on pci0 > uhci1: [ITHREAD] > uhci1: LegSup = 0x0000 > usbus1: <Intel 82801H (ICH8) USB controller USB-E> on uhci1 > ehci0: <Intel 82801H (ICH8) USB 2.0 controller USB2-B> mem > 0xfe226c00-0xfe226fff irq 22 at device 26.7 on pci0 > ehci0: [ITHREAD] > usbus2: EHCI version 1.0 > usbus2: <Intel 82801H (ICH8) USB 2.0 controller USB2-B> on ehci0 > hdac0: <Intel 82801H High Definition Audio Controller> mem > 0xfe220000-0xfe223fff irq 17 at device 27.0 on pci0 > hdac0: HDA Driver Revision: 20090226_0129 > hdac0: [ITHREAD] > pcib2: <ACPI PCI-PCI bridge> irq 20 at device 28.0 on pci0 > pci2: <ACPI PCI bus> on pcib2 > pci2: <memory> at device 0.0 (no driver attached) > pcib3: <ACPI PCI-PCI bridge> irq 21 at device 28.1 on pci0 > pci3: <ACPI PCI bus> on pcib3 > iwn0: <Intel(R) PRO/Wireless 4965BGN> mem 0xd7dfe000-0xd7dfffff irq 17 at > device 0.0 on pci3 > iwn0: Reg Domain: MoW1, address 00:1d:e0:48:13:2f > iwn0: [ITHREAD] > iwn0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps > iwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps > iwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps > 36Mbps 48Mbps 54Mbps > iwn0: 11na MCS: 15Mbps 30Mbps 45Mbps 60Mbps 90Mbps 120Mbps 135Mbps 150Mbps > 30Mbps 60Mbps 90Mbps 120Mbps 180Mbps 240Mbps 270Mbps 300Mbps > iwn0: 11ng MCS: 15Mbps 30Mbps 45Mbps 60Mbps 90Mbps 120Mbps 135Mbps 150Mbps > 30Mbps 60Mbps 90Mbps 120Mbps 180Mbps 240Mbps 270Mbps 300Mbps > pcib4: <ACPI PCI-PCI bridge> irq 22 at device 28.2 on pci0 > pci4: <ACPI PCI bus> on pcib4 > pcib5: <ACPI PCI-PCI bridge> irq 23 at device 28.3 on pci0 > pci5: <ACPI PCI bus> on pcib5 > pcib6: <ACPI PCI-PCI bridge> irq 20 at device 28.4 on pci0 > pci13: <ACPI PCI bus> on pcib6 > uhci2: <Intel 82801H (ICH8) USB controller USB-A> port 0x18a0-0x18bf irq 16 > at device 29.0 on pci0 > uhci2: [ITHREAD] > uhci2: LegSup = 0x0000 > usbus3: <Intel 82801H (ICH8) USB controller USB-A> on uhci2 > uhci3: <Intel 82801H (ICH8) USB controller USB-B> port 0x18c0-0x18df irq 17 > at device 29.1 on pci0 > uhci3: [ITHREAD] > uhci3: LegSup = 0x0000 > usbus4: <Intel 82801H (ICH8) USB controller USB-B> on uhci3 > uhci4: <Intel 82801H (ICH8) USB controller USB-C> port 0x18e0-0x18ff irq 18 > at device 29.2 on pci0 > uhci4: [ITHREAD] > uhci4: LegSup = 0x0000 > usbus5: <Intel 82801H (ICH8) USB controller USB-C> on uhci4 > ehci1: <Intel 82801H (ICH8) USB 2.0 controller USB2-A> mem > 0xfe227000-0xfe2273ff irq 19 at device 29.7 on pci0 > ehci1: [ITHREAD] > usbus6: EHCI version 1.0 > usbus6: <Intel 82801H (ICH8) USB 2.0 controller USB2-A> on ehci1 > pcib7: <ACPI PCI-PCI bridge> at device 30.0 on pci0 > pci21: <ACPI PCI bus> on pcib7 > cbb0: <RF5C476 PCI-CardBus Bridge> mem 0xf8100000-0xf8100fff irq 16 at device > 0.0 on pci21 > cardbus0: <CardBus bus> on cbb0 > pccard0: <16-bit PCCard bus> on cbb0 > cbb0: [FILTER] > isab0: <PCI-ISA bridge> at device 31.0 on pci0 > isa0: <ISA bus> on isab0 > atapci0: <Intel ICH8M UDMA100 controller> port > 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x1830-0x183f at device 31.1 on pci0 > ata0: <ATA channel 0> on atapci0 > ata0: [ITHREAD] > atapci1: <Intel AHCI controller> port > 0x1c48-0x1c4f,0x1c1c-0x1c1f,0x1c40-0x1c47,0x1c18-0x1c1b,0x1c20-0x1c3f mem > 0xfe226000-0xfe2267ff irq 16 at device 31.2 on pci0 > atapci1: [ITHREAD] > atapci1: AHCI Version 01.10 controller with 3 ports PM not supported > ata2: <ATA channel 0> on atapci1 > ata2: [ITHREAD] > ata3: <ATA channel 2> on atapci1 > ata3: [ITHREAD] > pci0: <serial bus, SMBus> at device 31.3 (no driver attached) > acpi_tz0: <Thermal Zone> on acpi0 > acpi_tz1: <Thermal Zone> on acpi0 > atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0 > atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0 > atkbd0: <AT Keyboard> irq 1 on atkbdc0 > kbd0 at atkbd0 > atkbd0: [GIANT-LOCKED] > atkbd0: [ITHREAD] > psm0: <PS/2 Mouse> irq 12 on atkbdc0 > psm0: [GIANT-LOCKED] > psm0: [ITHREAD] > psm0: model Synaptics Touchpad, device ID 0 > battery0: <ACPI Control Method Battery> on acpi0 > acpi_acad0: <AC Adapter> on acpi0 > acpi_ibm0: <IBM ThinkPad ACPI Extras> on acpi0 > cpu0: <ACPI CPU> on acpi0 > coretemp0: <CPU On-Die Thermal Sensors> on cpu0 > est0: <Enhanced SpeedStep Frequency Control> on cpu0 > p4tcc0: <CPU Frequency Thermal Control> on cpu0 > cpu1: <ACPI CPU> on acpi0 > coretemp1: <CPU On-Die Thermal Sensors> on cpu1 > est1: <Enhanced SpeedStep Frequency Control> on cpu1 > p4tcc1: <CPU Frequency Thermal Control> on cpu1 > orm0: <ISA Option ROMs> at iomem > 0xc0000-0xcefff,0xcf000-0xcffff,0xd0000-0xd0fff,0xe0000-0xeffff on isa0 > sc0: <System console> at flags 0x100 on isa0 > sc0: VGA <16 virtual consoles, flags=0x300> > vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 > WARNING: ZFS is considered to be an experimental feature in FreeBSD. > Timecounters tick every 1.000 msec > usbus0: 12Mbps Full Speed USB v1.0 > usbus1: 12Mbps Full Speed USB v1.0 > usbus2: 480Mbps High Speed USB v2.0 > usbus3: 12Mbps Full Speed USB v1.0 > usbus4: 12Mbps Full Speed USB v1.0 > usbus5: 12Mbps Full Speed USB v1.0 > usbus6: 480Mbps High Speed USB v2.0 > ZFS filesystem version 13 > ZFS storage pool version 13 > ugen0.1: <Intel> at usbus0 > uhub0: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0 > ugen1.1: <Intel> at usbus1 > uhub1: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus1 > ugen2.1: <Intel> at usbus2 > uhub2: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus2 > ugen3.1: <Intel> at usbus3 > uhub3: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus3 > ugen4.1: <Intel> at usbus4 > uhub4: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus4 > ugen5.1: <Intel> at usbus5 > uhub5: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus5 > ugen6.1: <Intel> at usbus6 > uhub6: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus6 > acd0: DVDR <Optiarc DVD RW AD-7910A/1.D1> at ata0-master UDMA33 > ad4: 305245MB <Seagate ST9320320AS SD03> at ata2-master SATA150 > hdac0: HDA Codec #0: Analog Devices AD1984 > hdac0: HDA Codec #1: Conexant (Unknown) > pcm0: <HDA Analog Devices AD1984 PCM #0 Analog> at cad 0 nid 1 on hdac0 > pcm1: <HDA Analog Devices AD1984 PCM #1 Digital> at cad 0 nid 1 on hdac0 > SMP: AP CPU #1 Launched! > uhub0: 2 ports with 2 removable, self powered > uhub1: 2 ports with 2 removable, self powered > uhub3: 2 ports with 2 removable, self powered > uhub4: 2 ports with 2 removable, self powered > uhub5: 2 ports with 2 removable, self powered > GEOM: ad4s1: geometry does not match label (255h,63s != 16h,63s). > Root mount waiting for: usbus6 usbus2 > Root mount waiting for: usbus6 usbus2 > uhub2: 4 ports with 4 removable, self powered > uhub6: 6 ports with 6 removable, self powered > Root mount waiting for: usbus2 > Trying to mount root from ufs:/dev/ad4s1a > ugen0.2: <Broadcom Corp> at usbus0 > ubt0: <Broadcom Corp BCM2045B, class 224/1, rev 2.00/1.00, addr 2> on usbus0 > ugen0.3: <STMicroelectronics> at usbus0 > wlan0: Ethernet address: 00:1d:e0:48:13:2f > WARNING: attempt to net_add_domain(bluetooth) after domainfinalize() > WARNING: attempt to net_add_domain(netgraph) after domainfinalize() > wlan0: link state changed to UP > iwn0: need multicast update callback > iwn0: need multicast update callback > iwn0: need multicast update callback > > Thanks, > kevin > > > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > > -- Mark Powell - UNIX System Administrator - The University of Salford Information & Learning Services, Clifford Whitworth Building, Salford University, Manchester, M5 4WT, UK. Tel: +44 161 295 6843 Fax: +44 161 295 5888 www.pgp.com for PGP key
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090325090456.G92412>