From owner-freebsd-fs@FreeBSD.ORG Thu Jul 4 18:13:14 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id EDF1E6D0 for ; Thu, 4 Jul 2013 18:13:14 +0000 (UTC) (envelope-from joh.hendriks@gmail.com) Received: from mail-ve0-x22d.google.com (mail-ve0-x22d.google.com [IPv6:2607:f8b0:400c:c01::22d]) by mx1.freebsd.org (Postfix) with ESMTP id A12C91736 for ; Thu, 4 Jul 2013 18:13:14 +0000 (UTC) Received: by mail-ve0-f173.google.com with SMTP id jw11so1211848veb.32 for ; Thu, 04 Jul 2013 11:13:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=LutsGvdAbTlsyGktjsoFAS/ccrulQbojjoD7dOm9cN4=; b=sK1aUOE3jDy2fOBcgDKqttAKfC3/MJS7Sv1AJBYKkf6TQQ/a9jLKjp8dwOVTKHJOtT eIFlenJABXHNPewHLgF7DG7jrliE5vrPeUOOazAlL7VQSxjymFPOZC2GXzzaD+88PAha O1yrWu+z2vwSBjwwuV4Tgg4+D/YQH6Waerr2yyg2T0sjaJ1nvfPUY3x/XdvXKuwhMjcn SXydP5orDXZg5s0txnp4spM5SlMCygLjn3TzTlpcTF8CDadmfE8V6HrOz4hDrwxczpnm Hd27XWjgHo9TNBDjjomFnl5clLhDoSJaf0JjLD91QeyIfvdXhznCXJduyhGhhKLw/N4P HaEA== MIME-Version: 1.0 X-Received: by 10.220.162.200 with SMTP id w8mr3911720vcx.90.1372961594193; Thu, 04 Jul 2013 11:13:14 -0700 (PDT) Received: by 10.58.227.10 with HTTP; Thu, 4 Jul 2013 11:13:14 -0700 (PDT) In-Reply-To: References: <51D42107.1050107@digsys.bg> <2EF46A8C-6908-4160-BF99-EC610B3EA771@alumni.chalmers.se> <51D437E2.4060101@digsys.bg> <20130704000405.GA75529@icarus.home.lan> Date: Thu, 4 Jul 2013 20:13:14 +0200 Message-ID: Subject: Re: Slow resilvering with mirrored ZIL From: Johan Hendriks To: mxb Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Jul 2013 18:13:15 -0000 Op donderdag 4 juli 2013 schreef mxb (mxb@alumni.chalmers.se) het volgende: > > Well, I'v got a lot of errors in dmesg regarding one of four disks, which > is not currently replaced. > After resilvering dropped to 2MB/s ( :O ), it stuck. I had to reboot > system. > It came up and now continues resilvering process at 9/s (9 what??? bytes? ) > > Here comes requested info: > > dmesg(plan was to re-new disks and then lift this sys up to STABLE): > > Copyright (c) 1992-2012 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD is a registered trademark of The FreeBSD Foundation. > FreeBSD 9.1-RELEASE-p1 #2: Fri Mar 8 10:30:38 CET 2013 > root@nas.home.unixconn.com:/usr/obj/usr/src/sys/GENERIC amd64 > module_register: module pci/em already exists! > Module pci/em failed to register: 17 > module_register: module pci/lem already exists! > Module pci/lem failed to register: 17 > CPU: Intel(R) Xeon(R) CPU X5460 @ 3.16GHz (3166.74-MHz K8-class > CPU) > Origin = "GenuineIntel" Id = 0x10676 Family = 6 Model = 17 Stepping > = 6 > > Features=0xbfebfbff > > Features2=0xce3bd > AMD Features=0x20100800 > AMD Features2=0x1 > TSC: P-state invariant, performance statistics > real memory = 34359738368 (32768 MB) > avail memory = 33095909376 (31562 MB) > Event timer "LAPIC" quality 400 > ACPI APIC Table: > FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs > FreeBSD/SMP: 1 package(s) x 4 core(s) > cpu0 (BSP): APIC ID: 0 > cpu1 (AP): APIC ID: 1 > cpu2 (AP): APIC ID: 2 > cpu3 (AP): APIC ID: 3 > ioapic0 irqs 0-23 on motherboard > ioapic1 irqs 24-47 on motherboard > kbd1 at kbdmux0 > ctl: CAM Target Layer loaded > acpi0: on motherboard > acpi0: Power Button (fixed) > cpu0: on acpi0 > cpu1: on acpi0 > cpu2: on acpi0 > cpu3: on acpi0 > atrtc0: port 0x70-0x71 irq 8 on acpi0 > Event timer "RTC" frequency 32768 Hz quality 0 > attimer0: port 0x40-0x43,0x50-0x53 irq 0 on acpi0 > Timecounter "i8254" frequency 1193182 Hz quality 0 > Event timer "i8254" frequency 1193182 Hz quality 100 > Timecounter "ACPI-fast" frequency 3579545 Hz quality 900 > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 > pcib0: port 0xcf8-0xcff on acpi0 > pci0: on pcib0 > pcib1: at device 2.0 on pci0 > pci1: on pcib1 > pcib2: irq 16 at device 0.0 on pci1 > pci2: on pcib2 > pcib3: irq 16 at device 0.0 on pci2 > pci3: on pcib3 > pcib4: at device 0.0 on pci3 > pci4: on pcib4 > pcib5: at device 0.2 on pci3 > pci5: on pcib5 > pcib6: irq 18 at device 2.0 on pci2 > pci6: on pcib6 > em0: port 0x2000-0x201f mem > 0xdc000000-0xdc01ffff irq 18 at device 0.0 on pci6 > em0: Using an MSI interrupt > em0: Ethernet address: 00:30:48:34:46:36 > em1: port 0x2020-0x203f mem > 0xdc020000-0xdc03ffff irq 19 at device 0.1 on pci6 > em1: Using an MSI interrupt > em1: Ethernet address: 00:30:48:34:46:37 > pcib7: at device 0.3 on pci1 > pci7: on pcib7 > pcib8: at device 4.0 on pci0 > pci8: on pcib8 > pcib9: at device 0.0 on pci8 > pci9: on pcib9 > bce0: mem > 0xd8000000-0xd9ffffff irq 16 at device 4.0 on pci9 > miibus0: on bce0 > brgphy0: PHY 1 on miibus0 > brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow > bce0: Ethernet address: 00:1b:78:38:2f:28 > bce0: ASIC (0x57060020); Rev (A2); Bus (PCI-X, 64-bit, 100MHz); B/C > (1.9.6); Bufs (RX:2;TX:2;PG:8); Flags (SPLT|MSI) > Coal (RX:6,6,18,18; TX:20,20,80,80) > pcib10: at device 0.2 on pci8 > pci10: on pcib10 > bce1: mem > 0xda000000-0xdbffffff irq 17 at device 5.0 on pci10 > miibus1: on bce1 > brgphy1: PHY 1 on miibus1 > brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow > bce1: Ethernet address: 00:1b:78:38:2f:2a > bce1: ASIC (0x57060020); Rev (A2); Bus (PCI-X, 64-bit, 100MHz); B/C > (1.9.6); Bufs (RX:2;TX:2;PG:8); Flags (SPLT|MSI) > Coal (RX:6,6,18,18; TX:20,20,80,80) > pcib11: at device 6.0 on pci0 > pci11: on pcib11 > pci0: at device 8.0 (no driver attached) > pcib12: irq 17 at device 28.0 on pci0 > pci12: on pcib12 > uhci0: port > 0x1800-0x181f irq 17 at device 29.0 on pci0 > usbus0 on uhci0 > uhci1: port > 0x1820-0x183f irq 19 at device 29.1 on pci0 > usbus1 on uhci1 > uhci2: port > 0x1840-0x185f irq 18 at device 29.2 on pci0 > usbus2 on uhci2 > ehci0: mem 0xdc500000-0xdc5003ff irq 17 > at device 29.7 on pci0 > usbus3: EHCI version 1.0 > usbus3 on ehci0 > pcib13: at device 30.0 on pci0 > pci13: on pcib13 > vgapci0: port 0x3000-0x30ff mem > 0xd0000000-0xd7ffffff,0xdc200000-0xdc20ffff irq 18 at device 1.0 on pci13 > isab0: at device 31.0 on pci0 > isa0: on isab0 > ahci0: port > 0x1890-0x1897,0x1884-0x1887,0x1888-0x188f,0x1880-0x1883,0x1860-0x187f mem > 0xdc500400-0xdc5007ff irq 19 at device 31.2 on pci0 > ahci0: AHCI v1.10 with 6 3Gbps ports, Port Multiplier supported > ahcich0: at channel 0 on ahci0 > ahcich1: at channel 1 on ahci0 > ahcich2: at channel 2 on ahci0 > ahcich3: at channel 3 on ahci0 > ahcich4: at channel 4 on ahci0 > ahcich5: at channel 5 on ahci0 > pci0: at device 31.3 (no driver attached) > acpi_button0: on acpi0 > atkbdc0: port 0x60,0x64 irq 1 on acpi0 > atkbd0: irq 1 on atkbdc0 > kbd0 at atkbd0 > atkbd0: [GIANT-LOCKED] > uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 > uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0 > fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 > fdc0: does not respond > device_attach: fdc0 attach returned 6 > ppc0: port 0x378-0x37f,0x778-0x77f irq 7 drq 3 on acpi0 > ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode > ppc0: FIFO with 16/16/9 bytes threshold > ppbus0: on ppc0 > plip0: on ppbus0 > lpt0: on ppbus0 > lpt0: Interrupt-driven port > ppi0: on ppbus0 > orm0: at iomem > 0xc0000-0xcafff,0xcb000-0xccfff,0xcd000-0xce7ff,0xce800-0xcffff on isa0 > sc0: at flags 0x100 on isa0 > sc0: VGA <16 virtual consoles, flags=0x300> > vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 > fdc0: No FDOUT register! > est0: on cpu0 > est: CPU supports Enhanced Speedstep, but is not recognized. > est: cpu_vendor GenuineIntel, msr 4921492106004921 > device_attach: est0 attach returned 6 > p4tcc0: on cpu0 > est1: on cpu1 > est: CPU supports Enhanced Speedstep, but is not recognized. > est: cpu_vendor GenuineIntel, msr 4921492106004921 > device_attach: est1 attach returned 6 > p4tcc1: on cpu1 > est2: on cpu2 > est: CPU supports Enhanced Speedstep, but is not recognized. > est: cpu_vendor GenuineIntel, msr 4921492106004921 > device_attach: est2 attach returned 6 > p4tcc2: on cpu2 > est3: on cpu3 > est: CPU supports Enhanced Speedstep, but is not recognized. > est: cpu_vendor GenuineIntel, msr 4921492106004921 > device_attach: est3 attach returned 6 > p4tcc3: on cpu3 > ZFS filesystem version 5 > ZFS storage pool version 28 > Timecounters tick every 1.000 msec > usbus0: 12Mbps Full Speed USB v1.0 > usbus1: 12Mbps Full Speed USB v1.0 > usbus2: 12Mbps Full Speed USB v1.0 > usbus3: 480Mbps High Speed USB v2.0 > ugen0.1: at usbus0 > uhub0: on usbus0 > ugen1.1: at usbus1 > uhub1: on usbus1 > ugen2.1: at usbus2 > uhub2: on usbus2 > ugen3.1: at usbus3 > uhub3: on usbus3 > uhub0: 2 ports with 2 removable, self powered > uhub1: 2 ports with 2 removable, self powered > uhub2: 2 ports with 2 removable, self powered > bce0: bce_pulse(): Warning: bootcode thinks driver is absent! (bc_state = > 0x00000004) > bce1: bce_pulse(): Warning: bootcode thinks driver is absent! (bc_state = > 0x00000004) > uhub3: 6 ports with 6 removable, self powered > ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 > ada0: ATA-8 SATA 3.x device > ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) > ada0: Command Queueing enabled > ada0: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C) > ada0: Previously was known as ad4 > ada1 at ahcich1 bus 0 scbus1 target 0 lun 0 > ada1: ATA-8 SATA 2.x device > ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) > ada1: Command Queueing enabled > ada1: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C) > ada1: Previously was known as ad6 > ada2 at ahcich2 bus 0 scbus2 target 0 lun 0 > ada2: ATA-8 SATA 2.x device > ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) > ada2: Command Queueing enabled > ada2: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C) > ada2: Previously was known as ad8 > ada3 at ahcich3 bus 0 scbus3 target 0 lun 0 > ada3: ATA-8 SATA 3.x device > ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) > ada3: Command Queueing enabled > ada3: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C) > ada3: Previously was known as ad10 > ada4 at ahcich4 bus 0 scbus4 target 0 lun 0 > ada4: ATA-7 SATA 2.x device > ada4: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) > ada4: Command Queueing enabled > ada4: 19087MB (39091248 512 byte sectors: 16H 63S/T 16383C) > ada4: Previously was known as ad12 > ada5 at ahcich5 bus 0 scbus5 target 0 lun 0 > ada5: ATA-8 SATA 3.x device > ada5: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) > ada5: Command Queueing enabled > ada5: 57241MB (117231408 512 byte sectors: 16H 63S/T 16383C) > ada5: Previously was known as ad14 > SMP: AP CPU #3 Launched! > SMP: AP CPU #1 Launched! > SMP: AP CPU #2 Launched! > Timecounter "TSC-low" frequency 12370070 Hz quality 1000 > Root mount waiting for: usbus3 > ugen0.2: at usbus0 > ukbd0: 2.00/5.35, addr 2> on usbus0 > kbd2 at ukbd0 > Root mount waiting for: usbus3 > ugen3.2: at usbus3 > umass0: on usbus3 > umass0: SCSI over Bulk-Only; quirks = 0x4100 > umass0:7:0:-1: Attached to scbus7 > Trying to mount root from zfs:NAS/system/root [rw]... > da0 at umass-sim0 bus 0 scbus7 target 0 lun 0 > da0: Removable Direct Access SCSI-0 device > da0: 40.000MB/s transfers > da0: 7580MB (15523840 512 byte sectors: 255H 63S/T 966C) > GEOM_PART: integrity check failed (label/zfs_boot, BSD) > GEOM_PART: integrity check failed (label/zfs_boot, BSD) > bce0: Gigabit link up! > bce1: Gigabit link up! > > zpool status: > pool: NAS > state: DEGRADED > status: One or more devices is currently being resilvered. The pool will > continue to function, possibly in a degraded state. > action: Wait for the resilver to complete. > scan: resilver in progress since Tue Jul 2 07:31:29 2013 > 483G scanned out of 3.49T at 10/s, (scan is slow, no estimated > time) > 121G resilvered, 13.52% done > config: > > NAME STATE READ WRITE CKSUM > NAS DEGRADED 0 0 0 > raidz1-0 DEGRADED 0 0 0 > replacing-0 DEGRADED 0 0 0 > 1160414651350745057 UNAVAIL 0 0 0 was > /dev/ada0/old > ada0 ONLINE 0 0 0 > (resilvering) > ada3 ONLINE 0 0 0 > ada1 ONLINE 0 0 0 > ada2 ONLINE 0 0 0 > logs > mirror-1 ONLINE 0 0 0 > ada4s1 ONLINE 0 0 0 > ada5s1 ONLINE 0 0 0 > cache > ada5s2 ONLINE 0 0 0 > > errors: No known data errors > > zpool get all: > NAME PROPERTY VALUE SOURCE > NAS size 3.64T - > NAS capacity 95% - > NAS altroot - default > NAS health DEGRADED - > NAS guid 3808946822857359331 default > NAS version 28 default > NAS bootfs - default > NAS delegation on default > NAS autoreplace off default > NAS cachefile - default > NAS failmode wait default > NAS listsnapshots off default > NAS autoexpand off default > NAS dedupditto 0 default > NAS dedupratio 1.00x - > NAS free 154G - > NAS allocated 3.49T - > NAS readonly off - > NAS comment - default > NAS expandsize 0 - > > zfs get all: > NAME PROPERTY VALUE SOURCE > NAS type filesystem - > NAS creation Tue Aug 3 20:10 2010 - > NAS used 2.61T - > NAS available 72.0G - > NAS referenced > > > > > On 4 jul 2013, at 02:04, Jeremy Chadwick wrote: > > > On Wed, Jul 03, 2013 at 05:12:17PM +0200, mxb wrote: > >> > >> Not sure if new are 4k. Done nothing about that. > >> But the SECOND drive, resilvering is SLOW. Not the first one. > >> > >> As stated below. Those changes are introduced to the system. > >> ALL new driver ARE identical, except S/N of cause :) > >> > >> On 3 jul 2013, at 16:55, Steven Hartland > wrote: > >> > >>> > >>> ----- Original Message ----- From: "Daniel Kalchev" > >>> To: "mxb" > >>> Cc: > >>> Sent: Wednesday, July 03, 2013 3:40 PM > >>> Subject: Re: Slow resilvering with mirrored ZIL > >>> > >>> > >>>> On 03.07.13 16:36, mxb wrote: > >>>>> Well, then my question persists - why I get so significant drop of > speed while resilvering second drive. > >>>>> The only changes to the system are: > >>>>> > >>>>> 1. Second partition for ZIL to create a mirror > >>>>> 2. New disks are 7200rpm. old ones are 5400rpm. > >>>>> > >>> > >>> Its not something like the old disks are 512byte sectors > >>> where as the new ones are 4k? > >>> > >>> It this is the case having already replaced one disk you've > >>> killed performance as its having to do lots more work reading > >>> none 4k aligned data? > > > > Not enough hard information to diagnose. What's needed is below; you > > can XXX out system names if you want, but please do not XXX out anything > > else. > > > > - Output from: dmesg > > > > - Output from: zpool status > > > > - Output from: zpool get all > > > > - Output from: zfs get all > > > > - Output from: "gpart show -p" for every disk on the system > > > > - Output from: cat /etc/sysctl.conf > > > > - Output from: cat /boot/loader.conf > > > > - Output from: "smartctl -a" for every disk that's used by ZFS > > (please use smartmontools 6.1 or newer in this case; install > > ports/sysutils/smartmontools) > > > > I can think o Could it be that your pool is at a capacity of 95%.. In a lot of cases this slows down a pool. Maybe this is what you see. Regards Johan