Date: Tue, 20 Jun 2017 02:31:59 +0100 From: Steven Hartland <killing@multiplay.co.uk> To: freebsd-fs@freebsd.org Subject: Re: FreeBSD 11.1 Beta 2 ZFS performance degradation on SSDs Message-ID: <990886ae-b7c1-8630-1cef-f6678f0b5b63@multiplay.co.uk> In-Reply-To: <431d958c658a408d8bfd4c574a565439@DM2PR58MB013.032d.mgd.msft.net> References: <431d958c658a408d8bfd4c574a565439@DM2PR58MB013.032d.mgd.msft.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On 20/06/2017 01:57, Caza, Aaron wrote: >> vfs.zfs.min_auto_ashift is a sysctl only its not a tuneable, so settin= g it in /boot/loader.conf won't have any effect. >> >> There's no need for it to be a tuneable as it only effects vdevs when = they are created, which an only be done once the system is running. >> > The bsdinstall install script itself set vfs.zfs.min_auto_shift=3D12 in= /boot/loader.conf yet, as you say, this doesn't do anything. As a user,= this is a bit confusing to see it in /boot/loader.conf but do a 'sysctl = -a | grep min_auto_ashift' and see 'vfs.zfs.min_auto_ashift: 9' so felt i= t was worth mentioning. Absolutely, patch is in review here: https://reviews.freebsd.org/D11278 > >> You don't explain why you believe there is degrading performance? > As I related in my post, my previous FreeBSD 11-Stable setup using this= same hardware, I was seeing 950MB/s after bootup. I've been posting to = the freebsd-hackers list, but have moved to freebsd-fs list as this seemi= ngly has something to do with FreeBSD+ZFS behavior and user Jov had previ= ously cross-posted to this list for me: > https://docs.freebsd.org/cgi/getmsg.cgi?fetch=3D2905+0+archive/2017/fre= ebsd-fs/20170618.freebsd-fs > > I've been using FreeBSD+ZFS ever since FreeBSD 9.0, admittedly, with a = different zpool layout which is essentially as follows: > adaXp1 - gptboot loader > adaXp2 - 1GB UFS partition > adaXp3 - UFS with UUID labeled partition hosting a GEOM ELI layer = using NULL encryption to emulate 4k sectors (done before ashift was an op= tion) > > So, adaXp3 would show up as something like the following: > > /dev/gpt/b62feb20-554b-11e7-989b-000bab332ee8 > /dev/gpt/b62feb20-554b-11e7-989b-000bab332ee8.eli > > Then, the zpool mirrored pair would be something like the following: > > pool: wwbase > state: ONLINE > scan: none requested > config: > > NAME STATE RE= AD WRITE CKSUM > wwbase ONLINE = 0 0 0 > mirror-0 ONLINE = 0 0 0 > gpt/b62feb20-554b-11e7-989b-000bab332ee8.eli ONLINE = 0 0 0 > gpt/4c596d40-554c-11e7-beb1-002590766b41.eli ONLINE = 0 0 0 > > Using the above zpool configuration on this same hardware on FreeBSD 11= -Stable, I was seeing read speeds of 950MB/s using dd (dd if=3D/testdb/te= st of=3D/dev/null bs=3D1m). However, after anywhere from 5 to 24 hours, = performance would degrade down to less than 100MB/s for unknown reasons -= server was essentially idle so it's a mystery to me why this occurs. I'= m seeing this behavior on FreeBSD 10.3R amd64 up through FreeBSD11.0 Stab= le. As I wasn't making any headway in resolving this, I opted today to u= se the FreeBSD11.1 Beta 2 memstick image to create a basic FreeBSD 11.1 B= eta 2 amd64 Auto(ZFS) installation to see if this would resolve the origi= nal issue I was having as I would be using ZFS-on-root and vfs.zfs.min_au= to_ashift=3D12 instead of my own emulation as described above. However, = instead of seeing the 950MB/s that I expected - which it what I see it wi= th my alternative emulation - I'm seeing 450MB/s. I've yet to determine = if this zpool setup as done by the bsdinstall script will s > uffer from the original performance degradation I observed. > >> What is the exact dd command your running as that can have a huge impa= ct on performance. > dd if=3D/testdb/test of=3D/dev/null bs=3D1m > > Note that file /testdb/test is 16GB, twice the size of ram available in= this system. The /testdb directory is a ZFS file system with recordsize= =3D8k, chosen as ultimately it's intended to host a PostgreSQL database w= hich uses an 8k page size. > > My understanding is that a ZFS mirrored pool with two drives can read f= rom both drives at the same time hence double the speed. This is what I'= ve actually observed ever since I first started using this in FreeBSD 9.0= with the GEOM ELI 4k sector emulation. This is actually my first time u= sing FreeBSD's native installer's Auto(ZFS) setup with 4k sectors emulate= d using vfs.zfs.min_auto_ashift=3D12. As it's a ZFS mirrored pool, I sti= ll expected it to be able to read at double-speed as it does with the GEO= M ELI 4k sector emulation; however, it does not. > > /var/run/dmesg.boot: > Copyright (c) 1992-2017 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 199= 4 > The Regents of the University of California. All rights reserved. > FreeBSD is a registered trademark of The FreeBSD Foundation. > FreeBSD 11.1-BETA2 #0 r319993: Fri Jun 16 02:32:38 UTC 2017 > root@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64 > FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on L= LVM 4.0.0) > VT(vga): resolution 640x480 > CPU: Intel(R) Xeon(R) CPU E31240 @ 3.30GHz (3292.60-MHz K8-class CPU) > Origin=3D"GenuineIntel" Id=3D0x206a7 Family=3D0x6 Model=3D0x2a S= tepping=3D7 > Features=3D0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MT= RR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PB= E> > Features2=3D0x1dbae3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,= TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,XSAVE,OS= XSAVE,AVX> > AMD Features=3D0x28100800<SYSCALL,NX,RDTSCP,LM> > AMD Features2=3D0x1<LAHF> > XSAVE Features=3D0x1<XSAVEOPT> > VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID > TSC: P-state invariant, performance statistics > real memory =3D 8589934592 (8192 MB) > avail memory =3D 8232431616 (7851 MB) > Event timer "LAPIC" quality 600 > ACPI APIC Table: <SUPERM SMCI--MB> > FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs > FreeBSD/SMP: 1 package(s) x 4 core(s) x 2 hardware threads > random: unblocking device. > ioapic0 <Version 2.0> irqs 0-23 on motherboard > SMP: AP CPU #1 Launched! > SMP: AP CPU #7 Launched! > SMP: AP CPU #2 Launched! > SMP: AP CPU #3 Launched! > SMP: AP CPU #4 Launched! > SMP: AP CPU #6 Launched! > SMP: AP CPU #5 Launched! > Timecounter "TSC-low" frequency 1646297938 Hz quality 1000 > random: entropy device external interface > kbd1 at kbdmux0 > netmap: loaded module > module_register_init: MOD_LOAD (vesa, 0xffffffff80f5a190, 0) error 19 > nexus0 > vtvga0: <VT VGA driver> on motherboard > cryptosoft0: <software crypto> on motherboard > acpi0: <SUPERM SMCI--MB> on motherboard > acpi0: Power Button (fixed) > cpu0: <ACPI CPU> on acpi0 > cpu1: <ACPI CPU> on acpi0 > cpu2: <ACPI CPU> on acpi0 > cpu3: <ACPI CPU> on acpi0 > cpu4: <ACPI CPU> on acpi0 > cpu5: <ACPI CPU> on acpi0 > cpu6: <ACPI CPU> on acpi0 > cpu7: <ACPI CPU> on acpi0 > attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0 > Timecounter "i8254" frequency 1193182 Hz quality 0 > Event timer "i8254" frequency 1193182 Hz quality 100 > atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0 > Event timer "RTC" frequency 32768 Hz quality 0 > hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi= 0 > Timecounter "HPET" frequency 14318180 Hz quality 950 > Event timer "HPET" frequency 14318180 Hz quality 550 > Timecounter "ACPI-fast" frequency 3579545 Hz quality 900 > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0 > pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 > pcib0: _OSC returned error 0x10 > pci0: <ACPI PCI bus> on pcib0 > em0: <Intel(R) PRO/1000 Network Connection 7.6.1-k> port 0xf020-0xf03f = mem 0xfba00000-0xfba1ffff,0xfba24000-0xfba24fff irq 20 at device 25.0 on = pci0 > em0: Using an MSI interrupt > em0: Ethernet address: 00:25:90:76:6b:41 > em0: netmap queues/slots: TX 1/1024, RX 1/1024 > ehci0: <Intel Cougar Point USB 2.0 controller> mem 0xfba23000-0xfba233f= f irq 16 at device 26.0 on pci0 > usbus0: EHCI version 1.0 > usbus0 on ehci0 > usbus0: 480Mbps High Speed USB v2.0 > pcib1: <ACPI PCI-PCI bridge> irq 17 at device 28.0 on pci0 > pci1: <ACPI PCI bus> on pcib1 > pcib2: <ACPI PCI-PCI bridge> irq 17 at device 28.4 on pci0 > pci2: <ACPI PCI bus> on pcib2 > em1: <Intel(R) PRO/1000 Network Connection 7.6.1-k> port 0xe000-0xe01f = mem 0xfb900000-0xfb91ffff,0xfb920000-0xfb923fff irq 16 at device 0.0 on p= ci2 > em1: Using MSIX interrupts with 3 vectors > em1: Ethernet address: 00:25:90:76:6b:40 > em1: netmap queues/slots: TX 1/1024, RX 1/1024 > ehci1: <Intel Cougar Point USB 2.0 controller> mem 0xfba22000-0xfba223f= f irq 23 at device 29.0 on pci0 > usbus1: EHCI version 1.0 > usbus1 on ehci1 > usbus1: 480Mbps High Speed USB v2.0 > pcib3: <ACPI PCI-PCI bridge> at device 30.0 on pci0 > pci3: <ACPI PCI bus> on pcib3 > vgapci0: <VGA-compatible display> mem 0xfe000000-0xfe7fffff,0xfb800000-= 0xfb803fff,0xfb000000-0xfb7fffff irq 23 at device 3.0 on pci3 > vgapci0: Boot video device > isab0: <PCI-ISA bridge> at device 31.0 on pci0 > isa0: <ISA bus> on isab0 > ahci0: <Intel Cougar Point AHCI SATA controller> port 0xf070-0xf077,0xf= 060-0xf063,0xf050-0xf057,0xf040-0xf043,0xf000-0xf01f mem 0xfba21000-0xfba= 217ff irq 19 at device 31.2 on pci0 > ahci0: AHCI v1.30 with 6 6Gbps ports, Port Multiplier not supported > ahcich0: <AHCI channel> at channel 0 on ahci0 > ahcich1: <AHCI channel> at channel 1 on ahci0 > ahciem0: <AHCI enclosure management bridge> on ahci0 > acpi_button0: <Power Button> on acpi0 > atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0 > atkbd0: <AT Keyboard> irq 1 on atkbdc0 > kbd0 at atkbd0 > atkbd0: [GIANT-LOCKED] > psm0: <PS/2 Mouse> irq 12 on atkbdc0 > psm0: [GIANT-LOCKED] > psm0: model IntelliMouse Explorer, device ID 4 > uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0= > orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff on isa= 0 > ppc0: cannot reserve I/O port range > est0: <Enhanced SpeedStep Frequency Control> on cpu0 > est1: <Enhanced SpeedStep Frequency Control> on cpu1 > est2: <Enhanced SpeedStep Frequency Control> on cpu2 > est3: <Enhanced SpeedStep Frequency Control> on cpu3 > est4: <Enhanced SpeedStep Frequency Control> on cpu4 > est5: <Enhanced SpeedStep Frequency Control> on cpu5 > est6: <Enhanced SpeedStep Frequency Control> on cpu6 > est7: <Enhanced SpeedStep Frequency Control> on cpu7 > ZFS filesystem version: 5 > ZFS storage pool version: features support (5000) > Timecounters tick every 1.000 msec > nvme cam probe device init > ugen0.1: <Intel EHCI root HUB> at usbus0 > ugen1.1: <Intel EHCI root HUB> at usbus1 > uhub0: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus= 0 > uhub1: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus= 1 > ses0 at ahciem0 bus 0 scbus2 target 0 lun 0 > ses0: <AHCI SGPIO Enclosure 1.00 0001> SEMB S-E-S 2.00 device > ses0: SEMB SES Device > ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 > ada0: <Samsung SSD 850 PRO 256GB EXM03B6Q> ACS-2 ATA SATA 3.x device > ada0: Serial Number S39KNB0HB00482Y > ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes) > ada0: Command Queueing enabled > ada0: 244198MB (500118192 512 byte sectors) > ada0: quirks=3D0x3<4K,NCQ_TRIM_BROKEN> > ada1 at ahcich1 bus 0 scbus1 target 0 lun 0 > ada1: <Samsung SSD 850 PRO 256GB EXM03B6Q> ACS-2 ATA SATA 3.x device > ada1: Serial Number S39KNB0HB00473Z > ada1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes) > ada1: Command Queueing enabled > ada1: 244198MB (500118192 512 byte sectors) > ada1: quirks=3D0x3<4K,NCQ_TRIM_BROKEN> > Trying to mount root from zfs:zroot/ROOT/default []... > Root mount waiting for: usbus1 usbus0 > uhub0: 2 ports with 2 removable, self powered > uhub1: 2 ports with 2 removable, self powered > Root mount waiting for: usbus1 usbus0 > ugen0.2: <vendor 0x8087 product 0x0024> at usbus0 > uhub2 on uhub0 > uhub2: <vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2>= on usbus0 > ugen1.2: <vendor 0x8087 product 0x0024> at usbus1 > uhub3 on uhub1 > uhub3: <vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2>= on usbus1 > Root mount waiting for: usbus1 usbus0 > uhub2: 6 ports with 6 removable, self powered > uhub3: 6 ports with 6 removable, self powered > ugen1.3: <Weatherford SPD> at usbus1 > umodem0 on uhub3 > umodem0: <Weatherford SPD, class 2/0, rev 1.10/0.01, addr 3> on usbus1 > umodem0: data interface 1, has CM over data, has break > >> On 19/06/2017 23:14, Caza, Aaron wrote: >>> I've been having a problem with FreeBSD ZFS SSD performance inexplic= ably degrading after < 24 hours uptime as described in a separate e-mail= thread. In an effort to get down to basics, I've now performed a ZFS-on= -Root install of FreeBSD 11.1 Beta 2 amd64 using the default Auto(ZFS) in= stall using the default 4k sector emulation (vfs.zfs.min_auto_ashift=3D3D= 12) setting (no swap, not encrypted). >>> >>> Firstly, the vfs.zfs.min_auto_ashift=3D3D12 is set correctly in the /= boot=3D/loader.conf file, but doesn't appear to work because when I log i= n and do "systctl -a | grep min_auto_ashift" it's set to 9 and not 12 as = expected. I tried setting it to vfs.zfs.min_auto_ashift=3D3D"12" in /boo= t/loader.conf but that didn't make any difference so I finally just added= it to /etc/sysctl.conf where it seems to work. So, something needs to b= e changed to make this functionaly work correctly. >>> >>> Next, after reboot I was expecting somewhere in the neighborhood of 9= 50MB/s from the ZFS mirrored zpool of 2 Samsung 850 Pro 256GB SSDs that I= 'm using as I was previously seeing this before with my previous FreeBSD = 11-Stable setup which, admittedly, is a different from the way the bsdins= tall script does it. However, I'm seeing half that on bootup. >>> >>> Performance result: >>> Starting 'dd' test of large file...please wait >>> 16000+0 records in >>> 16000+0 records out >>> 16777216000 bytes transferred in 37.407043 secs (448504207 bytes/sec)= Can you show the output from gstat -pd during this DD please. >>> >>> Zpool Status: >>> pool: zroot >>> state: ONLINE >>> scan: none requested >>> config: >>> >>> NAME STATE READ WRITE CKSUM >>> zroot ONLINE 0 0 0 >>> mirror-0 ONLINE 0 0 0 >>> ada0p2 ONLINE 0 0 0 >>> ada1p2 ONLINE 0 0 0 >>> >>> /boot/loader.conf: >>> kern.geom.label.disk_ident.enable=3D3D"0" >>> kern.geom.label.gptid.enable=3D3D"0" >>> vfs.zfs.min_auto_ashift=3D3D12 >>> vfs.zfs.arc_min=3D3D"50M" >>> vfs.zfs.arc_max=3D3D"51M" >>> zfs_load=3D3D"YES" >>> >>> /etc/sysctl.conf: >>> vfs.zfs.min_auto_ashift=3D3D12 >>> >>> >>> Is this the expected behavior now in FreeBSD 11.1? >>> >>> -- >>> Aaron > -- > Aaron > > This message may contain confidential and privileged information. If it= has been sent to you in error, please reply to advise the sender of the = error and then immediately delete it. If you are not the intended recipie= nt, do not read, copy, disclose or otherwise use this message. The sender= disclaims any liability for such unauthorized use. PLEASE NOTE that all = incoming e-mails sent to Weatherford e-mail accounts will be archived and= may be scanned by us and/or by external service providers to detect and = prevent threats to our systems, investigate illegal or inappropriate beha= vior, and/or eliminate unsolicited promotional e-mails (spam). This proce= ss could result in deletion of a legitimate e-mail before it is read by i= ts intended recipient at our organization. Moreover, based on the scannin= g results, the full text of e-mails and attachments may be made available= to Weatherford security and other personnel for review and appropriate a= ction. If you have any concerns about this process, > please contact us at dataprivacy@weatherford.com. > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?990886ae-b7c1-8630-1cef-f6678f0b5b63>