Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 20 Jun 2017 02:31:59 +0100
From:      Steven Hartland <killing@multiplay.co.uk>
To:        freebsd-fs@freebsd.org
Subject:   Re: FreeBSD 11.1 Beta 2 ZFS performance degradation on SSDs
Message-ID:  <990886ae-b7c1-8630-1cef-f6678f0b5b63@multiplay.co.uk>
In-Reply-To: <431d958c658a408d8bfd4c574a565439@DM2PR58MB013.032d.mgd.msft.net>
References:  <431d958c658a408d8bfd4c574a565439@DM2PR58MB013.032d.mgd.msft.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On 20/06/2017 01:57, Caza, Aaron wrote:
>> vfs.zfs.min_auto_ashift is a sysctl only its not a tuneable, so settin=
g it in /boot/loader.conf won't have any effect.
>>
>> There's no need for it to be a tuneable as it only effects vdevs when =
they are created, which an only be done once the system is running.
>>
> The bsdinstall install script itself set vfs.zfs.min_auto_shift=3D12 in=
 /boot/loader.conf yet, as you say, this doesn't do anything.  As a user,=
 this is a bit confusing to see it in /boot/loader.conf but do a 'sysctl =
-a | grep min_auto_ashift' and see 'vfs.zfs.min_auto_ashift: 9' so felt i=
t was worth mentioning.
Absolutely, patch is in review here: https://reviews.freebsd.org/D11278
>
>> You don't explain why you believe there is degrading performance?
> As I related in my post, my previous FreeBSD 11-Stable setup using this=
 same hardware, I was seeing 950MB/s after bootup.  I've been posting to =
the freebsd-hackers list, but have moved to freebsd-fs list as this seemi=
ngly has something to do with FreeBSD+ZFS behavior and user Jov had previ=
ously cross-posted to this list for me:
> https://docs.freebsd.org/cgi/getmsg.cgi?fetch=3D2905+0+archive/2017/fre=
ebsd-fs/20170618.freebsd-fs
>
> I've been using FreeBSD+ZFS ever since FreeBSD 9.0, admittedly, with a =
different zpool layout which is essentially as follows:
>      adaXp1 - gptboot loader
>      adaXp2 - 1GB UFS partition
>      adaXp3 - UFS with UUID labeled partition hosting a GEOM ELI layer =
using NULL encryption to emulate 4k sectors (done before ashift was an op=
tion)
>
> So, adaXp3 would show up as something like the following:
>
>    /dev/gpt/b62feb20-554b-11e7-989b-000bab332ee8
>    /dev/gpt/b62feb20-554b-11e7-989b-000bab332ee8.eli
>
> Then, the zpool mirrored pair would be something like the following:
>
>    pool: wwbase
>   state: ONLINE
>    scan: none requested
> config:
>
>          NAME                                              STATE     RE=
AD WRITE CKSUM
>          wwbase                                            ONLINE      =
 0     0     0
>            mirror-0                                        ONLINE      =
 0     0     0
>              gpt/b62feb20-554b-11e7-989b-000bab332ee8.eli  ONLINE      =
 0     0     0
>              gpt/4c596d40-554c-11e7-beb1-002590766b41.eli  ONLINE      =
 0     0     0
>
> Using the above zpool configuration on this same hardware on FreeBSD 11=
-Stable, I was seeing read speeds of 950MB/s using dd (dd if=3D/testdb/te=
st of=3D/dev/null bs=3D1m).  However, after anywhere from 5 to 24 hours, =
performance would degrade down to less than 100MB/s for unknown reasons -=
 server was essentially idle so it's a mystery to me why this occurs.  I'=
m seeing this behavior on FreeBSD 10.3R amd64 up through FreeBSD11.0 Stab=
le.  As I wasn't making any headway in resolving this, I opted today to u=
se the FreeBSD11.1 Beta 2 memstick image to create a basic FreeBSD 11.1 B=
eta 2 amd64 Auto(ZFS) installation to see if this would resolve the origi=
nal issue I was having as I would be using ZFS-on-root and vfs.zfs.min_au=
to_ashift=3D12 instead of my own emulation as described above.  However, =
instead of seeing the 950MB/s that I expected - which it what I see it wi=
th my alternative emulation - I'm seeing 450MB/s.  I've yet to determine =
if this zpool setup as done by the bsdinstall script will s
>   uffer from the original performance degradation I observed.
>
>> What is the exact dd command your running as that can have a huge impa=
ct on performance.
> dd if=3D/testdb/test of=3D/dev/null bs=3D1m
>
> Note that file /testdb/test is 16GB, twice the size of ram available in=
 this system.  The /testdb directory is a ZFS file system with recordsize=
=3D8k, chosen as ultimately it's intended to host a PostgreSQL database w=
hich uses an 8k page size.
>
> My understanding is that a ZFS mirrored pool with two drives can read f=
rom both drives at the same time hence double the speed.  This is what I'=
ve actually observed ever since I first started using this in FreeBSD 9.0=
 with the GEOM ELI 4k sector emulation.  This is actually my first time u=
sing FreeBSD's native installer's Auto(ZFS) setup with 4k sectors emulate=
d using vfs.zfs.min_auto_ashift=3D12.  As it's a ZFS mirrored pool, I sti=
ll expected it to be able to read at double-speed as it does with the GEO=
M ELI 4k sector emulation; however, it does not.
>
> /var/run/dmesg.boot:
> Copyright (c) 1992-2017 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 199=
4
> The Regents of the University of California. All rights reserved.
> FreeBSD is a registered trademark of The FreeBSD Foundation.
> FreeBSD 11.1-BETA2 #0 r319993: Fri Jun 16 02:32:38 UTC 2017
>      root@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64
> FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on L=
LVM 4.0.0)
> VT(vga): resolution 640x480
> CPU: Intel(R) Xeon(R) CPU E31240 @ 3.30GHz (3292.60-MHz K8-class CPU)
>    Origin=3D"GenuineIntel"  Id=3D0x206a7  Family=3D0x6  Model=3D0x2a  S=
tepping=3D7
>    Features=3D0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MT=
RR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PB=
E>
>    Features2=3D0x1dbae3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,=
TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,XSAVE,OS=
XSAVE,AVX>
>    AMD Features=3D0x28100800<SYSCALL,NX,RDTSCP,LM>
>    AMD Features2=3D0x1<LAHF>
>    XSAVE Features=3D0x1<XSAVEOPT>
>    VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
>    TSC: P-state invariant, performance statistics
> real memory  =3D 8589934592 (8192 MB)
> avail memory =3D 8232431616 (7851 MB)
> Event timer "LAPIC" quality 600
> ACPI APIC Table: <SUPERM SMCI--MB>
> FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs
> FreeBSD/SMP: 1 package(s) x 4 core(s) x 2 hardware threads
> random: unblocking device.
> ioapic0 <Version 2.0> irqs 0-23 on motherboard
> SMP: AP CPU #1 Launched!
> SMP: AP CPU #7 Launched!
> SMP: AP CPU #2 Launched!
> SMP: AP CPU #3 Launched!
> SMP: AP CPU #4 Launched!
> SMP: AP CPU #6 Launched!
> SMP: AP CPU #5 Launched!
> Timecounter "TSC-low" frequency 1646297938 Hz quality 1000
> random: entropy device external interface
> kbd1 at kbdmux0
> netmap: loaded module
> module_register_init: MOD_LOAD (vesa, 0xffffffff80f5a190, 0) error 19
> nexus0
> vtvga0: <VT VGA driver> on motherboard
> cryptosoft0: <software crypto> on motherboard
> acpi0: <SUPERM SMCI--MB> on motherboard
> acpi0: Power Button (fixed)
> cpu0: <ACPI CPU> on acpi0
> cpu1: <ACPI CPU> on acpi0
> cpu2: <ACPI CPU> on acpi0
> cpu3: <ACPI CPU> on acpi0
> cpu4: <ACPI CPU> on acpi0
> cpu5: <ACPI CPU> on acpi0
> cpu6: <ACPI CPU> on acpi0
> cpu7: <ACPI CPU> on acpi0
> attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0
> Timecounter "i8254" frequency 1193182 Hz quality 0
> Event timer "i8254" frequency 1193182 Hz quality 100
> atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
> Event timer "RTC" frequency 32768 Hz quality 0
> hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi=
0
> Timecounter "HPET" frequency 14318180 Hz quality 950
> Event timer "HPET" frequency 14318180 Hz quality 550
> Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
> acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
> pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
> pcib0: _OSC returned error 0x10
> pci0: <ACPI PCI bus> on pcib0
> em0: <Intel(R) PRO/1000 Network Connection 7.6.1-k> port 0xf020-0xf03f =
mem 0xfba00000-0xfba1ffff,0xfba24000-0xfba24fff irq 20 at device 25.0 on =
pci0
> em0: Using an MSI interrupt
> em0: Ethernet address: 00:25:90:76:6b:41
> em0: netmap queues/slots: TX 1/1024, RX 1/1024
> ehci0: <Intel Cougar Point USB 2.0 controller> mem 0xfba23000-0xfba233f=
f irq 16 at device 26.0 on pci0
> usbus0: EHCI version 1.0
> usbus0 on ehci0
> usbus0: 480Mbps High Speed USB v2.0
> pcib1: <ACPI PCI-PCI bridge> irq 17 at device 28.0 on pci0
> pci1: <ACPI PCI bus> on pcib1
> pcib2: <ACPI PCI-PCI bridge> irq 17 at device 28.4 on pci0
> pci2: <ACPI PCI bus> on pcib2
> em1: <Intel(R) PRO/1000 Network Connection 7.6.1-k> port 0xe000-0xe01f =
mem 0xfb900000-0xfb91ffff,0xfb920000-0xfb923fff irq 16 at device 0.0 on p=
ci2
> em1: Using MSIX interrupts with 3 vectors
> em1: Ethernet address: 00:25:90:76:6b:40
> em1: netmap queues/slots: TX 1/1024, RX 1/1024
> ehci1: <Intel Cougar Point USB 2.0 controller> mem 0xfba22000-0xfba223f=
f irq 23 at device 29.0 on pci0
> usbus1: EHCI version 1.0
> usbus1 on ehci1
> usbus1: 480Mbps High Speed USB v2.0
> pcib3: <ACPI PCI-PCI bridge> at device 30.0 on pci0
> pci3: <ACPI PCI bus> on pcib3
> vgapci0: <VGA-compatible display> mem 0xfe000000-0xfe7fffff,0xfb800000-=
0xfb803fff,0xfb000000-0xfb7fffff irq 23 at device 3.0 on pci3
> vgapci0: Boot video device
> isab0: <PCI-ISA bridge> at device 31.0 on pci0
> isa0: <ISA bus> on isab0
> ahci0: <Intel Cougar Point AHCI SATA controller> port 0xf070-0xf077,0xf=
060-0xf063,0xf050-0xf057,0xf040-0xf043,0xf000-0xf01f mem 0xfba21000-0xfba=
217ff irq 19 at device 31.2 on pci0
> ahci0: AHCI v1.30 with 6 6Gbps ports, Port Multiplier not supported
> ahcich0: <AHCI channel> at channel 0 on ahci0
> ahcich1: <AHCI channel> at channel 1 on ahci0
> ahciem0: <AHCI enclosure management bridge> on ahci0
> acpi_button0: <Power Button> on acpi0
> atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
> atkbd0: <AT Keyboard> irq 1 on atkbdc0
> kbd0 at atkbd0
> atkbd0: [GIANT-LOCKED]
> psm0: <PS/2 Mouse> irq 12 on atkbdc0
> psm0: [GIANT-LOCKED]
> psm0: model IntelliMouse Explorer, device ID 4
> uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0=

> orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff on isa=
0
> ppc0: cannot reserve I/O port range
> est0: <Enhanced SpeedStep Frequency Control> on cpu0
> est1: <Enhanced SpeedStep Frequency Control> on cpu1
> est2: <Enhanced SpeedStep Frequency Control> on cpu2
> est3: <Enhanced SpeedStep Frequency Control> on cpu3
> est4: <Enhanced SpeedStep Frequency Control> on cpu4
> est5: <Enhanced SpeedStep Frequency Control> on cpu5
> est6: <Enhanced SpeedStep Frequency Control> on cpu6
> est7: <Enhanced SpeedStep Frequency Control> on cpu7
> ZFS filesystem version: 5
> ZFS storage pool version: features support (5000)
> Timecounters tick every 1.000 msec
> nvme cam probe device init
> ugen0.1: <Intel EHCI root HUB> at usbus0
> ugen1.1: <Intel EHCI root HUB> at usbus1
> uhub0: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus=
0
> uhub1: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus=
1
> ses0 at ahciem0 bus 0 scbus2 target 0 lun 0
> ses0: <AHCI SGPIO Enclosure 1.00 0001> SEMB S-E-S 2.00 device
> ses0: SEMB SES Device
> ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
> ada0: <Samsung SSD 850 PRO 256GB EXM03B6Q> ACS-2 ATA SATA 3.x device
> ada0: Serial Number S39KNB0HB00482Y
> ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes)
> ada0: Command Queueing enabled
> ada0: 244198MB (500118192 512 byte sectors)
> ada0: quirks=3D0x3<4K,NCQ_TRIM_BROKEN>
> ada1 at ahcich1 bus 0 scbus1 target 0 lun 0
> ada1: <Samsung SSD 850 PRO 256GB EXM03B6Q> ACS-2 ATA SATA 3.x device
> ada1: Serial Number S39KNB0HB00473Z
> ada1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes)
> ada1: Command Queueing enabled
> ada1: 244198MB (500118192 512 byte sectors)
> ada1: quirks=3D0x3<4K,NCQ_TRIM_BROKEN>
> Trying to mount root from zfs:zroot/ROOT/default []...
> Root mount waiting for: usbus1 usbus0
> uhub0: 2 ports with 2 removable, self powered
> uhub1: 2 ports with 2 removable, self powered
> Root mount waiting for: usbus1 usbus0
> ugen0.2: <vendor 0x8087 product 0x0024> at usbus0
> uhub2 on uhub0
> uhub2: <vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2>=
 on usbus0
> ugen1.2: <vendor 0x8087 product 0x0024> at usbus1
> uhub3 on uhub1
> uhub3: <vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2>=
 on usbus1
> Root mount waiting for: usbus1 usbus0
> uhub2: 6 ports with 6 removable, self powered
> uhub3: 6 ports with 6 removable, self powered
> ugen1.3: <Weatherford SPD> at usbus1
> umodem0 on uhub3
> umodem0: <Weatherford SPD, class 2/0, rev 1.10/0.01, addr 3> on usbus1
> umodem0: data interface 1, has CM over data, has break
>
>> On 19/06/2017 23:14, Caza, Aaron wrote:
>>> I've been  having a problem with FreeBSD ZFS SSD performance inexplic=
ably degrading after < 24  hours uptime as described in a separate e-mail=
 thread.  In an effort to get down to basics, I've now performed a ZFS-on=
-Root install of FreeBSD 11.1 Beta 2 amd64 using the default Auto(ZFS) in=
stall using the default 4k sector emulation (vfs.zfs.min_auto_ashift=3D3D=
12) setting (no swap, not encrypted).
>>>
>>> Firstly, the vfs.zfs.min_auto_ashift=3D3D12 is set correctly in the /=
boot=3D/loader.conf file, but doesn't appear to work because when I log i=
n and do "systctl -a | grep min_auto_ashift" it's set to 9 and not 12 as =
expected.  I tried setting it to vfs.zfs.min_auto_ashift=3D3D"12" in /boo=
t/loader.conf but that didn't make any difference so I finally just added=
 it to /etc/sysctl.conf where it seems to work.  So, something needs to b=
e changed to make this functionaly work correctly.
>>>
>>> Next, after reboot I was expecting somewhere in the neighborhood of 9=
50MB/s from the ZFS mirrored zpool of 2 Samsung 850 Pro 256GB SSDs that I=
'm using as I was previously seeing this before with my previous FreeBSD =
11-Stable setup which, admittedly, is a different from the way the bsdins=
tall script does it.  However, I'm seeing half that on bootup.
>>>
>>> Performance result:
>>> Starting 'dd' test of large file...please wait
>>> 16000+0 records in
>>> 16000+0 records out
>>> 16777216000 bytes transferred in 37.407043 secs (448504207 bytes/sec)=

Can you show the output from gstat -pd during this DD please.
>>>
>>> Zpool Status:
>>>     pool: zroot
>>> state: ONLINE
>>>     scan: none requested
>>> config:
>>>
>>>           NAME        STATE     READ WRITE CKSUM
>>>           zroot       ONLINE       0     0     0
>>>             mirror-0  ONLINE       0     0     0
>>>               ada0p2  ONLINE       0     0     0
>>>               ada1p2  ONLINE       0     0     0
>>>
>>> /boot/loader.conf:
>>> kern.geom.label.disk_ident.enable=3D3D"0"
>>> kern.geom.label.gptid.enable=3D3D"0"
>>> vfs.zfs.min_auto_ashift=3D3D12
>>> vfs.zfs.arc_min=3D3D"50M"
>>> vfs.zfs.arc_max=3D3D"51M"
>>> zfs_load=3D3D"YES"
>>>
>>> /etc/sysctl.conf:
>>> vfs.zfs.min_auto_ashift=3D3D12
>>>
>>>
>>> Is this the expected behavior now in FreeBSD 11.1?
>>>
>>> --
>>> Aaron
> --
> Aaron
>
> This message may contain confidential and privileged information. If it=
 has been sent to you in error, please reply to advise the sender of the =
error and then immediately delete it. If you are not the intended recipie=
nt, do not read, copy, disclose or otherwise use this message. The sender=
 disclaims any liability for such unauthorized use. PLEASE NOTE that all =
incoming e-mails sent to Weatherford e-mail accounts will be archived and=
 may be scanned by us and/or by external service providers to detect and =
prevent threats to our systems, investigate illegal or inappropriate beha=
vior, and/or eliminate unsolicited promotional e-mails (spam). This proce=
ss could result in deletion of a legitimate e-mail before it is read by i=
ts intended recipient at our organization. Moreover, based on the scannin=
g results, the full text of e-mails and attachments may be made available=
 to Weatherford security and other personnel for review and appropriate a=
ction. If you have any concerns about this process,
>    please contact us at dataprivacy@weatherford.com.
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?990886ae-b7c1-8630-1cef-f6678f0b5b63>