Date: Tue, 20 Jun 2017 18:50:27 +0000 From: "Caza, Aaron" <Aaron.Caza@ca.weatherford.com> To: Karl Denninger <karl@denninger.net>, "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org> Subject: RE: FreeBSD 11.1 Beta 2 ZFS performance degradation on SSDs Message-ID: <b7350cca59624e91abee6697aaf9e1b6@DM2PR58MB013.032d.mgd.msft.net>
next in thread | raw e-mail | index | archive | help
> -----Original Message----- > From: Karl Denninger [mailto:karl@denninger.net] > Sent: Tuesday, June 20, 2017 11:58 AM > To: freebsd-fs@freebsd.org > Subject: Re: FreeBSD 11.1 Beta 2 ZFS performance degradation on SSDs > > On 6/20/2017 12:29, Caza, Aaron wrote: > >> -----Original Message----- > >> From: Karl Denninger [mailto:karl@denninger.net] > >> Sent: Monday, June 19, 2017 7:28 PM > >> To: freebsd-fs@freebsd.org > >> Subject: Re: FreeBSD 11.1 Beta 2 ZFS performance degradation on SSDs > >> > >> Just one note below... > >> > >> On 6/19/2017 19:57, Caza, Aaron wrote: > >>> Note that file /testdb/test is 16GB, twice the size of ram available = =3Dn this system. The /testdb directory is a ZFS file system with recordsi= =3De=3D8k, chosen as ultimately it's intended to host a PostgreSQL database= =3Dwhich uses an 8k page size. > >> Do not make this assumption blindly. Yes, I know the docs say to set > >> recordsize=3D8k but this is something you need to benchmark against > >> yo=3Dr actual working data set. > >> > >> MANY Postgres workloads are MUCH faster (2x or more!) if you use a > >> default page size and lz4 compression -- including one I have in > >> production and have extensively benchmarked. The difference is NOT sm= =3Dll.. > >> .... > >> > >> zroot/ticker compressratio 1.53x - > >> zroot/ticker mounted yes - > >> zroot/ticker quota none defa= =3Dlt > >> zroot/ticker reservation none defa= =3Dlt > >> zroot/ticker recordsize 128K defa= =3Dlt > >> zroot/ticker mountpoint /usr/local/pgsql/data-ticker loca= =3D > >> zroot/ticker sharenfs off defa= =3Dlt > >> zroot/ticker checksum fletcher4 > >> inherited from zroot > >> zroot/ticker compression lz4 > >> inherited from zroot > >> zroot/ticker atime off > >> inherited from zroot > >> > >> You may also want to consider setting logbias=3Dthroughput. In some > >> c=3Dses the improvement there can be quite material as well -- > >> depending on th=3D insert/update traffic to the database in question. > >> > >> -- > >> Karl Denninger > >> karl@denninger.net <mailto:karl@denninger.net> /The Market Ticker/ > >> /[S/MIME encrypted email preferred]/ > > Thanks for the suggestions Karl. I'll investigate further after I reso= =3Dve this performance degradation issue I'm experiencing. I recently read= =3Danother FreeBSD+ZFS+PostgreSQL user's Scale15x presentation, PostgreZFS,= =3DSean Chittenden if I recall correctly, who also advised lz4 compression = =3D 16K page size rather than 8K with PostgreZFS. > > > > With regards to my performance woes, I was originally using PostgreSQL = =3Dn my posts to freebsd-hackers@freebsd.org but started using 'dd' to remo= =3De it as a point of contention. In attempting to resolve this issue, I t= =3Died using your patch to PR 187594 (https://bugs.freebsd.org/bugzilla/sho= =3D_bug.cgi?id=3D187594). Took a bit of effort to > > find a revision of = FreeB=3DD 10 Stable to which your FreeBSD10 patch would both apply and comp= ile c=3Deanly; however, it didn't resolve the issue I'm experiencing. > I would not have expected my PR to impact this issue. > > I suspicious of a drive firmware interaction with your I/O pattern; SSDs > are somewhat-notorious for having that come up under certain workloads > that involve a lot of writes. > I've observed this performance degradation on 6 different hardware systems = using 4 differents SSDS (2x Intel 510 120GB, 2x Intel 520 120GB, 2x Intel 5= 40 120GB, 2x Samsung 850 Pro SSDs) on FreeBSD10.3 RELEASE, FreeBSD 10.3 REL= EASEp6, FreeBSD 10.3RELEASEp19, FreeBSD 10-Stable, FreeBSD11.0 RELEASE, Fre= eBSD 11-Stable and now FreeBSD11.1 Beta 2. This latest testing I'm not doi= ng much in the way of writing - only logging the output of the 'dd' command= along with 'zfs-stats -a' and 'uptime' to go along with it once an hour. = Ran for ~20hrs before performance drop kicked in though why it happens is = inexplicable as this server isn't doing anything other than running this te= st hourly. I have a FreeBSD9.0 system using 2x Intel 520 120GB SSDs that doesn't exhib= it this performance degradation, maintaining ~400MB/s speeds even after man= y days of uptime. This is using the GEOM ELI layer to provide 4k sector em= ulation for the mirrored zpool as I previously described. Interestingly, using the GEOM ELI layering, I was seeing the following - FreeBSD 10.3 RELEASE : performance ~750MB/s when dd'ing 16GB file - FreeBSD 10 Stable : performance ~850MB/s when dd'ing 16GB file - FreeBSD 11 Stable : performance ~950MB/s when dd'ing 16GB file During the above testing, which was all done after reboot, gstat would show= %busy of 90-95%. When performance degradation hits, %busy drops to ~15%. Switching to FreeBSD 11.1 Beta 2 with Auto(ZFS) ashift-based 4k emulation o= f ZFS mirrored pool: - FreeBSD 11.1 Beta 2 : performance ~450MB/s when dd'ing 16GB file wit= h gstat %busy of ~60%. When performance degradation hits, %busy drops to ~= 15%. Now, I expected that removing the GEOM ELI layer and just using vfs.zfs.min= _auto_ashift=3D12 to do the 4k sector emulation would provide even better p= erformance. It's seems strange to me that it doesn't. > -- > Karl Denninger > karl@denninger.net <mailto:karl@denninger.net> > /The Market Ticker -- Aaron This message may contain confidential and privileged information. If it has= been sent to you in error, please reply to advise the sender of the error = and then immediately delete it. If you are not the intended recipient, do n= ot read, copy, disclose or otherwise use this message. The sender disclaims= any liability for such unauthorized use. PLEASE NOTE that all incoming e-m= ails sent to Weatherford e-mail accounts will be archived and may be scanne= d by us and/or by external service providers to detect and prevent threats = to our systems, investigate illegal or inappropriate behavior, and/or elimi= nate unsolicited promotional e-mails (spam). This process could result in d= eletion of a legitimate e-mail before it is read by its intended recipient = at our organization. Moreover, based on the scanning results, the full text= of e-mails and attachments may be made available to Weatherford security a= nd other personnel for review and appropriate action. If you have any conce= rns about this process, please contact us at dataprivacy@weatherford.com.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?b7350cca59624e91abee6697aaf9e1b6>