Date: Sun, 11 Jun 2017 16:51:13 +0000 From: "Caza, Aaron" <Aaron.Caza@ca.weatherford.com> To: Allan Jude <allanjude@freebsd.org>, "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org> Subject: Re: FreeBSD10 Stable + ZFS + PostgreSQL + SSD performance drop < 24 hours Message-ID: <a8523e8099404bd699525f8ff7763819@DM2PR58MB013.032d.mgd.msft.net>
next in thread | raw e-mail | index | archive | help
Thanks Allan for the suggestions. I tried gstat -d but deletes (d/s) doesn= 't seem to be it as it stays at 0 despite vfs.zfs.trim.enabled=3D1. This is most likely due to the "layering" I use as, for historical reasons,= I have GEOM ELI set up to essentially emulate 4k sectors regardless of the= underlying media. I do my own alignment and partition sizing as well as h= ave the ZFS record size set to 8k for Postgres. In gstat, the SSDs %busy is 90-100% on startup after reboot. Once the perf= ormance degradation hits (<24 hours later), I'm seeing %busy at ~10%. #!/bin/sh psql --username=3Dtest --password=3Dsupersecret -h /db -d test << EOL \timing on select count(*) from test; \q EOL Sample run of above script after reboot (before degradation hits) (Samsung = 850 Pros in ZFS mirror): Timing is on. count ---------- 21568508 (1 row) Time: 57029.262 ms Sample run of above script after degradation (Samsung 850 Pros in ZFS mirro= r): Timing is on. count ---------- 21568508 (1 row) Time: 583595.239 ms (Uptime ~1 day in this particular case.) Any other suggestions? Regards, A -----Original Message----- From: owner-freebsd-hackers@freebsd.org [mailto:owner-freebsd-hackers@freeb= sd.org] On Behalf Of Allan Jude Sent: Saturday, June 10, 2017 9:40 PM To: freebsd-hackers@freebsd.org Subject: [EXTERNAL] Re: FreeBSD10 Stable + ZFS + PostgreSQL + SSD performan= ce drop < 24 hours On 06/10/2017 12:36, Slawa Olhovchenkov wrote: > On Sat, Jun 10, 2017 at 04:25:59PM +0000, Caza, Aaron wrote: > >> Gents, >> >> I'm experiencing an issue where iterating over a PostgreSQL table of ~21= .5 million rows (select count(*)) goes from ~35 seconds to ~635 seconds on = Intel 540 SSDs. This is using a FreeBSD 10 amd64 stable kernel back from J= an 2017. SSDs are basically 2 drives in a ZFS mirrored zpool. I'm using P= ostgreSQL 9.5.7. >> >> I've tried: >> >> * Using the FreeBSD10 amd64 stable kernel snapshot of May 25, 2017= . >> >> * Tested on half a dozen machines with different models of SSDs: >> >> o Intel 510s (120GB) in ZFS mirrored pair >> >> o Intel 520s (120GB) in ZFS mirrored pair >> >> o Intel 540s (120GB) in ZFS mirrored pair >> >> o Samsung 850 Pros (256GB) in ZFS mirrored pair >> >> * Using bonnie++ to remove Postgres from the equation and performa= nce does indeed drop. >> >> * Rebooting server and immediately re-running test and performance= is back to original. >> >> * Tried using Karl Denninger's patch from PR187594 (which took som= e work to find a kernel that the FreeBSD10 patch would both apply and compi= le cleanly against). >> >> * Tried disabling ZFS lz4 compression. >> >> * Ran the same test on a FreeBSD9.0 amd64 system using PostgreSQL = 9.1.3 with 2 Intel 520s in ZFS mirrored pair. System had 165 days uptime a= nd test took ~80 seconds after which I rebooted and re-ran test and was sti= ll at ~80 seconds (older processor and memory in this system). >> >> I realize that there's a whole lot of info I'm not including (dmesg, zfs= -stats -a, gstat, et cetera): I'm hoping some enlightened individual will b= e able to point me to a solution with only the above to go on. > > Just a random guess: can you try r307264 (I am mean regression in > r307266)? > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org= " > This sounds a bit like an issue I investigated for a customer a few months = ago. Look at gstat -d (includes DELETE operations like TRIM) If you see a lot of that happening, but try: vfs.zfs.trim.enabled=3D0 in /b= oot/loader.conf and see if your issues go away. the FreeBSD TRIM code for ZFS basicallys waits until the sector has been fr= ee for a while (to avoid doing a TRIM on a block we'll immediately reuse), = so your benchmark will run file for a little while, then suddenly the TRIM = will kick in. For postgres, fio, bonnie++ etc, make sure the ZFS dataset you are storing = the data on / benchmarking has a recordsize that matches the workload. If you are doing a write-only benchmark, and you see lots of reads in gstat= , you know you are having to do read/modify/write's, and that is why your p= erformance is so bad. -- Allan Jude _______________________________________________ freebsd-hackers@freebsd.org mailing list https://lists.freebsd.org/mailman/= listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" This message may contain confidential and privileged information. If it has= been sent to you in error, please reply to advise the sender of the error = and then immediately delete it. If you are not the intended recipient, do n= ot read, copy, disclose or otherwise use this message. The sender disclaims= any liability for such unauthorized use. PLEASE NOTE that all incoming e-m= ails sent to Weatherford e-mail accounts will be archived and may be scanne= d by us and/or by external service providers to detect and prevent threats = to our systems, investigate illegal or inappropriate behavior, and/or elimi= nate unsolicited promotional e-mails (spam). This process could result in d= eletion of a legitimate e-mail before it is read by its intended recipient = at our organization. Moreover, based on the scanning results, the full text= of e-mails and attachments may be made available to Weatherford security a= nd other personnel for review and appropriate action. If you have any conce= rns about this process, please contact us at dataprivacy@weatherford.com.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?a8523e8099404bd699525f8ff7763819>