From owner-freebsd-hackers@freebsd.org Sun Jun 11 03:39:54 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A6C03BFD300 for ; Sun, 11 Jun 2017 03:39:54 +0000 (UTC) (envelope-from allanjude@freebsd.org) Received: from mx1.scaleengine.net (mx1.scaleengine.net [209.51.186.6]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 882A271D83 for ; Sun, 11 Jun 2017 03:39:54 +0000 (UTC) (envelope-from allanjude@freebsd.org) Received: from T530-Allan.HML3.ScaleEngine.net (unknown [38.64.177.99]) (Authenticated sender: allanjude.freebsd@scaleengine.com) by mx1.scaleengine.net (Postfix) with ESMTPSA id 2D8C913333 for ; Sun, 11 Jun 2017 03:39:52 +0000 (UTC) Subject: Re: FreeBSD10 Stable + ZFS + PostgreSQL + SSD performance drop < 24 hours To: freebsd-hackers@freebsd.org References: <79528bf7a85a47079756dc508130360b@DM2PR58MB013.032d.mgd.msft.net> <20170610163642.GA18123@zxy.spb.ru> From: Allan Jude Message-ID: Date: Sat, 10 Jun 2017 23:39:34 -0400 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.1.0 MIME-Version: 1.0 In-Reply-To: <20170610163642.GA18123@zxy.spb.ru> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-CA Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 11 Jun 2017 03:39:54 -0000 On 06/10/2017 12:36, Slawa Olhovchenkov wrote: > On Sat, Jun 10, 2017 at 04:25:59PM +0000, Caza, Aaron wrote: > >> Gents, >> >> I'm experiencing an issue where iterating over a PostgreSQL table of ~21.5 million rows (select count(*)) goes from ~35 seconds to ~635 seconds on Intel 540 SSDs. This is using a FreeBSD 10 amd64 stable kernel back from Jan 2017. SSDs are basically 2 drives in a ZFS mirrored zpool. I'm using PostgreSQL 9.5.7. >> >> I've tried: >> >> * Using the FreeBSD10 amd64 stable kernel snapshot of May 25, 2017. >> >> * Tested on half a dozen machines with different models of SSDs: >> >> o Intel 510s (120GB) in ZFS mirrored pair >> >> o Intel 520s (120GB) in ZFS mirrored pair >> >> o Intel 540s (120GB) in ZFS mirrored pair >> >> o Samsung 850 Pros (256GB) in ZFS mirrored pair >> >> * Using bonnie++ to remove Postgres from the equation and performance does indeed drop. >> >> * Rebooting server and immediately re-running test and performance is back to original. >> >> * Tried using Karl Denninger's patch from PR187594 (which took some work to find a kernel that the FreeBSD10 patch would both apply and compile cleanly against). >> >> * Tried disabling ZFS lz4 compression. >> >> * Ran the same test on a FreeBSD9.0 amd64 system using PostgreSQL 9.1.3 with 2 Intel 520s in ZFS mirrored pair. System had 165 days uptime and test took ~80 seconds after which I rebooted and re-ran test and was still at ~80 seconds (older processor and memory in this system). >> >> I realize that there's a whole lot of info I'm not including (dmesg, zfs-stats -a, gstat, et cetera): I'm hoping some enlightened individual will be able to point me to a solution with only the above to go on. > > Just a random guess: can you try r307264 (I am mean regression in > r307266)? > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > This sounds a bit like an issue I investigated for a customer a few months ago. Look at gstat -d (includes DELETE operations like TRIM) If you see a lot of that happening, but try: vfs.zfs.trim.enabled=0 in /boot/loader.conf and see if your issues go away. the FreeBSD TRIM code for ZFS basicallys waits until the sector has been free for a while (to avoid doing a TRIM on a block we'll immediately reuse), so your benchmark will run file for a little while, then suddenly the TRIM will kick in. For postgres, fio, bonnie++ etc, make sure the ZFS dataset you are storing the data on / benchmarking has a recordsize that matches the workload. If you are doing a write-only benchmark, and you see lots of reads in gstat, you know you are having to do read/modify/write's, and that is why your performance is so bad. -- Allan Jude