From owner-freebsd-hackers@freebsd.org  Sun Jun 11 03:39:54 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id A6C03BFD300
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Sun, 11 Jun 2017 03:39:54 +0000 (UTC)
 (envelope-from allanjude@freebsd.org)
Received: from mx1.scaleengine.net (mx1.scaleengine.net [209.51.186.6])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 882A271D83
 for <freebsd-hackers@freebsd.org>; Sun, 11 Jun 2017 03:39:54 +0000 (UTC)
 (envelope-from allanjude@freebsd.org)
Received: from T530-Allan.HML3.ScaleEngine.net (unknown [38.64.177.99])
 (Authenticated sender: allanjude.freebsd@scaleengine.com)
 by mx1.scaleengine.net (Postfix) with ESMTPSA id 2D8C913333
 for <freebsd-hackers@freebsd.org>; Sun, 11 Jun 2017 03:39:52 +0000 (UTC)
Subject: Re: FreeBSD10 Stable + ZFS + PostgreSQL + SSD performance drop < 24
 hours
To: freebsd-hackers@freebsd.org
References: <79528bf7a85a47079756dc508130360b@DM2PR58MB013.032d.mgd.msft.net>
 <20170610163642.GA18123@zxy.spb.ru>
From: Allan Jude <allanjude@freebsd.org>
Message-ID: <dc09bd99-f49f-daeb-b85d-62933891fde2@freebsd.org>
Date: Sat, 10 Jun 2017 23:39:34 -0400
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101
 Thunderbird/52.1.0
MIME-Version: 1.0
In-Reply-To: <20170610163642.GA18123@zxy.spb.ru>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-CA
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 11 Jun 2017 03:39:54 -0000

On 06/10/2017 12:36, Slawa Olhovchenkov wrote:
> On Sat, Jun 10, 2017 at 04:25:59PM +0000, Caza, Aaron wrote:
> 
>> Gents,
>>
>> I'm experiencing an issue where iterating over a PostgreSQL table of ~21.5 million rows (select count(*)) goes from ~35 seconds to ~635 seconds on Intel 540 SSDs.  This is using a FreeBSD 10 amd64 stable kernel back from Jan 2017.  SSDs are basically 2 drives in a ZFS mirrored zpool.  I'm using PostgreSQL 9.5.7.
>>
>> I've tried:
>>
>> *       Using the FreeBSD10 amd64 stable kernel snapshot of May 25, 2017.
>>
>> *       Tested on half a dozen machines with different models of SSDs:
>>
>> o   Intel 510s (120GB) in ZFS mirrored pair
>>
>> o   Intel 520s (120GB) in ZFS mirrored pair
>>
>> o   Intel 540s (120GB) in ZFS mirrored pair
>>
>> o   Samsung 850 Pros (256GB) in ZFS mirrored pair
>>
>> *       Using bonnie++ to remove Postgres from the equation and performance does indeed drop.
>>
>> *       Rebooting server and immediately re-running test and performance is back to original.
>>
>> *       Tried using Karl Denninger's patch from PR187594 (which took some work to find a kernel that the FreeBSD10 patch would both apply and compile cleanly against).
>>
>> *       Tried disabling ZFS lz4 compression.
>>
>> *       Ran the same test on a FreeBSD9.0 amd64 system using PostgreSQL 9.1.3 with 2 Intel 520s in ZFS mirrored pair.  System had 165 days uptime and test took ~80 seconds after which I rebooted and re-ran test and was still at ~80 seconds (older processor and memory in this system).
>>
>> I realize that there's a whole lot of info I'm not including (dmesg, zfs-stats -a, gstat, et cetera): I'm hoping some enlightened individual will be able to point me to a solution with only the above to go on.
> 
> Just a random guess: can you try r307264 (I am mean regression in
> r307266)?
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"
> 

This sounds a bit like an issue I investigated for a customer a few 
months ago.

Look at gstat -d (includes DELETE operations like TRIM)

If you see a lot of that happening, but try: vfs.zfs.trim.enabled=0
in /boot/loader.conf and see if your issues go away.

the FreeBSD TRIM code for ZFS basicallys waits until the sector has been 
free for a while (to avoid doing a TRIM on a block we'll immediately 
reuse), so your benchmark will run file for a little while, then 
suddenly the TRIM will kick in.

For postgres, fio, bonnie++ etc, make sure the ZFS dataset you are 
storing the data on / benchmarking has a recordsize that matches the 
workload.

If you are doing a write-only benchmark, and you see lots of reads in 
gstat, you know you are having to do read/modify/write's, and that is 
why your performance is so bad.


-- 
Allan Jude