From owner-freebsd-hackers@freebsd.org Sun Jun 11 23:50:26 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 82C78BEF8AC for ; Sun, 11 Jun 2017 23:50:26 +0000 (UTC) (envelope-from amutu@amutu.com) Received: from mail-oi0-x22a.google.com (mail-oi0-x22a.google.com [IPv6:2607:f8b0:4003:c06::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 303547ADB5 for ; Sun, 11 Jun 2017 23:50:26 +0000 (UTC) (envelope-from amutu@amutu.com) Received: by mail-oi0-x22a.google.com with SMTP id s3so46826868oia.0 for ; Sun, 11 Jun 2017 16:50:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amutu-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=dLDQqbnbFFlgHlKkpD7vKzj3jHHzqMXibcwl+YU7nAM=; b=i7eQ+XVFnRsvpMmP5P/WgOHyKkKG0DyItIf4zXXvQbi8jSTpuptP+fvDYByB6rW0PQ VrJdW9l78vf0I+9v+XeAGIJM2CpK/DspfL1f70ZU6YKNgu8WC+Kl0lMgWqWVNwVKwvHr C+cAPYdFO1xEiCoXWM3D4J9+wWrbha6dFZLCs/FXuSgizFdr+XsSqnJ609XO41RXFoU0 EQZAaoWkqZdpFg9oSaStHV2wciziu01Bom+7pEISJ5KDomTMkjpM/SdjlKnBNNoYlu1P X2ugPwcSXZfugBS/btqFzZ90B8VRqE/VRVNb6Mx3Yh7qpfWHdj3QnhTHBq+GRS066rg1 pyPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=dLDQqbnbFFlgHlKkpD7vKzj3jHHzqMXibcwl+YU7nAM=; b=aLtJ1rjaUSUuF9I4vWq+aphq/9XrhLyZ/PsYqS/ZdJA5P5dA78BNacY7A3/5WRATpU tlh95ywYVUKtdzEaMrtsnQMmcdTYXVAxa2UFQkMXPxIPxPdkGl8qKAJgYfyGut1a27v1 d+OzHiqIeAMxFzQ48vW6MY+wn8X7ow9Or1dlWvZPIs1R+UlzXBKMS+3yJOr2IZyWXpdL qQ+sVyt+ZvCVamjLdgFQa5ZXyTAKUJ0YLPg4cisMYzqflg/RYvIhVv39CuzGQqwx0jOP fC6Wbea8nkHeMeTrZwUCfwgRqGcX54k904FfiWMpSvlhjeF7XB8CJ7RaXpaNwxx6me9k zBPg== X-Gm-Message-State: AODbwcDGjZqRjol/vHqN8UU4EBggIT1d2nnvE5wQVzqxKUoaZ5Kjc4ui RXY1CxC4UbFDLOBsdeq7xQ== X-Received: by 10.202.188.139 with SMTP id m133mr28912287oif.12.1497225025335; Sun, 11 Jun 2017 16:50:25 -0700 (PDT) Received: from mail-ot0-f181.google.com (mail-ot0-f181.google.com. [74.125.82.181]) by smtp.gmail.com with ESMTPSA id d27sm4134130ote.41.2017.06.11.16.50.24 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 11 Jun 2017 16:50:24 -0700 (PDT) Received: by mail-ot0-f181.google.com with SMTP id i31so59013904ota.3; Sun, 11 Jun 2017 16:50:24 -0700 (PDT) X-Received: by 10.157.9.35 with SMTP id 32mr5154049otp.118.1497225024430; Sun, 11 Jun 2017 16:50:24 -0700 (PDT) MIME-Version: 1.0 Received: by 10.74.133.136 with HTTP; Sun, 11 Jun 2017 16:50:23 -0700 (PDT) Received: by 10.74.133.136 with HTTP; Sun, 11 Jun 2017 16:50:23 -0700 (PDT) In-Reply-To: References: From: Jov Date: Mon, 12 Jun 2017 07:50:23 +0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: FreeBSD10 Stable + ZFS + PostgreSQL + SSD performance drop < 24 hours To: "Caza, Aaron" Cc: freebsd-hackers@freebsd.org, Allan Jude Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 11 Jun 2017 23:50:26 -0000 To exclude the fs problem=EF=BC=8CI will do a dd test on the pgdata data se= t after the performance drop,if the read and/or write utility can reach 100% or performance expected then I will say the problem is not fs or os. For pg,what's your output of explain analyze before and after performance drop? 2017=E5=B9=B46=E6=9C=8812=E6=97=A5 12:51 AM=EF=BC=8C"Caza, Aaron" =E5=86=99=E9=81=93=EF=BC=9A > Thanks Allan for the suggestions. I tried gstat -d but deletes (d/s) > doesn't seem to be it as it stays at 0 despite vfs.zfs.trim.enabled=3D1. > > This is most likely due to the "layering" I use as, for historical > reasons, I have GEOM ELI set up to essentially emulate 4k sectors > regardless of the underlying media. I do my own alignment and partition > sizing as well as have the ZFS record size set to 8k for Postgres. > > In gstat, the SSDs %busy is 90-100% on startup after reboot. Once the > performance degradation hits (<24 hours later), I'm seeing %busy at ~10%. > > #!/bin/sh > psql --username=3Dtest --password=3Dsupersecret -h /db -d test << EOL > \timing on > select count(*) from test; > \q > EOL > > Sample run of above script after reboot (before degradation hits) (Samsun= g > 850 Pros in ZFS mirror): > Timing is on. > count > ---------- > 21568508 > (1 row) > > Time: 57029.262 ms > > Sample run of above script after degradation (Samsung 850 Pros in ZFS > mirror): > Timing is on. > count > ---------- > 21568508 > (1 row) > > Time: 583595.239 ms > (Uptime ~1 day in this particular case.) > > > Any other suggestions? > > Regards, > A > > -----Original Message----- > From: owner-freebsd-hackers@freebsd.org [mailto:owner-freebsd-hackers@ > freebsd.org] On Behalf Of Allan Jude > Sent: Saturday, June 10, 2017 9:40 PM > To: freebsd-hackers@freebsd.org > Subject: [EXTERNAL] Re: FreeBSD10 Stable + ZFS + PostgreSQL + SSD > performance drop < 24 hours > > On 06/10/2017 12:36, Slawa Olhovchenkov wrote: > > On Sat, Jun 10, 2017 at 04:25:59PM +0000, Caza, Aaron wrote: > > > >> Gents, > >> > >> I'm experiencing an issue where iterating over a PostgreSQL table of > ~21.5 million rows (select count(*)) goes from ~35 seconds to ~635 second= s > on Intel 540 SSDs. This is using a FreeBSD 10 amd64 stable kernel back > from Jan 2017. SSDs are basically 2 drives in a ZFS mirrored zpool. I'm > using PostgreSQL 9.5.7. > >> > >> I've tried: > >> > >> * Using the FreeBSD10 amd64 stable kernel snapshot of May 25, > 2017. > >> > >> * Tested on half a dozen machines with different models of SSDs: > >> > >> o Intel 510s (120GB) in ZFS mirrored pair > >> > >> o Intel 520s (120GB) in ZFS mirrored pair > >> > >> o Intel 540s (120GB) in ZFS mirrored pair > >> > >> o Samsung 850 Pros (256GB) in ZFS mirrored pair > >> > >> * Using bonnie++ to remove Postgres from the equation and > performance does indeed drop. > >> > >> * Rebooting server and immediately re-running test and > performance is back to original. > >> > >> * Tried using Karl Denninger's patch from PR187594 (which took > some work to find a kernel that the FreeBSD10 patch would both apply and > compile cleanly against). > >> > >> * Tried disabling ZFS lz4 compression. > >> > >> * Ran the same test on a FreeBSD9.0 amd64 system using PostgreSQ= L > 9.1.3 with 2 Intel 520s in ZFS mirrored pair. System had 165 days uptime > and test took ~80 seconds after which I rebooted and re-ran test and was > still at ~80 seconds (older processor and memory in this system). > >> > >> I realize that there's a whole lot of info I'm not including (dmesg, > zfs-stats -a, gstat, et cetera): I'm hoping some enlightened individual > will be able to point me to a solution with only the above to go on. > > > > Just a random guess: can you try r307264 (I am mean regression in > > r307266)? > > _______________________________________________ > > freebsd-hackers@freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers > > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@ > freebsd.org" > > > > This sounds a bit like an issue I investigated for a customer a few month= s > ago. > > Look at gstat -d (includes DELETE operations like TRIM) > > If you see a lot of that happening, but try: vfs.zfs.trim.enabled=3D0 in > /boot/loader.conf and see if your issues go away. > > the FreeBSD TRIM code for ZFS basicallys waits until the sector has been > free for a while (to avoid doing a TRIM on a block we'll immediately > reuse), so your benchmark will run file for a little while, then suddenly > the TRIM will kick in. > > For postgres, fio, bonnie++ etc, make sure the ZFS dataset you are storin= g > the data on / benchmarking has a recordsize that matches the workload. > > If you are doing a write-only benchmark, and you see lots of reads in > gstat, you know you are having to do read/modify/write's, and that is why > your performance is so bad. > > > -- > Allan Jude > _______________________________________________ > freebsd-hackers@freebsd.org mailing list https://lists.freebsd.org/ > mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org= " > > This message may contain confidential and privileged information. If it > has been sent to you in error, please reply to advise the sender of the > error and then immediately delete it. If you are not the intended > recipient, do not read, copy, disclose or otherwise use this message. The > sender disclaims any liability for such unauthorized use. PLEASE NOTE tha= t > all incoming e-mails sent to Weatherford e-mail accounts will be archived > and may be scanned by us and/or by external service providers to detect a= nd > prevent threats to our systems, investigate illegal or inappropriate > behavior, and/or eliminate unsolicited promotional e-mails (spam). This > process could result in deletion of a legitimate e-mail before it is read > by its intended recipient at our organization. Moreover, based on the > scanning results, the full text of e-mails and attachments may be made > available to Weatherford security and other personnel for review and > appropriate action. If you have any concerns about this process, > please contact us at dataprivacy@weatherford.com. > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org= " >