From owner-freebsd-current@freebsd.org Fri Sep 7 13:34:36 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AB8BAFFB2EC for ; Fri, 7 Sep 2018 13:34:36 +0000 (UTC) (envelope-from subbsd@gmail.com) Received: from mail-lj1-x242.google.com (mail-lj1-x242.google.com [IPv6:2a00:1450:4864:20::242]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 26B8C8A0EA; Fri, 7 Sep 2018 13:34:36 +0000 (UTC) (envelope-from subbsd@gmail.com) Received: by mail-lj1-x242.google.com with SMTP id l15-v6so12299090lji.6; Fri, 07 Sep 2018 06:34:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=cNGFHBx8rRz96mWHXu9Vcr0rM7Ib6+HiesGPM/LB8Mk=; b=I/LF6Y5ZpGuXaWv1jMoMzNvg/LUMHzc0R1JJ3gvyW43nYdBm+2/CQJ9Sc8XgtBLUsV nafjnIj4iUi+WcZmV470BmrUnvCCIUB7iUapY1K0I3kxG1+OSbxKmvqXTQfF2vRvBh04 fO2dM7fAsZWCjpLUXGx008sehl/bojjgEIQYkj/enXV8Oc0LDOJsxqCPjpL/PS/mSnBR kw3YxnW1SGNl+2MWWe03RmzD2kiGSBwergO5DS6z1/MHSh6uLy7ABWlo2LvLgKt/Bdhn YjOCInfnGwZ0SbNxMVXcI0BPsii/EAo0mbyg4ESvDz6euLsOOcNA3eTmXOTB3IDXmnM3 8cnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=cNGFHBx8rRz96mWHXu9Vcr0rM7Ib6+HiesGPM/LB8Mk=; b=XMadNI0+UDi5XjJcCWhGHJMgqHa5DGSZ+tJdyuJqA978erO7rIy/tCSNCS4E+VJ2Tv gAAzajDRWebSuDRxmuVJbqsiCDHHyqA1cihUvXq7dpiI0qclhwEJwwg2+C4NeVy1EoMP 0BmJ/iZXWr2mxAYAfo8MRh3ppnefhQASGVj00js/YBIvU1dX6i5gIrGJ59RerPNJvtzs iaz7GmrwYBmwjuHj9xkaL8zf8jwzdPGPulI3pm/EShwAlPh5rgiYfL4aeT8VHy6iOptE wBmARErCjNiYA/Yc3mKT7U36w0jFwnFChu2PdFjy5gUwwm2oFH2vNozpvo63qgGIVura d4ug== X-Gm-Message-State: APzg51Asu4zhxk8uDtQOCQ3Ka+4dTVrM+sk0e8RQuX/bkgw38pFj/KuB SiamnNvw3uoW2xbLJ//DKlC665dvTUyojyl74LgFWjcM X-Google-Smtp-Source: ANB0VdbBW8Qdl7hiXD8D/9rkxbV7YClxiFFjtGCMNmcQjkDLLZcYzl9hUzoDZZ07qwis/CdS6bjwP9f1Gut2AyshGks= X-Received: by 2002:a2e:811:: with SMTP id 17-v6mr5312642lji.140.1536327274730; Fri, 07 Sep 2018 06:34:34 -0700 (PDT) MIME-Version: 1.0 References: <20180906002825.GB77324@raichu> In-Reply-To: <20180906002825.GB77324@raichu> From: Subbsd Date: Fri, 7 Sep 2018 16:34:23 +0300 Message-ID: Subject: Re: ZFS perfomance regression in FreeBSD 12 APLHA3->ALPHA4 To: markj@freebsd.org Cc: allanjude@freebsd.org, freebsd-current Current Content-Type: text/plain; charset="UTF-8" X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Sep 2018 13:34:37 -0000 On Thu, Sep 6, 2018 at 3:28 AM Mark Johnston wrote: > > On Wed, Sep 05, 2018 at 11:15:03PM +0300, Subbsd wrote: > > On Wed, Sep 5, 2018 at 5:58 PM Allan Jude wrote: > > > > > > On 2018-09-05 10:04, Subbsd wrote: > > > > Hi, > > > > > > > > I'm seeing a huge loss in performance ZFS after upgrading FreeBSD 12 > > > > to latest revision (r338466 the moment) and related to ARC. > > > > > > > > I can not say which revision was before except that the newver.sh > > > > pointed to ALPHA3. > > > > > > > > Problems are observed if you try to limit ARC. In my case: > > > > > > > > vfs.zfs.arc_max="128M" > > > > > > > > I know that this is very small. However, for two years with this there > > > > were no problems. > > > > > > > > When i send SIGINFO to process which is currently working with ZFS, i > > > > see "arc_reclaim_waiters_cv": > > > > > > > > e.g when i type: > > > > > > > > /bin/csh > > > > > > > > I have time (~5 seconds) to press several times 'ctrl+t' before csh is executed: > > > > > > > > load: 0.70 cmd: csh 5935 [arc_reclaim_waiters_cv] 1.41r 0.00u 0.00s 0% 3512k > > > > load: 0.70 cmd: csh 5935 [zio->io_cv] 1.69r 0.00u 0.00s 0% 3512k > > > > load: 0.70 cmd: csh 5935 [arc_reclaim_waiters_cv] 1.98r 0.00u 0.01s 0% 3512k > > > > load: 0.73 cmd: csh 5935 [arc_reclaim_waiters_cv] 2.19r 0.00u 0.01s 0% 4156k > > > > > > > > same story with find or any other commans: > > > > > > > > load: 0.34 cmd: find 5993 [zio->io_cv] 0.99r 0.00u 0.00s 0% 2676k > > > > load: 0.34 cmd: find 5993 [arc_reclaim_waiters_cv] 1.13r 0.00u 0.00s 0% 2676k > > > > load: 0.34 cmd: find 5993 [arc_reclaim_waiters_cv] 1.25r 0.00u 0.00s 0% 2680k > > > > load: 0.34 cmd: find 5993 [arc_reclaim_waiters_cv] 1.38r 0.00u 0.00s 0% 2684k > > > > load: 0.34 cmd: find 5993 [arc_reclaim_waiters_cv] 1.51r 0.00u 0.00s 0% 2704k > > > > load: 0.34 cmd: find 5993 [arc_reclaim_waiters_cv] 1.64r 0.00u 0.00s 0% 2716k > > > > load: 0.34 cmd: find 5993 [arc_reclaim_waiters_cv] 1.78r 0.00u 0.00s 0% 2760k > > > > > > > > this problem goes away after increasing vfs.zfs.arc_max > > > > _______________________________________________ > > > > freebsd-current@freebsd.org mailing list > > > > https://lists.freebsd.org/mailman/listinfo/freebsd-current > > > > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > > > > > > > > > > Previously, ZFS was not actually able to evict enough dnodes to keep > > > your arc_max under 128MB, it would have been much higher based on the > > > number of open files you had. A recent improvement from upstream ZFS > > > (r337653 and r337660) was pulled in that fixed this, so setting an > > > arc_max of 128MB is much more effective now, and that is causing the > > > side effect of "actually doing what you asked it to do", in this case, > > > what you are asking is a bit silly. If you have a working set that is > > > greater than 128MB, and you ask ZFS to use less than that, it'll have to > > > constantly try to reclaim memory to keep under that very low bar. > > > > > > > Thanks for comments. Mark was right when he pointed to r338416 ( > > https://svnweb.freebsd.org/base/head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c?r1=338416&r2=338415&pathrev=338416 > > ). Commenting aggsum_value returns normal speed regardless of the rest > > of the new code from upstream. > > I would like to repeat that the speed with these two lines is not just > > slow, but _INCREDIBLY_ slow! Probably, this should be written in the > > relevant documentation for FreeBSD 12+ > > Could you please retest with the patch below applied, instead of > reverting r338416? > > diff --git a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c > index 2bc065e12509..9b039b7d4a96 100644 > --- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c > +++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c > @@ -538,7 +538,7 @@ typedef struct arc_state { > */ > int zfs_arc_meta_prune = 10000; > unsigned long zfs_arc_dnode_limit_percent = 10; > -int zfs_arc_meta_strategy = ARC_STRATEGY_META_BALANCED; > +int zfs_arc_meta_strategy = ARC_STRATEGY_META_ONLY; > int zfs_arc_meta_adjust_restarts = 4096; > > /* The 6 states: */ Unfortunately, I can not conduct sufficiently serious regression tests right now (e.g through benchmark tools, with statistics via PMC(3), etc.). However, I can do a very surface comparison - for example with find. tested revision: 338520 loader.conf settings: vfs.zfs.arc_max="128M" meta strategy ARC_STRATEGY_META_BALANCED result: % time find /usr/ports -type f > /dev/null 0.495u 6.912s 7:20.88 1.6% 34+172k 289594+0io 0pf+0w meta strategy ARC_STRATEGY_META_ONLY: % time find /usr/ports -type f > /dev/null 0.721u 7.184s 4:57.18 2.6% 34+171k 300910+0io 0pf+0w So the difference in ~183 seconds. Both test started after a full boot OS cycle. Interestingly, despite the limit, ARC size by top(1) was constantly increasing and at the time of the end of the test it looked like this: ARC: 1161M Total, 392M MFU, 118M MRU, 32K Anon, 7959K Header, 643M Other 101M Compressed, 411M Uncompressed, 4.07:1 Ratio Swap: 16G Total, 16G Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 8413 root 1 20 0 11M 3024K arc_re 4 0:07 1.49% find