From owner-freebsd-fs@FreeBSD.ORG Sat Feb 15 11:59:43 2014 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 92776A94 for ; Sat, 15 Feb 2014 11:59:43 +0000 (UTC) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id E6E9310AC for ; Sat, 15 Feb 2014 11:59:42 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id NAA28771; Sat, 15 Feb 2014 13:59:33 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1WEdu9-0002nG-73; Sat, 15 Feb 2014 13:59:33 +0200 Message-ID: <52FF566D.3060601@FreeBSD.org> Date: Sat, 15 Feb 2014 13:58:37 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Brendan Gregg Subject: Re: l2arc_feed_thread cpu utlization References: <52B2D8D6.8090306@FreeBSD.org> <52FE0378.7070608@FreeBSD.org> In-Reply-To: X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Feb 2014 11:59:43 -0000 on 14/02/2014 22:23 Brendan Gregg said the following: > G'Day Andriy, > > Thanks for the patch. If most of the data is in one list (anyone have statistics > to confirm such a likelyhood? I know this happened a lot pre-list-split), then I > think this means we only scan that at 1/32nd of the previous rate. It should > solve the CPU issue, but could make warmup very slow. Brendan, I do not have any stats, but I think that the data should be spread more or less evenly between the lists. I mean the 16 sub-lists for data and 16 sub-lists for metadata. First, a list is picked up based on hash and that _should_ produce more or less even distribution. Second, if the hash funciton is not good enough then whole list splitting is pointless. In either case this was just a quick hack on my part. > I think the feed algorithm needs to be rethought, although that can be done as > future work. I'm trying to think of what simple that can be done right now to > solve CPU usage and warmup rate. I completely agree with you. I do not particularly like the fact that the threshold is per sub-list in FreeBSD. I would prefer a more "wholisitic" threshold. > Lets say we keep this change, but in l2arc_write_buffers we maintain an extra > copy of write_sz, say, list_write_sz, that is reset to zero for each list. Then, > when we reach headroom and choose to abort, we can check list_write_sz and > determine how fruitful the scanning has been so far. If that's greater than a > threshold, then keep scanning, up to the full L2ARC_WRITE_SIZE for that list. > That way, we've scanned only 1/32nd of the previous length as a test, and only > if that is fruitful enough do we keep scanning. > > Again, it probably needs to be rethought, but something like that may work fine > in the meantime. This sounds interesting. I will think more about this. Thanks! > > On Fri, Feb 14, 2014 at 3:52 AM, Andriy Gapon > wrote: > > on 19/12/2013 13:30 Andriy Gapon said the following: > > > > This is just a heads up, no patch yet. > > > > l2arc_feed_thread periodically wakes up and scans certain amount of ARC > buffers > > and writes eligible buffers to a cache device. > > Number of scanned buffers is limited by a threshold on the amount of data > in the > > buffers seen. The threshold is applied on a per buffer list basis. In > upstream > > there are 4 relevant lists: (data, metadata) X (MFU, MRU). In FreeBSD each of > > the lists was subdivided into 16 lists. This was done to reduce contention on > > the locks that protect the lists. But as a side effect l2arc_feed_thread can > > scan 16 times more data (~ buffers). > > > > So, if you have a rather large ARC and L2ARC and your buffers tend to be > > sufficiently small, then you could observe l2arc_feed_thread burning a > > noticeable amount of CPU. On some of our systems I observed it using up > to 40% > > of a single core. Scaling back the threshold by factor of 16 makes CPU > > utilization go down by the same factor. > > > > I plan to commit this change to FreeBSD ZFS code. > > Any comments are welcome. > > Here is what I have in mind: > https://github.com/avg-I/freebsd/compare/wip;hc;l2arc_feed_thread_scan_rate > > The calculations in the macro look somewhat ugly, but they should be correct :-) > > -- > Andriy Gapon > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org > " > > > > > -- > Brendan Gregg, Joyent http://dtrace.org/blogs/brendan -- Andriy Gapon