From owner-freebsd-stable@FreeBSD.ORG  Tue Jan  4 18:52:08 2011
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 3842A106566C;
	Tue,  4 Jan 2011 18:52:08 +0000 (UTC) (envelope-from bra@fsn.hu)
Received: from people.fsn.hu (people.fsn.hu [195.228.252.137])
	by mx1.freebsd.org (Postfix) with ESMTP id 88BBB8FC15;
	Tue,  4 Jan 2011 18:52:06 +0000 (UTC)
Received: by people.fsn.hu (Postfix, from userid 1001)
	id 011256B70F0; Tue,  4 Jan 2011 19:52:05 +0100 (CET)
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000002, version=1.2.2
X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MF-ACE0E1EA [pR:
	20.6888]
X-CRM114-CacheID: sfid-20110104_19520_4D33DCF3 
X-CRM114-Status: Good  ( pR: 20.6888 )
X-Spambayes-Classification: ham; 0.00
Message-ID: <4D236C54.6050309@fsn.hu>
Date: Tue, 04 Jan 2011 19:52:04 +0100
From: Attila Nagy <bra@fsn.hu>
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US;
	rv:1.8.1.23) Gecko/20090817 Thunderbird/2.0.0.23 Mnenhy/0.7.6.0
MIME-Version: 1.0
To: Bob Friesenhahn <bfriesen@simple.dallas.tx.us>
References: <4D0A09AF.3040005@FreeBSD.org> <4D1F7008.3050506@fsn.hu>
	<AANLkTimGdnESX-wwD52Fh4wCfS4xZ-839g6Ste5Bwihu@mail.gmail.com>
	<4D222B7B.1090902@fsn.hu>
	<alpine.GSO.2.01.1101031516320.4523@freddy.simplesystems.org>
In-Reply-To: <alpine.GSO.2.01.1101031516320.4523@freddy.simplesystems.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org,
	Artem Belevich <fbsdlist@src.cx>
Subject: Re: New ZFSv28 patchset for 8-STABLE
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 04 Jan 2011 18:52:08 -0000

  On 01/03/2011 10:35 PM, Bob Friesenhahn wrote:
>>>
>> After four days, the L2 hit rate is still hovering around 10-20 
>> percents (was between 60-90), so I think it's clearly a regression in 
>> the ZFSv28 patch...
>> And the massive growth in CPU usage can also very nicely be seen...
>>
>> I've updated the graphs at (switch time can be checked on the zfs-mem 
>> graph):
>> http://people.fsn.hu/~bra/freebsd/20110101-zfsv28-fbsd/
>>
>> There is a new phenomenom: the large IOPS peaks. I use this munin 
>> script on a lot of machines and never seen anything like this... I'm 
>> not sure whether it's related or not.
>
> It is not so clear that there is a problem.  I am not sure what you 
> are using this server for but it is wise 
The IO pattern has changed radically, so for me it's a problem.
> to consider that this is the funny time when a new year starts, SPAM 
> delivery goes through the roof, and employees and customers behave 
> differently.  You chose the worst time of the year to implement the 
> change and observe behavior.
It's a free software mirror, ftp.fsn.hu, and I'm sure that it's (the 
very low hit rate and the increased CPU usage) not related to the time 
when I made the switch.
>
> CPU use is indeed increased somewhat.  A lower loading of the l2arc is 
> not necessarily a problem.  The l2arc is usually bandwidth limited 
> compared with main store so if bulk data can not be cached in RAM, 
> then it is best left in main store.  A smarter l2arc algorithm could 
> put only the data producing the expensive IOPS (the ones requiring a 
> seek) in the l2arc, lessening the amount of data cached on the device.
That would make sense, if I wouldn't have 100-120 IOPS (for 7k2 RPM 
disks, it's about their max, gstat tells me the same) on the disks, and 
as low as 10 percents of L2 hit rate.
What's smarter? Having 60-90% hit rate from the SSDs and moving the slow 
disk heads less, or having 10-20 percent of hit rate and kill the disks 
with random IO?
If you are right, ZFS tries to be too smart and falls on its face with 
this kind of workload.

BTW, I've checked the v15-v28 patch for arc.c, and I can't see any L2ARC 
related change there. I'm not sure whether the hypothetical logic would 
be there, or a different file, I haven't read it end to end.