Date: Tue, 04 Jan 2011 19:52:04 +0100 From: Attila Nagy <bra@fsn.hu> To: Bob Friesenhahn <bfriesen@simple.dallas.tx.us> Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org, Artem Belevich <fbsdlist@src.cx> Subject: Re: New ZFSv28 patchset for 8-STABLE Message-ID: <4D236C54.6050309@fsn.hu> In-Reply-To: <alpine.GSO.2.01.1101031516320.4523@freddy.simplesystems.org> References: <4D0A09AF.3040005@FreeBSD.org> <4D1F7008.3050506@fsn.hu> <AANLkTimGdnESX-wwD52Fh4wCfS4xZ-839g6Ste5Bwihu@mail.gmail.com> <4D222B7B.1090902@fsn.hu> <alpine.GSO.2.01.1101031516320.4523@freddy.simplesystems.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 01/03/2011 10:35 PM, Bob Friesenhahn wrote: >>> >> After four days, the L2 hit rate is still hovering around 10-20 >> percents (was between 60-90), so I think it's clearly a regression in >> the ZFSv28 patch... >> And the massive growth in CPU usage can also very nicely be seen... >> >> I've updated the graphs at (switch time can be checked on the zfs-mem >> graph): >> http://people.fsn.hu/~bra/freebsd/20110101-zfsv28-fbsd/ >> >> There is a new phenomenom: the large IOPS peaks. I use this munin >> script on a lot of machines and never seen anything like this... I'm >> not sure whether it's related or not. > > It is not so clear that there is a problem. I am not sure what you > are using this server for but it is wise The IO pattern has changed radically, so for me it's a problem. > to consider that this is the funny time when a new year starts, SPAM > delivery goes through the roof, and employees and customers behave > differently. You chose the worst time of the year to implement the > change and observe behavior. It's a free software mirror, ftp.fsn.hu, and I'm sure that it's (the very low hit rate and the increased CPU usage) not related to the time when I made the switch. > > CPU use is indeed increased somewhat. A lower loading of the l2arc is > not necessarily a problem. The l2arc is usually bandwidth limited > compared with main store so if bulk data can not be cached in RAM, > then it is best left in main store. A smarter l2arc algorithm could > put only the data producing the expensive IOPS (the ones requiring a > seek) in the l2arc, lessening the amount of data cached on the device. That would make sense, if I wouldn't have 100-120 IOPS (for 7k2 RPM disks, it's about their max, gstat tells me the same) on the disks, and as low as 10 percents of L2 hit rate. What's smarter? Having 60-90% hit rate from the SSDs and moving the slow disk heads less, or having 10-20 percent of hit rate and kill the disks with random IO? If you are right, ZFS tries to be too smart and falls on its face with this kind of workload. BTW, I've checked the v15-v28 patch for arc.c, and I can't see any L2ARC related change there. I'm not sure whether the hypothetical logic would be there, or a different file, I haven't read it end to end.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4D236C54.6050309>