Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 21 Sep 2010 11:16:07 -0500
From:      Alan Cox <alan.l.cox@gmail.com>
To:        Jeff Roberson <jroberson@jroberson.net>
Cc:        Robert Watson <rwatson@freebsd.org>, Jeff Roberson <jeff@freebsd.org>, Andre Oppermann <andre@freebsd.org>, Andriy Gapon <avg@freebsd.org>, freebsd-hackers@freebsd.org
Subject:   Re: zfs + uma
Message-ID:  <AANLkTimy=2WUcH59R5spajrKkUYQnii9SD1ZDdMymNC%2B@mail.gmail.com>
In-Reply-To: <alpine.BSF.2.00.1009202037260.23448@desktop>
References:  <4C93236B.4050906@freebsd.org> <4C935F56.4030903@freebsd.org> <alpine.BSF.2.00.1009181221560.86826@fledge.watson.org> <alpine.BSF.2.00.1009181135430.23448@desktop> <4C95C804.1010701@freebsd.org> <alpine.BSF.2.00.1009182225050.23448@desktop> <4C95CCDA.7010007@freebsd.org> <4C984E90.60507@freebsd.org> <alpine.BSF.2.00.1009202037260.23448@desktop>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Sep 21, 2010 at 1:39 AM, Jeff Roberson <jroberson@jroberson.net>wrote:

> On Tue, 21 Sep 2010, Andriy Gapon wrote:
>
>  on 19/09/2010 11:42 Andriy Gapon said the following:
>>
>>> on 19/09/2010 11:27 Jeff Roberson said the following:
>>>
>>>> I don't like this because even with very large buffers you can still
>>>> have high
>>>> enough turnover to require per-cpu caching.  Kip specifically added UMA
>>>> support
>>>> to address this issue in zfs.  If you have allocations which don't
>>>> require
>>>> per-cpu caching and are very large why even use UMA?
>>>>
>>>
>>> Good point.
>>> Right now I am running with 4 items/bucket limit for items larger than
>>> 32KB.
>>>
>>
>> But I also have two counter-points actually :)
>> 1. Uniformity.  E.g. you can handle all ZFS I/O buffers via the same
>> mechanism
>> regardless of buffer size.
>> 2. (Open)Solaris does that for a while and it seems to suit them well.
>>  Not
>> saying that they are perfect, or the best, or an example to follow, but
>> still
>> that means quite a bit (for me).
>>
>
> I'm afraid there is not enough context here for me to know what 'the same
> mechanism' is or what solaris does.  Can you elaborate?
>
> I prefer not to take the weight of specific examples too heavily when
> considering the allocator as it must handle many cases and many types of
> systems.  I believe there are cases where you want large allocations to be
> handled by per-cpu caches, regardless of whether ZFS is one such case.  If
> ZFS does not need them, then it should simply allocate directly from the VM.
>  However, I don't want to introduce some maximum constraint unless it can be
> shown that adequate behavior is not generated from some more adaptable
> algorithm.
>
>
Actually, I think that there is a middle ground between "per-cpu caches" and
"directly from the VM" that we are missing.  When I've looked at the default
configuration of ZFS (without the extra UMA zones enabled), there is an
incredible amount of churn on the kmem map caused by the implementation of
uma_large_malloc() and uma_large_free() going directly to the kmem map.  Not
only are the obvious things happening, like allocating and freeing kernel
virtual addresses and underlying physical pages on every call, but also
system-wide TLB shootdowns and sometimes superpage demotions are occurring.

I have some trouble believing that the large allocations being performed by
ZFS really need per-CPU caching, but I can certainly believe that they could
benefit from not going directly to the kmem map on every uma_large_malloc()
and uma_large_free().  In other words, I think it would make a lot of sense
to have a thin layer between UMA and the kmem map that caches allocated but
unused ranges of pages.

Regards,
Alan



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTimy=2WUcH59R5spajrKkUYQnii9SD1ZDdMymNC%2B>