From owner-freebsd-hackers@FreeBSD.ORG Tue Sep 30 11:19:26 2014 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CB544D16; Tue, 30 Sep 2014 11:19:26 +0000 (UTC) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id AB3B195B; Tue, 30 Sep 2014 11:19:25 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id OAA21479; Tue, 30 Sep 2014 14:19:23 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1XYvSl-000G91-7i; Tue, 30 Sep 2014 14:19:23 +0300 Message-ID: <542A916A.2060703@FreeBSD.org> Date: Tue, 30 Sep 2014 14:18:02 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.1.2 MIME-Version: 1.0 To: freebsd-hackers@FreeBSD.org Subject: uk_slabsize, uk_ppera, uk_ipers, uk_pages Content-Type: text/plain; charset=X-VIET-VPS Content-Transfer-Encoding: 8bit Cc: Gleb Smirnoff X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Sep 2014 11:19:27 -0000 I have a hard time understanding how to use uk_slabsize, uk_ppera, uk_ipers, uk_pages to derive other useful characteristics of UMA kegs. This is despite the good descriptions of the fields and multiple examples of their usage in the code. Unfortunately, I find those examples to be at least inconsistent and possibly contradictory. First problem is quite obvious. uk_slabsize has a too narrow type. For example, ZFS creates many zones with item sizes larger than 64KB. So, obviously, uk_slabsize overflows. Not sure how that affects further calculation, if any, but probably not in a good way. On the other hand, there is probably no harm at all, because as far as I can see uk_slabsize is actually used only within keg_small_init(). It is set but not used in keg_large_init() and keg_cachespread_init(). It does not seem to be used after initialization. So, maybe this field could be just converted to a local variable in keg_small_init() ? Now a second problem. Even the names uk_ppera (pages per allocation) and uk_ipers (items per slab) leave some room for ambiguity. What is a relation between the allocation and the slab? It seems that some code assumes that the slab takes the whole allocation (that is, one slab per allocation), other code places multiple slabs into a single allocation, while other code looks inconsistent in this respect. For instance: static void keg_drain(uma_keg_t keg) { ... LIST_REMOVE(slab, us_link); keg->uk_pages -= keg->uk_ppera; keg->uk_free -= keg->uk_ipers; A slab is freed. There is no question about uk_free. But it is clear that the code assumes the slab takes a whole allocation. keg_alloc_slab() is symmetric with these stats. int uma_zone_set_max(uma_zone_t zone, int nitems) { uma_keg_t keg; keg = zone_first_keg(zone); if (keg == NULL) return (0); KEG_LOCK(keg); keg->uk_maxpages = (nitems / keg->uk_ipers) * keg->uk_ppera; if (keg->uk_maxpages * keg->uk_ipers < nitems) keg->uk_maxpages += keg->uk_ppera; nitems = keg->uk_maxpages * keg->uk_ipers; KEG_UNLOCK(keg); return (nitems); } The uk_maxpages calculation seems to assume that the allocation and the slab is the same. We first calculate a number of slabs needed to hold nitems and then multiply that number by the number of pages per allocation. But when we calculate nitems to be returned we simply multiply uk_maxpages by uk_ipers as if we assume that the slab size is 1 page regardless of uk_ppera. uma_zone_get_max() calculates nitems in the same way without taking uk_ppera into account. uma_print_keg(): out is calculated as (keg->uk_ipers * keg->uk_pages) - keg->uk_free while limit is calculated as: (keg->uk_maxpages / keg->uk_ppera) * keg->uk_ipers In one case we simply multiply a number of pages by ipers, but in the other case we first divide with uk_ppera. My personal opinion is that we should settle on the rule that the slab and the allocation map 1:1 and fix the code that does not conform to that. It seems that this is how the code that allocates and frees slabs actually works. I do not see any good reason to support multiple slabs per an allocation. P.S. By the way, we have some wonderful things in UMA code that are not used anymore (if ever?) and are scarcely documented. Perhaps some of those could be removed to simplify the code: - UMA_ZONE_CACHESPREAD - uma_zsecond_add() More generally it looks like the support for multiple zones using the same keg is quite useful. On the other hand the support for a zone using multiple kegs is of questionable utility and at present that capability is not used. -- Andriy Gapon