Date: Mon, 15 Jul 2013 15:05:42 -0600 From: Chris Torek <torek@torek.net> To: freebsd-hackers@freebsd.org Subject: am I abusing the UMA allocator? Message-ID: <201307152105.r6FL5g1B053031@elf.torek.net>
next in thread | raw e-mail | index | archive | help
I have been experimenting with using the UMA (slab) allocator for special-purpose physical address ranges. (The underlying issue is that we need zone-like and/or mbuf-like data structures to talk to hardware that has "special needs" in terms of which physical pages it can in turn use. Each device has a limited memory window it can access.) For my purposes it's nice that the allocation function receives a "zone" argument, even though the comment in the call says "zone is passed for legacy reasons". However, the free function does not get the zone argument, or anything other than a single bit -- up to 4 if you cheat harder. This is ... less convenient (although in my case I can use the VA being free'd, instead). What I'm wondering is what this single bit is really for; whether the allocation and free might be made more flexible for special- purpose back-end allocators; and whether this is really using things as intended. Details: In the allocator, there's a per-"keg" uk_allocf and uk_freef ("alloc"ation and "free" "f"unction) pointer, and you can set your own allocation and free functions for any zone with: void uma_zone_set_allocf(uma_zone_t zone, uma_alloc allocf); void uma_zone_set_freef(uma_zone_t zone, uma_free freef); (Aside: it seems a bit weird that you set these per *zone* but they're stored in the *kegs*, specifically the special "first keg", but never mind... :-) ) Each allocf is called as: /* arguments: uma_zone_t zone, int size, uint8_t *pflag, int wait */ mem = allocf(zone, nbytes, &flags, wait); where "wait" is made up of malloc flags (M_WAITOK, M_NOWAIT, M_ZERO, M_USE_RESERVE). The "flags" argument is not initialized at this point, so the allocation function must fill it in. The filled-in value is stored in the per-slab us_flags and eventually passed back to each freef function: /* arguments: void *mem, int size, uint8_t flag */ freef(mem, nbytes, pflag); /* where pflag = us->us_flags */ The flags are defined in sys/vm/uma.h and are the UMA_SLAB_* flags (BOOT, KMEM, KERNEL, "PRIV", OFFP, MALLOC). UMA_SLAB_PRIV is described as "private". The bit is never tested though, so it seems that a "private" allocator can set UMA_SLAB_PRIV, or not set it, freely. It appears to be the only UMA_SLAB_* bit that has no other defined meaning in uma_core.c or elsewhere. (Not entirely true, there's also UMA_SLAB_OFFP which is never tested or set, and bits 0x40 and 0x80 are unused. There's also an unused us_pad right after that. It looks like OFFP is a leftover, with "on" vs "off" page slab management controlled through UMA_ZONE_HASH and also the PG_SLAB bit in the underlying "struct vm_page".) There's also a per-keg flag spelled UMA_ZFLAG_PRIVALLOC, along with UMA_ZONE_NOFREE. But UMA_ZFLAG_PRIVALLOC is never tested; and UMA_ZONE_NOFREE is really per-keg, and you can't set it from outside the UMA code. When the system gets low on memory, it calls uma_reclaim(), which does (simplified): zone_foreach(zone_drain) | zone_drain(zone) | zone_drain_wait(zone) | bucket_cache_drain() | zone_foreach_keg() | keg_drain() | test: (UMA_ZONE_NOFREE || keg->uk_freef==NULL) | if either is the case, return now, can't free The issue here is that draining these special purpose, special- physical-page-backed zones is not actually going to help the system any (though freeing internal bucket data structures could help slightly). Of course I can have uk_freef == NULL, but it is nice to keep some statistics, and maybe be able to trade pages between various special-purpose physical spaces (by doing my own zone_drain()s on them -- the one in uma_reclaim() is not going to help the OS much as the physical pages cannot be handed out to processes, and they "run out" against themselves, not the VM system). All in all, I'm now thinking that I'm abusing the slab allocator too much here. But I wonder if perhaps some minor changes to uma_core might make this more useable, or if this is really within the intent of the UMA code at all. Chris
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201307152105.r6FL5g1B053031>