Date: Sat, 23 Jul 2011 04:07:19 +0200 From: Davide Italiano <davide.italiano@gmail.com> To: freebsd-hackers@freebsd.org Subject: UMA large allocations issues Message-ID: <CACYV=-G%2BUzw=q8WXqO_D8GLbrEGqKC0H_HL1U_wQCFPH9CeypQ@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
Hi. I'm a student and some time ago I started investigating a bit about the performance/fragmentation issue of large allocations within the UMA allocator. Benckmarks showed up that this problems of performances are mainly related to the fact that every call to uma_large_malloc() results in a call to kmem_malloc(), and this behaviour is really inefficient. I started doing some work. Here's somethin: First of all, I tried to define larger zones and let uma do it all as a first step. UMA can allocate slabs of more than one page. So I tried to define zones of 1,2,4,8 pages, moving ZMEM_KMAX up. I tested the solution w/ raidtest. Here there are some numbers. Here's the workload characterization: set mediasize=`diskinfo /dev/zvol/tank/vol | awk '{print $3}'` set sectorsize=`diskinfo /dev/zvol/tank/vol | awk '{print $2}'` raidtest genfile -s $mediasize -S $sectorsize -n 50000 # $mediasize = 10737418240 # $sectorsize = 512 Number of READ requests: 24924 Number of WRITE requests: 25076 Numbers of bytes to transmit: 3305292800 raidtest test -d /dev/zvol/tank/vol -n 4 ## tested using 4 cores, 1.5 GB Ram Results: Number of processes: 4 Bytes per second: 10146896 Requests per second: 153 Results: (4* PAGE_SIZE) Number of processes: 4 Bytes per second: 14793969 Requests per second: 223 Results: (8* PAGE_SIZE) Number of processes: 4 Bytes per second: 6855779 Requests per second: 103 The result of this tests is that defining larger zones is useful until the size of these zones is not too big. After some size, performances decreases significantly. As second step, alc@ proposed to create a new layer that sits between UMA and the VM subsystem. This layer can manage a pool of chunk that can be used to satisfy requests from uma_large_malloc so avoiding the overhead due to kmem_malloc() calls. I've recently started developing a patch (not yet full working) that implements this layer. First of all I'd like to concentrate my attention to the performance problem rather than the fragmentation one. So the patch that actually started to write doesn't care about fragmentation aspects. http://davit.altervista.org/uma_large_allocations.patch There are some questions to which I wasn't able to answer (for example, when it's safe to call kmem_malloc() to request memory). So, at the end of the day I'm asking for your opinion about this issue and I'm looking for a "mentor" (some kind of guidance) to continue this project. If someone is interested to help, it would be very appreciated. Best Davide Italiano
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CACYV=-G%2BUzw=q8WXqO_D8GLbrEGqKC0H_HL1U_wQCFPH9CeypQ>