Date: Fri, 01 Feb 2008 08:56:34 +0200 From: Alexander Motin <mav@FreeBSD.org> To: Julian Elischer <julian@elischer.org> Cc: freebsd-hackers@freebsd.org, freebsd-performance@freebsd.org Subject: Re: Memory allocation performance Message-ID: <47A2C2A2.5040109@FreeBSD.org> In-Reply-To: <47A25A0D.2080508@elischer.org> References: <47A25412.3010301@FreeBSD.org> <47A25A0D.2080508@elischer.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Julian Elischer пишет: > Alexander Motin wrote: >> Hi. >> >> While profiling netgraph operation on UP HEAD router I have found that >> huge amount of time it spent on memory allocation/deallocation: >> >> 0.14 0.05 132119/545292 ip_forward <cycle 1> [12] >> 0.14 0.05 133127/545292 fxp_add_rfabuf [18] >> 0.27 0.10 266236/545292 ng_package_data [17] >> [9]14.1 0.56 0.21 545292 uma_zalloc_arg [9] >> 0.17 0.00 545292/1733401 critical_exit <cycle 2> [98] >> 0.01 0.00 275941/679675 generic_bzero [68] >> 0.01 0.00 133127/133127 mb_ctor_pack [103] >> >> 0.15 0.06 133100/545266 mb_free_ext [22] >> 0.15 0.06 133121/545266 m_freem [15] >> 0.29 0.11 266236/545266 ng_free_item [16] >> [8]15.2 0.60 0.23 545266 uma_zfree_arg [8] >> 0.17 0.00 545266/1733401 critical_exit <cycle 2> [98] >> 0.00 0.04 133100/133100 mb_dtor_pack [57] >> 0.00 0.00 134121/134121 mb_dtor_mbuf [111] >> >> I have already optimized all possible allocation calls and those that >> left are practically unavoidable. But even after this kgmon tells that >> 30% of CPU time consumed by memory management. >> >> So I have some questions: >> 1) Is it real situation or just profiler mistake? >> 2) If it is real then why UMA is so slow? I have tried to replace it >> in some places with preallocated TAILQ of required memory blocks >> protected by mutex and according to profiler I have got _much_ better >> results. Will it be a good practice to replace relatively small UMA >> zones with preallocated queue to avoid part of UMA calls? >> 3) I have seen that UMA does some kind of CPU cache affinity, but does >> it cost so much that it costs 30% CPU time on UP router? > > given this information, I would add an 'item cache' in ng_base.c > (hmm do I already have one?) That was actually my second question. As there is only 512 items by default and they are small in size I can easily preallocate them all on boot. But is it a good way? Why UMA can't do just the same when I have created zone with specified element size and maximum number of objects? What is the principal difference? -- Alexander Motin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?47A2C2A2.5040109>