Date: Wed, 16 Sep 2020 15:36:38 -0700 From: John-Mark Gurney <jmg@funkthat.com> To: Warner Losh <imp@bsdimp.com> Cc: Julian Grajkowski <julian.grajkowski@gmail.com>, freebsd-drivers@freebsd.org Subject: Re: getpid() performance Message-ID: <20200916223638.GR4213@funkthat.com> In-Reply-To: <CANCZdfrSa7pfOWRLr9jji3ePshTOxFMYmy3=fQAonOdJO9a50g@mail.gmail.com> References: <CAGQdsJhgAM4aHY46SaGDpXJGOdEzG6rD93WzG7RT9PWi_%2BwJrA@mail.gmail.com> <CANCZdfrSa7pfOWRLr9jji3ePshTOxFMYmy3=fQAonOdJO9a50g@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Warner Losh wrote this message on Wed, Sep 16, 2020 at 01:24 -0600: > On Wed, Sep 16, 2020 at 1:15 AM Julian Grajkowski < > julian.grajkowski@gmail.com> wrote: > > > Hi, > > > > I am working on a contiguous memory allocator which frequently calls > > getpid() in user space and I have noticed very poor performance of this > > function call. I measured this call performance using following code: > > > > inline uint64_t rdtsc_start(void) > > { > > uint32_t cycles_high; > > uint32_t cycles_low; > > > > asm volatile("lfence\n\t" > > "rdtscp\n\t" > > "mov %%edx, %0\n\t" > > "mov %%eax, %1\n\t" > > : "=r" (cycles_high), "=r" (cycles_low) > > : : "%rax", "%rdx", "%rcx"); > > > > return (((uint64_t)cycles_high << 32) | cycles_low); > > } > > > > > > inline uint64_t rdtsc_end(void) > > { > > uint32_t cycles_high; > > uint32_t cycles_low; > > > > asm volatile("rdtscp\n\t" > > "mov %%edx, %0\n\t" > > "mov %%eax, %1\n\t" > > "lfence\n\t" > > : "=r" (cycles_high), "=r" (cycles_low) > > : : "%rax", "%rdx", "%rcx"); > > > > return (((uint64_t)cycles_high << 32) | cycles_low); > > } > > > > This way I measured ~320 cycles used for getpid() on FreeBSD 12.1. For > > comparison, in Linux (CentOS 7) this call uses ~10 cycles. I am aware that > > this should not be compared directly. as these are different systems, but > > such a big difference in performance is an issue for me, as getpid() is > > called very often in my code. > > > > Is such a poor performance of getpid() a known problem and is it possible > > that this might be improved in future releases? > > > > glibc optimizes getpid() system call so it only calls it once and returns a > cached value (which is in line with 10 cycles, there's no way you can > save/restore state in 10 cycles, let alone do a dispatch). FreeBSD doesn't. if you really need to see if your process has forked (I assume that is why you're calling getpid so frequently), you can mmap a page, and using minherit's INHERIT_ZERO so that all the data in that page will be zero'd on fork. You can then change your getpid check to something like: pid_t *page_with_inherit_zero_set; pid_t my_getpid() { if (page_with_inherit_zero_set == NULL) allocate_page_and_set_inherit_zero(); if (*page_with_inherit_zero_set == 0) { *page_with_inherit_zero_set = getpid(); return *page_with_inherit_zero_set; } and you should see similar improvements. Though this might allow you to move this logic to a better place in your code. -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20200916223638.GR4213>