Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 16 Sep 2020 15:36:38 -0700
From:      John-Mark Gurney <jmg@funkthat.com>
To:        Warner Losh <imp@bsdimp.com>
Cc:        Julian Grajkowski <julian.grajkowski@gmail.com>, freebsd-drivers@freebsd.org
Subject:   Re: getpid() performance
Message-ID:  <20200916223638.GR4213@funkthat.com>
In-Reply-To: <CANCZdfrSa7pfOWRLr9jji3ePshTOxFMYmy3=fQAonOdJO9a50g@mail.gmail.com>
References:  <CAGQdsJhgAM4aHY46SaGDpXJGOdEzG6rD93WzG7RT9PWi_%2BwJrA@mail.gmail.com> <CANCZdfrSa7pfOWRLr9jji3ePshTOxFMYmy3=fQAonOdJO9a50g@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Warner Losh wrote this message on Wed, Sep 16, 2020 at 01:24 -0600:
> On Wed, Sep 16, 2020 at 1:15 AM Julian Grajkowski <
> julian.grajkowski@gmail.com> wrote:
> 
> > Hi,
> >
> > I am working on a contiguous memory allocator which frequently calls
> > getpid() in user space and I have noticed very poor performance of this
> > function call. I measured this call performance using following code:
> >
> > inline uint64_t rdtsc_start(void)
> > {
> >     uint32_t cycles_high;
> >     uint32_t cycles_low;
> >
> >     asm volatile("lfence\n\t"
> >                  "rdtscp\n\t"
> >                  "mov %%edx, %0\n\t"
> >                  "mov %%eax, %1\n\t"
> >                  : "=r" (cycles_high), "=r" (cycles_low)
> >                  : : "%rax", "%rdx", "%rcx");
> >
> >     return (((uint64_t)cycles_high << 32) | cycles_low);
> > }
> >
> >
> > inline uint64_t rdtsc_end(void)
> > {
> >     uint32_t cycles_high;
> >     uint32_t cycles_low;
> >
> >     asm volatile("rdtscp\n\t"
> >                  "mov %%edx, %0\n\t"
> >                  "mov %%eax, %1\n\t"
> >                  "lfence\n\t"
> >                  : "=r" (cycles_high), "=r" (cycles_low)
> >                  : : "%rax", "%rdx", "%rcx");
> >
> >     return (((uint64_t)cycles_high << 32) | cycles_low);
> > }
> >
> > This way I measured ~320 cycles used for getpid() on FreeBSD 12.1. For
> > comparison, in Linux (CentOS 7) this call uses ~10 cycles. I am aware that
> > this should not be compared directly. as these are different systems, but
> > such a big difference in performance is an issue for me, as getpid() is
> > called very often in my code.
> >
> > Is such a poor performance of getpid() a known problem and is it possible
> > that this might be improved in future releases?
> >
> 
> glibc optimizes getpid() system call so it only calls it once and returns a
> cached value (which is in line with 10 cycles, there's no way you can
> save/restore state in 10 cycles, let alone do a dispatch). FreeBSD doesn't.

if you really need to see if your process has forked (I assume that is
why you're calling getpid so frequently), you can mmap a page, and using
minherit's INHERIT_ZERO so that all the data in that page will be zero'd
on fork.  You can then change your getpid check to something like:

pid_t *page_with_inherit_zero_set;

pid_t
my_getpid()
{
	if (page_with_inherit_zero_set == NULL)
		allocate_page_and_set_inherit_zero();

	if (*page_with_inherit_zero_set == 0) {
		*page_with_inherit_zero_set = getpid();

	return *page_with_inherit_zero_set;
}

and you should see similar improvements.

Though this might allow you to move this logic to a better place in
your code.

-- 
  John-Mark Gurney				Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20200916223638.GR4213>