Date: Wed, 16 Sep 2020 01:24:34 -0600 From: Warner Losh <imp@bsdimp.com> To: Julian Grajkowski <julian.grajkowski@gmail.com> Cc: freebsd-drivers@freebsd.org Subject: Re: getpid() performance Message-ID: <CANCZdfrSa7pfOWRLr9jji3ePshTOxFMYmy3=fQAonOdJO9a50g@mail.gmail.com> In-Reply-To: <CAGQdsJhgAM4aHY46SaGDpXJGOdEzG6rD93WzG7RT9PWi_%2BwJrA@mail.gmail.com> References: <CAGQdsJhgAM4aHY46SaGDpXJGOdEzG6rD93WzG7RT9PWi_%2BwJrA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Sep 16, 2020 at 1:15 AM Julian Grajkowski < julian.grajkowski@gmail.com> wrote: > Hi, > > I am working on a contiguous memory allocator which frequently calls > getpid() in user space and I have noticed very poor performance of this > function call. I measured this call performance using following code: > > inline uint64_t rdtsc_start(void) > { > uint32_t cycles_high; > uint32_t cycles_low; > > asm volatile("lfence\n\t" > "rdtscp\n\t" > "mov %%edx, %0\n\t" > "mov %%eax, %1\n\t" > : "=r" (cycles_high), "=r" (cycles_low) > : : "%rax", "%rdx", "%rcx"); > > return (((uint64_t)cycles_high << 32) | cycles_low); > } > > > inline uint64_t rdtsc_end(void) > { > uint32_t cycles_high; > uint32_t cycles_low; > > asm volatile("rdtscp\n\t" > "mov %%edx, %0\n\t" > "mov %%eax, %1\n\t" > "lfence\n\t" > : "=r" (cycles_high), "=r" (cycles_low) > : : "%rax", "%rdx", "%rcx"); > > return (((uint64_t)cycles_high << 32) | cycles_low); > } > > This way I measured ~320 cycles used for getpid() on FreeBSD 12.1. For > comparison, in Linux (CentOS 7) this call uses ~10 cycles. I am aware that > this should not be compared directly. as these are different systems, but > such a big difference in performance is an issue for me, as getpid() is > called very often in my code. > > Is such a poor performance of getpid() a known problem and is it possible > that this might be improved in future releases? > glibc optimizes getpid() system call so it only calls it once and returns a cached value (which is in line with 10 cycles, there's no way you can save/restore state in 10 cycles, let alone do a dispatch). FreeBSD doesn't. Warner > Measurements were done on the same mahcine with following setup: > > CPU: Intel(R) Atom(TM) CPU C3958 @ 2.00GHz (2000.06-MHz K8-class CPU) > FreeBSD/SMP: Multiprocessor System Detected: 16 CPUs > > 8GB RAM (2x4GB): > Type: DDR4 > Type Detail: Synchronous Unbuffered (Unregistered) > Speed: 2400 MT/s > > Thank you very much in advance for any help. > > Kind regards, > Julian > _______________________________________________ > freebsd-drivers@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-drivers > To unsubscribe, send any mail to "freebsd-drivers-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfrSa7pfOWRLr9jji3ePshTOxFMYmy3=fQAonOdJO9a50g>