Date: Fri, 28 Jan 2011 17:10:30 +0100 From: Ivan Voras <ivoras@freebsd.org> To: John Baldwin <jhb@freebsd.org> Cc: freebsd-hackers@freebsd.org Subject: Re: Namecache lock contention? Message-ID: <AANLkTimyFXopbVvJuTYH0Ck2Z4ze5s8F_nb1KFn00FnG@mail.gmail.com> In-Reply-To: <201101281015.36218.jhb@freebsd.org> References: <ihuhav$qso$1@dough.gmane.org> <201101281015.36218.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 28 January 2011 16:15, John Baldwin <jhb@freebsd.org> wrote: > On Friday, January 28, 2011 8:46:07 am Ivan Voras wrote: >> I have this situation on a PHP server: >> >> 36623 www =C2=A0 =C2=A0 =C2=A0 =C2=A0 1 =C2=A076 =C2=A0 =C2=A00 =C2=A0 2= 37M 30600K *Name =C2=A0 6 =C2=A0 0:14 47.27% php-cgi >> 36638 www =C2=A0 =C2=A0 =C2=A0 =C2=A0 1 =C2=A076 =C2=A0 =C2=A00 =C2=A0 2= 37M 30600K *Name =C2=A0 3 =C2=A0 0:14 46.97% php-cgi >> 36628 www =C2=A0 =C2=A0 =C2=A0 =C2=A0 1 105 =C2=A0 =C2=A00 =C2=A0 237M 3= 0600K *Name =C2=A0 2 =C2=A0 0:14 46.88% php-cgi >> 36627 www =C2=A0 =C2=A0 =C2=A0 =C2=A0 1 105 =C2=A0 =C2=A00 =C2=A0 237M 3= 0600K *Name =C2=A0 0 =C2=A0 0:14 46.78% php-cgi >> 36639 www =C2=A0 =C2=A0 =C2=A0 =C2=A0 1 105 =C2=A0 =C2=A00 =C2=A0 237M 3= 0600K *Name =C2=A0 5 =C2=A0 0:14 46.58% php-cgi >> 36643 www =C2=A0 =C2=A0 =C2=A0 =C2=A0 1 105 =C2=A0 =C2=A00 =C2=A0 237M 3= 0600K *Name =C2=A0 7 =C2=A0 0:14 46.39% php-cgi >> 36629 www =C2=A0 =C2=A0 =C2=A0 =C2=A0 1 =C2=A076 =C2=A0 =C2=A00 =C2=A0 2= 37M 30600K *Name =C2=A0 1 =C2=A0 0:14 46.39% php-cgi >> 36642 www =C2=A0 =C2=A0 =C2=A0 =C2=A0 1 105 =C2=A0 =C2=A00 =C2=A0 237M 3= 0600K *Name =C2=A0 2 =C2=A0 0:14 46.39% php-cgi >> 36626 www =C2=A0 =C2=A0 =C2=A0 =C2=A0 1 105 =C2=A0 =C2=A00 =C2=A0 237M 3= 0600K *Name =C2=A0 5 =C2=A0 0:14 46.19% php-cgi >> 36654 www =C2=A0 =C2=A0 =C2=A0 =C2=A0 1 105 =C2=A0 =C2=A00 =C2=A0 237M 3= 0600K *Name =C2=A0 7 =C2=A0 0:13 46.19% php-cgi >> 36645 www =C2=A0 =C2=A0 =C2=A0 =C2=A0 1 105 =C2=A0 =C2=A00 =C2=A0 237M 3= 0600K *Name =C2=A0 1 =C2=A0 0:14 45.75% php-cgi >> 36625 www =C2=A0 =C2=A0 =C2=A0 =C2=A0 1 105 =C2=A0 =C2=A00 =C2=A0 237M 3= 0600K *Name =C2=A0 0 =C2=A0 0:14 45.56% php-cgi >> 36624 www =C2=A0 =C2=A0 =C2=A0 =C2=A0 1 105 =C2=A0 =C2=A00 =C2=A0 237M 3= 0600K *Name =C2=A0 6 =C2=A0 0:14 45.56% php-cgi >> 36630 www =C2=A0 =C2=A0 =C2=A0 =C2=A0 1 =C2=A076 =C2=A0 =C2=A00 =C2=A0 2= 37M 30600K *Name =C2=A0 7 =C2=A0 0:14 45.17% php-cgi >> 36631 www =C2=A0 =C2=A0 =C2=A0 =C2=A0 1 105 =C2=A0 =C2=A00 =C2=A0 237M 3= 0600K RUN =C2=A0 =C2=A0 4 =C2=A0 0:14 45.17% php-cgi >> 36636 www =C2=A0 =C2=A0 =C2=A0 =C2=A0 1 105 =C2=A0 =C2=A00 =C2=A0 237M 3= 0600K *Name =C2=A0 3 =C2=A0 0:14 44.87% php-cgi >> >> It looks like periodically most or all of the php-cgi processes are >> blocked in "*Name" for long enough that "top" notices, then continue, >> probably in a "thundering herd" way. From grepping inside /sys the most >> likely suspect seems to be something in the namecache, but I can't find >> exactly a symbol named "Name" or string beginning with "Name" that would >> be connected to a lock. > > In vfs_cache.c: > > static struct rwlock cache_lock; > RW_SYSINIT(vfscache, &cache_lock, "Name Cache"); You're right, I misread it as SYSCTL at a glance. > What are the php scripts doing? =C2=A0Do they all try to create and delet= e files at > the same time (or do renames)? Right again - they do simultaneously create session files and in rare occasions (1%) delete them. These are "sharded" into a two-level directory structure by single letter (/storage/a/b/file, i.e. 32^2 directories); dirhash is large enough. During all this, the web server did around 60 PHP pages per second so it doesn't look to me like there should be such noticable contention (i.e. at most, there are 60 files/s created and on average 60/100 deletes). The file system is on softupdates, there's only light IO. Typical vmstat is: procs memory page disks faults cp= u r b w avm fre flt re pi po fr sr da0 da1 in sy cs us sy id 17 0 0 8730M 1240M 3 0 0 0 206 0 1 0 1948 266928 15079 65 34 1 19 0 0 8730M 1240M 0 0 0 0 290 0 1 24 1835 260618 15132 63 35 2 7 0 0 8730M 1239M 0 0 0 0 200 0 0 0 1822 260783 14851 63 35 2 16 0 0 8730M 1239M 0 0 0 0 199 0 788 0 2744 259902 20465 61 37 2 16 0 0 8730M 1239M 0 0 0 0 210 0 0 0 1755 265081 17564 61 37 2 (8 cores; around 35% sys load across them - I'm trying to find out why).
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTimyFXopbVvJuTYH0Ck2Z4ze5s8F_nb1KFn00FnG>