Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 8 May 2006 02:52:08 -0400
From:      Kris Kennaway <kris@obsecurity.org>
To:        Kris Kennaway <kris@obsecurity.org>
Cc:        performance@FreeBSD.org, Robert Watson <rwatson@FreeBSD.org>, current@FreeBSD.org
Subject:   Re: Fine-grained locking for POSIX local sockets (UNIX domain sockets)
Message-ID:  <20060508065207.GA20386@xor.obsecurity.org>
In-Reply-To: <20060507230430.GA6872@xor.obsecurity.org>
References:  <20060506150622.C17611@fledge.watson.org> <20060506221908.GB51268@xor.obsecurity.org> <20060507210426.GA4422@xor.obsecurity.org> <20060507214153.GA5275@xor.obsecurity.org> <20060507230430.GA6872@xor.obsecurity.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--r5Pyd7+fXNt84Ff3
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

OK, David's patch fixes the umtx thundering herd (and seems to give a
4-6% boost).  I also fixed a thundering herd in FILEDESC_UNLOCK (which
was also waking up 2-7 CPUs at once about 30% of the time) by doing
s/wakeup/wakeup_one/.  This did not seem to give a performance impact
on this test though.

It seems to me that a more useful way to sort the mutex profiling list
is by ration of contention to total acquisitions.  Here is the list
resorted by cnt_hold/count, keeping only the top 40 values of count
and the mutexes with nonzero contention:

Before:
   max      total      count    avg   cnt_hold  cnt_lock   ratio name
   275     507115     166457      3        907       1348   .005 kern/vfs_bio.c:357 (needsbuffer lock)
   310     487209     166460      2       1158        645   .006 kern/vfs_bio.c:315 (needsbuffer lock)
  1084    3860336     166507     23       1241       1377   .007 kern/vfs_bio.c:1445 (buf queue lock)
  1667   35018604     320038    109       3877          0   .012 kern/uipc_usrreq.c:696 (unp_mtx)
   379    2143505     635740      3      10736      37083   .016 kern/sys_socket.c:176 (so_snd)
  1503    4311935     502656      8       8664       9312   .017 kern/kern_lock.c:163 (lockbuilder mtxpool)
   875    3495175     166487     20       3394       4272   .020 kern/vfs_bio.c:2424 (vnode interlock)
  2084  121390320    2880081     42      67339      79525   .023 kern/uipc_usrreq.c:581 (so_snd)
   909    1809346     165769     10       4454       9597   .026 kern/vfs_vnops.c:796 (vnode interlock)
   277     518716     166442      3       5034       5172   .030 kern/vfs_bio.c:1464 (vnode interlock)
  1565   10515648     282278     37      15760      10821   .055 kern/subr_sleepqueue.c:374 (process lock)
   492    2500241     634835      3      54003      62520   .085 kern/kern_sig.c:1002 (process lock)
   569     335913      30022     11       3262       2176   .108 kern/kern_sx.c:245 (lockbuilder mtxpool)
  1378   27840143     320038     86      42183       1453   .131 kern/uipc_usrreq.c:705 (so_rcv)
   300    1011100     320045      3      52423      30742   .163 kern/uipc_socket.c:1101 (so_snd)
   437   10472850    3200213      3     576918     615361   .180 kern/kern_resource.c:1172 (sleep mtxpool)
  2052   46242974     320039    144      80690      80729   .252 kern/uipc_usrreq.c:617 (unp_global_mtx)
   546   48160602    3683470     13    1488801     696814   .404 kern/kern_descrip.c:1988 (filedesc structure)
   395   13842967    3683470      3    1568927     685295   .425 kern/kern_descrip.c:1967 (filedesc structure)
   644   16700212     635731     26     606615     278511   .954 kern/kern_descrip.c:420 (filedesc structure)
   384    2863741     635774      4     654035     280340  1.028 kern/kern_descrip.c:368 (filedesc structure)
   604   22164433    2721994      8    5564709    2225496  2.044 kern/kern_synch.c:220 (process lock)

After:
   max      total      count    avg   cnt_hold  cnt_lock   ratio name
   168     467413     166364      2       1025       2655   .006 kern/vfs_bio.c:357 (needsbuffer lock)
   264     453972     166364      2       1688         44   .010 kern/vfs_bio.c:315 (needsbuffer lock)
   240    2011519     640106      3      12032      48460   .018 kern/sys_socket.c:176 (so_snd)
   425    5394174     514469     10      12838      15343   .024 kern/kern_lock.c:163 (lockbuilder mtxpool)
   514    5127131     166383     30       4417       5666   .026 kern/vfs_bio.c:1445 (buf queue lock)
   261     199860      38442      5       1405        475   .036 kern/kern_sx.c:245 (lockbuilder mtxpool)
   707  174604101    2880083     60     119723      84566   .041 kern/uipc_usrreq.c:581 (so_snd)
   126     520485     166351      3       7850       8574   .047 kern/vfs_bio.c:1464 (vnode interlock)
   364    1850567     165607     11       8077      22156   .048 kern/vfs_vnops.c:796 (vnode interlock)
   499    3233479     166432     19       9258       8468   .055 kern/vfs_bio.c:2424 (vnode interlock)
   754   42181810     320038    131      21236          0   .066 kern/uipc_usrreq.c:696 (unp_mtx)
   462   21081419    3685605      5     316514     243585   .085 kern/kern_descrip.c:1988 (filedesc structure)
   577   12178436     321182     37      28585      21082   .088 kern/subr_sleepqueue.c:374 (process lock)
   221    2410704     640387      3      75056      77553   .117 kern/kern_sig.c:1002 (process lock)
   309   12026860    3685605      3     468707     331121   .127 kern/kern_descrip.c:1967 (filedesc structure)
   299     973885     320046      3      60629      72506   .189 kern/uipc_socket.c:1101 (so_snd)
   471    6132557     640097      9     125478      98778   .196 kern/kern_descrip.c:420 (filedesc structure)
   737   33114067     320038    103      85243          1   .266 kern/uipc_usrreq.c:705 (so_rcv)
   454    5866777     878113      6     240669     364921   .274 kern/kern_synch.c:220 (process lock)
   365    2308060     640133      3     183152     142569   .286 kern/kern_descrip.c:368 (filedesc structure)
   220   10297249    3200211      3    1117448    1175412   .349 kern/kern_resource.c:1172 (sleep mtxpool)
   947   57806295     320040    180     132456     109179   .413 kern/uipc_usrreq.c:617 (unp_global_mtx)

filedesc contention is down by a factor of 3-4, with corresponding
reduction in the average hold time.  The process lock contention
coming from the signal delivery wakeup has also gone way down for some
reason.

unp contention has risen a bit.  The other big gain is to sleep
mtxpool contention, which roughly doubled:

/*
 * Change the total socket buffer size a user has used.
 */
int
chgsbsize(uip, hiwat, to, max)
        struct  uidinfo *uip;
        u_int  *hiwat;
        u_int   to;
        rlim_t  max;
{
        rlim_t new;

        UIDINFO_LOCK(uip);

So the next question is how can that be optimized?

Kris
--r5Pyd7+fXNt84Ff3
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (FreeBSD)

iD8DBQFEXuqXWry0BWjoQKURAooFAKDolGQ8HXdVW06/t3LYDdllYL/TAgCfanPu
b/FT/nnB0xb6Lon1bQJi2Cs=
=Db4k
-----END PGP SIGNATURE-----

--r5Pyd7+fXNt84Ff3--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060508065207.GA20386>