Date: Sat, 29 Jun 2013 17:48:59 +0300 From: Konstantin Belousov <kostikbel@gmail.com> To: Alexander Motin <mav@FreeBSD.org> Cc: Adrian Chadd <adrian@freebsd.org>, hackers@freebsd.org Subject: Re: b_freelist TAILQ/SLIST Message-ID: <20130629144859.GB91021@kib.kiev.ua> In-Reply-To: <51CE8763.2090406@FreeBSD.org> References: <51CCAE14.6040504@FreeBSD.org> <20130628065732.GL91021@kib.kiev.ua> <51CE0AF7.6090906@FreeBSD.org> <20130629023532.GW91021@kib.kiev.ua> <51CE8763.2090406@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--mUmOsk7ZE69Fau6J Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Jun 29, 2013 at 10:06:11AM +0300, Alexander Motin wrote: > I understand that lock attempt will steal cache line from lock owner.=20 > What I don't very understand is why avoiding it helps performance in=20 > this case. Indeed, having mutex on own cache line will not let other=20 > cores to steal also bswlist, but it also means that bswlist should be=20 > prefetched separately (and profiling shows resource stalls there). Or in= =20 > this case separate speculative prefetch will be better then forced one=20 > which could be stolen? Is there cases when it is not, or the only reason= =20 > to not pad all global mutexes is only saving memory? I can speculate that it is the case when speculative execution helps. If mutex and list head are on the different cache lines, then cpu could speculatively read the head, and then prove that executing the read before the lock acquisition does not break the ordering rules (because lock protects the head, other core indeed cannot modify the head if the lock acquisition was successfull). I think it is very similar reason why locked instructions as barriers are faster then the explicit barriers, cpu could still do the speculative execution after the lock prefix if the ordering is provable consistent. Please see the Intel IA32 architecture optimization manual 8.4.5 for the recommendations (but not much explanation). Yes, I think putting all locks on dedicated cache lines is the waste, only hot locks need this. --mUmOsk7ZE69Fau6J Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.20 (FreeBSD) iQIcBAEBAgAGBQJRzvPbAAoJEJDCuSvBvK1BpdIQAKj4KU/KFAPjIGQWMBi0Mmqs CoItYYEC+okQgZSFpZn6KFWlZmLxb9fB8S0hPL6ytKwdi6XAwNnVdhSuNrITDYZQ 10dylktNHkpeS9/OxmEmxIPe9kvPxlbhd+ffBUHiQqFpbzYgpVJbTVed9ClwrPxI Zp+1pWDugRYnGzzrNz8B4DsD2EkxzlxVG+6bN4Gs/0Hk9FZ2dpbZ0cosESmd8vT2 Jtl2/Mc56pJ4HXOM65Pe3gUwx8Yo1Mj9XQrmC9FroI9iuJL987QhK5aN66r1A+x/ Yhh0koj8+cQ0Vzi6BKHRfVkrd9PU8mV7JKXvsAuaZfQFjYmMpZeNK9WYt3et4Xhy +YO6Cqvy09mJs2JhlsCbpdk/Ytl+BryhjI5WdSMObtw4nuYGOipwIX4xkmNSrn61 IbhKWTnrhrsx5deeARUPQ1Bb9zn5QuBEaXOWO4d+w0yJDFDMTMosj4FlhRRsk3b9 NmdPPOukESP5CgZcgvGvsBn7mCcZlXucoZBwcubOvM6JXBWc3S2DJuA+C39Fs3CN ZVLaQZqna3AwFZCXGhcWFQclsIHlrbYbVgndfsXs8mT2wq15bz2rcdX5A1AL1nAW z66mKxC28ZksEbEC0eZv+IWGVcXnxiOGK2XHXLVpzKq3FEoKG/13QfVRgTOuSvhq PtncUTjUY0V1W/fCJApG =vB/a -----END PGP SIGNATURE----- --mUmOsk7ZE69Fau6J--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130629144859.GB91021>