Date: Tue, 10 Apr 2007 14:43:04 -0400 From: Kris Kennaway <kris@obsecurity.org> To: Mark Kirkwood <markir@paradise.net.nz> Cc: pgsql-hackers <pgsql-hackers@postgresql.org>, performance@FreeBSD.org, current@FreeBSD.org, Kris Kennaway <kris@obsecurity.org> Subject: Re: Anyone interested in improving postgresql scaling? Message-ID: <20070410184304.GB44123@xor.obsecurity.org> In-Reply-To: <461B69C0.4060707@paradise.net.nz> References: <20070226002234.GA80974@xor.obsecurity.org> <461B69C0.4060707@paradise.net.nz>
next in thread | previous in thread | raw e-mail | index | archive | help
--huq684BweRXVnRxX Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Apr 10, 2007 at 10:41:04PM +1200, Mark Kirkwood wrote: > Kris Kennaway wrote: > >If so, then your task is the following: > > > >Make SYSV semaphores less dumb about process wakeups. Currently > >whenever the semaphore state changes, all processes sleeping on the > >semaphore are woken, even if we only have released enough resources > >for one waiting process to claim. i.e. there is a thundering herd > >wakeup situation which destroys performance at high loads. Fixing > >this will involve replacing the wakeup() calls with appropriate > >amounts of wakeup_one(). >=20 > I'm forwarding this to the pgsql-hackers list so that folks more=20 > qualified than I can comment, but as I understand the way postgres=20 > implements locking each process has it *own* semaphore it waits on -=20 > and who is waiting for what is controlled by an in (shared) memory hash= =20 > of lock structs (access to these is controlled via platform Dependant=20 > spinlock code). So a given semaphore state change should only involve=20 > one process wakeup. I have not studied the exact code path, but there are indeed multiple wakeups happening from the semaphore code (as many as the number of active postgresql processes). It is easy to instrument sleepq_broadcast() and log them when they happen. Anyway mux@ fixed this some time ago, which indeed helped scaling for traffic over a local domain socket (particularly at higher loads), but I saw some anomalous results when using loopback TCP traffic. I think this is unrelated (in this situation TCP is highly contended, and it is often the case that fixing one bottleneck can make a highly contended situation perform worse, because you were effectively serializing a bit before, and reducing the non-linear behaviour) but am still investigating, so the patch has not yet been committed. Kris --huq684BweRXVnRxX Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (FreeBSD) iD8DBQFGG9q4Wry0BWjoQKURAgj5AKD8GphymMDpkMqiJsyxu77xXZN5RACbBlbV OxZZdXcUrbW7nwz2Ac/srxo= =UDMf -----END PGP SIGNATURE----- --huq684BweRXVnRxX--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070410184304.GB44123>