Date: Sat, 8 Jan 2011 15:39:59 -0800 (PST) From: Matthew Dillon <dillon@apollo.backplane.com> To: Joerg Sonnenberger <joerg@britannica.bec.de> Cc: freebsd-hackers@freebsd.org Subject: Re: MONITOR/MWAIT question Message-ID: <201101082339.p08Ndx5X015401@apollo.backplane.com> References: <201012182224.oBIMOYfl067212@apollo.backplane.com> <20101219070515.GB2433@britannica.bec.de>
next in thread | previous in thread | raw e-mail | index | archive | help
:On Sat, Dec 18, 2010 at 02:24:34PM -0800, Matthew Dillon wrote: :> Does anyone know if an IRET cancels/triggers a MONITOR event? : :AMD's Architecture Programmer's Manual explicitly contains: : :Events that cause an exit from the monitor event pending state include: :... :- Any far control transfer that occurs between the MONITOR and the :MWAIT. : :Joerg Yah. The Intel documentation listed specific instructions and said something about a 'far call' but wasn't generic enough. My AMD manuals are too old, I'm getting a new set. The AMD manual using the 'any far control transfer' terminology implies that IRET is also covered. Another interesting question came up and that is whether a write on the same cpu that MONITOR was run on (without a far control transfer) can trigger a later MWAIT. i.e. MONITOR addr, INCL addr, MWAIT addr, on the same cpu (that the MWAIT would then effectively be a NOP). The MONITOR/MWAIT stuff apparently ties into the cpu's cache management architecture and a local write to a cache line which is already exclusive might not count, so I'm not sure if that case is covered. I can't find a definitive answer so at some point I'll actually code something up and test it. It isn't a case which current uses trigger but I don't like question marks. Right now it looks like MONITOR/MWAIT works quite nicely with a pseudo-FIFO reservation model for handling cpu contention. Basically you have a windex and a rindex. You reserve a 'spot' using XADD on the windex and then resolve the cpu<->cpu contention with MONITOR/CMP/MWAIT's on rindex. Only the owner of the rindex (when rindex matches the reserved index, which is exactly one cpu out of the N contending cpus) can increment rindex. That way only *ONE* cpu at a time is trying to get the spin lock against the current lock holder instead of all the cpus contending with each other to try to get the spin lock from the current lock holder. Exponential backoff seems to fail horribly once you get over 8 cpus or so, but the pseudo-FIFO methodology seems to work well up to the maximum I've been able to test on (48 cpus). -Matt Matthew Dillon <dillon@backplane.com>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201101082339.p08Ndx5X015401>