Date: Mon, 25 Sep 2000 03:02:13 +0000 (GMT) From: Terry Lambert <tlambert@primenet.com> To: grog@wantadilla.lemis.com (Greg Lehey) Cc: cp@bsdi.com (Chuck Paterson), archie@whistle.com (Archie Cobbs), brian@awfulhak.org (Brian Somers), joerg@cs.waikato.ac.nz (Joerg Micheel), mjacob@feral.com (Matthew Jacob), frank@exit.com (Frank Mayhar), jhb@pike.osd.bsdi.com (John Baldwin), markm@FreeBSD.ORG (Mark Murray), FreeBSD-arch@FreeBSD.ORG Subject: Re: Mutexes and semaphores (was: cvs commit: src/sys/conf files src/sys/sys random.h src/sys/dev/randomdev hash.c hash.h harvest Message-ID: <200009250302.UAA04620@usr05.primenet.com> In-Reply-To: <20000924154216.D512@wantadilla.lemis.com> from "Greg Lehey" at Sep 24, 2000 03:42:16 PM
next in thread | previous in thread | raw e-mail | index | archive | help
> >> A MUTEX is just a sepaphore whose initial count is 1. > >> > >> ?? > > > > In general this might be true, but in specific it isn't. > > As you know, I used to say exactly the same thing as Archie, but I've > realized that this implied count of 1 causes a couple of important > differences. I'm still working on a clearer definition, but what I've > seen so far is: > > 1. Because "mutexes" (I really hate this term; I wish I could find a > better one) only have an implied count of one, they can also have > the concept of an owner, which we use. > > 2. Because the mutex has an owner, only the owner can release it. > > 3. The mutex can also be "recursive" (it's really iterative, I > suppose): the owner can take it several times. The only reason > for this appears to be sloppy coding, but in the short term I > think we're agreed that we can't dispose of that. > > One thing that I don't think is important is the duration of > ownership. We currently use mutexes for short periods of time, which > is why we have the spin version. Actually, that's crucial, since it defines the conflict domain; you can acquire a heavy lock after contending with a spin lock for the right to acquire the heavy lock. In most cases, where the heavy lock is held a short time, you won't have any contention, and thus can quickly grant the resource. In the case of a long held resource, the contention domain is such that the resource is probably contended, and has waiters outstanding; this means that the release case (and thus the acquisition case) must be much heavier weight. I recommend: <http://www.cl.cam.ac.uk/Research/SRG/netos/pegasus/reports/TR361.txt> <http://www.wam.umd.edu/whats_new/workshop3.0/common-tools/numerical_comp_guide/ncg_glossary.doc.html> This second is a rather good glossary, which is duplicated in many places on the net. > At Tandem, we only used semaphores, but they always had a count of 1, > so they were effectively very close to our mutexes. They didn't allow > recursion, which is the Right Thing in a system designed from the > ground up, but they also didn't have owners. One of the most frequent > complicated problems we had were system hangs (deadlocks), and we > frequently couldn't figure out who had done what and why. Having > owners is a great debug aid. I think that we need to be very clear on one thing: you can recurse on a semaphore, but a true mutex will not permit recursion; it is a light weight object, and has very little content. It lacks a recurse count, and many other attributes of semaphores. Microsoft actually got this right in Windows, surprisingly. When you attempt to get a mutex you already hold, you are shooting yourself in the foot; it means that you didn't track the resource sufficiently. Usually this occurs when a mutex is acquired at one level, and released at another, or worse, when it is acquired in the wrong place (e.g. a subroutine called several times from a higher level routine, which should be acquiring the mutex instead). Disallowing recursion, mutex ownership is therefore implicit by virtue of the holder of the mutex holding it. In the case of a starvation or deadly embrace deadlock, one need only get a stack trace of processes currently in the kernel to determine where the problem lives; however, an owner would make this rather automatic, and could aid debugging, as you say. I do have a problem with this approach, however, since it makes it much more likely that people will be sloppy, and then wait for deadlocks to be reported, rather than thinking through their code and ensuring that deadlocks are not possible in the first place. The idea that fixing deadlocks in released code, rather than releasing only code without deadlocks, is an acceptable approach needs to be discouraged. > If we can expect that the mutex will, on average, be freed in less > time than it would take to schedule a new process, spin locks can be a > better alternative. Otherwise we wouldn't need them at all. > > Anyway, this doesn't directly relate to semaphores. We have the basic > issue of atomicity, which in general can be handled without spin > locks, and that would apply to semaphores just as much as to mutexes. The advantage to a test-and-set spin prior to acquisition of a mutex is that the mutex can be acquired without taking a cache synchronization hit between processors, which would otherwise be necessary. Some cache synchronization events will inevitably occur, but they will be much less frequent. The mutexes themselves can be in non-cached pages, to accomplish this. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200009250302.UAA04620>