Date: Mon, 28 Jun 1999 05:03:54 +0000 (GMT) From: Terry Lambert <tlambert@primenet.com> To: dillon@apollo.backplane.com (Matthew Dillon) Cc: freebsd-smp@FreeBSD.ORG Subject: Re: high-efficiency SMP locks - submission for review Message-ID: <199906280503.WAA26806@usr04.primenet.com> In-Reply-To: <199906272024.NAA15634@apollo.backplane.com> from "Matthew Dillon" at Jun 27, 99 01:24:01 pm
index | next in thread | previous in thread | raw e-mail
> I would like to know what the SMP gurus think of this code. I have
> not posted the complete patch, but have included the meat of the code.
> I am interested in comments relating to the implementation & performance
> aspects of the code. I would also appreciate it if an inline assembly
> expert could look at my inline assembly. I have tested it well and
> I believe it to be correct, but I don't write much inline assembly so...
>
> This code is designed to implement extremely low overhead SMP-compatible
> locks. It will probably be used instead of lockmgr() locks for the
> buffer cache, and I hope to use it as a basis to replace lockmgr() locks
> in other modules later on. The same structure can be used to implement
> both spin locks and normal blocking locks, and can handle shared,
> exclusive, and recursive-exclusive lock types. The critical path has
> been optimize down to 20 instructions or so ( compared to the hundreds
> in lockmgr() ).
>
> As many of you may already know, lockmgr() locks are extremely expensive
> and require interrupt synchronization for their internal simplelocks to
> operate properly in a mixed process/interrupt environment. The result is
> a horrendous amount of overhead for very little gain. I am attempting
> to come up with a general locking module that can eventually replace
> lockmgr() locks.
The lockmgr() locks are ill suited to use for anything other than
advisory/mandatory file locking, IMO. Any move away from them is a
good thing. However, I have some comment on the implementation.
1) SMP locking should be done utilizing intention modes. I
had a nice discussion with Mike Smith and Julian over the
proposed locking mechanism for non-intention mode multiple
reader, single writer locks for the buffer cache. This
boiled down to:
A) Use of non-intention mode locks has a serious
serialization penalty.
B) Use of locks, where the write queue has a shared
intention exclusive (SIX) lock implicit to the
enqueueing of synchronization point data (e.g. a
soft dependency on a directory entry block with
one backed out operation) _significantly_ reduces
the serialization overhead.
C) There's really no reason that the blocks that are
not locked exclusive on the list can't be updated
by user processes. This resolves the Ziff-Davis
Labs benchmark issue, more cleanly than the way
that Mike stated Kirk proposed to solve the problem.
2) In general, as regards, SMP locking, I think that your lock
structure is too abbreviated:
> struct qlock {
> volatile int count; /* lock refs: neg=exclusive, pos=shared */
> int waiting; /* process(s) are waiting on lock */
> struct proc *holder; /* holder of an exclusive lock */
> };
In particular, the use of a single wait count means that you
will not have an ordered list for the equivalent of a "wake
one". This is a real problem, since it allows for a deadly
embrace deadlock to occur when two kernel contexts each hold
one lock and want to acquire the other contexts lock.
I believe that resolving this as an EWOULDBLOCK, and then
backtracking the stack and any partially complete operations
is prohibitive.
The use of a count, rather than a relationship between
the loc holder(s) and the things being locked is also
similarly problematic.
Finally, there is insufficient mechanism to avoid competition
starvation, where a write blocks indefinitely as multiple
readers pass the resource between each other.
I believe the following is the minimal set of structures
required to resolve the blocking operation inheritance and
deadlock detection:
/*
*
*/
struct lockable;
typedef struct lockable LKA;
struct lockingentity;
typedef struct lockingentity LKE;
struct locklistentry;
typedef struct locklistentry LKLE;
struct lockentitylistentry;
typedef struct lockentitylistentry LKEE;
struct lockentitylistentry {
LKEE *next; /* next list entry*/
LKEE *inherit; /* who inherits this dependency*/
LKE *entity; /* entity waiting on us*/
};
struct locklistentry {
LKLE *next; /* next lock in list*/
LKA *held; /* a held lock*/
};
struct lockingentity {
LKLE *holding; /* locks that entity is holding*/
LKA *waiton; /* lock that entity is waiting on*/
};
struct lockable {
LKEE *holds; /* the locking entities*/
LKEE *waits; /* entities waiting for the lock*/
};
This presumes a model where each context which wishes to
acquire a lock is considered a lockingentity (this works
equally well for kernel threads vs. processes vs. async
call gates, vs. pseudo async call gates ala lazy thread
context creation for kernel sleeps ala BSDI).
Each object that can be locked is considered a lockable,
and the relationship between a locking entity and a
lockable is a locklistentry.
The lockentitylistentry is used to inherit blocked,
pending operations to the root of the lock tree (graph).
This allows near instantaneous deadlock detection, and
the inheritance list combined with the graph represents
the transitive closure patch between any lock you propose,
and the locks which already exist.
Finally, one could envision a hierarchical relationship
between lockingentities, e.g. multiple threads within a
process, so as to avoid self-deadlock. This really
depends on your threads implemenation; obviously, a user
space call conversion mechanism backed by multiple kernel
threads, and implemented wholly using asynchronous kernel
entries (e.g. an async call gate) would be immune to the
requirement. Other implementations would need something
like:
/*
* iml.c
*
* An intention mode based lock manager
*
* This software uses graph theory. A graph is a mathematical object
* that can accurately model a problem formulated in terms of objects
* and the relationships between them. Lock management is one such
* problem. I recommend:
*
* Algorithms in C++
* Robert Sedgewick
* Addison-Wesley Publishing Company
* ISBN 0-201-51059-6
*
* The Art Of computer Programming
* _Volume 1: Fundamental Algorithms_
* Donald Knuth
* Addison-Wesley Publishing Company
* ISBN 0-201-03809-9
*
* UNIX Systems for Modern Architectures
* _Symmetric Multiprocessing and Caching for Kernel Programmers_
* Curt Schimmel
* Addison-Wesley Publishing Company
* ISBN 0-201-63338-8
*
*/
struct lockable;
typedef struct lockable IML_L;
struct lockentity;
typedef struct lockentity IML_E;
struct locklock;
typedef struct locklock IML_LOCK;
/*
* A node that can have a lock applied against it
*/
struct lockable {
IML_L *parent; /* our parent locakable*/
IML_L *sibling; /* sibling lockable(s), if any*/
IML_L *child; /* child lockable(s), if any*/
IML_LOCK *locks; /* locks that are held*/
IML_LOCK *waits; /* locks that are waiting*/
unsigned long nesting_lvl;
};
/*
* A lock entity node for holding locks and entity relationships
*/
struct lockentity {
IML_E *sibling; /* entites with a relationship to us*/
IML_LOCK *locks; /* the list of locks we hold*/
};
/*
* A lock instance
*/
struct locklock {
IML_E *entity; /* the entity that applied the lock*/
IML_L *lockable; /* where the lock was applied*/
IML_LOCK *enextlock; /* the next lock by the entity*/
IML_LOCK *lnextlock; /* the next lock on the lockable*/
unsigned long mode; /* the mode of this lock*/
};
Note: This doesn't deal with the inheritance issue for waiters,
which would use a structure similar to the structure from the
previous example.
Terry Lambert
terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message
home |
help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199906280503.WAA26806>
