Date: Fri, 23 Nov 2007 12:28:33 +0100 From: Max Laier <max@love2party.net> To: freebsd-arch@freebsd.org Cc: Stephan Uphoff <ups@freebsd.org>, Attilio Rao <attilio@freebsd.org>, Alfred Perlstein <alfred@freebsd.org> Subject: Re: rwlocks, correctness over speed. Message-ID: <200711231228.41382.max@love2party.net> In-Reply-To: <20071123092415.GP44563@elvis.mu.org> References: <20071121222319.GX44563@elvis.mu.org> <47469328.8020404@freebsd.org> <20071123092415.GP44563@elvis.mu.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--nextPart1382413.3MpHy4ZRLo Content-Type: multipart/mixed; boundary="Boundary-01=_klrRHoFB8DwWuBG" Content-Transfer-Encoding: 7bit Content-Disposition: inline --Boundary-01=_klrRHoFB8DwWuBG Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline On Friday 23 November 2007, Alfred Perlstein wrote: > * Stephan Uphoff <ups@freebsd.org> [071123 00:46] wrote: > > Alfred Perlstein wrote: > > >* Max Laier <max@love2party.net> [071122 14:40] wrote: > > >>On Thursday 22 November 2007, Attilio Rao wrote: > > >>>2007/11/22, Max Laier <max@love2party.net>: > > >>>>rwlocks are already used in places that do recursive reads. The > > >>>> one place I'm certain about is pfil(9) and we need a proper > > >>>> sollution for this. Before rwlocks were used, I had a handrolled > > >>>> locking that supported both read/write semantics and starvation > > >>>> avoidance - at the cost of failing to allow futher read access > > >>>> when a writer asked for access. This however, was quite > > >>>> application specific and not the most efficient implementation > > >>>> either. > > >>> > > >>>I'm not a pfil(9) expert, but for what I've heard, rmlocks should > > >>> be really good for it, shouldn't them? > > >>> > > >>>The concept is that if we want to maintain fast paths for rwlock > > >>> we cannot do too much tricks there. And you can really deadlock > > >>> if you allow recursion on readers... > > >> > > >>How about adding rwlock_try_rlock() which would do the following: > > >> 1) Only variant to allow[1] read recursion and ... > > >> 2) ... only if no outstanding write requests > > >> 3) Let the caller deal with failure > > >> > > >>This can be implemented statically, so no overhead in the fast > > >> path. The caller is in the best position to decide if it is > > >> recursing or not - could keep that info on the stack - and can > > >> either fail[2] or do a normal rwlock_rlock() which would wait for > > >> the writer to enter and exit. > > >> > > >>[2] In most situation where you use read locks you can fail or roll > > >> back carefully as you didn't change anything - obviously. In pfil > > >> - for instance - we just dropped the packet if there was a writer > > >> waiting. > > >> > > >>[1] "allow" in terms of WITNESS - if that can be done. > > > > > >The problem is that there is no tracking in the common case (without > > >additional overhead), so you can't know if you're recursing, unless > > >you track it yourself. > > > > > >-Alfred > > > > I talked with Attilio about that on IRC. > > Most common cases of writer starvation (but not all) could be solved > > by keeping a per thread count of shared acquired rwlocks. > > If a rwlock is currently locked as shared/read AND a thread is > > blocked on it to lock it exclusively/write - then new shared/read > > locks will only be granted to thread that already has a shared lock. > > (per thread shared counter is non zero) > > > > To be honest I am a bit twitchy about a lock without priority > > propagation - especially since in FreeBSD threads run with user > > priority in kernel > > space and can get preempted. > > > > Stephan > > That's an interesting hack, I guess it could be done. > > I would still like to disallow recursion. > > Can we come to a concensus on that? As long as we properly take care of all the fallout before doing so, I'm=20 all for it. Attached is a diff to switch pfil over to rmlocks which works for me. =20 I'll post it to -net and do the switch in a few days unless I hear=20 otherwise. =2D-=20 /"\ Best regards, | mlaier@freebsd.org \ / Max Laier | ICQ #67774661 X http://pf4freebsd.love2party.net/ | mlaier@EFnet / \ ASCII Ribbon Campaign | Against HTML Mail and News --Boundary-01=_klrRHoFB8DwWuBG Content-Type: text/x-diff; charset="iso-8859-1"; name="pfil.rmlock.diff" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="pfil.rmlock.diff" Index: pfil.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /home/ncvs/src/sys/net/pfil.c,v retrieving revision 1.14 diff -u -r1.14 pfil.c =2D-- pfil.c 2 Feb 2006 03:13:15 -0000 1.14 +++ pfil.c 23 Nov 2007 09:06:12 -0000 @@ -34,7 +34,7 @@ #include <sys/errno.h> #include <sys/lock.h> #include <sys/malloc.h> =2D#include <sys/rwlock.h> +#include <sys/rmlock.h> #include <sys/socket.h> #include <sys/socketvar.h> #include <sys/systm.h> @@ -66,11 +66,12 @@ pfil_run_hooks(struct pfil_head *ph, struct mbuf **mp, struct ifnet *ifp, int dir, struct inpcb *inp) { + struct rm_priotracker rmpt; struct packet_filter_hook *pfh; struct mbuf *m =3D *mp; int rv =3D 0; =20 =2D PFIL_RLOCK(ph); + PFIL_RLOCK(ph, &rmpt); KASSERT(ph->ph_nhooks >=3D 0, ("Pfil hook count dropped < 0")); for (pfh =3D pfil_hook_get(dir, ph); pfh !=3D NULL; pfh =3D TAILQ_NEXT(pfh, pfil_link)) { @@ -80,7 +81,7 @@ break; } } =2D PFIL_RUNLOCK(ph); + PFIL_RUNLOCK(ph, &rmpt); =09 *mp =3D m; return (rv); @@ -104,7 +105,7 @@ } PFIL_LIST_UNLOCK(); =20 =2D rw_init(&ph->ph_mtx, "PFil hook read/write mutex"); + PFIL_LOCK_INIT(ph); PFIL_WLOCK(ph); ph->ph_nhooks =3D 0; =20 @@ -143,7 +144,7 @@ free(pfh, M_IFADDR); TAILQ_FOREACH_SAFE(pfh, &ph->ph_out, pfil_link, pfnext) free(pfh, M_IFADDR); =2D rw_destroy(&ph->ph_mtx); + PFIL_LOCK_DESTROY(ph); =09 return (0); } Index: pfil.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /home/ncvs/src/sys/net/pfil.h,v retrieving revision 1.16 diff -u -r1.16 pfil.h =2D-- pfil.h 8 Jun 2007 12:43:25 -0000 1.16 +++ pfil.h 23 Nov 2007 10:07:52 -0000 @@ -37,7 +37,7 @@ #include <sys/_lock.h> #include <sys/_mutex.h> #include <sys/lock.h> =2D#include <sys/rwlock.h> +#include <sys/rmlock.h> =20 struct mbuf; struct ifnet; @@ -69,7 +69,7 @@ pfil_list_t ph_out; int ph_type; int ph_nhooks; =2D struct rwlock ph_mtx; + struct rmlock ph_lock; union { u_long phu_val; void *phu_ptr; @@ -93,10 +93,13 @@ struct pfil_head *pfil_head_get(int, u_long); =20 #define PFIL_HOOKED(p) ((p)->ph_nhooks > 0) =2D#define PFIL_RLOCK(p) rw_rlock(&(p)->ph_mtx) =2D#define PFIL_WLOCK(p) rw_wlock(&(p)->ph_mtx) =2D#define PFIL_RUNLOCK(p) rw_runlock(&(p)->ph_mtx) =2D#define PFIL_WUNLOCK(p) rw_wunlock(&(p)->ph_mtx) +#define PFIL_LOCK_INIT(p) \ + rm_init(&(p)->ph_lock, "PFil hook read/write mutex", LO_RECURSABLE) +#define PFIL_LOCK_DESTROY(p) rm_destroy(&(p)->ph_lock) +#define PFIL_RLOCK(p, t) rm_rlock(&(p)->ph_lock, (t)) +#define PFIL_WLOCK(p) rm_wlock(&(p)->ph_lock) +#define PFIL_RUNLOCK(p, t) rm_runlock(&(p)->ph_lock, (t)) +#define PFIL_WUNLOCK(p) rm_wunlock(&(p)->ph_lock) #define PFIL_LIST_LOCK() mtx_lock(&pfil_global_lock) #define PFIL_LIST_UNLOCK() mtx_unlock(&pfil_global_lock) =20 --Boundary-01=_klrRHoFB8DwWuBG-- --nextPart1382413.3MpHy4ZRLo Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQBHRrlpXyyEoT62BG0RAqC0AKCBuMmC7YmsW331yCGrGzdoucy+QQCfZHi6 pVviPYRmNUns7i6x0jUpMzU= =9OYH -----END PGP SIGNATURE----- --nextPart1382413.3MpHy4ZRLo--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200711231228.41382.max>