Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 23 Nov 2007 12:28:33 +0100
From:      Max Laier <max@love2party.net>
To:        freebsd-arch@freebsd.org
Cc:        Stephan Uphoff <ups@freebsd.org>, Attilio Rao <attilio@freebsd.org>, Alfred Perlstein <alfred@freebsd.org>
Subject:   Re: rwlocks, correctness over speed.
Message-ID:  <200711231228.41382.max@love2party.net>
In-Reply-To: <20071123092415.GP44563@elvis.mu.org>
References:  <20071121222319.GX44563@elvis.mu.org> <47469328.8020404@freebsd.org> <20071123092415.GP44563@elvis.mu.org>

next in thread | previous in thread | raw e-mail | index | archive | help
--nextPart1382413.3MpHy4ZRLo
Content-Type: multipart/mixed;
  boundary="Boundary-01=_klrRHoFB8DwWuBG"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

--Boundary-01=_klrRHoFB8DwWuBG
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

On Friday 23 November 2007, Alfred Perlstein wrote:
> * Stephan Uphoff <ups@freebsd.org> [071123 00:46] wrote:
> > Alfred Perlstein wrote:
> > >* Max Laier <max@love2party.net> [071122 14:40] wrote:
> > >>On Thursday 22 November 2007, Attilio Rao wrote:
> > >>>2007/11/22, Max Laier <max@love2party.net>:
> > >>>>rwlocks are already used in places that do recursive reads.  The
> > >>>> one place I'm certain about is pfil(9) and we need a proper
> > >>>> sollution for this. Before rwlocks were used, I had a handrolled
> > >>>> locking that supported both read/write semantics and starvation
> > >>>> avoidance - at the cost of failing to allow futher read access
> > >>>> when a writer asked for access.  This however, was quite
> > >>>> application specific and not the most efficient implementation
> > >>>> either.
> > >>>
> > >>>I'm not a pfil(9) expert, but for what I've heard, rmlocks should
> > >>> be really good for it, shouldn't them?
> > >>>
> > >>>The concept is that if we want to maintain fast paths for rwlock
> > >>> we cannot do too much tricks there. And you can really deadlock
> > >>> if you allow recursion on readers...
> > >>
> > >>How about adding rwlock_try_rlock() which would do the following:
> > >> 1) Only variant to allow[1] read recursion and ...
> > >> 2) ... only if no outstanding write requests
> > >> 3) Let the caller deal with failure
> > >>
> > >>This can be implemented statically, so no overhead in the fast
> > >> path.  The caller is in the best position to decide if it is
> > >> recursing or not - could keep that info on the stack - and can
> > >> either fail[2] or do a normal rwlock_rlock() which would wait for
> > >> the writer to enter and exit.
> > >>
> > >>[2] In most situation where you use read locks you can fail or roll
> > >> back carefully as you didn't change anything - obviously.  In pfil
> > >> - for instance - we just dropped the packet if there was a writer
> > >> waiting.
> > >>
> > >>[1] "allow" in terms of WITNESS - if that can be done.
> > >
> > >The problem is that there is no tracking in the common case (without
> > >additional overhead), so you can't know if you're recursing, unless
> > >you track it yourself.
> > >
> > >-Alfred
> >
> > I talked with Attilio about that on  IRC.
> > Most common cases of writer starvation (but not all) could be solved
> > by keeping a per thread count of shared acquired rwlocks.
> > If a rwlock is currently locked as shared/read AND a thread is
> > blocked on it to lock it exclusively/write - then new shared/read
> > locks will only be granted to thread that already has a shared lock.
> > (per thread shared counter is non zero)
> >
> > To be honest I am a bit twitchy about a lock without priority
> > propagation - especially since in FreeBSD threads run with user
> > priority in kernel
> > space and can get preempted.
> >
> > Stephan
>
> That's an interesting hack, I guess it could be done.
>
> I would still like to disallow recursion.
>
> Can we come to a concensus on that?

As long as we properly take care of all the fallout before doing so, I'm=20
all for it.

Attached is a diff to switch pfil over to rmlocks which works for me. =20
I'll post it to -net and do the switch in a few days unless I hear=20
otherwise.

=2D-=20
/"\  Best regards,                      | mlaier@freebsd.org
\ /  Max Laier                          | ICQ #67774661
 X   http://pf4freebsd.love2party.net/  | mlaier@EFnet
/ \  ASCII Ribbon Campaign              | Against HTML Mail and News

--Boundary-01=_klrRHoFB8DwWuBG
Content-Type: text/x-diff;
  charset="iso-8859-1";
  name="pfil.rmlock.diff"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="pfil.rmlock.diff"

Index: pfil.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /home/ncvs/src/sys/net/pfil.c,v
retrieving revision 1.14
diff -u -r1.14 pfil.c
=2D-- pfil.c	2 Feb 2006 03:13:15 -0000	1.14
+++ pfil.c	23 Nov 2007 09:06:12 -0000
@@ -34,7 +34,7 @@
 #include <sys/errno.h>
 #include <sys/lock.h>
 #include <sys/malloc.h>
=2D#include <sys/rwlock.h>
+#include <sys/rmlock.h>
 #include <sys/socket.h>
 #include <sys/socketvar.h>
 #include <sys/systm.h>
@@ -66,11 +66,12 @@
 pfil_run_hooks(struct pfil_head *ph, struct mbuf **mp, struct ifnet *ifp,
     int dir, struct inpcb *inp)
 {
+	struct rm_priotracker rmpt;
 	struct packet_filter_hook *pfh;
 	struct mbuf *m =3D *mp;
 	int rv =3D 0;
=20
=2D	PFIL_RLOCK(ph);
+	PFIL_RLOCK(ph, &rmpt);
 	KASSERT(ph->ph_nhooks >=3D 0, ("Pfil hook count dropped < 0"));
 	for (pfh =3D pfil_hook_get(dir, ph); pfh !=3D NULL;
 	     pfh =3D TAILQ_NEXT(pfh, pfil_link)) {
@@ -80,7 +81,7 @@
 				break;
 		}
 	}
=2D	PFIL_RUNLOCK(ph);
+	PFIL_RUNLOCK(ph, &rmpt);
 =09
 	*mp =3D m;
 	return (rv);
@@ -104,7 +105,7 @@
 		}
 	PFIL_LIST_UNLOCK();
=20
=2D	rw_init(&ph->ph_mtx, "PFil hook read/write mutex");
+	PFIL_LOCK_INIT(ph);
 	PFIL_WLOCK(ph);
 	ph->ph_nhooks =3D 0;
=20
@@ -143,7 +144,7 @@
 		free(pfh, M_IFADDR);
 	TAILQ_FOREACH_SAFE(pfh, &ph->ph_out, pfil_link, pfnext)
 		free(pfh, M_IFADDR);
=2D	rw_destroy(&ph->ph_mtx);
+	PFIL_LOCK_DESTROY(ph);
 =09
 	return (0);
 }
Index: pfil.h
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /home/ncvs/src/sys/net/pfil.h,v
retrieving revision 1.16
diff -u -r1.16 pfil.h
=2D-- pfil.h	8 Jun 2007 12:43:25 -0000	1.16
+++ pfil.h	23 Nov 2007 10:07:52 -0000
@@ -37,7 +37,7 @@
 #include <sys/_lock.h>
 #include <sys/_mutex.h>
 #include <sys/lock.h>
=2D#include <sys/rwlock.h>
+#include <sys/rmlock.h>
=20
 struct mbuf;
 struct ifnet;
@@ -69,7 +69,7 @@
 	pfil_list_t	ph_out;
 	int		ph_type;
 	int		ph_nhooks;
=2D	struct rwlock	ph_mtx;
+	struct rmlock	ph_lock;
 	union {
 		u_long		phu_val;
 		void		*phu_ptr;
@@ -93,10 +93,13 @@
 struct pfil_head *pfil_head_get(int, u_long);
=20
 #define	PFIL_HOOKED(p) ((p)->ph_nhooks > 0)
=2D#define PFIL_RLOCK(p) rw_rlock(&(p)->ph_mtx)
=2D#define PFIL_WLOCK(p) rw_wlock(&(p)->ph_mtx)
=2D#define PFIL_RUNLOCK(p) rw_runlock(&(p)->ph_mtx)
=2D#define PFIL_WUNLOCK(p) rw_wunlock(&(p)->ph_mtx)
+#define	PFIL_LOCK_INIT(p) \
+    rm_init(&(p)->ph_lock, "PFil hook read/write mutex", LO_RECURSABLE)
+#define	PFIL_LOCK_DESTROY(p) rm_destroy(&(p)->ph_lock)
+#define PFIL_RLOCK(p, t) rm_rlock(&(p)->ph_lock, (t))
+#define PFIL_WLOCK(p) rm_wlock(&(p)->ph_lock)
+#define PFIL_RUNLOCK(p, t) rm_runlock(&(p)->ph_lock, (t))
+#define PFIL_WUNLOCK(p) rm_wunlock(&(p)->ph_lock)
 #define PFIL_LIST_LOCK() mtx_lock(&pfil_global_lock)
 #define PFIL_LIST_UNLOCK() mtx_unlock(&pfil_global_lock)
=20

--Boundary-01=_klrRHoFB8DwWuBG--

--nextPart1382413.3MpHy4ZRLo
Content-Type: application/pgp-signature; name=signature.asc 
Content-Description: This is a digitally signed message part.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQBHRrlpXyyEoT62BG0RAqC0AKCBuMmC7YmsW331yCGrGzdoucy+QQCfZHi6
pVviPYRmNUns7i6x0jUpMzU=
=9OYH
-----END PGP SIGNATURE-----

--nextPart1382413.3MpHy4ZRLo--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200711231228.41382.max>