From owner-freebsd-current@FreeBSD.ORG  Tue Jun  3 13:06:15 2003
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 3100237B401; Tue,  3 Jun 2003 13:06:15 -0700 (PDT)
Received: from mx0.freebsd-services.com (survey.codeburst.net
	[195.149.39.161])	by mx1.FreeBSD.org (Postfix) with ESMTP
	id C3E1D43F3F; Tue,  3 Jun 2003 13:06:11 -0700 (PDT)
	(envelope-from paul@freebsd-services.com)
Received: from [192.168.7.2] (freebsd.gotadsl.co.uk [81.6.249.198])
	by mx0.freebsd-services.com (Postfix) with ESMTP
	id D736A1B212; Tue,  3 Jun 2003 21:06:09 +0100 (BST)
From: Paul Richards <paul@freebsd-services.com>
To: "M. Warner Losh" <imp@bsdimp.com>
In-Reply-To: <20030603.111956.46316212.imp@bsdimp.com>
References: <20030603134717.GD35187@survey.codeburst.net>
	 <20030603.085659.13314075.imp@bsdimp.com>
	 <20030603155157.GH35187@survey.codeburst.net>
	 <20030603.111956.46316212.imp@bsdimp.com>
Content-Type: text/plain
Organization: 
Message-Id: <1054584260.85972.26.camel@cf.freebsd-services.com>
Mime-Version: 1.0
X-Mailer: Ximian Evolution 1.2.4 
Date: 02 Jun 2003 21:04:22 +0100
Content-Transfer-Encoding: 7bit
cc: hmp@freebsd.org
cc: current@freebsd.org
cc: des@freebsd.org
Subject: Re: VFS: C99 sparse format for struct vfsops
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 03 Jun 2003 20:06:15 -0000

On Tue, 2003-06-03 at 18:19, M. Warner Losh wrote:
> In message: <20030603155157.GH35187@survey.codeburst.net>
>             Paul Richards <paul@freebsd-services.com> writes:
> : I'm not sure that kobj actually needs to be MP safe if the kobj
> : struct is always embedded in a structure at a higher level i.e.
> : a use of kobj in say the device driver code will not interact with
> : it's use by say the linker code so the locking can be done on device
> : driver or linker level structs.
> 
> Lock lock data, not code.

Agreed, that's what I said :-) However, I was wondering if locking the
structure that the kobj was embedded in was enough to protect the kobj
itself. As you point out below, that's not the case.

> The kobj dispatch tables are shared between all instances of the
> object.  So if you have two instances of the driver, the driver locks
> protect softc for that driver.  However, both softc's reference the
> kobj structures, directly or indirectly, that are not MP safe.  If
> instance 1 takes out a lock, that doesn't prevent instance 2 from
> getting screwed by the following code:
> 
> #define KOBJOPLOOKUP(OPS,OP) do {					\
> 	kobjop_desc_t _desc = &OP##_##desc;				\
> 	kobj_method_t *_ce =						\
> 	    &OPS->cache[_desc->id & (KOBJ_CACHE_SIZE-1)];		\
> 	if (_ce->desc != _desc) {					\
> 		KOBJOPMISS;						\
> [1]		kobj_lookup_method(OPS->cls->methods, _ce, _desc);	\
> 	} else {							\
> 		KOBJOPHIT;						\
> 	}								\
> [2]	_m = _ce->func;							\
> } while(0)
> 
> Notice how OPS has a cache array.  If instance 1 is calling the same
> ID as instance 2, modulo KOBJ_CACHE_SIZE, then a collision occurs and
> a race happens betwen [1] and [2]:
> 
> thread 1		  thread 2
> 
> _ce = ...
> 			_ce = ...
> HIT
> 			MISS
> 			set's *_ce via kobj_lookup_method
> _m gets set
> _m called
> 			_m set
> 			_m called
> 
> 
> Notice how thread 1's _m gets set based on the results of the kobj
> lookup, and we have a race, even if thread1 and thread2 took out their
> driver instance locks.

That means we have to lock the dispatch table before every method is
looked up and hold it until the method returns (the alternative would be
to free it inside the method once it had been called but that'd be a
right mess).

Actually, in practice it doesn't matter, though that's an accident of
the kobjs we currently create. The reason being that the hashing
algorithm is really simple and just lops off some high bits of the
method id so unless we have more than 255 methods they always have an
unique slot. There's no mechanism to change the actual function
associated with the method at the moment. Even if there was, it might
not matter as long as the number of methods was still below 255, because
what you'd actually be doing is, say, changing which attach function to
call when you were indirecting through the attach method entry and by
the time thread 1 gets to actually call the method it might even be
considered correct for it to call the newly mapped one.

However, I take your point and we either have to implement locking of
the method table or redesign the caching mechanism to guarantee that
there's no race. I don't think anything other than _ce is a problem (is
there anything else?).

If we ensured that the cache entry couldn't have collisions (perhaps by
not having a hash and instead indexing a unique position in an array
using the method id) then only the latter race would exist (changing the
function associated with a method) and that might be acceptable
behaviour.

The tradeoff with using an index into an array is that there'd be a
heavy penalty for growing the array if an extra method didn't fit, but
that would be exceptionally rare and with our present usage we'd never
have that happen.

Incidentally, if we implement a lock on the method table we'd
effectively have code locks :-)

-- 
Tis a wise thing to know what is wanted, wiser still to know when
it has been achieved and wisest of all to know when it is unachievable
for then striving is folly. [Magician]