From owner-freebsd-current  Fri Feb 22 22:35:42 2002
Delivered-To: freebsd-current@freebsd.org
Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2])
	by hub.freebsd.org (Postfix) with ESMTP
	id EEF0737B400; Fri, 22 Feb 2002 22:35:37 -0800 (PST)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.11.6/8.9.1) id g1N6ZbZ34600;
	Fri, 22 Feb 2002 22:35:37 -0800 (PST)
	(envelope-from dillon)
Date: Fri, 22 Feb 2002 22:35:37 -0800 (PST)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <200202230635.g1N6ZbZ34600@apollo.backplane.com>
To: John Baldwin <jhb@FreeBSD.ORG>
Cc: current@FreeBSD.ORG
Subject: Re: RE: First (easy) td_ucred patch
References:  <XFMail.020223010839.jhb@FreeBSD.org>
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-current.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-current>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-current>
X-Loop: FreeBSD.ORG

:>     as found in getgroups().  Some of these changes, for example return()ing
:>     in the middle of a procedure, are highly dependant on the removal of
:>     Giant.  goto's are questionable but replacing them with return()s in
:>     the middle of a procedure isn't too hot an idea either.
:
:It's how the code is going to look as the final rendition.  It also restores
:the code more to its 4.x flow making a diff to see what actual changes SMPng
:made easier to read.

    This makes no sense whatsoever.  We aren't trying to make the code look
    like 4.x.  We are trying to make it MP safe.
    
:- kern.giant.proc as you would have it now is far too broad.  Most of the proc
:  locking currently in the tree and in my work tree is not safe yet.  This is
:  because certain fields are only locked in certain places.  For a field to be
:  safe outside of Giant, it needs to be locked everywhere.  You include both
:  code that contains partially locked fields and fully locked fields under the
:  same sysctl.  This means I can't actually turn the sysctl on to do the
:  testing safely, so I might as well just leave Giant in there rahter than
:  bother with a useless sysctl.  One solution might be to split ths sysctl up.
:  Well then, how many are we going to have, one sysctl for each field in proc?
:  This won't scale in my opinion.  It may be useful for covering fields that
:  aren't fully locked, but for stuff that is done, I don't think you need it.
:- How many various locking systems do syscalls like read() going call into?
:  Are we going to eventually need to check 8, 10, or 16 sysctl's?  Trying to
:  keep all that straight will be a major pain.  This is another reason I don't
:  think they scale well.
:- We eventually have to go and remove all this stuff anyways.
:- As another note specific to td_ucred: there is no other lock that you are
:  "covering up" for.  It is a private per-thread pointer to a read-only
:  structure.  I can see needing to turn Giant back on around a lock done
:  wrong, but there is no lock in this instance.

    I don't think you quite understand the purpose of instrumenting Giant.
    You are synthesizing problems where none exist, and you are making 
    assumptions that are simply not true.  You are exaggerating the issues
    to ridiculous extremes.

    You seem hell bent on taking parts of PROC out from under Giant.  Well,
    where's the documentation?  In your head?  How the hell are other people
    supposed to be able to work on the system when the only person who knows
    what is safe and what is not is you?  One of the things Giant 
    instrumentation gives us is the ability to show people, very clearly in
    the code, what we believe to be safe and what we believe not to be safe,
    or in beta, simply by changing the giant globals in kern_mutex.c.  It
    allows other developers to see, very clearly, *EXACTLY* where we are
    in the Giant pushdown work.

    The way you are doing it nobody will know what the hell is going on 
    except you!

:We need to people to test stuff as it comes out from under Giant so we can find

    The instrumented Giant does not in any way prevent this from occuring.
    It does not in any way prevent you from finding bugs.

:Maybe if you want SMPng to take 5 times as long...
:

    I think it's going to take 5 times as long if you make mass commits from
    P4 with no way to at least partially turn off the MP functionality you
    added and we wind up with dozens of impossible-to-find bugs a year down 
    the line.  I think it is going to take 5 times as long if there is no
    clear documentation or indication in the code showing developers what
    is safe, what is under test, and what is not safe.

    What I am doing is trying to prevent that from happening.  You are 
    treating each subsystem separately and assuming that bugs will be
    found on a per-subsystem basis, regardless of the complexity of the 
    interactions between subsystems.  Well, I know better.  When you have 
    a dozen subsystems interacting in an MP system and you have a race
    or MP-related bug, good fucking luck find it!

    How long do you want it to take to get a stable 5.x release?  Because
    the way you are going it isn't going to happen until we hit 5.3 or so.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>

:John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
:"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message