From owner-freebsd-hackers  Fri Jun  7 22:31:18 1996
Return-Path: owner-hackers
Received: (from root@localhost)
          by freefall.freebsd.org (8.7.5/8.7.3) id WAA10147
          for hackers-outgoing; Fri, 7 Jun 1996 22:31:18 -0700 (PDT)
Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211])
          by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id WAA10105;
          Fri, 7 Jun 1996 22:31:06 -0700 (PDT)
Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id WAA05423; Fri, 7 Jun 1996 22:25:01 -0700
From: Terry Lambert <terry@lambert.org>
Message-Id: <199606080525.WAA05423@phaeton.artisoft.com>
Subject: Re: The -stable problem: my view
To: nate@sri.MT.net (Nate Williams)
Date: Fri, 7 Jun 1996 22:25:01 -0700 (MST)
Cc: michaelh@cet.co.jp, nate@sri.MT.net, terry@lambert.org,
        hackers@FreeBSD.org, freebsd-stable@FreeBSD.org,
        FreeBSD-current@FreeBSD.org
In-Reply-To: <199606080407.WAA02519@rocky.sri.MT.net> from "Nate Williams" at Jun 7, 96 10:07:40 pm
X-Mailer: ELM [version 2.4 PL24]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-hackers@FreeBSD.org
X-Loop: FreeBSD.org
Precedence: bulk

> > > > Terry proposes a set of tools to help enforce the policy of always having
> >                                     ^^^^^^
> > I said help not guarantee.  The tools would help resolve reads while
> > commits are being done.  Multiple reader/single writer locks are a cheap
> > effective way to do this.
> 
> They wouldn't enforce or even help the policy.  Multiple reader/single
> writer locks don't solve any significant problem we've faced.  Why do
> something that limits the ability of developers to commit changes when
> the problem the fix happens .001% of the time?
> 
> It's like making a loop that gets called once at initialization time 50%
> faster while you leave the sorting algorithm which takes up 95% of CPU
> time alone.  It's doesn't buy you anything but a warm fuzzy feeling.

This is *not* an issue of "optimizing the boot code".

This *is* an issue of removing the potential for developer checkin
conflict, so that the only margin for error is that of the developer
who disobeys protocol.

It also *cleary* identifies the violator, and avoids needless rounds
of finger-pointing, investigation, damage-control, and repair.

Whoever gets the write lock before you, it's *your* responsibility to
make sure *your* changes don't conflict with *his/her* changes when
you get the lock.

I also believe you are neglecting the fact that the CVS repository
is broken up into multiple collections, and that the lock is not
global to the system, it's global to the collection.  This is far
less likely to cause "inter-developer conflicts for write lock
acquisition" than if it were all in one collection.

The majority use of the tree is going to be of type "reader", not
of type "writer".  The programs that deal with CVS tree mirroring
for the SUP and CTM servers, and local checkouts, will be the
majority usage.

You don't need the writer lock uless you are writing, and another
developer doesn't need the writer lock unless they are committing
code in the same are (in which case, it's a damn good thing you
are not both going at it at once).

The net results are that the claim "merge cascade failure" is no
longer a valid excude for an unbuildable tree.  If Jim-Bob makes
the tree unbuildable, it's obvious that Jim-Bob is a protocol
violator.  If he does this a lot, then there should probably be
a policy enforcement decision by "the grantors of tree access"
to prevent future offenses.

The intended effect is a buildable tree and identifiable culprits
in the case of a non-buildable tree.

If Jim-Bob and John-Boy make changes in the same area simultaneously,
and the tree does not build, there is currently no way to assign
blame.  Because of this, people play "fast and loose" with the tree,
hoping that it will be too dificult to track the transgressor.

If Jim-Bob has to assert that John-Boy can't write the tree for
him to be able to, he will think twice before writing to the
tree.  Hopefully, part of this "think" will include building his
checked-out portion of the tree before checking it in, which is
what the policy says he should do anyway.


What you seem to be claiming as "limiting the ability of the
developer to commit changes", is really "limiting the ability
of the developer to commit changes in violation of protocol".


To test your "conflict inconvenience" theory, I suggest you
implement reader/writer locks with no teeth, that output "CONFLICT
WITH LOCK BY USER XXX", with a time stamp,  to a log.  Also
"in" and "out" times.

Then we can examine the conflicts that arise in real usage, and
determine:

1)	How often the conflict is a writer wanting to write when
	a reader was actively reading (meaning the writer was
	allowed by the lack of teeth,and the reader's data has
	been potentially corrupted into unbuildability).

2)	How often the reader whose data was potentially trashed
	was SUP or CTM (meaning we greatly multiplied the problem
	in #1).

3)	How often a reader came in while a write was active (meaning
	the reader has made a snapshot of a potentially inconsistent
	tree that was avoidably corrupt by nature of allowing readers
	while there are writers active).

4)	How often the reader whose data was potentially trashed
	was SUP or CTM (meaning we greatly multiplied the problem
	in #3).

5)	How often one writer came in while another writer was in,
	and how many of those writes afftected header files that
	the affected the others work, or actual sorce files were
	potentially conflicted, by file.

6)	Using "in" and "out", how often and whay kind of delays
	occurred as a result of the locks.

7)	Count of total delay (delay for readers is negative, while
	delay for writers is positive, because of the nature of
	writer corruption of reader data).


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.