From owner-freebsd-chat  Fri Apr 12 16:53:36 2002
Delivered-To: freebsd-chat@freebsd.org
Received: from avocet.prod.itd.earthlink.net (avocet.mail.pas.earthlink.net [207.217.120.50])
	by hub.freebsd.org (Postfix) with ESMTP id 7055937B404
	for <chat@freebsd.org>; Fri, 12 Apr 2002 16:53:20 -0700 (PDT)
Received: from pool0003.cvx21-bradley.dialup.earthlink.net ([209.179.192.3] helo=mindspring.com)
	by avocet.prod.itd.earthlink.net with esmtp (Exim 3.33 #1)
	id 16wAqr-0003HA-00; Fri, 12 Apr 2002 16:53:09 -0700
Message-ID: <3CB7734B.DEE9ED94@mindspring.com>
Date: Fri, 12 Apr 2002 16:52:43 -0700
From: Terry Lambert <tlambert2@mindspring.com>
X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony}  (Win98; U)
X-Accept-Language: en
MIME-Version: 1.0
To: Joe Halpin <joe.halpin@attbi.com>
Cc: dan@langille.org, chat@freebsd.org
Subject: Re: setting up daily builds
References: <20020411214456.0E68B3F2D@bast.unixathome.org> <3CB63991.7B33851F@mindspring.com> <3CB707CF.D6DEAA19@attbi.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-chat@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-chat.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-chat>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-chat>
X-Loop: FreeBSD.org

Joe Halpin wrote:
> How do you go about identifying the guilty parties? For example, if a
> subsystem that other code depends on breaks, that would probably cause
> failures in dependent subsystems. Would the owners of the dependent
> subsystems get email as well in that case?

Because you have two source trees -- "the last one that worked" and
"the current one", and they are both taggesd, you can cvs diff the
tags in order to get the differences.

Normally, what you do is copy the working tree to a new location,
and cvs up[date it, logging the updated files.

The difference in the revisions are enough to identify who made
the modificiations. (1.173 was alfred, 1.74 was phk, etc.).  You
back out the modifications in revere chronological order, until
the code works again.

If you want to be more complicated, and you have a fully populated
build tree with .depend files, you can segment the changes to
identify the change that caused the problem.


> Also, if a subsystem fails because of an error in a header file exported
> by some other subsystem (which didn't fail to build), will the right
> developer get the email?

Yes.  Because the header file will show up in the list of
deltas.

The reporting is based on the deltas and *any* build failure,
not on the failing file itself.

The fundamental assumption is that you start with a tree with
zero build failures, so you can always back things up to that
state, no matter what.  By basing it on deltas, the problem
blame goes to the authors of the deltas (e.g. the people who
changed the interfaces on everyone else), rather than on the
consumers of the interface.

Or, to put it another way: "code does not rot: it takes an
intentional modification to break working code".

Normally, there is a human being who has organizational
responsibility for backing out/fixing the minimal change set.
This works well in an environment where you have a couple of
lead developers that can really spew code, but aren't very
careful about it.  You are better off curbing your lead
developers, rather than ingraining their bad work habits by
ensuring that there is someone to clean up after them.  Living
in your own mess does wonders for bad work habits.  In the
long run, it's better to have a team player than a star,
unless what you are doing is work no one else can do (you
can't throw away your critical path resources because they
are inconvenient).


> I'm very interested in how you deal with things like this.

It's really a tools problem.  There aren't really a lot of
tools which enforce good work habits on people, and those that
do are considered onerous by developers, so unless it's a real
requirement for participation in the process (i.e. the developer
gets something out of it), then the tools chosen will be the ones
that "bother" the developer least with "inconvenineces", such as
"not breaking the other developer's ability to use the code".


Personally, with any larger number of developers, I like to
use GID protection on the CVS tree, and *not* put the developers
in the group.  THen I add sgid wrappers, which add the verbs
"lock", "unlock", and "force".  You can't checkin without a
lock, and you have to build successfully before you release
the lock.  Releasing the lock results in a session log message
being input by the lock releaser, if any commits occurred.

CVSup works in this environment because the CVSup operates on
a "snapshot".  This works by locking, copying the repository,
and unlocking.  The copy is guaranteed to be a consistent and
buildable snapshot of the repository.  You can achieve the same
effect with moving tags, if you lack space, or if you want to
minimize the lockdown time (for a locally controlled project,
3AM is pretty idle; for a distributed project, you pretty much
have to eat the lock).

When people CVSup, they do it against the snapshot of the
repository.  You can wire synchornization into the unlock and
the CVSup daemon, but I never bothered with the Modula 3 code
that would have taken.

This doesn't prevent people from sticking in bogus code and
unlocking without testing it, but you can lay the blame
squarely at the correct feet, in all cases, since they can't
claim the failure was the result of a simultaneous update
with another developer.  You can also back out the changes, in
all cases, until they are made functional.


Basically, unless your repository is buildable nightly, then
you can't guarantee completion of nightly snapshots.

If you can't guarantee nightly snapshots, then you can't do
consistent and automated regression testing.

If you can't do consistent and automated regression testing,
then you can't measure project progress, especially for
maintenance cycles on already released products, but also for
"next revision" products.


If you want to look at it that way, you could consider it to
be the first steps towards a requirements tracking process:

1)	Customer requirements
2)	Use cases
3)	Specification
4)	Deviations from fulfillment of requirements
	i)	Failure to implement
	ii)	Defect reports from customers
5)	Regression testing
	i)	Implementation verification
	ii)	Defect resolution verification
	iii)	Specification compliance verification
6)	Ability to consistently and reporducibly build something
	to be tested
7)	Self consistency of the source tree

It all flows down hill from the goal of meeting the customer
requirements.

If you want to gear your developement processes to the model
even further (e.g. by writing test cases for unit testing to
ensure specification compliance before writing a line of code
for the product itself, etc.), then you can carry this through
to ensure that there isn't a line of code written without that
line of code needing to be written to fulfill a customer
requirement of some kind.

You really want to build the knowledge and the quality of the
product into the process, rather than into key people who might
get hit by a bus, or go to work for your competitor for a
salary hike you are unwilling/unable to match.

It's like building a product to reduce technical support costs:

1)	Ensure that each error message is unique
2)	Ensure that each message can only result from a single
	condition, so that hueristics are not required to
	differentiate root causes (as much as possible)
3)	Include obvious keywords/keyphrases in the error reports
	-- preferrably, set out in the message by bold or brackets,
	etc., so that they are reported to the representative
4)	Build an associative database of keyword/keyphrase and
	solution pairs (e.g. dbVista from Raima corp.), and
	display them in match count/frequency-of-correctness
	order.
5)	Hire mminimum wage monkeys for bottom tier support: you
	have built your knowledge into the system, rather than
	building it into key people.
6)	Feedback #4 into the maintenance engineering process (if
	it can be that identified, then it can be proactively
	fixed and/or simply tell the user what the administrator
	needs to do, if it requires a priviledged operation to fix)
7)	Distribute your database to OEMs/VARs/VADs/etc., so they
	can look smart to their customers.
8)	Translate your database into other languages, when you
	translate your product into other languages, so front
	line support in Japan can be done in Japan

...though most companies stupidly consider technical support to be
a "profit center" these days, and so don't spend a lot of time on
trying to eliminate it (I say "stupidly" because this is how
WordPerfect lost out to Microsoft Word).

When you think about it, the rules you should follow in any given
situation are pretty obvious.

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-chat" in the body of the message