From owner-freebsd-chat Fri Apr 12 16:53:36 2002 Delivered-To: freebsd-chat@freebsd.org Received: from avocet.prod.itd.earthlink.net (avocet.mail.pas.earthlink.net [207.217.120.50]) by hub.freebsd.org (Postfix) with ESMTP id 7055937B404 for ; Fri, 12 Apr 2002 16:53:20 -0700 (PDT) Received: from pool0003.cvx21-bradley.dialup.earthlink.net ([209.179.192.3] helo=mindspring.com) by avocet.prod.itd.earthlink.net with esmtp (Exim 3.33 #1) id 16wAqr-0003HA-00; Fri, 12 Apr 2002 16:53:09 -0700 Message-ID: <3CB7734B.DEE9ED94@mindspring.com> Date: Fri, 12 Apr 2002 16:52:43 -0700 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Joe Halpin Cc: dan@langille.org, chat@freebsd.org Subject: Re: setting up daily builds References: <20020411214456.0E68B3F2D@bast.unixathome.org> <3CB63991.7B33851F@mindspring.com> <3CB707CF.D6DEAA19@attbi.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-chat@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Joe Halpin wrote: > How do you go about identifying the guilty parties? For example, if a > subsystem that other code depends on breaks, that would probably cause > failures in dependent subsystems. Would the owners of the dependent > subsystems get email as well in that case? Because you have two source trees -- "the last one that worked" and "the current one", and they are both taggesd, you can cvs diff the tags in order to get the differences. Normally, what you do is copy the working tree to a new location, and cvs up[date it, logging the updated files. The difference in the revisions are enough to identify who made the modificiations. (1.173 was alfred, 1.74 was phk, etc.). You back out the modifications in revere chronological order, until the code works again. If you want to be more complicated, and you have a fully populated build tree with .depend files, you can segment the changes to identify the change that caused the problem. > Also, if a subsystem fails because of an error in a header file exported > by some other subsystem (which didn't fail to build), will the right > developer get the email? Yes. Because the header file will show up in the list of deltas. The reporting is based on the deltas and *any* build failure, not on the failing file itself. The fundamental assumption is that you start with a tree with zero build failures, so you can always back things up to that state, no matter what. By basing it on deltas, the problem blame goes to the authors of the deltas (e.g. the people who changed the interfaces on everyone else), rather than on the consumers of the interface. Or, to put it another way: "code does not rot: it takes an intentional modification to break working code". Normally, there is a human being who has organizational responsibility for backing out/fixing the minimal change set. This works well in an environment where you have a couple of lead developers that can really spew code, but aren't very careful about it. You are better off curbing your lead developers, rather than ingraining their bad work habits by ensuring that there is someone to clean up after them. Living in your own mess does wonders for bad work habits. In the long run, it's better to have a team player than a star, unless what you are doing is work no one else can do (you can't throw away your critical path resources because they are inconvenient). > I'm very interested in how you deal with things like this. It's really a tools problem. There aren't really a lot of tools which enforce good work habits on people, and those that do are considered onerous by developers, so unless it's a real requirement for participation in the process (i.e. the developer gets something out of it), then the tools chosen will be the ones that "bother" the developer least with "inconvenineces", such as "not breaking the other developer's ability to use the code". Personally, with any larger number of developers, I like to use GID protection on the CVS tree, and *not* put the developers in the group. THen I add sgid wrappers, which add the verbs "lock", "unlock", and "force". You can't checkin without a lock, and you have to build successfully before you release the lock. Releasing the lock results in a session log message being input by the lock releaser, if any commits occurred. CVSup works in this environment because the CVSup operates on a "snapshot". This works by locking, copying the repository, and unlocking. The copy is guaranteed to be a consistent and buildable snapshot of the repository. You can achieve the same effect with moving tags, if you lack space, or if you want to minimize the lockdown time (for a locally controlled project, 3AM is pretty idle; for a distributed project, you pretty much have to eat the lock). When people CVSup, they do it against the snapshot of the repository. You can wire synchornization into the unlock and the CVSup daemon, but I never bothered with the Modula 3 code that would have taken. This doesn't prevent people from sticking in bogus code and unlocking without testing it, but you can lay the blame squarely at the correct feet, in all cases, since they can't claim the failure was the result of a simultaneous update with another developer. You can also back out the changes, in all cases, until they are made functional. Basically, unless your repository is buildable nightly, then you can't guarantee completion of nightly snapshots. If you can't guarantee nightly snapshots, then you can't do consistent and automated regression testing. If you can't do consistent and automated regression testing, then you can't measure project progress, especially for maintenance cycles on already released products, but also for "next revision" products. If you want to look at it that way, you could consider it to be the first steps towards a requirements tracking process: 1) Customer requirements 2) Use cases 3) Specification 4) Deviations from fulfillment of requirements i) Failure to implement ii) Defect reports from customers 5) Regression testing i) Implementation verification ii) Defect resolution verification iii) Specification compliance verification 6) Ability to consistently and reporducibly build something to be tested 7) Self consistency of the source tree It all flows down hill from the goal of meeting the customer requirements. If you want to gear your developement processes to the model even further (e.g. by writing test cases for unit testing to ensure specification compliance before writing a line of code for the product itself, etc.), then you can carry this through to ensure that there isn't a line of code written without that line of code needing to be written to fulfill a customer requirement of some kind. You really want to build the knowledge and the quality of the product into the process, rather than into key people who might get hit by a bus, or go to work for your competitor for a salary hike you are unwilling/unable to match. It's like building a product to reduce technical support costs: 1) Ensure that each error message is unique 2) Ensure that each message can only result from a single condition, so that hueristics are not required to differentiate root causes (as much as possible) 3) Include obvious keywords/keyphrases in the error reports -- preferrably, set out in the message by bold or brackets, etc., so that they are reported to the representative 4) Build an associative database of keyword/keyphrase and solution pairs (e.g. dbVista from Raima corp.), and display them in match count/frequency-of-correctness order. 5) Hire mminimum wage monkeys for bottom tier support: you have built your knowledge into the system, rather than building it into key people. 6) Feedback #4 into the maintenance engineering process (if it can be that identified, then it can be proactively fixed and/or simply tell the user what the administrator needs to do, if it requires a priviledged operation to fix) 7) Distribute your database to OEMs/VARs/VADs/etc., so they can look smart to their customers. 8) Translate your database into other languages, when you translate your product into other languages, so front line support in Japan can be done in Japan ...though most companies stupidly consider technical support to be a "profit center" these days, and so don't spend a lot of time on trying to eliminate it (I say "stupidly" because this is how WordPerfect lost out to Microsoft Word). When you think about it, the rules you should follow in any given situation are pretty obvious. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-chat" in the body of the message