From owner-freebsd-arch@FreeBSD.ORG  Wed Jun  4 00:59:21 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id BAEC837B401
	for <arch@FreeBSD.ORG>; Wed,  4 Jun 2003 00:59:21 -0700 (PDT)
Received: from HAL9000.homeunix.com (ip232.bella-vista.sfo.interquest.net
	[66.199.86.232])
	by mx1.FreeBSD.org (Postfix) with ESMTP id F403043F75
	for <arch@FreeBSD.ORG>; Wed,  4 Jun 2003 00:59:20 -0700 (PDT)
	(envelope-from das@FreeBSD.ORG)
Received: from HAL9000.homeunix.com (localhost [127.0.0.1])
	by HAL9000.homeunix.com (8.12.9/8.12.5) with ESMTP id h547xIqG008545;
	Wed, 4 Jun 2003 00:59:18 -0700 (PDT)
	(envelope-from das@FreeBSD.ORG)
Received: (from das@localhost)
	by HAL9000.homeunix.com (8.12.9/8.12.5/Submit) id h547xIjk008544;
	Wed, 4 Jun 2003 00:59:18 -0700 (PDT)
	(envelope-from das@FreeBSD.ORG)
Date: Wed, 4 Jun 2003 00:59:18 -0700
From: David Schultz <das@FreeBSD.ORG>
To: Terry Lambert <tlambert2@mindspring.com>
Message-ID: <20030604075918.GA8419@HAL9000.homeunix.com>
Mail-Followup-To: Terry Lambert <tlambert2@mindspring.com>,
	Peter Jeremy <peterjeremy@optushome.com.au>, arch@freebsd.org,
	Matthew Dillon <dillon@apollo.backplane.com>
References: <20030602171942.GA87863@roark.gnf.org>
	<xzpznl02nry.fsf@flood.ping.uio.no>
	<20030603080456.GA57773@cirb503493.alcatel.com.au>
	<3EDD7CFA.4795FB99@mindspring.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <3EDD7CFA.4795FB99@mindspring.com>
cc: Matthew Dillon <dillon@apollo.backplane.com>
cc: arch@FreeBSD.ORG
Subject: Re: Making a dynamically-linked root
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 04 Jun 2003 07:59:22 -0000

On Tue, Jun 03, 2003, Terry Lambert wrote:
> The main problem we ran into with doing this on the InterJet
> was thatsome services started later would finish starting
> before earlier services on which they were dependent.

You can solve this problem by enforcing the rule that when a
service forks off a daemon process, the parent does not exit
until the child is ready to accept requests.  I think Oracle
and Postgresql work like this.  Alternatively, as with named,
you perform all necessary initialization and opening of sockets
before forking at all.

There may be some services that don't offer this level of sanity,
but these services can probably be fixed without too much effort.
In the worst case, you will need a wrapper to poll the daemon
until it is running normally.

Once all of your services' startup scripts can make this
guarantee, writing a program to do parallel boot is easy.  You
continuously try to start as many ``exposed'' nodes in your
dependency graph at a time as you can (up to some concurrency
limit), where an exposed node is a node whose ancestors have all
finished starting.  (Wasn't this mentioned in NetBSD's original
rcNG proposal?)

> On top of that, the
> dependencies tend to be both hard and soft, e.g. it's possible
> to continue to offer a degraded service, rather than failing
> outright, if some dependent services aren't there (e.g. you
> can log by IP address if DNS isn't up to provide reverse
> name mappings to look pretty in your logs, etc.).

Distinctions like this are particularly important if you would
like to make use of the information about dependencies between
processes for more than just parallel boot.  Consider how you
could get your server to automatically recover if named dies
versus if your database dies.