From owner-freebsd-rc@FreeBSD.ORG Mon Oct 22 11:07:14 2007 Return-Path: Delivered-To: freebsd-rc@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 11E9516A418 for ; Mon, 22 Oct 2007 11:07:14 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id F342913C480 for ; Mon, 22 Oct 2007 11:07:13 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.1/8.14.1) with ESMTP id l9MB7DIx080075 for ; Mon, 22 Oct 2007 11:07:13 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.1/8.14.1/Submit) id l9MB7DoW080071 for freebsd-rc@FreeBSD.org; Mon, 22 Oct 2007 11:07:13 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 22 Oct 2007 11:07:13 GMT Message-Id: <200710221107.l9MB7DoW080071@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-rc@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-rc@FreeBSD.org X-BeenThere: freebsd-rc@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussion related to /etc/rc.d design and implementation." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2007 11:07:14 -0000 Current FreeBSD problem reports Critical problems Serious problems S Tracker Resp. Description -------------------------------------------------------------------------------- o conf/98758 rc [jail] [patch] Templatize 'jail_fstab' in /etc/rc.d/ja o conf/98846 rc [jail] [patch] Templatize 'jail_rootdir' in /etc/rc.d/ o conf/105689 rc syslogd starts too late at boot o conf/107155 rc [ppp] /etc/rc.d/ppp-user does not bring up pppoe at bo o conf/107364 rc pf fails to start on bootup after system update from F 5 problems total. Non-critical problems S Tracker Resp. Description -------------------------------------------------------------------------------- o conf/45226 rc [patch] Fix for rc.network, ppp-user annoyance o conf/48870 rc [PATCH] rc.network: allow to cancel interface status d o conf/58939 rc [patch] dumb little hack for /etc/rc.firewall{,6} o conf/73677 rc [patch] add support for powernow states to power_profi o conf/74817 rc [patch] network.subr: fixed automatic configuration of o conf/77663 rc Suggestion: add /etc/rc.d/addnetswap after addcritremo o conf/79196 rc [PATCH] configurable dummynet loading from /etc/rc.co o kern/81006 rc ipnat not working with tunnel interfaces on startup o conf/85363 rc syntax error in /etc/rc.d/devfs o conf/85819 rc [patch] script allowing multiuser mode in spite of fsc o conf/88913 rc [patch] wrapper support for rc.subr o conf/89061 rc [patch] IPv6 6to4 auto-configuration enhancement o conf/89870 rc [patch] feature request to make netif verbose rc.conf o conf/92523 rc [patch] allow rc scripts to kill process after a timeo o conf/93815 rc [patch] Adds in the ability to save ipfw rules to rc.d o conf/95162 rc [patch] Missing feature in rc.subr o conf/96343 rc [patch] rc.d order change to start inet6 before pf o conf/99444 rc [patch] Enhancement: rc.subr could easily support star o conf/99721 rc [patch] /etc/rc.initdiskless problem copy dotfile in s o conf/102700 rc [geli] [patch] Add encrypted /tmp support to GELI/GBDE o conf/102913 rc [jail] [patch] /etc/rc.d/named killall in jailed OS o conf/103486 rc [rc.d] [jail] [patch] rc.d/jail: mount fstab after dev o conf/103489 rc [rc.d] [jail] [patch] named_chroot_autoupdate doesn't o conf/104549 rc [patch] rc.d/nfsd needs special _find_processes functi o conf/105145 rc [PATCH] add redial function to rc.d/ppp o conf/105568 rc [patch] Add more flexibility to rc.conf, to choose "_e o conf/106009 rc [patch] Fix pppoed startup script to process multiply o conf/109562 rc [rc.d] [patch] Make rc.d/devfs usable from command-lin o conf/109980 rc /etc/rc.d/netif restart doesn't destroy cloned_interfa o conf/114119 rc [jail] [patch] /etc/rc.d/jail improvements for network o conf/116177 rc rc.d/mdconfig2 script fail at -CURRENT 31 problems total. From owner-freebsd-rc@FreeBSD.ORG Tue Oct 23 05:13:53 2007 Return-Path: Delivered-To: freebsd-rc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 12C0316A420 for ; Tue, 23 Oct 2007 05:13:53 +0000 (UTC) (envelope-from John.Marshall@riverwillow.com.au) Received: from mail2.riverwillow.net.au (ns2.riverwillow.net.au [203.58.93.41]) by mx1.freebsd.org (Postfix) with ESMTP id 8770413C481 for ; Tue, 23 Oct 2007 05:13:52 +0000 (UTC) (envelope-from John.Marshall@riverwillow.com.au) Received: from rwmail.mby.riverwillow.net.au (rwsrv06.rw2.riverwillow.net.au [172.25.25.16]) by mail2.riverwillow.net.au (8.14.1/8.14.1) with ESMTP id l9N4wNlg014870 for ; Tue, 23 Oct 2007 14:58:23 +1000 (AEST) Received: from [172.25.25.68] ([172.25.25.68] RDNS failed) by rwmail.mby.riverwillow.net.au with Microsoft SMTPSVC(6.0.3790.3959); Tue, 23 Oct 2007 14:58:23 +1000 Message-ID: <471D7F68.8070308@riverwillow.com.au> Date: Tue, 23 Oct 2007 14:58:16 +1000 From: John Marshall Organization: Riverwillow Pty Ltd User-Agent: Thunderbird 2.0.0.6 (Windows/20070728) MIME-Version: 1.0 To: freebsd-rc@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 23 Oct 2007 04:58:23.0556 (UTC) FILETIME=[56E61840:01C81531] Subject: How to debug rc hangs? X-BeenThere: freebsd-rc@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussion related to /etc/rc.d design and implementation." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Oct 2007 05:13:53 -0000 A few weeks ago I upgraded a server from 6.2-RELEASE to 7.0-PRERELEASE. I re-built all ports on the new platform. During startup, the following servers fail to start: - mountd - squid - cvsupd The system boots happily and rc does its thing up until "Starting mountd" and sulks until I intervene with ctrl-C; at which point everything continues until "Starting squid" (interrupt) and then "Starting cvsupd" (interrupt). After startup has completed, the following work just fine: - /etc/rc.d/mountd start - /usr/local/etc/rc.d/squid start - /usr/local/etc/rc.d/cvsupd start I held out until 7.0-BETA1 and upgraded to that but still have the same symptoms. I have tried setting rc_debug="YES" in rc.conf but that doesn't show me any more than I already know (e.g. last line before mountd hang is: "/etc/rc: DEBUG: run_rc_command: doit: /usr/sbin/mountd -l" So, are there any clever configuration switches to aid with this debugging? Otherwise I'll just plough through manually editing rc.conf "foo_enable" settings until I discover the PROVIDE/REQUIRE loop (or something else). Thank you. -- John Marshall From owner-freebsd-rc@FreeBSD.ORG Tue Oct 23 12:05:11 2007 Return-Path: Delivered-To: freebsd-rc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 86EE216A46C for ; Tue, 23 Oct 2007 12:05:11 +0000 (UTC) (envelope-from mmakonnen@gmail.com) Received: from py-out-1112.google.com (py-out-1112.google.com [64.233.166.183]) by mx1.freebsd.org (Postfix) with ESMTP id 4212A13C4BF for ; Tue, 23 Oct 2007 12:05:11 +0000 (UTC) (envelope-from mmakonnen@gmail.com) Received: by py-out-1112.google.com with SMTP id u77so3255305pyb for ; Tue, 23 Oct 2007 05:05:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; bh=/6ZXZyD/bT3KMEdezLRsDx+tIYj2+Bc2FK4kmmj+Ugw=; b=LoUxnjnVQ4V3WubwulmjO4A3fwJWdwtLrwGrHoE9g3GB3nFxjyLKskEGDbSIHOwNDC1vM73ZY+ANZ0sZpGJ9SnneLPUhA8yn7m0xf/dVuVAvA7mXW71CRfLWPFHo/FG5HJhBkzOHYhF++bBx5ao6aGVtUqrqEEGC3TQaBFyX0kg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; b=NT3tfwL0befgLJKZElD1PC3XaZCCMLBGaETr3I1kTSh5EYX6GM2IObLAcCh/kGQo9NJ8g2e9NKanhrvCeRJ6rrF2oVfIngK6fyJLIyDjmIdJgvuG0p1IAb+CpA6wffmy3qrqs68Nt5ma8CLqi/ECqW6vyY8eZgXbxKArjgrAAB8= Received: by 10.65.53.3 with SMTP id f3mr12038598qbk.1193141102517; Tue, 23 Oct 2007 05:05:02 -0700 (PDT) Received: by 10.65.239.20 with HTTP; Tue, 23 Oct 2007 05:05:02 -0700 (PDT) Message-ID: <584bfc3f0710230505i29e8f19aofc4e66d0aee7b7c1@mail.gmail.com> Date: Tue, 23 Oct 2007 15:05:02 +0300 From: "Mike Telahun Makonnen" Sender: mmakonnen@gmail.com To: "John Marshall" In-Reply-To: <471D7F68.8070308@riverwillow.com.au> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <471D7F68.8070308@riverwillow.com.au> X-Google-Sender-Auth: 93d0a7843de0aaf9 Cc: freebsd-rc@freebsd.org Subject: Re: How to debug rc hangs? X-BeenThere: freebsd-rc@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussion related to /etc/rc.d design and implementation." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Oct 2007 12:05:11 -0000 On 10/23/07, John Marshall wrote: > > I have tried setting rc_debug="YES" in rc.conf but that doesn't show me > any more than I already know (e.g. last line before mountd hang is: > "/etc/rc: DEBUG: run_rc_command: doit: /usr/sbin/mountd -l" It seems to me that if it's getting this far, that the problem probably is not in rc.d. The next thing it does after that debug message is eval the $doit line you saw, so either the eval command is missbehaving or the problem is with the daemon and not rc.d. What does CTR-t say when it hangs? Also, I noticed all three programs you listed are network daemons. My guess is they are not actually hung, they only *appear* to hang because they're wating on some sort of network resource (DNS maybe?). Cheers, Mike. From owner-freebsd-rc@FreeBSD.ORG Tue Oct 23 14:06:26 2007 Return-Path: Delivered-To: freebsd-rc@FreeBSD.Org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BA6B216A417 for ; Tue, 23 Oct 2007 14:06:26 +0000 (UTC) (envelope-from John.Marshall@riverwillow.com.au) Received: from mail1.riverwillow.net.au (ns1.riverwillow.net.au [203.58.93.40]) by mx1.freebsd.org (Postfix) with ESMTP id 3A97413C48E for ; Tue, 23 Oct 2007 14:06:25 +0000 (UTC) (envelope-from John.Marshall@riverwillow.com.au) Received: from rwmail.mby.riverwillow.net.au (rwsrv06.rw2.riverwillow.net.au [172.25.25.16]) by mail1.riverwillow.net.au (8.14.1/8.14.1) with ESMTP id l9NE6CcT093549; Wed, 24 Oct 2007 00:06:12 +1000 (AEST) Received: from [172.25.25.68] ([172.25.25.68] RDNS failed) by rwmail.mby.riverwillow.net.au with Microsoft SMTPSVC(6.0.3790.3959); Wed, 24 Oct 2007 00:06:12 +1000 Message-ID: <471DFFD0.8020701@riverwillow.com.au> Date: Wed, 24 Oct 2007 00:06:08 +1000 From: John Marshall Organization: Riverwillow Pty Ltd User-Agent: Thunderbird 2.0.0.6 (Windows/20070728) MIME-Version: 1.0 To: Mike Telahun Makonnen References: <471D7F68.8070308@riverwillow.com.au> <584bfc3f0710230505i29e8f19aofc4e66d0aee7b7c1@mail.gmail.com> In-Reply-To: <584bfc3f0710230505i29e8f19aofc4e66d0aee7b7c1@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 23 Oct 2007 14:06:12.0157 (UTC) FILETIME=[DE1C96D0:01C8157D] Cc: "freebsd-rc@FreeBSD.Org" Subject: Re: How to debug rc hangs? X-BeenThere: freebsd-rc@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussion related to /etc/rc.d design and implementation." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Oct 2007 14:06:26 -0000 Mike Telahun Makonnen wrote: > On 10/23/07, John Marshall wrote: >> I have tried setting rc_debug="YES" in rc.conf but that doesn't show me >> any more than I already know (e.g. last line before mountd hang is: >> "/etc/rc: DEBUG: run_rc_command: doit: /usr/sbin/mountd -l" > > It seems to me that if it's getting this far, that the problem probably is > not in rc.d. The next thing it does after that debug message is eval the $doit > line you saw, so either the eval command is missbehaving or the problem > is with the daemon and not rc.d. What does CTR-t say when it hangs? Also, > I noticed all three programs you listed are network daemons. My guess is > they are not actually hung, they only *appear* to hang because they're wating > on some sort of network resource (DNS maybe?). > Thanks Mike, The ctrl-T tip is the kind of information I'm looking for. My primary reason for posting is to find out what tools/switches/hooks are available to help troubleshoot this kind of problem, rather than asking somebody else to solve it. Having said that, ctrl-T shows: load: 0.74 cmd: mountd 576 [nanslp] 0.00u 0.00s 0% 1428k load: 0.25 cmd: mountd 576 [nanslp] 0.00u 0.00s 0% 1432k load: 0.12 cmd: mountd 576 [nanslp] 0.00u 0.00s 0% 1432k ...which lends weight to my suspicion that a pre-requisite resource is not yet available - and, perhaps, hasn't yet started due to a circular dependency? As I hinted, my plan is to drill down into the PROVIDE/REQUIRE labyrinth and work by trial and error (with a reboot in between each error). I'm happy to do that but I'm hoping that I might be able to use this situation to learn of more elegant ways to diagnose the problem. ...and to reiterate, this is on 7.0-BETA1 (built Saturday morning) and all this was working without any intervention on 6.2-RELEASE. -- John Marshall From owner-freebsd-rc@FreeBSD.ORG Tue Oct 23 15:59:34 2007 Return-Path: Delivered-To: freebsd-rc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CAFF716A41B; Tue, 23 Oct 2007 15:59:34 +0000 (UTC) (envelope-from brooks@lor.one-eyed-alien.net) Received: from lor.one-eyed-alien.net (cl-162.ewr-01.us.sixxs.net [IPv6:2001:4830:1200:a1::2]) by mx1.freebsd.org (Postfix) with ESMTP id 7475D13C4AA; Tue, 23 Oct 2007 15:59:34 +0000 (UTC) (envelope-from brooks@lor.one-eyed-alien.net) Received: from lor.one-eyed-alien.net (localhost [127.0.0.1]) by lor.one-eyed-alien.net (8.13.8/8.13.8) with ESMTP id l9NFxWkc037521; Tue, 23 Oct 2007 10:59:33 -0500 (CDT) (envelope-from brooks@lor.one-eyed-alien.net) Received: (from brooks@localhost) by lor.one-eyed-alien.net (8.13.8/8.13.8/Submit) id l9NFxWfJ037520; Tue, 23 Oct 2007 10:59:32 -0500 (CDT) (envelope-from brooks) Date: Tue, 23 Oct 2007 10:59:32 -0500 From: Brooks Davis To: John Marshall Message-ID: <20071023155932.GA37204@lor.one-eyed-alien.net> References: <471D7F68.8070308@riverwillow.com.au> <584bfc3f0710230505i29e8f19aofc4e66d0aee7b7c1@mail.gmail.com> <471DFFD0.8020701@riverwillow.com.au> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="7AUc2qLy4jB3hD7Z" Content-Disposition: inline In-Reply-To: <471DFFD0.8020701@riverwillow.com.au> User-Agent: Mutt/1.5.15 (2007-04-06) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0 (lor.one-eyed-alien.net [127.0.0.1]); Tue, 23 Oct 2007 10:59:33 -0500 (CDT) Cc: Mike Telahun Makonnen , "freebsd-rc@FreeBSD.Org" Subject: Re: How to debug rc hangs? X-BeenThere: freebsd-rc@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussion related to /etc/rc.d design and implementation." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Oct 2007 15:59:34 -0000 --7AUc2qLy4jB3hD7Z Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Oct 24, 2007 at 12:06:08AM +1000, John Marshall wrote: > Mike Telahun Makonnen wrote: > > On 10/23/07, John Marshall wrote: > >> I have tried setting rc_debug=3D"YES" in rc.conf but that doesn't show= me > >> any more than I already know (e.g. last line before mountd hang is: > >> "/etc/rc: DEBUG: run_rc_command: doit: /usr/sbin/mountd -l" > > It seems to me that if it's getting this far, that the problem probably= is > > not in rc.d. The next thing it does after that debug message is eval th= e=20 > > $doit > > line you saw, so either the eval command is missbehaving or the problem > > is with the daemon and not rc.d. What does CTR-t say when it hangs? Als= o, > > I noticed all three programs you listed are network daemons. My guess is > > they are not actually hung, they only *appear* to hang because they're= =20 > > wating > > on some sort of network resource (DNS maybe?). >=20 > Thanks Mike, >=20 > The ctrl-T tip is the kind of information I'm looking for. My primary re= ason=20 > for posting is to find out what tools/switches/hooks are available to he= lp=20 > troubleshoot this kind of problem, rather than asking somebody else to s= olve=20 > it. >=20 > Having said that, ctrl-T shows: > load: 0.74 cmd: mountd 576 [nanslp] 0.00u 0.00s 0% 1428k > load: 0.25 cmd: mountd 576 [nanslp] 0.00u 0.00s 0% 1432k > load: 0.12 cmd: mountd 576 [nanslp] 0.00u 0.00s 0% 1432k >=20 > ...which lends weight to my suspicion that a pre-requisite resource is n= ot=20 > yet available - and, perhaps, hasn't yet started due to a circular=20 > dependency? As I hinted, my plan is to drill down into the PROVIDE/REQUI= RE=20 > labyrinth and work by trial and error (with a reboot in between each err= or).=20 > I'm happy to do that but I'm hoping that I might be able to use this=20 > situation to learn of more elegant ways to diagnose the problem. >=20 > ...and to reiterate, this is on 7.0-BETA1 (built Saturday morning) and a= ll=20 > this was working without any intervention on 6.2-RELEASE. When I see processes stalled on nanslp at boot it's usually when my network= is messed up in some way. I think it's stuck in the resolver trying to look t= hings up. -- Brooks --7AUc2qLy4jB3hD7Z Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQFHHhpkXY6L6fI4GtQRAkDPAJ4tGbFNBkxilOfUaiEFmzmdEEVdkgCdF4Bi qMFgQjmSxT8hTTNCykN77tk= =S2Vr -----END PGP SIGNATURE----- --7AUc2qLy4jB3hD7Z-- From owner-freebsd-rc@FreeBSD.ORG Tue Oct 23 16:57:22 2007 Return-Path: Delivered-To: freebsd-rc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EFAB116A417 for ; Tue, 23 Oct 2007 16:57:22 +0000 (UTC) (envelope-from John.Marshall@riverwillow.com.au) Received: from mail2.riverwillow.net.au (ns2.riverwillow.net.au [203.58.93.41]) by mx1.freebsd.org (Postfix) with ESMTP id 89BC513C49D for ; Tue, 23 Oct 2007 16:57:21 +0000 (UTC) (envelope-from John.Marshall@riverwillow.com.au) Received: from rwmail.mby.riverwillow.net.au (rwsrv06.rw2.riverwillow.net.au [172.25.25.16]) by mail2.riverwillow.net.au (8.14.1/8.14.1) with ESMTP id l9NGvC2s038304; Wed, 24 Oct 2007 02:57:12 +1000 (AEST) Received: from [172.25.25.68] ([172.25.25.68] RDNS failed) by rwmail.mby.riverwillow.net.au with Microsoft SMTPSVC(6.0.3790.3959); Wed, 24 Oct 2007 02:57:12 +1000 Message-ID: <471E27E5.4030609@riverwillow.com.au> Date: Wed, 24 Oct 2007 02:57:09 +1000 From: John Marshall Organization: Riverwillow Pty Ltd User-Agent: Thunderbird 2.0.0.6 (Windows/20070728) MIME-Version: 1.0 To: Brooks Davis References: <471D7F68.8070308@riverwillow.com.au> <584bfc3f0710230505i29e8f19aofc4e66d0aee7b7c1@mail.gmail.com> <471DFFD0.8020701@riverwillow.com.au> <20071023155932.GA37204@lor.one-eyed-alien.net> In-Reply-To: <20071023155932.GA37204@lor.one-eyed-alien.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 23 Oct 2007 16:57:12.0263 (UTC) FILETIME=[C19C8570:01C81595] Cc: "Mike Telahun Makonnen ; freebsd-rc@FreeBSD.Org" Subject: Re: How to debug rc hangs? X-BeenThere: freebsd-rc@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussion related to /etc/rc.d design and implementation." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Oct 2007 16:57:23 -0000 Brooks Davis wrote: > On Wed, Oct 24, 2007 at 12:06:08AM +1000, John Marshall wrote: >> Mike Telahun Makonnen wrote: >>> On 10/23/07, John Marshall wrote: >>>> I have tried setting rc_debug="YES" in rc.conf but that doesn't show me >>>> any more than I already know (e.g. last line before mountd hang is: >>>> "/etc/rc: DEBUG: run_rc_command: doit: /usr/sbin/mountd -l" >>> It seems to me that if it's getting this far, that the problem probably is >>> not in rc.d. The next thing it does after that debug message is eval the >>> $doit >>> line you saw, so either the eval command is missbehaving or the problem >>> is with the daemon and not rc.d. What does CTR-t say when it hangs? Also, >>> I noticed all three programs you listed are network daemons. My guess is >>> they are not actually hung, they only *appear* to hang because they're >>> wating >>> on some sort of network resource (DNS maybe?). >> Thanks Mike, >> >> The ctrl-T tip is the kind of information I'm looking for. My primary reason >> for posting is to find out what tools/switches/hooks are available to help >> troubleshoot this kind of problem, rather than asking somebody else to solve >> it. >> >> Having said that, ctrl-T shows: >> load: 0.74 cmd: mountd 576 [nanslp] 0.00u 0.00s 0% 1428k >> load: 0.25 cmd: mountd 576 [nanslp] 0.00u 0.00s 0% 1432k >> load: 0.12 cmd: mountd 576 [nanslp] 0.00u 0.00s 0% 1432k >> >> ...which lends weight to my suspicion that a pre-requisite resource is not >> yet available - and, perhaps, hasn't yet started due to a circular >> dependency? As I hinted, my plan is to drill down into the PROVIDE/REQUIRE >> labyrinth and work by trial and error (with a reboot in between each error). >> I'm happy to do that but I'm hoping that I might be able to use this >> situation to learn of more elegant ways to diagnose the problem. >> >> ...and to reiterate, this is on 7.0-BETA1 (built Saturday morning) and all >> this was working without any intervention on 6.2-RELEASE. > > When I see processes stalled on nanslp at boot it's usually when my network is > messed up in some way. I think it's stuck in the resolver trying to look things > up. [blush] I actually fixed this 12 months ago on 6.n and forgot all about it. I let the 7.0 mergemaster overwrite the rc.d/ypset because I didn't think I had touched it. Here is the fix. All happy now - but not much the wiser as to rc troubleshooting techniques. ----------------------------------------------- --- /usr/src/etc/rc.d/ypset 2007-10-12 12:38:42.000000000 +1000 +++ /etc/rc.d/ypset 2007-10-24 02:31:32.000000000 +1000 @@ -5,6 +5,7 @@ # PROVIDE: ypset # REQUIRE: ypbind +# BEFORE: mountd . /etc/rc.subr ----------------------------------------------- Thank you for your help. -- John Marshall From owner-freebsd-rc@FreeBSD.ORG Tue Oct 23 17:49:54 2007 Return-Path: Delivered-To: freebsd-rc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 60E9A16A418 for ; Tue, 23 Oct 2007 17:49:54 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: from mail2.fluidhosting.com (mx21.fluidhosting.com [204.14.89.4]) by mx1.freebsd.org (Postfix) with SMTP id 24D4713C480 for ; Tue, 23 Oct 2007 17:49:53 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: (qmail 19214 invoked by uid 399); 23 Oct 2007 17:49:43 -0000 Received: from localhost (HELO slave.dougb.net) (dougb@dougbarton.us@127.0.0.1) by localhost with ESMTP; 23 Oct 2007 17:49:43 -0000 X-Originating-IP: 127.0.0.1 Date: Tue, 23 Oct 2007 10:49:40 -0700 (PDT) From: Doug Barton To: John Marshall In-Reply-To: <471E27E5.4030609@riverwillow.com.au> Message-ID: References: <471D7F68.8070308@riverwillow.com.au> <584bfc3f0710230505i29e8f19aofc4e66d0aee7b7c1@mail.gmail.com> <471DFFD0.8020701@riverwillow.com.au> <20071023155932.GA37204@lor.one-eyed-alien.net> <471E27E5.4030609@riverwillow.com.au> X-message-flag: Outlook -- Not just for spreading viruses anymore! X-OpenPGP-Key-ID: 0xD5B2F0FB Organization: http://www.FreeBSD.org/ MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="0-1398324669-1193161782=:14145" Cc: Brooks Davis , "Mike Telahun Makonnen ; freebsd-rc@FreeBSD.Org" Subject: Re: How to debug rc hangs? X-BeenThere: freebsd-rc@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussion related to /etc/rc.d design and implementation." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Oct 2007 17:49:54 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-1398324669-1193161782=:14145 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed On Wed, 24 Oct 2007, John Marshall wrote: > [blush] I actually fixed this 12 months ago on 6.n and forgot all about it. I > let the 7.0 mergemaster overwrite the rc.d/ypset because I didn't think I had > touched it. > > Here is the fix. All happy now - but not much the wiser as to rc > troubleshooting techniques. > > ----------------------------------------------- > --- /usr/src/etc/rc.d/ypset 2007-10-12 12:38:42.000000000 +1000 > +++ /etc/rc.d/ypset 2007-10-24 02:31:32.000000000 +1000 > @@ -5,6 +5,7 @@ > > # PROVIDE: ypset > # REQUIRE: ypbind > +# BEFORE: mountd > > . /etc/rc.subr John, Thanks for digging into this yourself, always nice when someone answers their own questions. :) In regards to your proposed solution, I would like to explore this with you a little. In general it would be preferred to add REQUIRE: ypset in mountd instead of using BEFORE. This makes debugging easier down the road. The other thing I'm interested in is the relationship between this change and the other yp* stuff. I've attached a patch that I use to debug the rcorder stuff. If you could copy your /etc/rc to a handy directory, apply the patch, and then run 'sh rc'. If you could do that before and after applying the change you suggested (using REQUIRE instead of BEFORE if you don't mind) I'd be interested if anything else changes besides the ypset/mountd relationship. My concern is that in my rcorder (basically stock RELENG_6 with some ports) ypxfrd, ypudated, and ypset are all running in that order, and late in the overall order. I'm concerned that if we move ypset earlier (and mountd is run a lot earlier than those 3) it will have other side effects. I also noticed that ypserv and ypbind are run much than even mountd, and since I don't use NIS I don't really know what those side effects might be (if any). I may be a little overcautious here, but experience has taught us that even seemingly small changes can have large and unanticipated consequences. Doug -- This .signature sanitized for your protection --0-1398324669-1193161782=:14145 Content-Type: TEXT/PLAIN; charset=US-ASCII; name=rc-debug.patch Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: Content-Disposition: attachment; filename=rc-debug.patch LS0tIC9ldGMvcmMJMjAwNy0wNy0wNiAyMjoyNjowOS4wMDAwMDAwMDAgLTA3 MDANCisrKyByYwkyMDA3LTEwLTIzIDEwOjM1OjAzLjAwMDAwMDAwMCAtMDcw MA0KQEAgLTg1LDggKzQ3LDEyIEBADQogIw0KIGZpbGVzPWByY29yZGVyICR7 c2tpcH0gL2V0Yy9yYy5kLyogMj4vZGV2L251bGxgDQogDQorIyBYWFgNCity bSAtZiByYy5lYXJseSogcmMubGF0ZQ0KKw0KIGZvciBfcmNfZWxlbSBpbiAk e2ZpbGVzfTsgZG8NCi0JcnVuX3JjX3NjcmlwdCAke19yY19lbGVtfSAke19i b290fQ0KKwkjcnVuX3JjX3NjcmlwdCAke19yY19lbGVtfSAke19ib290fQ0K KwllY2hvICRfcmNfZWxlbSA+PiByYy5lYXJseTENCiANCiAJY2FzZSAiJF9y Y19lbGVtIiBpbg0KIAkqLyR7ZWFybHlfbGF0ZV9kaXZpZGVyfSkJYnJlYWsg OzsNCkBAIC0xMDcsMTYgKzczLDIyIEBADQogX3NraXBfZWFybHk9MQ0KIGZv ciBfcmNfZWxlbSBpbiAke2ZpbGVzfTsgZG8NCiAJY2FzZSAiJF9za2lwX2Vh cmx5IiBpbg0KLQkxKQljYXNlICIkX3JjX2VsZW0iIGluDQorCTEpDQorCQll Y2hvICRfcmNfZWxlbSA+PiByYy5lYXJseTINCisJCWNhc2UgIiRfcmNfZWxl bSIgaW4NCiAJCSovJHtlYXJseV9sYXRlX2RpdmlkZXJ9KQlfc2tpcF9lYXJs eT0wIDs7DQogCQllc2FjDQogCQljb250aW51ZQ0KIAkJOzsNCiAJZXNhYw0K IA0KLQlydW5fcmNfc2NyaXB0ICR7X3JjX2VsZW19ICR7X2Jvb3R9DQorCWVj aG8gJF9yY19lbGVtID4+IHJjLmxhdGUNCisNCisJI3J1bl9yY19zY3JpcHQg JHtfcmNfZWxlbX0gJHtfYm9vdH0NCiBkb25lDQogDQorZGlmZiAtdSByYy5l YXJseSoNCisNCiBlY2hvICcnDQogZGF0ZQ0KIGV4aXQgMA0K --0-1398324669-1193161782=:14145-- From owner-freebsd-rc@FreeBSD.ORG Wed Oct 24 12:17:49 2007 Return-Path: Delivered-To: freebsd-rc@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7DC0A16A420; Wed, 24 Oct 2007 12:17:49 +0000 (UTC) (envelope-from John.Marshall@riverwillow.com.au) Received: from mail2.riverwillow.net.au (ns2.riverwillow.net.au [203.58.93.41]) by mx1.freebsd.org (Postfix) with ESMTP id D682013C4B3; Wed, 24 Oct 2007 12:17:48 +0000 (UTC) (envelope-from John.Marshall@riverwillow.com.au) Received: from rwmail.mby.riverwillow.net.au (rwsrv06.rw2.riverwillow.net.au [172.25.25.16]) by mail2.riverwillow.net.au (8.14.1/8.14.1) with ESMTP id l9OCHYVI077609; Wed, 24 Oct 2007 22:17:34 +1000 (AEST) Received: from [172.25.25.68] ([172.25.25.68] RDNS failed) by rwmail.mby.riverwillow.net.au with Microsoft SMTPSVC(6.0.3790.3959); Wed, 24 Oct 2007 22:17:33 +1000 Message-ID: <471F37CA.1080005@riverwillow.com.au> Date: Wed, 24 Oct 2007 22:17:14 +1000 From: John Marshall Organization: Riverwillow Pty Ltd User-Agent: Thunderbird 2.0.0.6 (Windows/20070728) MIME-Version: 1.0 To: Doug Barton References: <471D7F68.8070308@riverwillow.com.au> <584bfc3f0710230505i29e8f19aofc4e66d0aee7b7c1@mail.gmail.com> <471DFFD0.8020701@riverwillow.com.au> <20071023155932.GA37204@lor.one-eyed-alien.net> <471E27E5.4030609@riverwillow.com.au> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 24 Oct 2007 12:17:33.0782 (UTC) FILETIME=[DB453F60:01C81637] Cc: "Brooks Davis ; Mike Telahun Makonnen ; freebsd-rc@FreeBSD.Org" Subject: Re: How to debug rc hangs? X-BeenThere: freebsd-rc@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussion related to /etc/rc.d design and implementation." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Oct 2007 12:17:49 -0000 Doug Barton wrote: > On Wed, 24 Oct 2007, John Marshall wrote: >> Here is the fix. All happy now - but not much the wiser as to rc >> troubleshooting techniques. >> >> ----------------------------------------------- >> --- /usr/src/etc/rc.d/ypset 2007-10-12 12:38:42.000000000 +1000 >> +++ /etc/rc.d/ypset 2007-10-24 02:31:32.000000000 +1000 >> @@ -5,6 +5,7 @@ >> >> # PROVIDE: ypset >> # REQUIRE: ypbind >> +# BEFORE: mountd >> >> . /etc/rc.subr > > In regards to your proposed solution, I would like to explore this with > you a little. In general it would be preferred to add REQUIRE: ypset in > mountd instead of using BEFORE. This makes debugging easier down the road. Thanks Doug. No problem. I used your /etc/rc debug patch to produce rc.late files for a few different scenarios and include diffs from those below. > My concern is that in my rcorder (basically stock RELENG_6 with some > ports) ypxfrd, ypudated, and ypset are all running in that order, and > late in the overall order. I'm concerned that if we move ypset earlier > (and mountd is run a lot earlier than those 3) it will have other side > effects. I also noticed that ypserv and ypbind are run much than even > mountd, and since I don't use NIS I don't really know what those side > effects might be (if any). It looks to me like the only reason why ypset runs so late is that there are no REQUIRE/BEFORE constraints requiring it to run earlier - and I think there should be. ---- ypset(8) ---- NAME ypset -- tell ypbind(8) which YP server process to use ------------------ The sole purpose of ypset is to get ypbind working. Anything which has a dependency on the NIS client (ypbind) is going to stall if ypbind is stalled waiting to find a NIS server. I have no NIS experience beyond this particular network, but my guess is that most systems wouldn't be relying on ypset to get ypbind functioning, so most people wouldn't hit this problem. If ypset is unused, who cares where it sits in the rc order? If it is used, then it really needs to kick in very soon after ypbind and before anything that requires ypbind. Having delved a bit further into this, I think that mountd was probably actually waiting on nfsserver, and that nfsserver has the implicit dependency on ypbind. So, I added ypset to the REQUIRE list in rc.d/nfsserver - and it works. So, my revised proposed solution (which works fine for me) is: ----------------- --- /usr/src/etc/rc.d/nfsserver 2007-10-12 12:38:42.000000000 +1000 +++ /etc/rc.d/nfsserver 2007-10-24 21:01:37.000000000 +1000 @@ -4,7 +4,7 @@ # # PROVIDE: nfsserver -# REQUIRE: NETWORKING mountcritremote +# REQUIRE: NETWORKING mountcritremote ypset # KEYWORD: nojail . /etc/rc.subr ----------------- (If there was some way of tying ypbind and ypset together, I think that would be a better solution.) ...and the diff's resulting from various experiments with your debug patch applied to /etc/rc (thanks!). In each case, the only thing that moves is ypset - and in the first two scenarios yield identical results. ----------------- --- rc.late.standard 2007-10-24 08:54:59.000000000 +1000 +++ rc.late.ypset_before_mountd 2007-10-24 08:53:59.000000000 +1000 @@ -65,6 +65,7 @@ /etc/rc.d/kadmind /etc/rc.d/keyserv /etc/rc.d/kpasswdd +/etc/rc.d/ypset /etc/rc.d/quota /etc/rc.d/nfsserver /etc/rc.d/mountd @@ -115,7 +116,6 @@ /usr/local/etc/rc.d/apache22 /etc/rc.d/ypxfrd /etc/rc.d/ypupdated -/etc/rc.d/ypset /etc/rc.d/watchdogd /etc/rc.d/syscons /etc/rc.d/sshd ----------------- ----------------- --- rc.late.standard 2007-10-24 08:54:59.000000000 +1000 +++ rc.late.mountd_requires_ypset 2007-10-24 08:55:44.000000000 +1000 @@ -65,6 +65,7 @@ /etc/rc.d/kadmind /etc/rc.d/keyserv /etc/rc.d/kpasswdd +/etc/rc.d/ypset /etc/rc.d/quota /etc/rc.d/nfsserver /etc/rc.d/mountd @@ -115,7 +116,6 @@ /usr/local/etc/rc.d/apache22 /etc/rc.d/ypxfrd /etc/rc.d/ypupdated -/etc/rc.d/ypset /etc/rc.d/watchdogd /etc/rc.d/syscons /etc/rc.d/sshd ----------------- ----------------- --- rc.late.standard 2007-10-24 08:54:59.000000000 +1000 +++ rc.late.nfsserver_requires_ypset 2007-10-24 21:06:23.000000000 +1000 @@ -66,6 +66,7 @@ /etc/rc.d/keyserv /etc/rc.d/kpasswdd /etc/rc.d/quota +/etc/rc.d/ypset /etc/rc.d/nfsserver /etc/rc.d/mountd /etc/rc.d/nfsd @@ -115,7 +116,6 @@ /usr/local/etc/rc.d/apache22 /etc/rc.d/ypxfrd /etc/rc.d/ypupdated -/etc/rc.d/ypset /etc/rc.d/watchdogd /etc/rc.d/syscons /etc/rc.d/sshd ----------------- -- John Marshall From owner-freebsd-rc@FreeBSD.ORG Wed Oct 24 21:48:13 2007 Return-Path: Delivered-To: freebsd-rc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2414916A420 for ; Wed, 24 Oct 2007 21:48:13 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: from mail2.fluidhosting.com (mx21.fluidhosting.com [204.14.89.4]) by mx1.freebsd.org (Postfix) with SMTP id AFB2213C4A5 for ; Wed, 24 Oct 2007 21:48:12 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: (qmail 20054 invoked by uid 399); 24 Oct 2007 21:41:25 -0000 Received: from localhost (HELO slave.dougb.net) (dougb@dougbarton.us@127.0.0.1) by localhost with ESMTP; 24 Oct 2007 21:41:25 -0000 X-Originating-IP: 127.0.0.1 Date: Wed, 24 Oct 2007 14:41:23 -0700 (PDT) From: Doug Barton To: freebsd-rc@freebsd.org Message-ID: X-message-flag: Outlook -- Not just for spreading viruses anymore! X-OpenPGP-Key-ID: 0xD5B2F0FB Organization: http://www.FreeBSD.org/ MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII Subject: Please MFC rc.d work to RELENG_6 X-BeenThere: freebsd-rc@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussion related to /etc/rc.d design and implementation." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Oct 2007 21:48:13 -0000 Howdy, It's that time again. :) I took a quick look at the diff between HEAD and RELENG_6 in rc.subr and rc.d/, and it's pretty big. (Of course some of it isn't relevant, but a lot of it is.) I'm guilty here too, I just committed a small fix that I did in June. Please take a look at your commits in /etc/ between now and the 6.2-RELEASE and update anything you think is appropriate. There is a lot of low-hanging fruit here, so it would be nice to clean this up. Doug -- This .signature sanitized for your protection From owner-freebsd-rc@FreeBSD.ORG Wed Oct 24 22:30:20 2007 Return-Path: Delivered-To: freebsd-rc@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8864416A4C6 for ; Wed, 24 Oct 2007 22:30:20 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: from mail2.fluidhosting.com (mx21.fluidhosting.com [204.14.89.4]) by mx1.freebsd.org (Postfix) with SMTP id 361FA13C4B5 for ; Wed, 24 Oct 2007 22:30:20 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: (qmail 16778 invoked by uid 399); 24 Oct 2007 22:30:05 -0000 Received: from localhost (HELO slave.dougb.net) (dougb@dougbarton.us@127.0.0.1) by localhost with ESMTP; 24 Oct 2007 22:30:05 -0000 X-Originating-IP: 127.0.0.1 Date: Wed, 24 Oct 2007 15:30:02 -0700 (PDT) From: Doug Barton To: John Marshall In-Reply-To: <471F37CA.1080005@riverwillow.com.au> Message-ID: References: <471D7F68.8070308@riverwillow.com.au> <584bfc3f0710230505i29e8f19aofc4e66d0aee7b7c1@mail.gmail.com> <471DFFD0.8020701@riverwillow.com.au> <20071023155932.GA37204@lor.one-eyed-alien.net> <471E27E5.4030609@riverwillow.com.au> <471F37CA.1080005@riverwillow.com.au> X-message-flag: Outlook -- Not just for spreading viruses anymore! X-OpenPGP-Key-ID: 0xD5B2F0FB Organization: http://www.FreeBSD.org/ MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: "Brooks Davis ; Mike Telahun Makonnen ; freebsd-rc@FreeBSD.Org" Subject: Re: How to debug rc hangs? X-BeenThere: freebsd-rc@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussion related to /etc/rc.d design and implementation." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Oct 2007 22:30:20 -0000 John, Thanks for digging into this, a few comments in line. On Wed, 24 Oct 2007, John Marshall wrote: > It looks to me like the only reason why ypset runs so late is that there are > no REQUIRE/BEFORE constraints requiring it to run earlier - and I think there > should be. Ok, fair enough. > ---- ypset(8) ---- > NAME > ypset -- tell ypbind(8) which YP server process to use > ------------------ > > The sole purpose of ypset is to get ypbind working. Anything which has a > dependency on the NIS client (ypbind) is going to stall if ypbind is stalled > waiting to find a NIS server. I have no NIS experience beyond this particular > network, but my guess is that most systems wouldn't be relying on ypset to > get ypbind functioning, so most people wouldn't hit this problem. If ypset is > unused, who cares where it sits in the rc order? If it is used, then it > really needs to kick in very soon after ypbind and before anything that > requires ypbind. I think I understand that bit, thanks. > Having delved a bit further into this, I think that mountd was probably > actually waiting on nfsserver, and that nfsserver has the implicit dependency > on ypbind. So, I added ypset to the REQUIRE list in rc.d/nfsserver - and it > works. More good detective work! I have another idea though ... > (If there was some way of tying ypbind and ypset together, I think that would > be a better solution.) Well there are a couple ways of doing that. The least painful would be to change any REQUIRE lines that mention ypbind to mention ypset instead. ypset already has a REQUIRE for ypbind, so that would put them in the right order, and closer together. Since I know next to nothing about NIS though, I am hesitant to do that. > ...and the diff's resulting from various experiments with your debug patch > applied to /etc/rc (thanks!). In each case, the only thing that moves is > ypset - and in the first two scenarios yield identical results. That's good news, I did notice something interesting though. > ----------------- > --- rc.late.standard 2007-10-24 08:54:59.000000000 +1000 > +++ rc.late.ypset_before_mountd 2007-10-24 08:53:59.000000000 +1000 > @@ -65,6 +65,7 @@ > /etc/rc.d/kadmind > /etc/rc.d/keyserv > /etc/rc.d/kpasswdd > +/etc/rc.d/ypset > /etc/rc.d/quota Here it comes before quota > ----------------- > --- rc.late.standard 2007-10-24 08:54:59.000000000 +1000 > +++ rc.late.nfsserver_requires_ypset 2007-10-24 21:06:23.000000000 +1000 > @@ -66,6 +66,7 @@ > /etc/rc.d/keyserv > /etc/rc.d/kpasswdd > /etc/rc.d/quota > +/etc/rc.d/ypset Here it comes after. Now mountd has: # REQUIRE: NETWORKING nfsserver rpcbind quota and there is a comment in quota that says: must be after ypbind if using NIS So I'm thinking that adding REQUIRE: ypset to quota might be the way to go. Can you try that fix instead and see if it works for you? If you have time, I would also be interested to know if there are any changes if you take the current CVS code and just s/ypbind/ypset/ for all the scripts that currently REQUIRE ypbind. I'm not convinced that is the right answer yet, I'm just interested in what's going to happen. Doug -- This .signature sanitized for your protection From owner-freebsd-rc@FreeBSD.ORG Thu Oct 25 02:34:26 2007 Return-Path: Delivered-To: freebsd-rc@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8C6EA16A420; Thu, 25 Oct 2007 02:34:26 +0000 (UTC) (envelope-from John.Marshall@riverwillow.com.au) Received: from mail2.riverwillow.net.au (ns2.riverwillow.net.au [203.58.93.41]) by mx1.freebsd.org (Postfix) with ESMTP id CA4B113C4A3; Thu, 25 Oct 2007 02:34:25 +0000 (UTC) (envelope-from John.Marshall@riverwillow.com.au) Received: from rwmail.mby.riverwillow.net.au (rwsrv06.rw2.riverwillow.net.au [172.25.25.16]) by mail2.riverwillow.net.au (8.14.1/8.14.1) with ESMTP id l9P2Y5Rj016152; Thu, 25 Oct 2007 12:34:05 +1000 (AEST) Received: from [172.25.25.68] ([172.25.25.68] RDNS failed) by rwmail.mby.riverwillow.net.au with Microsoft SMTPSVC(6.0.3790.3959); Thu, 25 Oct 2007 12:34:05 +1000 Message-ID: <47200095.6050204@riverwillow.com.au> Date: Thu, 25 Oct 2007 12:33:57 +1000 From: John Marshall Organization: Riverwillow Pty Ltd User-Agent: Thunderbird 2.0.0.6 (Windows/20070728) MIME-Version: 1.0 To: Doug Barton References: <471D7F68.8070308@riverwillow.com.au> <584bfc3f0710230505i29e8f19aofc4e66d0aee7b7c1@mail.gmail.com> <471DFFD0.8020701@riverwillow.com.au> <20071023155932.GA37204@lor.one-eyed-alien.net> <471E27E5.4030609@riverwillow.com.au> <471F37CA.1080005@riverwillow.com.au> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 25 Oct 2007 02:34:05.0310 (UTC) FILETIME=[830219E0:01C816AF] Cc: "Brooks Davis ; Mike Telahun Makonnen ; freebsd-rc@FreeBSD.Org" Subject: Re: How to debug rc hangs? X-BeenThere: freebsd-rc@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussion related to /etc/rc.d design and implementation." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Oct 2007 02:34:26 -0000 Doug Barton wrote: > On Wed, 24 Oct 2007, John Marshall wrote: >> (If there was some way of tying ypbind and ypset together, I think >> that would be a better solution.) > > Well there are a couple ways of doing that. The least painful would be > to change any REQUIRE lines that mention ypbind to mention ypset > instead. ypset already has a REQUIRE for ypbind, so that would put them > in the right order, and closer together. Since I know next to nothing > about NIS though, I am hesitant to do that. Apart from ypset, the only scripts which REQUIRE ypbind are: - amd - keyserv - yppasswdd I edited those three files and substituted "ypset" for "ypbind" in the REQUIRE line (which should be OK because ypset REQUIREs ypbind) and ran your modified rc script. Again, the only thing that moves in the rc order (compared to out-of-the-box 7.0-BETA1) is ypset - but it moves to EXACTLY the right place. I've expanded the diff context in order to show where this sits in relation to nfsserver, mountd and friends. ------------------ --- rc.late.standard 2007-10-24 08:54:59.000000000 +1000 +++ rc.late.subs_require_ypset 2007-10-25 10:45:28.000000000 +1000 @@ -37,40 +37,41 @@ /etc/rc.d/NETWORKING /etc/rc.d/mountcritremote /etc/rc.d/accounting /etc/rc.d/ldconfig /etc/rc.d/devfs /etc/rc.d/ipmon /etc/rc.d/mdconfig2 /etc/rc.d/newsyslog /etc/rc.d/syslogd /etc/rc.d/savecore /etc/rc.d/archdep /etc/rc.d/abi /etc/rc.d/SERVERS /etc/rc.d/named /etc/rc.d/ntpdate /etc/rc.d/rpcbind /etc/rc.d/nfsclient /etc/rc.d/nisdomain /etc/rc.d/ypserv /etc/rc.d/ypbind +/etc/rc.d/ypset /etc/rc.d/amd /etc/rc.d/atm3 /etc/rc.d/auditd /etc/rc.d/tmp /etc/rc.d/cleartmp /etc/rc.d/dmesg /etc/rc.d/ipxrouted /etc/rc.d/kerberos /etc/rc.d/kadmind /etc/rc.d/keyserv /etc/rc.d/kpasswdd /etc/rc.d/quota /etc/rc.d/nfsserver /etc/rc.d/mountd /etc/rc.d/nfsd /etc/rc.d/statd /etc/rc.d/lockd /etc/rc.d/pppoed /etc/rc.d/pwcheck /etc/rc.d/virecover @@ -98,41 +99,40 @@ /usr/local/etc/rc.d/samba /etc/rc.d/LOGIN /usr/local/etc/rc.d/squid /usr/local/etc/rc.d/spamass-milter /usr/local/etc/rc.d/sa-spamd /usr/local/etc/rc.d/rsyncd_ext /usr/local/etc/rc.d/rsyncd /usr/local/etc/rc.d/milter-greylist.sh /usr/local/etc/rc.d/mailman /usr/local/etc/rc.d/icecast2 /usr/local/etc/rc.d/ices0_recent /usr/local/etc/rc.d/ices0_mark /usr/local/etc/rc.d/ices0_attr /usr/local/etc/rc.d/ices0_acts /usr/local/etc/rc.d/ices0 /usr/local/etc/rc.d/htcacheclean /usr/local/etc/rc.d/cvsupd /usr/local/etc/rc.d/apache22 /etc/rc.d/ypxfrd /etc/rc.d/ypupdated -/etc/rc.d/ypset /etc/rc.d/watchdogd /etc/rc.d/syscons /etc/rc.d/sshd /etc/rc.d/sendmail /etc/rc.d/cron /etc/rc.d/jail /etc/rc.d/localpkg /etc/rc.d/securelevel /etc/rc.d/othermta /etc/rc.d/msgs /etc/rc.d/moused /etc/rc.d/mixer /etc/rc.d/kernel /etc/rc.d/inetd /etc/rc.d/idmapd /etc/rc.d/hostapd /etc/rc.d/geli2 /etc/rc.d/ftpd /etc/rc.d/ftp-proxy /etc/rc.d/bsnmpd ------------------ > Now mountd has: > # REQUIRE: NETWORKING nfsserver rpcbind quota > > and there is a comment in quota that says: > must be after ypbind if using NIS > > So I'm thinking that adding REQUIRE: ypset to quota might be the way to > go. Can you try that fix instead and see if it works for you? The absence of "ypbind" from the REQUIRE line in the quota script must be an oversight, given the script's ypbind requirement comment, and ought to be addressed. Also, as you note, the fact that mountd REQUIREs quota means that fixing the quota dependency will kill two birds with the one stone. So, I added "ypset" to the REQUIRE line in quota and re-ran your script. The result was identical to that posted above. Rebooting with the following modifications resulted in a flawless boot: - Replace ypbind with ypset in REQUIRE lines for: - amd - keyserv - yppasswdd - Add ypset to REQUIRE line for quota These modifications make perfect sense to me but I would value the opinion of rc/NFS/NIS gurus before filing a PR - or is filing a PR the proper way to bring something like this under review? Thank you for your helpful, friendly tuition in rc order analysis. I hope this might prove useful to others. -- John Marshall From owner-freebsd-rc@FreeBSD.ORG Thu Oct 25 04:29:03 2007 Return-Path: Delivered-To: freebsd-rc@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C4B8516A41A for ; Thu, 25 Oct 2007 04:29:03 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: from mail2.fluidhosting.com (mx21.fluidhosting.com [204.14.89.4]) by mx1.freebsd.org (Postfix) with SMTP id 510EB13C494 for ; Thu, 25 Oct 2007 04:29:03 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: (qmail 24293 invoked by uid 399); 25 Oct 2007 04:29:02 -0000 Received: from localhost (HELO slave.dougb.net) (dougb@dougbarton.us@127.0.0.1) by localhost with ESMTP; 25 Oct 2007 04:29:02 -0000 X-Originating-IP: 127.0.0.1 Date: Wed, 24 Oct 2007 21:28:59 -0700 (PDT) From: Doug Barton To: John Marshall In-Reply-To: <47200095.6050204@riverwillow.com.au> Message-ID: References: <471D7F68.8070308@riverwillow.com.au> <584bfc3f0710230505i29e8f19aofc4e66d0aee7b7c1@mail.gmail.com> <471DFFD0.8020701@riverwillow.com.au> <20071023155932.GA37204@lor.one-eyed-alien.net> <471E27E5.4030609@riverwillow.com.au> <471F37CA.1080005@riverwillow.com.au> <47200095.6050204@riverwillow.com.au> X-message-flag: Outlook -- Not just for spreading viruses anymore! X-OpenPGP-Key-ID: 0xD5B2F0FB Organization: http://www.FreeBSD.org/ MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: "Brooks Davis ; Mike Telahun Makonnen ; freebsd-rc@FreeBSD.Org" Subject: Re: How to debug rc hangs? X-BeenThere: freebsd-rc@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussion related to /etc/rc.d design and implementation." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Oct 2007 04:29:03 -0000 This is great work John, thanks! On Thu, 25 Oct 2007, John Marshall wrote: > These modifications make perfect sense to me but I would value the opinion of > rc/NFS/NIS gurus before filing a PR - or is filing a PR the proper way to > bring something like this under review? Yeah, it is. My suggestion is to wait 24 hours for someone on this list more knowledgeable than I (which is just about anyone) to speak up, and if no one does then please file the PR, and forward me a copy of the notice after the number is assigned. I'll then poke around for some NIS-aware folks to review further before we commit it. BTW, since this is a non-trivial change, and since I'm not familiar with NIS, I'm not anticipating getting this in before the 7.0 or 6.3 -RELEASEs. I would rather get some review for it now, and put it in HEAD for a while. Then I'll MFC it after the releases. > Thank you for your helpful, friendly tuition in rc order analysis. I hope > this might prove useful to others. As do I. Doug -- This .signature sanitized for your protection