From owner-freebsd-rc@FreeBSD.ORG Tue Oct 23 16:57:22 2007 Return-Path: Delivered-To: freebsd-rc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EFAB116A417 for ; Tue, 23 Oct 2007 16:57:22 +0000 (UTC) (envelope-from John.Marshall@riverwillow.com.au) Received: from mail2.riverwillow.net.au (ns2.riverwillow.net.au [203.58.93.41]) by mx1.freebsd.org (Postfix) with ESMTP id 89BC513C49D for ; Tue, 23 Oct 2007 16:57:21 +0000 (UTC) (envelope-from John.Marshall@riverwillow.com.au) Received: from rwmail.mby.riverwillow.net.au (rwsrv06.rw2.riverwillow.net.au [172.25.25.16]) by mail2.riverwillow.net.au (8.14.1/8.14.1) with ESMTP id l9NGvC2s038304; Wed, 24 Oct 2007 02:57:12 +1000 (AEST) Received: from [172.25.25.68] ([172.25.25.68] RDNS failed) by rwmail.mby.riverwillow.net.au with Microsoft SMTPSVC(6.0.3790.3959); Wed, 24 Oct 2007 02:57:12 +1000 Message-ID: <471E27E5.4030609@riverwillow.com.au> Date: Wed, 24 Oct 2007 02:57:09 +1000 From: John Marshall Organization: Riverwillow Pty Ltd User-Agent: Thunderbird 2.0.0.6 (Windows/20070728) MIME-Version: 1.0 To: Brooks Davis References: <471D7F68.8070308@riverwillow.com.au> <584bfc3f0710230505i29e8f19aofc4e66d0aee7b7c1@mail.gmail.com> <471DFFD0.8020701@riverwillow.com.au> <20071023155932.GA37204@lor.one-eyed-alien.net> In-Reply-To: <20071023155932.GA37204@lor.one-eyed-alien.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 23 Oct 2007 16:57:12.0263 (UTC) FILETIME=[C19C8570:01C81595] Cc: "Mike Telahun Makonnen ; freebsd-rc@FreeBSD.Org" Subject: Re: How to debug rc hangs? X-BeenThere: freebsd-rc@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussion related to /etc/rc.d design and implementation." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Oct 2007 16:57:23 -0000 Brooks Davis wrote: > On Wed, Oct 24, 2007 at 12:06:08AM +1000, John Marshall wrote: >> Mike Telahun Makonnen wrote: >>> On 10/23/07, John Marshall wrote: >>>> I have tried setting rc_debug="YES" in rc.conf but that doesn't show me >>>> any more than I already know (e.g. last line before mountd hang is: >>>> "/etc/rc: DEBUG: run_rc_command: doit: /usr/sbin/mountd -l" >>> It seems to me that if it's getting this far, that the problem probably is >>> not in rc.d. The next thing it does after that debug message is eval the >>> $doit >>> line you saw, so either the eval command is missbehaving or the problem >>> is with the daemon and not rc.d. What does CTR-t say when it hangs? Also, >>> I noticed all three programs you listed are network daemons. My guess is >>> they are not actually hung, they only *appear* to hang because they're >>> wating >>> on some sort of network resource (DNS maybe?). >> Thanks Mike, >> >> The ctrl-T tip is the kind of information I'm looking for. My primary reason >> for posting is to find out what tools/switches/hooks are available to help >> troubleshoot this kind of problem, rather than asking somebody else to solve >> it. >> >> Having said that, ctrl-T shows: >> load: 0.74 cmd: mountd 576 [nanslp] 0.00u 0.00s 0% 1428k >> load: 0.25 cmd: mountd 576 [nanslp] 0.00u 0.00s 0% 1432k >> load: 0.12 cmd: mountd 576 [nanslp] 0.00u 0.00s 0% 1432k >> >> ...which lends weight to my suspicion that a pre-requisite resource is not >> yet available - and, perhaps, hasn't yet started due to a circular >> dependency? As I hinted, my plan is to drill down into the PROVIDE/REQUIRE >> labyrinth and work by trial and error (with a reboot in between each error). >> I'm happy to do that but I'm hoping that I might be able to use this >> situation to learn of more elegant ways to diagnose the problem. >> >> ...and to reiterate, this is on 7.0-BETA1 (built Saturday morning) and all >> this was working without any intervention on 6.2-RELEASE. > > When I see processes stalled on nanslp at boot it's usually when my network is > messed up in some way. I think it's stuck in the resolver trying to look things > up. [blush] I actually fixed this 12 months ago on 6.n and forgot all about it. I let the 7.0 mergemaster overwrite the rc.d/ypset because I didn't think I had touched it. Here is the fix. All happy now - but not much the wiser as to rc troubleshooting techniques. ----------------------------------------------- --- /usr/src/etc/rc.d/ypset 2007-10-12 12:38:42.000000000 +1000 +++ /etc/rc.d/ypset 2007-10-24 02:31:32.000000000 +1000 @@ -5,6 +5,7 @@ # PROVIDE: ypset # REQUIRE: ypbind +# BEFORE: mountd . /etc/rc.subr ----------------------------------------------- Thank you for your help. -- John Marshall