Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 24 Oct 2007 02:57:09 +1000
From:      John Marshall <John.Marshall@riverwillow.com.au>
To:        Brooks Davis <brooks@freebsd.org>
Cc:        "Mike Telahun Makonnen <mtm@freebsd.org>; freebsd-rc@FreeBSD.Org" <freebsd-rc@freebsd.org>
Subject:   Re: How to debug rc hangs?
Message-ID:  <471E27E5.4030609@riverwillow.com.au>
In-Reply-To: <20071023155932.GA37204@lor.one-eyed-alien.net>
References:  <471D7F68.8070308@riverwillow.com.au> <584bfc3f0710230505i29e8f19aofc4e66d0aee7b7c1@mail.gmail.com> <471DFFD0.8020701@riverwillow.com.au> <20071023155932.GA37204@lor.one-eyed-alien.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Brooks Davis wrote:
> On Wed, Oct 24, 2007 at 12:06:08AM +1000, John Marshall wrote:
>>  Mike Telahun Makonnen wrote:
>>> On 10/23/07, John Marshall <John.Marshall@riverwillow.com.au> wrote:
>>>> I have tried setting rc_debug="YES" in rc.conf but that doesn't show me
>>>> any more than I already know (e.g. last line before mountd hang is:
>>>> "/etc/rc: DEBUG: run_rc_command: doit: /usr/sbin/mountd -l"
>>> It seems to me that if it's getting this far, that the problem probably is
>>> not in rc.d. The next thing it does after that debug message is eval the 
>>> $doit
>>> line you saw, so either the eval command is missbehaving or the problem
>>> is with the daemon and not rc.d. What does CTR-t say when it hangs? Also,
>>> I noticed all three programs you listed are network daemons. My guess is
>>> they are not actually hung, they only *appear* to hang because they're 
>>> wating
>>> on some sort of network resource (DNS maybe?).
>>  Thanks Mike,
>>
>>  The ctrl-T tip is the kind of information I'm looking for. My primary reason 
>>  for posting is to find out what tools/switches/hooks are available to help 
>>  troubleshoot this kind of problem, rather than asking somebody else to solve 
>>  it.
>>
>>  Having said that, ctrl-T shows:
>>   load: 0.74 cmd: mountd 576 [nanslp] 0.00u 0.00s 0% 1428k
>>   load: 0.25 cmd: mountd 576 [nanslp] 0.00u 0.00s 0% 1432k
>>   load: 0.12 cmd: mountd 576 [nanslp] 0.00u 0.00s 0% 1432k
>>
>>  ...which lends weight to my suspicion that a pre-requisite resource is not 
>>  yet available - and, perhaps, hasn't yet started due to a circular 
>>  dependency? As I hinted, my plan is to drill down into the PROVIDE/REQUIRE 
>>  labyrinth and work by trial and error (with a reboot in between each error). 
>>  I'm happy to do that but I'm hoping that I might be able to use this 
>>  situation to learn of more elegant ways to diagnose the problem.
>>
>>  ...and to reiterate, this is on 7.0-BETA1 (built Saturday morning) and all 
>>  this was working without any intervention on 6.2-RELEASE.
> 
> When I see processes stalled on nanslp at boot it's usually when my network is
> messed up in some way.  I think it's stuck in the resolver trying to look things
> up.

[blush] I actually fixed this 12 months ago on 6.n and forgot all about 
it. I let the 7.0 mergemaster overwrite the rc.d/ypset because I didn't 
think I had touched it.

Here is the fix. All happy now - but not much the wiser as to rc 
troubleshooting techniques.

-----------------------------------------------
--- /usr/src/etc/rc.d/ypset     2007-10-12 12:38:42.000000000 +1000
+++ /etc/rc.d/ypset  2007-10-24 02:31:32.000000000 +1000
@@ -5,6 +5,7 @@

  # PROVIDE: ypset
  # REQUIRE: ypbind
+# BEFORE:  mountd

  . /etc/rc.subr

-----------------------------------------------

Thank you for your help.


-- 
John Marshall



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?471E27E5.4030609>