Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 28 Dec 2011 15:42:42 +0100
From:      Damien Fleuriot <ml@my.gd>
To:        Sergey Kandaurov <pluknet@gmail.com>,  Jeremy Chadwick <freebsd@jdc.parodius.com>
Cc:        "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org>
Subject:   Re: [SOLVED] stuck /etc/rc autoboot processes
Message-ID:  <4EFB2AE2.909@my.gd>
In-Reply-To: <4EFB294F.703@my.gd>
References:  <4EFA129C.2090407@my.gd> <CAE-mSO%2BrMGn2sqc-PhWBe0=023uPqDK=kyWSx2o86nzCgM%2BuOQ@mail.gmail.com> <4EFAF945.3050509@my.gd> <CAE-mSOLBSq_OhPtQyjPsLk8uEa90jTyA889TJ-BC3P=RGEW0WA@mail.gmail.com> <4EFB294F.703@my.gd>

next in thread | previous in thread | raw e-mail | index | archive | help

On 12/28/11 3:35 PM, Damien Fleuriot wrote:
> 
> 
> On 12/28/11 12:37 PM, Sergey Kandaurov wrote:
>> On 28 December 2011 15:11, Damien Fleuriot <ml@my.gd> wrote:
>>>
>>>
>>> On 12/28/11 11:50 AM, Sergey Kandaurov wrote:
>>>> On 27 December 2011 22:46, Damien Fleuriot <ml@my.gd> wrote:
>>>>> Hello list,
>>>>>
>>>>>
>>>>>
>>>>> Yesterday and today, I've been busy either patching boxes for the BIND
>>>>> advisory that we received on the 23rd (when they were running 8.1 or
>>>>> 8.2-RELEASE), or upgrading them (when running 8.0-RELEASE).
>>>>>
>>>>>
>>>>> Today I've come across 2 boxes running 8.2-STABLE and of course, the
>>>>> BIND patch wouldn't apply correctly.
>>>>>
>>>>> I've decided to cvsup them to 8.2-RELEASE and "upgrade" them to it.
>>>>>
>>>>>
>>>>> I've gone through the following steps:
>>>>> - make buildworld
>>>>> - make buildkernel
>>>>> - make installkernel
>>>>> - nextboot -k my new kernel, to ensure it worked fine
>>>>> - rebooted again with the new kernel, this time correctly installed as
>>>>> /boot/kernel
>>>>> - installed the world
>>>>> - run mergemaster -FiPU
>>>>> - rebuild ports
>>>>>
>>>>>
>>>>> Now, I'm facing this odd situation where, just after booting, I get this
>>>>> on the 2 boxes:
>>>>>
>>>>>
>>>>> root         22  0.0  0.0  8256  1876  v0  Is+   7:32PM   0:00.03 sh
>>>>> /etc/rc autoboot
>>>>> root       1250  0.0  0.0 18000  2576  v0  I+    7:32PM   0:00.04
>>>>> /usr/local/sbin/rsyslogd -a /var/run/log -a /var/named/var/run/log -i
>>>>> /var/run/syslog.pid -f /usr/local/etc/rsyslog.conf
>>>>> root       1790  0.0  0.0  8256  1952  v0  I+    7:32PM   0:00.00 sh
>>>>> /etc/rc autoboot
>>>>> root       1793  0.0  0.0  8256  1952  v0  I+    7:32PM   0:00.00 sh
>>>>> /etc/rc autoboot
>>>>>
>>>>>
>>>>> Does anybody have an idea why I get these stuck "sh /etc/rc autoboot"
>>>>> processes ?
>>>>>
>>>>> Any pointers as to where I should look ?
>>>>
>>>> Check if the box has a working resolving during boot.
>>>> This is a main reason why it may stuck in /etc/rc phase.
>>>> When on physical console, type ^T. Usually it will get you
>>>> the name of offending process.
>>>>
>>>> You posted output from ps aux. It would be nice if you post
>>>> ps auxl, so values of MWCHAN ps keyword will be also seen,
>>>> which can add an additional debugging info.
>>>>
>>>
>>>
>>> Find below the info:
>>>
>>>
>>> # ps aufx
>>> http://pastebin.com/iLy0Hs8s
>>>
>>> # ps aufxl
>>> http://pastebin.com/3meFWvRH
>>>
>>> # dmesg.boot
>>> http://pastebin.com/rFEsPfD5
>>>
>>> Again, the box gets stuck at "Local package initialization:" from
>>> /etc/rc.d/localpkg
>>>
>>>
>>> I then run the following:
>>> # sh -x /etc/rc.d/localpkg
>>>
>>>
>>> A snip from the end of the script's output (stuck) yields:
>>> + logger 'localpkg: DEBUG: run_rc_command: doit: pkg_start '
>>> + echo 'localpkg: DEBUG: run_rc_command: doit: pkg_start '
>>> localpkg: DEBUG: run_rc_command: doit: pkg_start
>>> + eval 'pkg_start '
>>> + pkg_start
>>> + local initdone
>>> + initdone=''
>>> + find_local_scripts_old
>>> + zlist=''
>>> + slist=''
>>> + [ -d /usr/local/etc/rc.d ]
>>> + grep '^# PROVIDE:' '/usr/local/etc/rc.d/[0-9]*.sh'
>>> + zlist=' /usr/local/etc/rc.d/[0-9]*.sh'
>>> + grep '^# PROVIDE:' /usr/local/etc/rc.d/relayd_check.sh
>>> + slist=' /usr/local/etc/rc.d/relayd_check.sh'
>>> + [ -z '' -a -f '/usr/local/etc/rc.d/[0-9]*.sh' ]
>>> + [ -x '/usr/local/etc/rc.d/[0-9]*.sh' ]
>>> + [ -f '/usr/local/etc/rc.d/[0-9]*.sh' -o -L
>>> '/usr/local/etc/rc.d/[0-9]*.sh' ]
>>> + [ -z '' -a -f /usr/local/etc/rc.d/relayd_check.sh ]
>>> + echo -n 'Local package initialization:'
>>> Local package initialization:+ initdone=yes
>>> + [ -x /usr/local/etc/rc.d/relayd_check.sh ]
>>> + set -T
>>> + trap 'exit 1' 2
>>> + /usr/local/etc/rc.d/relayd_check.sh start
>>>
>>>
>>> relayd_check.sh is a custom script that I wrote to monitor relayd for
>>> crashes, log them to /var/log/ and restart the process.
>>>
>>> This script does *not* contain any "PROVIDE / REQUIRE / KEYWORD"
>>> initialization info and I'm beginning to think this may be the problem.
>>>
>>> I shall try further, thanks for all the pointers so far :)
>>
>> Yeah, just thought to point you out at sleeping relayd_check.sh,
>> when I finished to read your mail :-). Ok, I was glad to help you.
>>
>> btw,
>> rcorder(8) can help you to clarify a starting order for your process.
>> Just use:  rcorder /etc/rc.d/* /usr/local/etc/rc.d/*
>>
> 
> 
> Well, that was definitely check_relayd.sh (which I have moved to
> check_relayd , to skip the .sh extension for consistency).
> 
> I've also taken the opportunity to run "yes | make delete-old" from /usr/src
> 
> I'm definitely not running make delete-old-libs though, I fear that
> might break things.


For anyone reading the archive, if you ever run into the same problem,
aka a stuck "sh /etc/rc autoboot" or /etc/rc.d/localpkg, or a
"Initializing local packages:" at boot.

The problem was caused by a defective (meh) usermade script in
/usr/local/etc/rc.d/ which missed rcorder tags, for example:

# PROVIDE: relayd_check
# REQUIRE: NETWORKING relayd
# KEYWORD: shutdown


Adding these solved the problem.


Ty Sergey and Jeremy for your time.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4EFB2AE2.909>