From owner-freebsd-stable@FreeBSD.ORG Wed Dec 28 14:42:47 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 095EE106566B for ; Wed, 28 Dec 2011 14:42:47 +0000 (UTC) (envelope-from ml@my.gd) Received: from mail-lpp01m010-f54.google.com (mail-lpp01m010-f54.google.com [209.85.215.54]) by mx1.freebsd.org (Postfix) with ESMTP id 830B08FC13 for ; Wed, 28 Dec 2011 14:42:46 +0000 (UTC) Received: by lahl5 with SMTP id l5so6800022lah.13 for ; Wed, 28 Dec 2011 06:42:45 -0800 (PST) Received: by 10.152.136.39 with SMTP id px7mr25401972lab.2.1325083365230; Wed, 28 Dec 2011 06:42:45 -0800 (PST) Received: from dfleuriot-at-hi-media.com ([83.167.62.196]) by mx.google.com with ESMTPS id pf2sm25008887lab.1.2011.12.28.06.42.43 (version=SSLv3 cipher=OTHER); Wed, 28 Dec 2011 06:42:44 -0800 (PST) Message-ID: <4EFB2AE2.909@my.gd> Date: Wed, 28 Dec 2011 15:42:42 +0100 From: Damien Fleuriot User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:8.0) Gecko/20111105 Thunderbird/8.0 MIME-Version: 1.0 To: Sergey Kandaurov , Jeremy Chadwick References: <4EFA129C.2090407@my.gd> <4EFAF945.3050509@my.gd> <4EFB294F.703@my.gd> In-Reply-To: <4EFB294F.703@my.gd> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: "freebsd-stable@freebsd.org" Subject: Re: [SOLVED] stuck /etc/rc autoboot processes X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Dec 2011 14:42:47 -0000 On 12/28/11 3:35 PM, Damien Fleuriot wrote: > > > On 12/28/11 12:37 PM, Sergey Kandaurov wrote: >> On 28 December 2011 15:11, Damien Fleuriot wrote: >>> >>> >>> On 12/28/11 11:50 AM, Sergey Kandaurov wrote: >>>> On 27 December 2011 22:46, Damien Fleuriot wrote: >>>>> Hello list, >>>>> >>>>> >>>>> >>>>> Yesterday and today, I've been busy either patching boxes for the BIND >>>>> advisory that we received on the 23rd (when they were running 8.1 or >>>>> 8.2-RELEASE), or upgrading them (when running 8.0-RELEASE). >>>>> >>>>> >>>>> Today I've come across 2 boxes running 8.2-STABLE and of course, the >>>>> BIND patch wouldn't apply correctly. >>>>> >>>>> I've decided to cvsup them to 8.2-RELEASE and "upgrade" them to it. >>>>> >>>>> >>>>> I've gone through the following steps: >>>>> - make buildworld >>>>> - make buildkernel >>>>> - make installkernel >>>>> - nextboot -k my new kernel, to ensure it worked fine >>>>> - rebooted again with the new kernel, this time correctly installed as >>>>> /boot/kernel >>>>> - installed the world >>>>> - run mergemaster -FiPU >>>>> - rebuild ports >>>>> >>>>> >>>>> Now, I'm facing this odd situation where, just after booting, I get this >>>>> on the 2 boxes: >>>>> >>>>> >>>>> root 22 0.0 0.0 8256 1876 v0 Is+ 7:32PM 0:00.03 sh >>>>> /etc/rc autoboot >>>>> root 1250 0.0 0.0 18000 2576 v0 I+ 7:32PM 0:00.04 >>>>> /usr/local/sbin/rsyslogd -a /var/run/log -a /var/named/var/run/log -i >>>>> /var/run/syslog.pid -f /usr/local/etc/rsyslog.conf >>>>> root 1790 0.0 0.0 8256 1952 v0 I+ 7:32PM 0:00.00 sh >>>>> /etc/rc autoboot >>>>> root 1793 0.0 0.0 8256 1952 v0 I+ 7:32PM 0:00.00 sh >>>>> /etc/rc autoboot >>>>> >>>>> >>>>> Does anybody have an idea why I get these stuck "sh /etc/rc autoboot" >>>>> processes ? >>>>> >>>>> Any pointers as to where I should look ? >>>> >>>> Check if the box has a working resolving during boot. >>>> This is a main reason why it may stuck in /etc/rc phase. >>>> When on physical console, type ^T. Usually it will get you >>>> the name of offending process. >>>> >>>> You posted output from ps aux. It would be nice if you post >>>> ps auxl, so values of MWCHAN ps keyword will be also seen, >>>> which can add an additional debugging info. >>>> >>> >>> >>> Find below the info: >>> >>> >>> # ps aufx >>> http://pastebin.com/iLy0Hs8s >>> >>> # ps aufxl >>> http://pastebin.com/3meFWvRH >>> >>> # dmesg.boot >>> http://pastebin.com/rFEsPfD5 >>> >>> Again, the box gets stuck at "Local package initialization:" from >>> /etc/rc.d/localpkg >>> >>> >>> I then run the following: >>> # sh -x /etc/rc.d/localpkg >>> >>> >>> A snip from the end of the script's output (stuck) yields: >>> + logger 'localpkg: DEBUG: run_rc_command: doit: pkg_start ' >>> + echo 'localpkg: DEBUG: run_rc_command: doit: pkg_start ' >>> localpkg: DEBUG: run_rc_command: doit: pkg_start >>> + eval 'pkg_start ' >>> + pkg_start >>> + local initdone >>> + initdone='' >>> + find_local_scripts_old >>> + zlist='' >>> + slist='' >>> + [ -d /usr/local/etc/rc.d ] >>> + grep '^# PROVIDE:' '/usr/local/etc/rc.d/[0-9]*.sh' >>> + zlist=' /usr/local/etc/rc.d/[0-9]*.sh' >>> + grep '^# PROVIDE:' /usr/local/etc/rc.d/relayd_check.sh >>> + slist=' /usr/local/etc/rc.d/relayd_check.sh' >>> + [ -z '' -a -f '/usr/local/etc/rc.d/[0-9]*.sh' ] >>> + [ -x '/usr/local/etc/rc.d/[0-9]*.sh' ] >>> + [ -f '/usr/local/etc/rc.d/[0-9]*.sh' -o -L >>> '/usr/local/etc/rc.d/[0-9]*.sh' ] >>> + [ -z '' -a -f /usr/local/etc/rc.d/relayd_check.sh ] >>> + echo -n 'Local package initialization:' >>> Local package initialization:+ initdone=yes >>> + [ -x /usr/local/etc/rc.d/relayd_check.sh ] >>> + set -T >>> + trap 'exit 1' 2 >>> + /usr/local/etc/rc.d/relayd_check.sh start >>> >>> >>> relayd_check.sh is a custom script that I wrote to monitor relayd for >>> crashes, log them to /var/log/ and restart the process. >>> >>> This script does *not* contain any "PROVIDE / REQUIRE / KEYWORD" >>> initialization info and I'm beginning to think this may be the problem. >>> >>> I shall try further, thanks for all the pointers so far :) >> >> Yeah, just thought to point you out at sleeping relayd_check.sh, >> when I finished to read your mail :-). Ok, I was glad to help you. >> >> btw, >> rcorder(8) can help you to clarify a starting order for your process. >> Just use: rcorder /etc/rc.d/* /usr/local/etc/rc.d/* >> > > > Well, that was definitely check_relayd.sh (which I have moved to > check_relayd , to skip the .sh extension for consistency). > > I've also taken the opportunity to run "yes | make delete-old" from /usr/src > > I'm definitely not running make delete-old-libs though, I fear that > might break things. For anyone reading the archive, if you ever run into the same problem, aka a stuck "sh /etc/rc autoboot" or /etc/rc.d/localpkg, or a "Initializing local packages:" at boot. The problem was caused by a defective (meh) usermade script in /usr/local/etc/rc.d/ which missed rcorder tags, for example: # PROVIDE: relayd_check # REQUIRE: NETWORKING relayd # KEYWORD: shutdown Adding these solved the problem. Ty Sergey and Jeremy for your time.