Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 31 Jul 2007 18:00:10 GMT
From:      Bruce Evans <brde@optusnet.com.au>
To:        freebsd-i386@FreeBSD.org
Subject:   Re: i386/115054: NTP errors out on startup but restart of NTP fixes problem
Message-ID:  <200707311800.l6VI0AKN070144@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help
The following reply was made to PR i386/115054; it has been noted by GNATS.

From: Bruce Evans <brde@optusnet.com.au>
To: "Chauncey N. Menefee" <cmenefee@prism-grp.com>
Cc: freebsd-gnats-submit@freebsd.org, freebsd-i386@freebsd.org
Subject: Re: i386/115054: NTP errors out on startup but restart of NTP fixes
 problem
Date: Tue, 31 Jul 2007 08:26:21 +1000 (EST)

 On Mon, 30 Jul 2007, Chauncey N. Menefee wrote:
 
 >> From what we've been able to gather the NTP daemon is starting up before the network card and errors out. Restarting the NTP service afterwards clears up the problem.
 
 Several versions of FreeBSD have annoying behaviouor for network
 startup, involving the network not actually being up when ifconfig
 returns and subsequent different mishandling of this by various
 utilities.  I use the workaround of a couple of pings in rc.d/netif
 (ping -c2 -t2 $ntpdhost or ping -c1 -t1 $ntpdhost) so that ping times
 out instead of more important services.  This usually works for ntpd
 startup, but not for nfs startup.  Nfs doesn't fail, but makes you
 wish it would, by failing at first and then only retrying after about
 30 or 60 seconds, to that booting takes too long.
 
 This problem seems to get worse with each release of FreeBSD and/or
 with newer NICs.  I never noticed fxp or even ed or rl NICs.  Now it
 is barely noticeable with fxp and very noticeable with sk, bge and em
 NICs.  For bge, "ifconfig up" after "ifconfig down" takes 2 seconds
 to return, but the network still isn't quite back up at that point,
 as shown by "route get $ntpdhost" taking another 5+ seconds to return
 and the route cloning not even being quite complete when it returns:
 
 
 Under FreeBSD-~5.2:
 
 %%%
 Script started on Tue Jul 31 07:58:24 2007
 ttyv1:root@besplex:/tmp> route get delplex; ifconfig bge0 down; time ifconfig b
 ge0 up; time route get delplex; time route get delplex
     route to: delplex
 destination: delplex
    interface: bge0
        flags: <UP,HOST,DONE,LLINFO,WASCLONED>
   recvpipe  sendpipe  ssthresh  rtt,msec    rttvar  hopcount      mtu     expire
         0         0         0         0         0         0      1500      1052
          1.90 real         0.00 user         1.90 sys
     route to: delplex
 destination: 192.168.2.0
         mask: 255.255.255.0
    interface: bge0
        flags: <UP,DONE,CLONING>
   recvpipe  sendpipe  ssthresh  rtt,msec    rttvar  hopcount      mtu     expire
         0         0         0         0         0         0      1500        -7
          5.25 real         0.00 user         0.00 sys
     route to: delplex
 destination: delplex
    interface: bge0
        flags: <UP,HOST,DONE,LLINFO,WASCLONED>
   recvpipe  sendpipe  ssthresh  rtt,msec    rttvar  hopcount      mtu     expire
         0         0         0         0         0         0      1500      1200
          0.00 real         0.00 user         0.00 sys
 ttyv1:root@besplex:/tmp> exit
 
 Script done on Tue Jul 31 07:58:56 2007
 %%%
 
 Maybe I should be using "route get $ntpdhost; route get $nfshost ..."
 instead of the pings, since route(8) apparently waits long enough,
 while waiting for the minimal amount of time is harder to program with
 ping (ping -c1 $ntpdhost takes 11+ seconds where "route get $ntpdhost"
 takes only 5+, and then it is unclear if ping waited long enough since
 it loses the packet anyway; I avoid this 11+ second wait using -t1 or
 -t2, but the 1-2 second timeout is apparently not long enough).
 
 At boot time, the initial ifconfig seems to involve too much link
 flapping.  At least for bge in -current on a different machine booted
 to single-user mode so that I can look at the initial state, the
 interface is already up (but unused), with the message about this being
 printed a couple of seconds after reaching the shell prompt (actually
 in the middle of "ifconfig <no options>").  Then the initial ifconfig
 causes the link to go down and up.
 
 The behaviour of -current is quite different for the above commands
 -- both "ifconfig up" and "route get" return before the link is actually
 up; they return in < 0.01 seconds, but the link still takes about 2
 seconds to come back according to the "link state changed" message.
 This is probably why I'm using the ping hack with a constant timeout --
 I had forgotten some details and want to use the same rc.d/netif on all
 machines.  Another difference in -current is that the second "route get"
 doesn't show the cloning completed.  That might be only because I had
 to test on an inactive machine since bringing bge0 down breaks normal
 operation.
 
 Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200707311800.l6VI0AKN070144>