Date: Fri, 18 Feb 2011 04:47:04 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: John Baldwin <jhb@freebsd.org> Cc: Olaf Seibert <O.Seibert@cs.ru.nl>, net@freebsd.org, freebsd-stable@freebsd.org, Jeremy Chadwick <freebsd@jdc.parodius.com>, Steven Hartland <killing@multiplay.co.uk> Subject: Re: mountd has resolving problems Message-ID: <20110218043432.S3233@besplex.bde.org> In-Reply-To: <201102171158.24636.jhb@freebsd.org> References: <20100909131017.GO4404@twoquid.cs.ru.nl> <20100909140529.GB76889@icarus.home.lan> <FD94648144304764A7A8E4589DC33EE6@multiplay.co.uk> <201102171158.24636.jhb@freebsd.org>
index | next in thread | previous in thread | raw e-mail
On Thu, 17 Feb 2011, John Baldwin wrote: > On Thursday, February 17, 2011 7:18:28 am Steven Hartland wrote: >> This has become a issue for us in 8.x as well. >> >> I'm pretty sure in pre 8.x these nfs mounts would simply background but >> recently machines are now failing to boot. It seems that failure to >> lookup nfs mount point hosts now causes this fatal error :( >> >> We've just tried Jeremy's netwait script and it works perfectly so either >> this or something similar needs to get pushed into base. >> >> For reference the reason we need a delay here is our core Cisco router >> takes a while to bring the port up properly on boot. >> >> Thanks for sharing the script Jeremy :) > > I use a similar hack that waits up to 30 seconds for the default gateway to be > pingable. I think it is at least partly related to the new ARP code that now > drops packets in IP output if the link is down. I use hackish ping -t <timeout much smaller than 30 seconds since even 2 seconds is annoying>s and traceroutes in /etc/rc.d/netif. Don't know if it is the same problem. It affects mainly nfs and ntpdate/ntpd to local systems here. Even with all-static routes. > This can be very problematic > during boot since some interfaces take a few seconds to negotiate link but > the end result of the new check in IP output is that the attempt to send the > packet fails with an error causing gethostbyname() and getaddrinfo() to fail > completely without doing any retries. In 7 the packet would either sit in the Also after down/up to change something. If you try to use the network before it is back then you have to wait much longer before it is really back. This is a relatively minor problem since down/up is not needed routinely. > descriptor ring until link was up, or it would be dropped, but it would > silently fail, so the resolver in libc would just retry in 30 seconds or so at > which time it would work fine. > > Waiting for the default route to be pingable actually fixed a few other > problems for us on 7 though as well (often ntpdate would not work on boot and > now it works reliably, etc.) so we went with that route. I thought I first saw the problem a little earlier, and it affected bge more than fxp. Maybe the latter is correct and the problem is smaller with fxp just because it is ready sooner. Brucehome | help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110218043432.S3233>
