From owner-freebsd-net@FreeBSD.ORG Thu Feb 17 17:00:10 2011 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D1FC61065674; Thu, 17 Feb 2011 17:00:09 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 8F6208FC1C; Thu, 17 Feb 2011 17:00:09 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 1E61D46B06; Thu, 17 Feb 2011 12:00:09 -0500 (EST) Received: from jhbbsd.localnet (unknown [209.249.190.10]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 88EE88A02A; Thu, 17 Feb 2011 12:00:07 -0500 (EST) From: John Baldwin To: freebsd-stable@freebsd.org Date: Thu, 17 Feb 2011 11:58:24 -0500 User-Agent: KMail/1.13.5 (FreeBSD/7.4-CBSD-20110107; KDE/4.4.5; amd64; ; ) References: <20100909131017.GO4404@twoquid.cs.ru.nl> <20100909140529.GB76889@icarus.home.lan> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201102171158.24636.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (bigwig.baldwin.cx); Thu, 17 Feb 2011 12:00:07 -0500 (EST) X-Virus-Scanned: clamav-milter 0.96.3 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=0.5 required=4.2 tests=BAYES_00,MAY_BE_FORGED, RDNS_DYNAMIC autolearn=no version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on bigwig.baldwin.cx Cc: Olaf Seibert , Steven Hartland , Jeremy Chadwick , net@freebsd.org Subject: Re: mountd has resolving problems X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Feb 2011 17:00:10 -0000 On Thursday, February 17, 2011 7:18:28 am Steven Hartland wrote: > This has become a issue for us in 8.x as well. > > I'm pretty sure in pre 8.x these nfs mounts would simply background but > recently machines are now failing to boot. It seems that failure to > lookup nfs mount point hosts now causes this fatal error :( > > We've just tried Jeremy's netwait script and it works perfectly so either > this or something similar needs to get pushed into base. > > For reference the reason we need a delay here is our core Cisco router > takes a while to bring the port up properly on boot. > > Thanks for sharing the script Jeremy :) I use a similar hack that waits up to 30 seconds for the default gateway to be pingable. I think it is at least partly related to the new ARP code that now drops packets in IP output if the link is down. This can be very problematic during boot since some interfaces take a few seconds to negotiate link but the end result of the new check in IP output is that the attempt to send the packet fails with an error causing gethostbyname() and getaddrinfo() to fail completely without doing any retries. In 7 the packet would either sit in the descriptor ring until link was up, or it would be dropped, but it would silently fail, so the resolver in libc would just retry in 30 seconds or so at which time it would work fine. Waiting for the default route to be pingable actually fixed a few other problems for us on 7 though as well (often ntpdate would not work on boot and now it works reliably, etc.) so we went with that route. > Regards > Steve > > ----- Original Message ----- > From: "Jeremy Chadwick" > To: "Olaf Seibert" > Cc: > Sent: Thursday, September 09, 2010 2:05 PM > Subject: Re: mountd has resolving problems > > > > On Thu, Sep 09, 2010 at 03:10:17PM +0200, Olaf Seibert wrote: > >> I just upgraded a box from 8.0 to 8.1, and already when rebooting with > >> the new kernel (i.e. before installing new userland), I got the > >> following problem. > >> > >> Of course many of the messages scrolled off screen, but some were > >> preserved in the syslog. > >> > >> Sep 9 14:26:51 fourquid mountd[839]: can't get address info for host XYZ > >> Sep 9 14:26:51 fourquid mountd[839]: bad host XYZ in netgroup vbgroup, skipping > >> > >> Mountd was run and wanted to determine which hosts to export to. > >> However, it could not resolve any of them. So, that suggests some > >> network issue. > >> > >> However, I use a static IP address (no DHCP) and static info in > >> /etc/resolv.conf, using one of the university's name servers. So > >> resolving should always be available. > >> > >> Running /etc/rc.d/mountd restart so far always solved the export > >> problem. > >> > >> I have also seen (presumably similar) issues with mounting NFS file > >> systems, but that was deemed so fatal that the boot was aborted. A mount > >> ``by hand'' of the affected file system also worked. > >> > >> Any ideas? Maybe with the new kernel the network interface is a bit > >> slower in coming up, and not fully working by the time /etc/rc.d/mountd > >> runs? In fact, I now notice this sequence of messages in > >> /var/log/messages: > >> > >> Sep 9 14:26:51 fourquid mountd[839]: bad host XYZ in netgroup vbgroup, skipping > >> Sep 9 14:26:51 fourquid mountd[839]: bad exports list line /xxxxxx > >> Sep 9 14:26:54 fourquid kernel: fuse4bsd: version 0.3.9-pre1, FUSE ABI 7.8 > >> Sep 9 14:26:54 fourquid init: /bin/sh on /etc/rc terminated abnormally, going to single user mode > >> Sep 9 14:26:55 fourquid kernel: nfe0: link state changed to UP > >> > >> so here the network interface takes a full 4 more seconds to come up, > >> after it was already needed. > >> > >> I can try to put a 10 sec delay somewhere, but there should be a better > >> solution... > > > > The problem is that the network isn't "truly" up and available by the > > time mountd runs, and therefore DNS resolution doesn't work. Please use > > my netwait script to solve this problem: > > > > http://jdc.parodius.com/freebsd/netwait > > > > Place it in /usr/local/etc/rc.d, make sure it's chmod'd to 755, > > then enable use of it by using /etc/rc.conf variables like so: > > > > netwait_enable="yes" > > netwait_ip="4.2.2.1 4.2.2.2" > > netwait_if="nfe0" > > > > For what the variables do, please see the script comments. > > > > -- > > | Jeremy Chadwick jdc@parodius.com | > > | Parodius Networking http://www.parodius.com/ | > > | UNIX Systems Administrator Mountain View, CA, USA | > > | Making life hard for others since 1977. PGP: 4BD6C0CB | > > > > _______________________________________________ > > freebsd-stable@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > > > > ================================================ > This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. > > In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 > or return the E.mail to postmaster@multiplay.co.uk. > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > -- John Baldwin