From owner-freebsd-current@FreeBSD.ORG Thu Nov 16 11:08:11 2006 Return-Path: X-Original-To: current@freebsd.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CA3CA16A4D2; Thu, 16 Nov 2006 11:08:11 +0000 (UTC) (envelope-from if@hetzner.co.za) Received: from mail1a.your-server.co.za (mail1a.your-server.co.za [196.7.18.227]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4F04D43D7E; Thu, 16 Nov 2006 11:06:45 +0000 (GMT) (envelope-from if@hetzner.co.za) Received: from [192.168.2.25] (helo=hetzner.co.za) by mail1a.your-server.co.za with esmtpa (Exim 4.63) (envelope-from ) id 1Gkf4i-0004XL-IS; Thu, 16 Nov 2006 13:06:32 +0200 Received: from localhost ([127.0.0.1]) by hetzner.co.za with esmtp (Exim 4.63 (FreeBSD)) (envelope-from ) id 1Gkf4h-00011r-PC; Thu, 16 Nov 2006 13:06:31 +0200 To: Robert Watson From: Ian FREISLICH In-Reply-To: Message from Robert Watson of "Wed, 15 Nov 2006 09:03:28 GMT." <20061115085427.R79655@fledge.watson.org> X-Attribution: BOFH Date: Thu, 16 Nov 2006 13:06:31 +0200 Message-Id: X-Authenticated-Sender: if@hetzner.co.za X-Virus-Scanned: Clear (ClamAV 0.88.4/2199/Thu Nov 16 05:54:28 2006) Cc: current@freebsd.org Subject: Re: Panic during boot (in_arpinput). X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Nov 2006 11:08:11 -0000 Robert Watson wrote: > > On Wed, 15 Nov 2006, Ian FREISLICH wrote: > > > Robert Watson wrote: > >> On Wed, 15 Nov 2006, Ian FREISLICH wrote: > >> > >>> Ian FREISLICH wrote: > >>>> > >>>> I have 2 servers each with 255 vlan interfaces and carp interfaces in > >>>> each vlan.During the boot up while it's configuring the interfaces, it > >>>> reliably panics. It boots fine if no network cables are plugged in (and > >>>> in the test evironment on a quient lan). > >>>> > >>>> It's an SMP machine. My guess (from the panic message below) is that an > >>>> arp query arives on an interface it's in the middle of creating or > >>>> something like that (highly unsophisticated debugging conjecture). > >>>> > >>>> In the mean time I'm going to try a UP kernel and see if that masks the > >>>> problem. > >>> > >>> FWIW, a UP kernel has the same problem. > >> > >> What happens if you disable PREEMPTION on UP and try the same thing again? > > > > Same thing. > > > > If I don't assign the carp interfaces a vhid and pass at boot time, it boot s > > up OK, but I need the carp interfaces. I can arrange serial console access . > > I have a similar system from ~"Tue Aug 29 09:47:50 SAST 2006" that works, > > but I suspect it may suffer the same problem. I'm about to test this. > > This suggests that it is not the race I was worried it was, which is really > good news :-). This makes me suspect a CARP-specific bug as opposed to the > wider issue of under-synchronization of the address lists. I'm not sure that ia_hash from netinet/in_var.h: LIST_ENTRY(in_ifaddr) ia_hash; /* entry in bucket of inet addresses */ is being locked properly. Or somehow junk data is making its way into the list. I've narrowed it down to configuration errors in /etc/rc.conf. Bringing vlans or CARP interfaces up with some bogus data (due to cut and paste errors): ifconfig_vlan2001="vlandev em0 vlan 2001" ... ifconfig_vlan2001="vhid 1 pass 4f5116b5b66a5bcc3c096c72eeabb7bd" or ifconfig_carp166_name="vlan166_vrrp" ifconfig_carp167_name="vlan167_vrrp" ifconfig_vlan166_vrrp="vhid 161 advskew 0 pass 0f2c9a93eea6f38fabb3acb1c31488c6"ifconfig_vlan167_vrrp="vhid 161 advskew 0 pass 0f2c9a93eea6f38fabb3acb1c31488c6" Note that the 3rd and "4th" line depending on your terminal size are 1 line. It seems that when CARP interfaces are in play, even without addresses so their parent ip interfaces aren't in promiscuous mode, if an ifconfig at boot time produces an error: ifconfig: SIOCGVH: Invalid argument ifconfig: ioctl (SIOCAIFADDR): Can't assign requested address later, while it's printing the ifconfig data for the 255 vlans on the console, the system panics as previously written if it's ethernet recieves an arp request at a critical time. Ian -- Ian Freislich