From owner-freebsd-stable@FreeBSD.ORG Thu Jul 21 13:59:02 2005 Return-Path: X-Original-To: stable@freebsd.org Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D878B16A424 for ; Thu, 21 Jul 2005 13:59:02 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [204.156.12.53]) by mx1.FreeBSD.org (Postfix) with ESMTP id A6FF643D6E for ; Thu, 21 Jul 2005 13:59:01 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by cyrus.watson.org (Postfix) with ESMTP id 8160746B1C; Thu, 21 Jul 2005 09:58:55 -0400 (EDT) Date: Thu, 21 Jul 2005 14:59:23 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Benjamin Lutz In-Reply-To: <42DFA43C.7010601@datacomm.ch> Message-ID: <20050721145408.P97888@fledge.watson.org> References: <42DFA43C.7010601@datacomm.ch> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: stable@freebsd.org Subject: Re: jails bring down network interface X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2005 13:59:03 -0000 On Thu, 21 Jul 2005, Benjamin Lutz wrote: > While tracking an issue with a jail I run, the interface to which the > jail aliases it's IP to suddenly became unresponsive. > > My script starts the jail, then runs ifconfig alias. After starting and > stopping the jail about 20 times, the interface basically froze. > Ifconfig reported it as up and running, but it would no longer pass any > packets. Bringing it down then back up made it work again. > > Since this is a production machine, I'm afraid I can't give more > specific details, I don't wish to run into the problem again. > > I'm running FreeBSD 5.4-p5, the interface in question is a VIA VT6105 > Rhine III using the vr(4) driver. Should this occur again, the starting point to investigate is to determine whether it's "sending" that's broken, "receiving" that's broken, or both. I would investigate them by: - Using ping on the system to ping a remote host, see if the other system receives the ping packets using tcpdump. - Use ping on another host to ping the local host, and see if tcpdump on the local host sees the ping packets. As an FYI, ideally you'll do it using a pair of machines that already have each other in the ARP cache, or otherwise you'll need to look for ARP requests on the local area network instead of ICMP requests. Beware switches and routers that mask traffic from third parties (hence suggesting using those two machines). Also, it would be good to know if the if_vr interface receives interrupts or not when it's "wedged" -- you can check this using vmstat -i or systat -vmstat 1 and see what the interrupt count for the interface is. I prefer systat to vmstat, FYI. Finally, if you sit there and ping for a while, do you start getting ENOBUFS back from the interface? Finally, the dmesg probe output would be helpful. Thanks, Robert N M Watson