From owner-freebsd-stable@FreeBSD.ORG Wed Sep 27 09:45:12 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4B69B16A415 for ; Wed, 27 Sep 2006 09:45:12 +0000 (UTC) (envelope-from hausen@punkt.de) Received: from kagate.punkt.de (kagate.punkt.de [217.29.33.131]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5099B43D49 for ; Wed, 27 Sep 2006 09:45:10 +0000 (GMT) (envelope-from hausen@punkt.de) Received: from hugo10.ka.punkt.de (hugo10.ka.punkt.de [10.0.0.110]) by kagate1.punkt.de with ESMTP id k8R9j9CT020821 for ; Wed, 27 Sep 2006 11:45:09 +0200 (CEST) Received: from hugo10.ka.punkt.de (localhost [127.0.0.1]) by hugo10.ka.punkt.de (8.12.10/8.12.10) with ESMTP id k8R9j9a9077816; Wed, 27 Sep 2006 11:45:09 +0200 (CEST) (envelope-from ry93@hugo10.ka.punkt.de) Received: (from ry93@localhost) by hugo10.ka.punkt.de (8.12.10/8.12.10/Submit) id k8R9j96v077815; Wed, 27 Sep 2006 11:45:09 +0200 (CEST) (envelope-from ry93) Date: Wed, 27 Sep 2006 11:45:09 +0200 From: "Patrick M. Hausen" To: Scott Long Message-ID: <20060927094509.GB75104@hugo10.ka.punkt.de> References: <451A1375.5080202@gneto.com> <20060927071538.GF22229@e-Gitt.NET> <451A4189.5020906@samsco.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <451A4189.5020906@samsco.org> User-Agent: Mutt/1.5.10i Cc: freebsd-stable@freebsd.org, Oliver Brandmueller Subject: Re: 6.2 SHOWSTOPPER - em completely unusable on 6.2 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Sep 2006 09:45:12 -0000 Hello! > Well, the best I can say at the moment is, "Wow." =-( I guess the > thing to do here is to figure out if the problem lies with the em > interrupt handler not getting run, or the taskqueue not getting run. I helped Pyun with some debugging by providing ssh access to a machine showing the (seemingly) same problem. At first he thought the interrupt handler of the em driver was the culprit, but we applied quite a few patches and tested afterwards - seems like the driver is not the cause. On -stable occasionally other people complained about very similar looking problems with bge and other drivers. My guess is, though I'm not a kernel developer, just an experienced admin, that em stands out as problematic just by coincidence. Certain onboard network components tend to come with certaiin chipsets and certain architectures. So, Pyun suggested it was a problem with the taskqueue that was introduced some time between 6.0 and 6.1. With my system (Tyan GT20 B5161G20) the problem shows when there is heavy disk and cpu activity, like "make buildworld". I made sure that the em interface doesn't share an interrupt with the SATA controller. When the problem occurs, I get the well known "watchdog timeout" messages and then the system's network activity over that interface freezes completely for a couple of minutes. Usually the system recovers after a while without reboot or other measures. What I can do: give ssh access to a system showing this behaviour including a network connection to another box, so one can transfer large amounts of data over a private LAN. I used FTP of a sparse big file. Prerequisite: fixed IP address of the machine that the developer whishes to use to connect to my system. HTH, Patrick -- punkt.de GmbH Internet - Dienstleistungen - Beratung Vorholzstr. 25 Tel. 0721 9109 -0 Fax: -100 76137 Karlsruhe http://punkt.de