From owner-freebsd-stable@FreeBSD.ORG Mon Aug 21 19:52:08 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DF5EE16A4DD for ; Mon, 21 Aug 2006 19:52:08 +0000 (UTC) (envelope-from hausen@punkt.de) Received: from kagate.punkt.de (kagate.punkt.de [217.29.33.131]) by mx1.FreeBSD.org (Postfix) with ESMTP id B050043D77 for ; Mon, 21 Aug 2006 19:52:03 +0000 (GMT) (envelope-from hausen@punkt.de) Received: from hugo10.ka.punkt.de (hugo10.ka.punkt.de [10.0.0.110]) by kagate1.punkt.de with ESMTP id k7LJq24j011409 for ; Mon, 21 Aug 2006 21:52:02 +0200 (CEST) Received: from hugo10.ka.punkt.de (localhost [127.0.0.1]) by hugo10.ka.punkt.de (8.12.10/8.12.10) with ESMTP id k7LJq2a9057868; Mon, 21 Aug 2006 21:52:02 +0200 (CEST) (envelope-from ry93@hugo10.ka.punkt.de) Received: (from ry93@localhost) by hugo10.ka.punkt.de (8.12.10/8.12.10/Submit) id k7LJq2Un057867; Mon, 21 Aug 2006 21:52:02 +0200 (CEST) (envelope-from ry93) Date: Mon, 21 Aug 2006 21:52:02 +0200 From: "Patrick M. Hausen" To: "Patrick M. Hausen" Message-ID: <20060821195202.GA57333@hugo10.ka.punkt.de> References: <20060821120052.0B25816A526@hub.freebsd.org> <200608211414.16731.matt@chronos.org.uk> <20060821132743.GC45736@hugo10.ka.punkt.de> <44E9B7C1.9010708@goodforbusiness.co.uk> <20060821142613.GI45736@hugo10.ka.punkt.de> <44E9C5B3.90604@goodforbusiness.co.uk> <20060821145328.GL45736@hugo10.ka.punkt.de> <20060821150744.GN45736@hugo10.ka.punkt.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20060821150744.GN45736@hugo10.ka.punkt.de> User-Agent: Mutt/1.5.10i Cc: freebsd-stable@freebsd.org Subject: Re: ICH7 SATA and em interrupt sharing X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Aug 2006 19:52:09 -0000 And yet more testing ... I rebuilt my kernel without USB devices and made sure atapci1 doesn't share an interrupt with anything: pcib1: 16 pcib2: 20 em0: 16 em1: 17 fxp0: 16 atapci1: 19 atkbdc0: 1 atkbd0: 1 sio0: 4 sio1: 3 ppc0: 7 Side note: on this particular box I had to leave the USB devices enabled in the BIOS setup, otherwise em0 would end up on the same interrupt as atapci1 |-) Then I ran make buildworld and in parallel started to transfer a large file via FTP (done by fetching a sparse file of 10 GB) maxing out or 100 Mbit/s LAN. *boom* - or so I thought ;-) The ssh session was stuck, the system did not respond to ICMP echo. OK, wait until tomorrow morning to reset it ... ... just gave it one more ping an hour later, and the machine was alive again! It did not panic/reboot, the buildworld was running and the file transfer was transferring a file. In /var/log messages I found: Aug 21 21:37:08 tomcat kernel: em0: Missing Tx completion interrupt! Aug 21 21:39:55 tomcat kernel: em0: Missing Tx completion interrupt! Aug 21 21:40:29 tomcat kernel: em0: Missing Tx completion interrupt! Seems like for some reason the netwok card blocked for a couple of minutes, then resumed. This was all with debug.mpsafenet set to 1. Now I'm running the same stress test with debug.mpsafenet set to 0 and I haven't seen any problem/hang at all. Wait a minute ... now as I'm typing this message, ssh to the box hangs again. Damn. I think I'll try the fxp interface for production use and disable the onboard Gigabit NICs. Now the ssh session is responding again while the file transfer reports "Connection reset by peer". Dmesg shows: em0: Missing Tx completion interrupt! em0: Missing Tx completion interrupt! em0: Missing Tx completion interrupt! em0: Missing Tx completion interrupt! em0: Missing Tx completion interrupt! em0: Missing Tx completion interrupt! I'm still not able to really reproduce the SATA problem others are reporting, besides forcing em0 to share its interrupt with the SATA controller. This can easily be avoided - at least with our hardware. Regards, Patrick M. Hausen Leiter Netzwerke und Sicherheit -- punkt.de GmbH Internet - Dienstleistungen - Beratung Vorholzstr. 25 Tel. 0721 9109 -0 Fax: -100 76137 Karlsruhe http://punkt.de