From owner-freebsd-stable@FreeBSD.ORG Thu May 29 18:45:49 2008 Return-Path: Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4D837106566C for ; Thu, 29 May 2008 18:45:49 +0000 (UTC) (envelope-from gerrit@pmp.uni-hannover.de) Received: from mrelay1.uni-hannover.de (mrelay1.uni-hannover.de [130.75.2.106]) by mx1.freebsd.org (Postfix) with ESMTP id B8B9A8FC1C for ; Thu, 29 May 2008 18:45:48 +0000 (UTC) (envelope-from gerrit@pmp.uni-hannover.de) Received: from www.pmp.uni-hannover.de (www.pmp.uni-hannover.de [130.75.117.2]) by mrelay1.uni-hannover.de (8.14.2/8.14.2) with ESMTP id m4TIjjkq027037; Thu, 29 May 2008 20:45:46 +0200 Received: from pmp.uni-hannover.de (theq.pmp.uni-hannover.de [130.75.117.4]) by www.pmp.uni-hannover.de (Postfix) with SMTP id 21A834F; Thu, 29 May 2008 20:45:45 +0200 (CEST) Date: Thu, 29 May 2008 20:45:43 +0200 From: Gerrit =?ISO-8859-1?Q?K=FChn?= To: freebsd-stable@FreeBSD.ORG Message-Id: <20080529204543.d4aa927e.gerrit@pmp.uni-hannover.de> In-Reply-To: <200805291652.m4TGqt2o060679@lurza.secnetix.de> References: <20080529171351.a3dd5111.gerrit@pmp.uni-hannover.de> <200805291652.m4TGqt2o060679@lurza.secnetix.de> Organization: Albert-Einstein-Institut (MPI =?ISO-8859-1?Q?f=FCr?= Gravitationsphysik & IGP =?ISO-8859-1?Q?Universit=E4t?= Hannover) X-Mailer: Sylpheed 2.4.2 (GTK+ 2.10.12; i386-portbld-freebsd6.1) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-PMX-Version: 5.4.1.325704, Antispam-Engine: 2.6.0.325393, Antispam-Data: 2008.5.29.112922 Cc: Oliver Fromme Subject: Re: broken re(4) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 May 2008 18:45:49 -0000 On Thu, 29 May 2008 18:52:55 +0200 (CEST) Oliver Fromme wrote about Re: broken re(4): OF> In that case I would suspect that the one piece of hardware OF> that is misbehaving is broken and needs to be replaced. I agree. I just do not know yet which part is broken. OF> > The only hardware thing that is different in this system from the OF> > others is an additional SATA-controller. Can there be conflicts with OF> > this card which are triggering the problems? OF> I think it's unlikely. Do they share interrupts? (The OF> output of "vmstat -i" will tell you.) protoserve# vmstat -i interrupt total rate irq0: clk 31564049 1000 irq7: ppbus0 ppc0 1 0 irq8: rtc 4038754 127 irq9: uhci0 uhci1+ 2 0 irq10: re0 re1+ 2401340 76 irq11: atapci0+++ 655498 20 irq14: ata0 11167 0 Total 38670811 1225 Just the two NICs on the same IRQ. A system that is working fine looks like this: firefly1# vmstat -i interrupt total rate irq0: clk 2614761182 1000 irq1: atkbd0 902 0 irq7: ppbus0 ppc0 1 0 irq8: rtc 334559120 127 irq10: re0 re1+ 24354774 9 irq11: atapci0++++ 70905 0 irq14: ata0 800110 0 Total 2974546994 1138 OF> In theory it could also be a power supply problem. I OF> assume that you use rather small (thus possibly weak) OF> power supplies for your ITX machines. Maybe the SATA OF> controller in that problematic machine drives the power OF> supply to its limit, and the re(4) interfaces suffer. OF> You could check whether removing the SATA controller OF> improves things. Or try to connect a stronger power OF> supply if you have one available. I have Travla C146/C147 chassis these macines and use the power supply that comes with them. However, the ultimate test for checking the controller-related things is to simply remove it. I will try this tomorrow (the systems are at work, and I am at home now - can't unplug a controller via ssh :-). OF> - Do you see any non-zero numbers in the collision or OF> error columns of "netstat -i"? No: protoserve# netstat -i Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll re0 1500 00:30:18:af:19:6a 131032 0 271757 0 0 re0 1500 10.117.0.0 protoserve 80442 - 271722 - - re1 1500 00:30:18:af:19:6b 1474484 0 1114542 0 0 re1 1500 192.168.0.0 192.168.2.1 1471156 - 1114457 - - plip0 1500 0 0 0 0 0 lo0 16384 0 0 0 0 0 lo0 16384 fe80:4::1 fe80:4::1 0 - 0 - - lo0 16384 localhost ::1 0 - 0 - - lo0 16384 your-net localhost 0 - 0 - - OF> - Are you sure the interfaces don't have the same MAC OF> addresses (it's unlikely, but it doesn't hurt to check OF> in the ifconfig output). Yes: protoserve# ifconfig re0: flags=8843 metric 0 mtu 1500 options=399b ether 00:30:18:af:19:6a inet 10.117.15.1 netmask 0xffff0000 broadcast 10.117.255.255 media: Ethernet autoselect (1000baseTX ) status: active re1: flags=8843 metric 0 mtu 1500 options=399b ether 00:30:18:af:19:6b inet 192.168.2.1 netmask 0xffff0000 broadcast 192.168.255.255 media: Ethernet autoselect (1000baseTX ) status: active plip0: flags=108810 metric 0 mtu 1500 lo0: flags=8049 metric 0 mtu 16384 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4 inet6 ::1 prefixlen 128 inet 127.0.0.1 netmask 0xff000000 OF> - Are you sure that media and duplex settings are OF> correct on both sides (i.e. PC and switch)? The systems are all on the same switch (I also changed the switch during the tests with no change), all devices show a 1GB link. OF> - Have you tried replacing cables, switch ports, or the OF> whole switch? Yes, all of that. OF> - Have you tried to disable hardware support features OF> of the driver? In 7-stable re(4) supports quite a lot OF> of hardware features. See "ifconfig -m". You could OF> check whether disabling RXCSUM, TXCSUM and/or TSO4 OF> makes a difference. Another good idea, thanks. I will try that tomorrow, too. cu Gerrit