Date: Thu, 29 May 2008 20:45:43 +0200 From: Gerrit =?ISO-8859-1?Q?K=FChn?= <gerrit@pmp.uni-hannover.de> To: freebsd-stable@FreeBSD.ORG Cc: Oliver Fromme <olli@lurza.secnetix.de> Subject: Re: broken re(4) Message-ID: <20080529204543.d4aa927e.gerrit@pmp.uni-hannover.de> In-Reply-To: <200805291652.m4TGqt2o060679@lurza.secnetix.de> References: <20080529171351.a3dd5111.gerrit@pmp.uni-hannover.de> <200805291652.m4TGqt2o060679@lurza.secnetix.de>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 29 May 2008 18:52:55 +0200 (CEST) Oliver Fromme <olli@lurza.secnetix.de> wrote about Re: broken re(4): OF> In that case I would suspect that the one piece of hardware OF> that is misbehaving is broken and needs to be replaced. I agree. I just do not know yet which part is broken. OF> > The only hardware thing that is different in this system from the OF> > others is an additional SATA-controller. Can there be conflicts with OF> > this card which are triggering the problems? OF> I think it's unlikely. Do they share interrupts? (The OF> output of "vmstat -i" will tell you.) protoserve# vmstat -i interrupt total rate irq0: clk 31564049 1000 irq7: ppbus0 ppc0 1 0 irq8: rtc 4038754 127 irq9: uhci0 uhci1+ 2 0 irq10: re0 re1+ 2401340 76 irq11: atapci0+++ 655498 20 irq14: ata0 11167 0 Total 38670811 1225 Just the two NICs on the same IRQ. A system that is working fine looks like this: firefly1# vmstat -i interrupt total rate irq0: clk 2614761182 1000 irq1: atkbd0 902 0 irq7: ppbus0 ppc0 1 0 irq8: rtc 334559120 127 irq10: re0 re1+ 24354774 9 irq11: atapci0++++ 70905 0 irq14: ata0 800110 0 Total 2974546994 1138 OF> In theory it could also be a power supply problem. I OF> assume that you use rather small (thus possibly weak) OF> power supplies for your ITX machines. Maybe the SATA OF> controller in that problematic machine drives the power OF> supply to its limit, and the re(4) interfaces suffer. OF> You could check whether removing the SATA controller OF> improves things. Or try to connect a stronger power OF> supply if you have one available. I have Travla C146/C147 chassis these macines and use the power supply that comes with them. However, the ultimate test for checking the controller-related things is to simply remove it. I will try this tomorrow (the systems are at work, and I am at home now - can't unplug a controller via ssh :-). OF> - Do you see any non-zero numbers in the collision or OF> error columns of "netstat -i"? No: protoserve# netstat -i Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll re0 1500 <Link#1> 00:30:18:af:19:6a 131032 0 271757 0 0 re0 1500 10.117.0.0 protoserve 80442 - 271722 - - re1 1500 <Link#2> 00:30:18:af:19:6b 1474484 0 1114542 0 0 re1 1500 192.168.0.0 192.168.2.1 1471156 - 1114457 - - plip0 1500 <Link#3> 0 0 0 0 0 lo0 16384 <Link#4> 0 0 0 0 0 lo0 16384 fe80:4::1 fe80:4::1 0 - 0 - - lo0 16384 localhost ::1 0 - 0 - - lo0 16384 your-net localhost 0 - 0 - - OF> - Are you sure the interfaces don't have the same MAC OF> addresses (it's unlikely, but it doesn't hurt to check OF> in the ifconfig output). Yes: protoserve# ifconfig re0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=399b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_UCAST,WOL_MCAST,WOL_MAGIC> ether 00:30:18:af:19:6a inet 10.117.15.1 netmask 0xffff0000 broadcast 10.117.255.255 media: Ethernet autoselect (1000baseTX <full-duplex>) status: active re1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=399b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_UCAST,WOL_MCAST,WOL_MAGIC> ether 00:30:18:af:19:6b inet 192.168.2.1 netmask 0xffff0000 broadcast 192.168.255.255 media: Ethernet autoselect (1000baseTX <full-duplex>) status: active plip0: flags=108810<POINTOPOINT,SIMPLEX,MULTICAST,NEEDSGIANT> metric 0 mtu 1500 lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4 inet6 ::1 prefixlen 128 inet 127.0.0.1 netmask 0xff000000 OF> - Are you sure that media and duplex settings are OF> correct on both sides (i.e. PC and switch)? The systems are all on the same switch (I also changed the switch during the tests with no change), all devices show a 1GB link. OF> - Have you tried replacing cables, switch ports, or the OF> whole switch? Yes, all of that. OF> - Have you tried to disable hardware support features OF> of the driver? In 7-stable re(4) supports quite a lot OF> of hardware features. See "ifconfig -m". You could OF> check whether disabling RXCSUM, TXCSUM and/or TSO4 OF> makes a difference. Another good idea, thanks. I will try that tomorrow, too. cu Gerrit
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080529204543.d4aa927e.gerrit>