Skip site navigation (1)Skip section navigation (2)
Date:      27 Dec 2002 18:51:53 +0000
From:      Stacey Roberts <stacey@vickiandstacey.com>
To:        Gerard Samuel <gsam@trini0.org>
Cc:        FreeBSD Questions <questions@FreeBSD.ORG>
Subject:   Re: Network timeouts???
Message-ID:  <1041015112.68500.144.camel@localhost>
In-Reply-To: <3E0C9D2E.3000704@trini0.org>
References:  <3E0C72A9.9000302@trini0.org> <1041004206.68500.116.camel@localhost>  <3E0C91F0.3000102@trini0.org> <1041012776.68500.128.camel@localhost>  <3E0C9D2E.3000704@trini0.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 2002-12-27 at 18:34, Gerard Samuel wrote:
> Stacey Roberts wrote:
> 
> >On Fri, 2002-12-27 at 17:46, Gerard Samuel wrote:
> >  
> >
> >>Stacey Roberts wrote:
> >>
> >>    
> >>
> >>>On Fri, 2002-12-27 at 15:32, Gerard Samuel wrote:
> >>> 
> >>>
> >>>      
> >>>
> >>>>Im not really sure when the problem began, but I believe it was after
> >>>>upgrading to 4.7-RELEASE-p2.
> >>>>2 of the boxes running 4.7-RELEASE-p2 which are also running with Intel
> >>>>Pro 10/100B/100+ Ethernet cards,
> >>>>are getting numerous timeouts in the logs.
> >>>>
> >>>>fxp0: device timeout
> >>>>
> >>>>When connecting to these boxes, the connections are sluggish, to the 
> >>>>point where I can type faster, that the command line can display.
> >>>>All boxes are connected on a 100Mb network via an SMC EZ-Switch SMC 
> >>>>6308TX switch.
> >>>>The only thing that has changed in months, is software versions.
> >>>>The problem seems sporadic.  Can't seem to find out how or what is 
> >>>>causing the problem.
> >>>>
> >>>>Is/was there a problem with the fxp drivers, or can someone direct me as 
> >>>>to how one goes about to debug this problem.
> >>>>
> >>>>Thanks for any info you may provide...
> >>>>   
> >>>>

<snipped>

> >>>
> >>hivemind# netstat -in
> >>Name  Mtu   Network       Address            Ipkts Ierrs    Opkts Oerrs  
> >>Coll
> >>fxp0  1500  <Link#1>    00:80:29:12:90:b9   366170 27094   426767    
> >>31    10
> >>fxp0  1500  192.168.0     192.168.0.2       372504     -   432856     
> >>-     -
> >>lo0   16384 <Link#2>                          9454     0     9454     
> >>0     0
> >>lo0   16384 127           127.0.0.1           3179     -     3179     
> >>-     -
> >>
> >>    
> >>
> >
> >Hi Gerard,
> >   See here that the only transmission errors are for the Ierrs (27094
> >occurences.
> >
> >This is an indication that fxp0 is collecting stats on late / undetected
> >collisions.
> >
> >Please look at the stats for fxp0 with the command "ifconfig fxp0" and
> >place the output here. It would appears that fxp0 is *not* in full
> >duplex mode.
> >
> >There is also something else to note - interfaces, operating at full
> >duplex don't actually perform collision detection for their own
> >respective operation (I am open to suggestions otherwise on this), but
> >they may well be capable of collecting collision stats for other hosts
> >on the subnet. As such, you might want to check the interfaces of other
> >hosts with which this box is networked.
> >
> >Get back to the list with what data you are able to extract from fxp0
> >and other hosts as they case may be.
> >
> >Regards,
> >
> >Stacey
> >
> Ok, I see the amount of errors.  Maybe, cables went bad.  Come to think 
> of it, the room that the computers are in, recently got repainted, and I 
> had disconnected everything.  Maybe something happened then???  Ill make 
> some new cables tonight, and see how it goes....
> 
> Here are the stats off the two boxes ->
> 
> {gsam@gatekeeper}-{~} > netstat -in                             [27 Dec 
> 1:22pm]
> Name  Mtu   Network       Address            Ipkts Ierrs    Opkts Oerrs  
> Coll
> fxp0  1500  <Link#1>    00:80:29:12:9c:20    74003 31125    83139   
> 117     3
> fxp0  1500  192.168.0     192.168.0.1         5910     -     1114     
> -     -
> ed0   1500  <Link#2>    00:00:c0:29:52:48   401134     0    68811     0  
> 3161
> ed0   1500  68.39.128/21  68.39.132.244       2759     -      403     
> -     -
> lo0   16384 <Link#3>                           766     0      766     
> 0     0
> lo0   16384 127           127.0.0.1              4     -        4     
> -     -
> 

Yes, this box as well has pretty much the same occurrence of Ierrs, and
also noteworthy, no Oerrs - just like the previous box.

> {gsam@gatekeeper}-{~} > ifconfig fxp0                           [27 Dec 
> 1:22pm]
> fxp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
>         inet 192.168.0.1 netmask 0xffffff00 broadcast 192.168.0.255
>         ether 00:80:29:12:9c:20
>         media: Ethernet autoselect (100baseTX <full-duplex>)
>         status: active
> 
> hivemind# netstat -in
> Name  Mtu   Network       Address            Ipkts Ierrs    Opkts Oerrs  
> Coll
> fxp0  1500  <Link#1>    00:80:29:12:90:b9   370620 27557   430514    
> 32    10
> fxp0  1500  192.168.0     192.168.0.2       377045     -   436691     
> -     -
> lo0   16384 <Link#2>                          9594     0     9594     
> 0     0
> lo0   16384 127           127.0.0.1           3229     -     3229     
> -     -
> 
> hivemind# ifconfig fxp0
> fxp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
>         inet 192.168.0.2 netmask 0xffffff00 broadcast 192.168.0.255
>         ether 00:80:29:12:90:b9
>         media: Ethernet autoselect (100baseTX <full-duplex>)
>         status: active
> 

So both nic believe themselves to be running okay (as shown in their
respective Oerr counts being 0 each - translation: the nics are fine as
they are able to send all packets okay. Its only in receiving data that
they're returning errors on the link.

I'd suggest that you have a look at the switch itself as well. I know
that this is a desktop switch (read non-enterprise) that is different to
most others in this class as its got *no* cooling fans (hence that bump
on the top). As such, depending on how you hammer this device, there is
the chance of it over-heating. This, of course, is also dependant on
where its actually located as well.

It does have some link / speed / duplex indicator LED's on the front
that can be useful. As well as checking the cabling into the switch, see
if swapping the cables to available ports (except the uplink partner!)
and see if this makes any difference.

I remember one of the guys on my team at work got the Linksys peer to
this switch and rubbished it within 3 weeks due to the switch locking up
under sustained (4+ hours) heavy load transfers at his section lan
point.

Let us know how you get on, and what new information you might have for
us.

Regards,

Stacey


> >
> >  
> >
> >>>netstat -s
> >>>
> >>>      
> >>>
> >>Its too long.  Don't want to offend anyone with a long debug output....
> >>
> >>    
> >>
> >>>netstat -m
> >>>
> >>>      
> >>>
> >>hivemind# netstat -m
> >>67/896/6144 mbufs in use (current/peak/max):
> >>        66 mbufs allocated to data
> >>        1 mbufs allocated to packet headers
> >>64/600/1536 mbuf clusters in use (current/peak/max)
> >>1424 Kbytes allocated to network (30% of mb_map in use)
> >>0 requests for memory denied
> >>0 requests for memory delayed
> >>0 calls to protocol drain routines
> >>
> >>    
> >>
> >>>At the least, you could try "bouncing" (ifconfig down / ifconfig up) the
> >>>interfaces if the situation degrades dramatically.
> >>>
> >>>      
> >>>
> >>True, but the thing is these boxes, don't have keyboards hooked up to 
> >>them, so when they go down,
> >>I have to wait to see if they come up, or I kill the power if Im impatient.
> >>I just moved the switch away from the box its next, hoping it need more 
> >>ventilation, so Ill see how it goes now...
> >>
> >>    
> >>
> >>>Hope this helps.
> >>>
> >>>Stacey
> >>>
> >>> 
> >>>
> >>>      
> >>>
-- 
Stacey Roberts
B.Sc (HONS) Computer Science

Web: www.vickiandstacey.com



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1041015112.68500.144.camel>