Date: Mon, 27 Jun 2016 08:50:56 -0400 From: Ernie Luzar <luzar722@gmail.com> To: Janos Dohanics <web@3dresearch.com> Cc: FreeBSD Questions <freebsd-questions@freebsd.org> Subject: Re: LAN slow or dead, intermittently Message-ID: <57712130.2050603@gmail.com> In-Reply-To: <20160624112659.a9fd454b8d05166befb5876d@3dresearch.com> References: <20160624112659.a9fd454b8d05166befb5876d@3dresearch.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Janos Dohanics wrote: > Hello List, > > Please help me figure out what makes my LAN intermittently slow or just > about dead. > > The LAN consists of a pfSense router (m1n1wall), a Netgear GS724T > switch, a recently installed FreeBSD 10.3 machine, several Windows 7 Pro > machines, androids and iPhones, and a Brother printer, altogether > between a dozen and 2 dozen networked devices. > > There are no local servers on the network, so as far as I can tell, > most traffic to and from the local nodes is with the internet > > Desktops have wired connections (100 MB or 1 GB NICs), but the phones > and most laptops are connected by WiFi. > > WiFi is provided by a Linksys E1500 configured to work only as a WiFi > AP. > > There is also a Linksys RE4000W WiFi extender on the network. > > The FreeBSD machine, the printer, the switch, the E1500 and RE4000W > WiFis have static IP addresses. Most of the Windows machines have > reserved DHCP addresses, the rest are unreserved DHCP. pfSense is > providing the DHCP server. > > I started to investigate the problem using mtr(8) which runs every 10 > minutes. Several times in my testing, the average RTT between the > FreeBSD machine (10.10.11.252) and the router's LAN interface > (10.10.11.1) was hundreds of milliseconds. Also, several times, 1 out > of the 10 packets is lost, but whenever this packet loss occurs, RTTs > are mostly 0.1 or 0.2 ms, but always less than 1 ms. > > Pinging various hosts on the LAN at times is in the 10s of milliseconds > or higher. > > Using my FreeBSD laptop and the FreeBSD machine, I tested the LAN with > netperf(1) which showed over 80 Mbit/s in good times but also less than > 1 Mbit/s at other times. > > During off-hours, I have disconnected and then reconnected computers > one by one, but could not identify any as the culprit. Replaced the > switch and patch cables - the problem is still there... intermittently. > > None of the Windows computers seems to have any malware which might > flood the network. I looked at pftop, and traffic seems to be legit - > but how could I see all LAN traffic and possibly correlate it with the > slowdown? Could this be caused by a broken networking hardware? How > would I identify that? > > What is the intelligent way to track down this problem? Please advise. > I also had performance problems with 10.3 that did not happen with 10.2 and older releases. When the lan went dead I had to reboot the host system to get things working again because users were on my back. I never let this condition exist to see if it would resolve it self. My first solution was to go back to using 10.2 and everything was fine. One evening I swapped the hosts 10.2 hard drive with the 10.3 hard drive so I could test some more. Just by luck I checked the date & time by issuing the "date" command. The date was correct but the time was -2 hours off. I manually set the correct time using the "date" command and let 10.3 run as production. With in 5 days the lan network was having performance problems again. I checked the host time and it was off by -30 minutes. I replaced the host motherboard battery with a new one and manually set the correct time again. Things ran ok for about 2 weeks when it happened again. This time the time was off by -2 minutes. This time I enabled the base ntpd time daemon by adding this to rc.conf ntpd_enable="YES" ntpd_sync_on_start="YES" Since then 10.3 has been running ok [2 months now]. I think some thing in the network stack code changed between 10.2 and 10.3 that made the time sync between lan nodes and the host, time range dependent. I would say that checking the time on your host and all the machines on the lan would be a good place to start looking for your problem. Good luck
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?57712130.2050603>