From owner-freebsd-questions@freebsd.org Mon Jun 27 12:50:51 2016 Return-Path: Delivered-To: freebsd-questions@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 66C41B72BE7 for ; Mon, 27 Jun 2016 12:50:51 +0000 (UTC) (envelope-from luzar722@gmail.com) Received: from mail-it0-x232.google.com (mail-it0-x232.google.com [IPv6:2607:f8b0:4001:c0b::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 2F0D7246F for ; Mon, 27 Jun 2016 12:50:51 +0000 (UTC) (envelope-from luzar722@gmail.com) Received: by mail-it0-x232.google.com with SMTP id a5so64531701ita.1 for ; Mon, 27 Jun 2016 05:50:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-transfer-encoding; bh=hyMIhTzKS4/sSCOv1qVa5uHLRxYXcjpl/b97R9TplCw=; b=M/1uZMlnbR+Ii7c9H6yQLE54FxMv4d77cncSsG/c4Ci+wzJDD561PjcvuHweGMQ4JZ TuXNVOQbpi+7g7mPjzIPRFwJjklKfTy8nyWVvfRHDw62W/atD1Mv4clDJ0qmQ6bxGLGu XWCAvi9iNJecAu4qIihKnpmOr13YiDp7+8jIQMFCwuhNKvH75itGgzoOtq+/FKaFi7SD mvo1qluOuoAMacHHLuUyQRsf66M4kuA9ZxTddpRQjEBQkivQKm3Ng4lRI81WpnZlAM16 nZuaUwbJ5tVSynbRvTGZeUI8X1gTgBNvRrwZjJ5H3gmusf/HRwLWWdsKsrprwvmdZmCM /OFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :cc:subject:references:in-reply-to:content-transfer-encoding; bh=hyMIhTzKS4/sSCOv1qVa5uHLRxYXcjpl/b97R9TplCw=; b=j+g/TGj9ZsSYqsmnHNKcUy8lHeUICzThjgRQ1/+rCVuWefU/7ltpcpM73BjW8lkxF2 oTqye4csCTRg7DPuXZGwDIukPUP2a3uqQFntOvJeprkzL5z/Ciwd8/Ot3Nz6Savhu/Nu 40dioAPhhBgeJpAicINGK8bj+4SYs9mknGvk+ru3ZkdeyVj+Q48I7GMBhtiwbR3tO3Je uaM8WTKwRJhUYYEgmoBh+z+CY/nRH5SOv9UpvM/63iCZd7BYhmtMHcZbzJmuhlFINCll fP9w5ZA954Il9MxL5ApovcIMY93sPzlmo37SYu7DO1fEYU6YCob9IdLEzIth8YFwZC1p O7MA== X-Gm-Message-State: ALyK8tJu5Acygt87/fH2U6lN0Lu79wfZ5Jm+EriAgeS/iDal3qPyHaySQ0cFY181u+ryLQ== X-Received: by 10.36.44.136 with SMTP id i130mr1737794iti.99.1467031850234; Mon, 27 Jun 2016 05:50:50 -0700 (PDT) Received: from [10.0.10.3] (cpe-184-56-210-236.neo.res.rr.com. [184.56.210.236]) by smtp.googlemail.com with ESMTPSA id h128sm4565495ita.19.2016.06.27.05.50.49 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 27 Jun 2016 05:50:49 -0700 (PDT) Message-ID: <57712130.2050603@gmail.com> Date: Mon, 27 Jun 2016 08:50:56 -0400 From: Ernie Luzar User-Agent: Thunderbird 2.0.0.24 (Windows/20100228) MIME-Version: 1.0 To: Janos Dohanics CC: FreeBSD Questions Subject: Re: LAN slow or dead, intermittently References: <20160624112659.a9fd454b8d05166befb5876d@3dresearch.com> In-Reply-To: <20160624112659.a9fd454b8d05166befb5876d@3dresearch.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Jun 2016 12:50:51 -0000 Janos Dohanics wrote: > Hello List, > > Please help me figure out what makes my LAN intermittently slow or just > about dead. > > The LAN consists of a pfSense router (m1n1wall), a Netgear GS724T > switch, a recently installed FreeBSD 10.3 machine, several Windows 7 Pro > machines, androids and iPhones, and a Brother printer, altogether > between a dozen and 2 dozen networked devices. > > There are no local servers on the network, so as far as I can tell, > most traffic to and from the local nodes is with the internet > > Desktops have wired connections (100 MB or 1 GB NICs), but the phones > and most laptops are connected by WiFi. > > WiFi is provided by a Linksys E1500 configured to work only as a WiFi > AP. > > There is also a Linksys RE4000W WiFi extender on the network. > > The FreeBSD machine, the printer, the switch, the E1500 and RE4000W > WiFis have static IP addresses. Most of the Windows machines have > reserved DHCP addresses, the rest are unreserved DHCP. pfSense is > providing the DHCP server. > > I started to investigate the problem using mtr(8) which runs every 10 > minutes. Several times in my testing, the average RTT between the > FreeBSD machine (10.10.11.252) and the router's LAN interface > (10.10.11.1) was hundreds of milliseconds. Also, several times, 1 out > of the 10 packets is lost, but whenever this packet loss occurs, RTTs > are mostly 0.1 or 0.2 ms, but always less than 1 ms. > > Pinging various hosts on the LAN at times is in the 10s of milliseconds > or higher. > > Using my FreeBSD laptop and the FreeBSD machine, I tested the LAN with > netperf(1) which showed over 80 Mbit/s in good times but also less than > 1 Mbit/s at other times. > > During off-hours, I have disconnected and then reconnected computers > one by one, but could not identify any as the culprit. Replaced the > switch and patch cables - the problem is still there... intermittently. > > None of the Windows computers seems to have any malware which might > flood the network. I looked at pftop, and traffic seems to be legit - > but how could I see all LAN traffic and possibly correlate it with the > slowdown? Could this be caused by a broken networking hardware? How > would I identify that? > > What is the intelligent way to track down this problem? Please advise. > I also had performance problems with 10.3 that did not happen with 10.2 and older releases. When the lan went dead I had to reboot the host system to get things working again because users were on my back. I never let this condition exist to see if it would resolve it self. My first solution was to go back to using 10.2 and everything was fine. One evening I swapped the hosts 10.2 hard drive with the 10.3 hard drive so I could test some more. Just by luck I checked the date & time by issuing the "date" command. The date was correct but the time was -2 hours off. I manually set the correct time using the "date" command and let 10.3 run as production. With in 5 days the lan network was having performance problems again. I checked the host time and it was off by -30 minutes. I replaced the host motherboard battery with a new one and manually set the correct time again. Things ran ok for about 2 weeks when it happened again. This time the time was off by -2 minutes. This time I enabled the base ntpd time daemon by adding this to rc.conf ntpd_enable="YES" ntpd_sync_on_start="YES" Since then 10.3 has been running ok [2 months now]. I think some thing in the network stack code changed between 10.2 and 10.3 that made the time sync between lan nodes and the host, time range dependent. I would say that checking the time on your host and all the machines on the lan would be a good place to start looking for your problem. Good luck