From owner-freebsd-hackers@FreeBSD.ORG Fri Aug 10 07:41:58 2012 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E4836106566B for ; Fri, 10 Aug 2012 07:41:58 +0000 (UTC) (envelope-from kpielorz_lst@tdx.co.uk) Received: from mail.tdx.com (mail.tdx.com [62.13.128.18]) by mx1.freebsd.org (Postfix) with ESMTP id 21CA88FC0A for ; Fri, 10 Aug 2012 07:41:57 +0000 (UTC) Received: from OctaHexa64-MkII (HPQuadro64.dmpriest.net.uk [62.13.130.30]) (authenticated bits=0) by mail.tdx.com (8.14.3/8.14.3) with ESMTP id q7A7cYmW081625 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NO) for ; Fri, 10 Aug 2012 08:38:34 +0100 (BST) Date: Fri, 10 Aug 2012 08:38:34 +0100 From: Karl Pielorz To: freebsd-hackers@FreeBSD.org Message-ID: <8E11701C93FCCC39AA97E4F1@OctaHexa64-MkII> X-Mailer: Mulberry/4.0.8 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Cc: Subject: FreeBSD 9.0-R em0 issues? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Aug 2012 07:41:59 -0000 Hi, I've got a SuperMicro X8DTL-IF based server (with Intel L5630), 6Gb of RAM and two onboard Intel NIC's. afaik this is running the stock FreeBSD 9.0-R GENERIC kernel. em0: port 0xdc00-0xdc1f mem 0xfbce0000-0xfbcfffff,0xfbcdc000-0xfbcdffff irq 16 at device 0.0 on pci6 em0: Using MSIX interrupts with 3 vectors em0: Ethernet address: 00:25:90:31:82:46 em0: link state changed to UP em1: port 0xec00-0xec1f mem 0xfbde0000-0xfbdfffff,0xfbddc000-0xfbddffff irq 17 at device 0.0 on pci7 em1: Using MSIX interrupts with 3 vectors em1: Ethernet address: 00:25:90:31:82:47 em0 is the only one in use, and it 'freezes' every now and again. Symptoms are no traffic in, or out - but pretty 'insane' figures from 'netstat -i' e.g. Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll em0 1500 00:25:90:31:82:46 610815304 22999549864725 0 518403896 6571299961350 3285649980675 The machines ARP cache expires at the time, and tcpdump shows no data at all on that interface. The switch port this is connected to disagrees with the errors (it has non logged currently for that port). The machine is a lightly loaded MySQL host. Considering the above was taken 'seconds' after the NIC stopped - I can't really see it had logged billions of legitimate errors in that small time frame? Doing an 'ifconfig em0 down' and 'ifconfig em0 up' makes no difference when it's hung. Rebooting the machine fixes the problem 'for a while'. Once rebooted no Oerrs, Coll errors or anything are shown by netstat -i Any suggestions what this could be? - Or what I can do to diagnose further? Nothing is logged on the console, or /var/log/messages When it failed last time, I did 'sysctl dev.em.0.debug=1' which netted: Interface is RUNNING and INACTIVE em0: hw tdh = -1, hw tdt = -1 em0: hw rdh = -1, hw rdt = -1 em0: Tx Queue Status = 1 em0: Tx descriptors avail = 986 em0: Tx Descriptors avail failure = 0 em0: RX discarded packets = 0 em0: RX Next to Check = 844 em0: RX Next to Refresh = 843 Should I be concerned about the '-1's? Thanks, -Karl