From owner-freebsd-stable@FreeBSD.ORG Thu Nov 12 20:52:25 2009 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0111D106566B for ; Thu, 12 Nov 2009 20:52:25 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from fg-out-1718.google.com (fg-out-1718.google.com [72.14.220.154]) by mx1.freebsd.org (Postfix) with ESMTP id 5BC9E8FC12 for ; Thu, 12 Nov 2009 20:52:23 +0000 (UTC) Received: by fg-out-1718.google.com with SMTP id d23so1007250fga.13 for ; Thu, 12 Nov 2009 12:52:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type; bh=KzDySWjJulLh53FM9+7ma3Od0s+j+mUEDLY48+Sd2tw=; b=H1TAX90AAM7hRqkV+MqkmAyyKBo1BGyKMDZ9AsKnn6JbP1vfK3udCS6nqzmShFkYWR CWQIY2SG/QLW529WendOpPb8ThmXMAbjaQ/RbGjaqFcsbfgePxf39OciwD6SLIbwmxEg REQ4bmzMeR8r+N0OhS8cuQ9CShsHdZjJp6QqM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=DqkEn1LwGPzAVuYEJEeYA4USt97IF1I2tryNA/RsWwqsf/bXr2YQ92GIJRzbzf/hIO eQdKVrJ2LFnP0gclc3h+TQ/Ge6QKwK9kqkWeQ6fIUkHDgKmwF2UCXlEued8COxAeS0xK vrqBM+6AKlHTjZsEy24wS8Ox9nr71XMx5tgyY= MIME-Version: 1.0 Received: by 10.216.93.1 with SMTP id k1mr1068582wef.151.1258059142898; Thu, 12 Nov 2009 12:52:22 -0800 (PST) In-Reply-To: <20091112204736.GA29095@icarus.home.lan> References: <4AFC63B0.5020707@alaska.net> <20091112204736.GA29095@icarus.home.lan> Date: Thu, 12 Nov 2009 12:52:22 -0800 Message-ID: <2a41acea0911121252y81f365fo2982e43e3efdba4d@mail.gmail.com> From: Jack Vogel To: Jeremy Chadwick Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-stable@freebsd.org Subject: Re: 82573 xfers pause, no watchdog timeouts, DCGDIS ineffective (7.2-R) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Nov 2009 20:52:25 -0000 It is critically important on these systems that you get the latest BIOS on them, so maybe that's the difference between you two. I am going to be putting out a new em driver to CURRENT soon, it might be an option to try that as well, it sounds like a hang, management/os race in the driver is a possibility. Jack On Thu, Nov 12, 2009 at 12:47 PM, Jeremy Chadwick wrote: > On Thu, Nov 12, 2009 at 10:36:16AM -0900, Royce Williams wrote: > > We have servers with dual 82573 NICs that work well during low-throughput > activity, but during high-volume activity, they pause shortly after > transfers start and do not recover. Other sessions to the system are not > affected. > > Please define "low-throughput" and "high-volume" if you could; it might > help folks determine where the threshold is for problems. > > > These systems are being repurposed, jumping from 6.3 to 7.2. The same > system and its kin do not exhibit the symptom under 6.3-RELEASE-p13. The > symptoms appear under freebsd-updated 7.2-RELEASE GENERIC kernel with no > tuning. > > > > Previously, we've been using DCGDIS.EXE (from Jack Vogel) for this > symptom. The first system to be repurposed accepts DCGDIS with 'Updated' > and subsequent 'update not needed', with no relief. > > > > Notably, there are no watchdog timeout errors - unlike our various > Supermicro models still running FreeBSD 6.x. All of our other 7.x > Supermicro flavors had already received the flash update and haven't show > the symptom. > > > > Details follow. > > > > Kernel: > > > > rand# uname -a > > FreeBSD rand.acsalaska.net 7.2-RELEASE-p4 FreeBSD 7.2-RELEASE-p4 #0: Fri > Oct 2 12:21:39 UTC 2009 root@i386-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC > i386 > > > > sysctls: > > > > rand# sysctl dev.em > > dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 6.9.6 > > dev.em.0.%driver: em > > dev.em.0.%location: slot=0 function=0 > > dev.em.0.%pnpinfo: vendor=0x8086 device=0x108c subvendor=0x15d9 > subdevice=0x108c class=0x020000 > > dev.em.0.%parent: pci13 > > dev.em.0.debug: -1 > > dev.em.0.stats: -1 > > dev.em.0.rx_int_delay: 0 > > dev.em.0.tx_int_delay: 66 > > dev.em.0.rx_abs_int_delay: 66 > > dev.em.0.tx_abs_int_delay: 66 > > dev.em.0.rx_processing_limit: 100 > > dev.em.1.%desc: Intel(R) PRO/1000 Network Connection 6.9.6 > > dev.em.1.%driver: em > > dev.em.1.%location: slot=0 function=0 > > dev.em.1.%pnpinfo: vendor=0x8086 device=0x108c subvendor=0x15d9 > subdevice=0x108c class=0x020000 > > dev.em.1.%parent: pci14 > > dev.em.1.debug: -1 > > dev.em.1.stats: -1 > > dev.em.1.rx_int_delay: 0 > > dev.em.1.tx_int_delay: 66 > > dev.em.1.rx_abs_int_delay: 66 > > dev.em.1.tx_abs_int_delay: 66 > > dev.em.1.rx_processing_limit: 100 > > > > kenv: > > > > rand# kenv | grep smbios | egrep -v 'socket|serial|uuid|tag|0123456789' > > smbios.bios.reldate="03/05/2008" > > smbios.bios.vendor="Phoenix Technologies LTD" > > smbios.bios.version="6.00" > > smbios.chassis.maker="Supermicro" > > smbios.planar.maker="Supermicro" > > smbios.planar.product="PDSMi " > > smbios.planar.version="PCB Version" > > smbios.system.maker="Supermicro" > > smbios.system.product="PDSMi" > > > > > > The system is not yet production, so I can invasively abuse it if needed. > The other systems are in production under 6.3-RELEASE-p13 and can also be > inspected. > > > > Any pointers appreciated. > > > > Royce > > For what it's worth as a comparison base: > > We use the following Supermicro SuperServers, and can confirm that no > such issues occur for us using RELENG_6 nor RELENG_7 on the following > hardware: > > Supermicro SuperServer 5015B-MTB - amd64 - Intel 82573V + Intel 82573L > Supermicro SuperServer 5015M-T+B - amd64 - Intel 82573V + Intel 82573L > Supermicro SuperServer 5015M-T+B - amd64 - Intel 82573V + Intel 82573L > Supermicro SuperServer 5015M-T+B - i386 - Intel 82573V + Intel 82573L > Supermicro SuperServer 5015M-T+B - i386 - Intel 82573V + Intel 82573L > > The 5015B-MTB system presently runs RELENG_8 -- no issues there either. > > Relevant server configuration and network setup details: > > - All machines use pf(4). > - All emX devices are configured for autoneg. > - All emX devices use RXCSUM, TXCSUM, and TSO4. > - We do not use polling. > - All machines use both NICs simultaneously at all times. > - All machines connected to an HP ProCurve 2626 switch (100mbit, > full-duplex ports, all autoneg). > - We do not use Jumbo frames. > - No add-in cards (PCI, PCI-X, nor PCIe) are used in the systems. > - All of the systems had DCGDIS.EXE run on them; no EEPROM settings > were changed, indicating the from-the-Intel-factory MANC register > in question was set properly. > > Relevant throughput details per box: > > - em0 pushes ~600-1000kbit/sec at all times. > - em1 pushes ~100-200kbit/sec at all times. > - During nightly maintenance (backups), em1 pushes ~2-3mbit/sec > for a variable amount of time. > - For a full level 0 backup (which I've done numerous times), em1 > pushes 60-70mbit/sec without issues. > > I've compared your sysctl dev.em output to that of our 5015M-T+B systems > (which use the PDSMi+, not the PDSMi, but whatever), and ours is 100% > identical. > > All of our 5015M-T+B systems are using BIOS 1.3, and the 5015B-MTB > system is using BIOS 1.30. > > If you'd like, I can provide the exact BIOS settings we use on the > machines in question; they do deviate from the factory defaults a slight > bit, but none of the adjustments are "tweaks" for performance or > otherwise (just disabling things which we don't use, etc.). > > -- > | Jeremy Chadwick jdc@parodius.com | > | Parodius Networking http://www.parodius.com/ | > | UNIX Systems Administrator Mountain View, CA, USA | > | Making life hard for others since 1977. PGP: 4BD6C0CB | > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" >