From owner-freebsd-net@FreeBSD.ORG Fri Oct 7 20:46:38 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3FD4C1065670 for ; Fri, 7 Oct 2011 20:46:38 +0000 (UTC) (envelope-from nitroboost@gmail.com) Received: from mail-ey0-f182.google.com (mail-ey0-f182.google.com [209.85.215.182]) by mx1.freebsd.org (Postfix) with ESMTP id 3B5F28FC0A for ; Fri, 7 Oct 2011 20:46:37 +0000 (UTC) Received: by eyz10 with SMTP id 10so2560099eyz.13 for ; Fri, 07 Oct 2011 13:46:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=coXeA/dl7SNWjrSElTq+unz7XuCoEnbn9iI2ijwoKnM=; b=eRhGkAWx1FWQecVnPCtUYxT0onAAxikRas2vY07vX1/ZS2a93Q+wrQhiQzNvZlwAkI zLtp5Cqy/4bhCKHhZq1jyrhYE9GjTxGB3bhXVgBFA10WDuuvc0hGVcv7pFhUPk7ko3R2 NQit6W66hso4R2AlHLdbP3IrV0dJ7vkUNSPOY= MIME-Version: 1.0 Received: by 10.223.58.83 with SMTP id f19mr13046211fah.36.1318020396263; Fri, 07 Oct 2011 13:46:36 -0700 (PDT) Received: by 10.152.36.102 with HTTP; Fri, 7 Oct 2011 13:46:36 -0700 (PDT) In-Reply-To: References: Date: Fri, 7 Oct 2011 13:46:36 -0700 Message-ID: From: Jason Wolfe To: Arnaud Lacombe Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-net@freebsd.org Subject: Re: Intel 82574L interface wedging on em 7.1.9/7.2.3 when MSIX enabled X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Oct 2011 20:46:38 -0000 On Fri, Oct 7, 2011 at 12:24 PM, Arnaud Lacombe wrote: > Hi, > > On Fri, Oct 7, 2011 at 2:57 PM, Jason Wolfe wrote: > > Jack, > > > > Entirely possible there are multiple moving pieces here, the only bit I > know > > for certain is it's related to the different operation when running with > MSI > > vs MSI-X. Here is also my loader.conf for reference. I'm currently > running > > the modular congestion control stuff with cubic in use, but these issues > > predate those changes also. Just to give you a scope of it though, it was > > somewhat 'rare' for them to wedge. Out of a pool of ~2000 servers running > > with the 82574L doing ~800Mb/s average, there were ~220 reports in a > week. > > So with some fuzzy math to put it in the same terms you were talking in, > a > > server in particular would hang about once every 9 weeks. > > > Just a two questions out of my mind: > > Are the failing server evenly distributed, or always the same are failing ? > > Did you collect the uptime and the kernel msgbuf of the server when > the issue triggered ? > > Thanks, > - Arnaud > Arnaud, The failures were pretty random, though there were a handful of servers that did fail a couple times. It didn't seem attributable to a certain batch or physical location. The uptime was not collected, but most were in the ballpark of 30-90 days. I was tailing /var/log/messages, but didn't save kern.msgbuf no. I've added both of these to the collections and pulled a couple that did fail more than once and will be re enabling MSI-X on them later today. Jason