From owner-freebsd-net@FreeBSD.ORG Thu Mar 31 21:57:57 2011 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 54431106566B for ; Thu, 31 Mar 2011 21:57:57 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-vx0-f182.google.com (mail-vx0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id 019998FC17 for ; Thu, 31 Mar 2011 21:57:56 +0000 (UTC) Received: by vxc34 with SMTP id 34so2811426vxc.13 for ; Thu, 31 Mar 2011 14:57:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=N1e+u+M4/jMaeMrPSTJNfbqV8f3/YdB0qxf1EzH2db4=; b=KUA3peiPnbxJ0QTe2l0ffLOpWwNY5W8JwYjuv8DRX3lZK/Np00pVuQogLeoULepzkr cySk4nYcDnCFRYJUDZpbryd3oN93rizLJJsUG32v0Jgicyw3FtCbxu6LrMU3n3Dt29RN AUYJnIR9aEwqEph9uR0QoUhTzX0yYnYLq12To= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=H9OBYaamBsg6qLO3cDzW8jhlKQZgqAY1aqgLhmD/U49IZEuYGAjz15WVEsqpHbR3hD JXgOll1TLv3D+zzymzRrmIUylu8ulCMHgFJI/rQpQTOyHjRQ5SPpv9bM2zk2Sru4lb2Q N02eyLbWYySkNhxnk3Tqp15RtNYUYDS9FfNQQ= MIME-Version: 1.0 Received: by 10.52.92.161 with SMTP id cn1mr4365487vdb.253.1301608676366; Thu, 31 Mar 2011 14:57:56 -0700 (PDT) Received: by 10.52.167.6 with HTTP; Thu, 31 Mar 2011 14:57:56 -0700 (PDT) In-Reply-To: References: Date: Thu, 31 Mar 2011 14:57:56 -0700 Message-ID: From: Jack Vogel To: Arnaud Lacombe Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-net@freebsd.org Subject: Re: em(4) hang [Was: Re: igb(4) won't start with "igb0: Could not setup receive structures"] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Mar 2011 21:57:57 -0000 So, what is the evidence that the driver is stuck here? I see that next_to_check !=3D next_to_refresh, which is why the local timer won't schedule anything. OH, and I also realized there is a problem with local_timer anyway, it will run rxeof, but that won't hel= p if you can't enter the loop, so I need to add some code at the top to call em_refresh_mbufs() when in this state. On this interrupt cause that you are focused upon, although its there in th= e design, I had talked with some of our most seasoned developers on both the Windows and Linux side of the house, and NO one has ever used this 'feature', because (and I'm quoting here) "there's no good use case for it"= . Meaning, there's always some simpler way of handling the issue. When you use MSIX you can't read causes btw, if you configured it, it would mean you'd just get into the regular RX handler, same as always, so why some special bother with this cause? On non-MSIX hardware there is just no particular reason to worry about the cause either, we can just handle the RX situation in the interrupt handler. Jack On Thu, Mar 31, 2011 at 2:09 PM, Arnaud Lacombe wrote: > Hi Jack, > > On Thu, Mar 31, 2011 at 9:51 AM, Arnaud Lacombe > wrote: > > [...] > > I'll remove part of the changes I made to keep only `rx_forced_refill' > > and the associated sysctl, re-run the tests and come back with correct > > value, hopefully in a few hours. > > > Here it is: > > # sysctl dev.em.0.%desc > dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.2.2 > > # sysctl dev.em.0.mac_stats.missed_packets > dev.em.0.mac_stats.missed_packets: 917428 > > # sysctl dev.em.0.debug=3D1 > dev.em.0.debug: I-1nterface is RUNNING and INACTIVE > em0: hw tdh =3D 975, hw tdt =3D 975 > em0: hw rdh =3D 884, hw rdt =3D 885 > em0: Tx Queue Status =3D 0 > em0: TX descriptors avail =3D 1024 > em0: Tx Descriptors avail failure =3D 0 > em0: RX discarded packets =3D 0 > em0: RX Next to Check =3D 884 > em0: RX Next to Refresh =3D 885 > -> -1 > > So the taskqueue cannot be scheduled to run and the driver is stuck. > > > On Wed, Mar 30, 2011 at 2:22 PM, Jack Vogel wrote: > >> Read the code in HEAD, em_local_timer() has a test of ALL the rx queue= s > and > >> will schedule a task that refreshes mbufs if they are empty. This has > >> exactly the > >> same effect as checking for some interrupt cause, a cause that is not > >> available > >> when using MSIX on 82574, but this approach works for everything. > >> > Can you please point me to a reference datasheet (or errata), provided > by Intel, about the RX Overrun interrupt not being available with > MSI-X on the 82574 ? > > Currently, I only have access to [0], which precises the following: > > 7.4 Interrupts > 7.4.2 MSI-X Mode > [...] > The following configuration and parameters are involved: > =95 The IVAR.INT_Alloc[4:0] entries map two Tx queues, two Rx queues and > other > events to 5 interrupt vectors > =95 The ICR[24:20] bits reflect specific interrupt causes > =95 Five MSI-X interrupt vectors are provided (calculated based on four > vectors for > queues and one vector for other causes). The requested number of vectors = is > loaded from the MSI_X_N fields in the EEPROM into the PCIe MSI-X capabili= ty > structure of the function. > > 10.2.4.1 Interrupt Cause Read Register - ICR (0x000C0; RC/WC) > [...] > > about bit 24: > > Other Interrupt. Indicates one of the following interrupts was set: > =95 Link Status Change. > =95 Receiver Overrun. > =95 MDIO Access Complete. > =95 Small Receive Packet Detected. > =95 Receive ACK Frame Detected. > =95 Manageability Event Detected. > > Thanks in advance, > - Arnaud > > [0]: ftp://download.intel.com/design/network/datashts/82574.pdf >