From owner-freebsd-net@FreeBSD.ORG Thu Mar 15 07:39:51 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B24A7106564A for ; Thu, 15 Mar 2012 07:39:51 +0000 (UTC) (envelope-from juli@clockworksquid.com) Received: from mail-we0-f182.google.com (mail-we0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id 409218FC08 for ; Thu, 15 Mar 2012 07:39:50 +0000 (UTC) Received: by wern13 with SMTP id n13so3268591wer.13 for ; Thu, 15 Mar 2012 00:39:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:sender:from:date:x-google-sender-auth:message-id :subject:to:content-type:content-transfer-encoding :x-gm-message-state; bh=R7f5J/4/kh+KbJOPnQYkHPSXWqh9ucwfDndyLUi8RBs=; b=KcogfpR4RcJ2dEBfLcHAFhVkZ2cx6DmjS8QNvNqwpqVrnY38JtRApOrSa30ozm9E/b 1bU/PCrEjOHqM39XZHDUaGagqydSDWrGgytqD6Qi1vLeR3YAQimuj7F4K4C57twdBtN1 U2Rl8clYTUQgKEQAsAQ99dcwwkPtYLcBGkGofh8+6dCpTnEreEEm31bdaVfeQEBJ4Axv w5T01XpAVEgcVwqQDZjJELEZ4s5qi45P572grkzZ/SgOwRWFhwXeqU9OGCHtt3TZ4w4O G+VGGn4BptMnDKXUkBdS/Q2oDkjaAdg8ge8FTXHDoODu5ucahzX8yGpWb2M/lqs+3oI2 yHiw== Received: by 10.216.134.155 with SMTP id s27mr3815691wei.80.1331797190147; Thu, 15 Mar 2012 00:39:50 -0700 (PDT) MIME-Version: 1.0 Sender: juli@clockworksquid.com Received: by 10.180.96.231 with HTTP; Thu, 15 Mar 2012 00:39:30 -0700 (PDT) From: Juli Mallett Date: Thu, 15 Mar 2012 00:39:30 -0700 X-Google-Sender-Auth: boPVFp2BDv2pp0XGSiaFsIy1PXk Message-ID: To: freebsd-net@freebsd.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Gm-Message-State: ALoCoQm46UI3AYdHme8J0+dhs5aVMMOWoEGlmXCnklZqKNSvfKWLu20ynkeg/Rgl6hyqwAI9hcVm Subject: MSI-X + em(4) = Refresh mbufs: hdr dmamap load failure - 22 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2012 07:39:51 -0000 All, On both stable/9 and trunk I see that with one of either the 82571EB or 82574L I am flooded with messages in the form of: Refresh mbufs: hdr dmamap load failure - 22 If I disable msix, then the messages go away. I am not sure why msix vs. non-msix would matter in this case unless in the msix case there's some kind of case of spurious interrupts causing em_rxeof to be called without any packets available. If that happens then perhaps e1000_rx_unrefreshed() is called when no buffers have been processed and then em_refresh_mbufs wrongly refreshes the whole ring? This seems like it would be a problem because the bus_dmamap_load_mbuf_sg code is called unconditionally, even when a new mbuf isn't being allocated. In that case, the mapping already exists. Wouldn't it be necessary to unload and then reload the mbuf? So either it's a bug that em_refresh_mbufs is being called at all, or it's naively reusing mbufs in a way that actually guarantees an error, right? Also, in the case where it frees, only m_free is called =E2=80=94 i= s there never a case where that should be an m_freem? I can imagine some, but they are likely impossible with the receive path of the driver. (I don't know for sure because the receive path and the mbuf refresh code keep changing and I've been unable to keep up.) I don't know which part it is, of course, because I don't know what port it's coming from. Like three other printfs in the driver where which device is being used matters tremendously, it uses a bare printf and not a device_printf. I could modify the driver, but for now disabling msix is easier than continuing to load new kernels to try to debug the problem. Is anyone else seeing this? Has anyone further investigated the problem? Is there a patch floating around and I just haven't found the right search terms? Thanks in advance, Juli. PS: Yes, I know this is kind of a crappy bug report, sorry. I've had a limited amount of time to investigate so far, and don't want to delay reporting it until I am able to get more time with the problematic hardware.