From owner-freebsd-net@FreeBSD.ORG  Thu Mar 15 07:39:51 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B24A7106564A
	for <freebsd-net@freebsd.org>; Thu, 15 Mar 2012 07:39:51 +0000 (UTC)
	(envelope-from juli@clockworksquid.com)
Received: from mail-we0-f182.google.com (mail-we0-f182.google.com
	[74.125.82.182])
	by mx1.freebsd.org (Postfix) with ESMTP id 409218FC08
	for <freebsd-net@freebsd.org>; Thu, 15 Mar 2012 07:39:50 +0000 (UTC)
Received: by wern13 with SMTP id n13so3268591wer.13
	for <freebsd-net@freebsd.org>; Thu, 15 Mar 2012 00:39:50 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=google.com; s=20120113;
	h=mime-version:sender:from:date:x-google-sender-auth:message-id
	:subject:to:content-type:content-transfer-encoding
	:x-gm-message-state;
	bh=R7f5J/4/kh+KbJOPnQYkHPSXWqh9ucwfDndyLUi8RBs=;
	b=KcogfpR4RcJ2dEBfLcHAFhVkZ2cx6DmjS8QNvNqwpqVrnY38JtRApOrSa30ozm9E/b
	1bU/PCrEjOHqM39XZHDUaGagqydSDWrGgytqD6Qi1vLeR3YAQimuj7F4K4C57twdBtN1
	U2Rl8clYTUQgKEQAsAQ99dcwwkPtYLcBGkGofh8+6dCpTnEreEEm31bdaVfeQEBJ4Axv
	w5T01XpAVEgcVwqQDZjJELEZ4s5qi45P572grkzZ/SgOwRWFhwXeqU9OGCHtt3TZ4w4O
	G+VGGn4BptMnDKXUkBdS/Q2oDkjaAdg8ge8FTXHDoODu5ucahzX8yGpWb2M/lqs+3oI2
	yHiw==
Received: by 10.216.134.155 with SMTP id s27mr3815691wei.80.1331797190147;
	Thu, 15 Mar 2012 00:39:50 -0700 (PDT)
MIME-Version: 1.0
Sender: juli@clockworksquid.com
Received: by 10.180.96.231 with HTTP; Thu, 15 Mar 2012 00:39:30 -0700 (PDT)
From: Juli Mallett <jmallett@FreeBSD.org>
Date: Thu, 15 Mar 2012 00:39:30 -0700
X-Google-Sender-Auth: boPVFp2BDv2pp0XGSiaFsIy1PXk
Message-ID: <CACVs6=9rTNAjEEdy7sBNEWPtoTdkx7eifZisQF5JTESAorQeJQ@mail.gmail.com>
To: freebsd-net@freebsd.org
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-Gm-Message-State: ALoCoQm46UI3AYdHme8J0+dhs5aVMMOWoEGlmXCnklZqKNSvfKWLu20ynkeg/Rgl6hyqwAI9hcVm
Subject: MSI-X + em(4) = Refresh mbufs: hdr dmamap load failure - 22
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 15 Mar 2012 07:39:51 -0000

All,

On both stable/9 and trunk I see that with one of either the 82571EB
or 82574L I am flooded with messages in the form of:

Refresh mbufs: hdr dmamap load failure - 22

If I disable msix, then the messages go away.  I am not sure why msix
vs. non-msix would matter in this case unless in the msix case there's
some kind of case of spurious interrupts causing em_rxeof to be called
without any packets available.  If that happens then perhaps
e1000_rx_unrefreshed() is called when no buffers have been processed
and then em_refresh_mbufs wrongly refreshes the whole ring?

This seems like it would be a problem because the
bus_dmamap_load_mbuf_sg code is called unconditionally, even when a
new mbuf isn't being allocated.  In that case, the mapping already
exists.  Wouldn't it be necessary to unload and then reload the mbuf?
So either it's a bug that em_refresh_mbufs is being called at all, or
it's naively reusing mbufs in a way that actually guarantees an error,
right?  Also, in the case where it frees, only m_free is called =E2=80=94 i=
s
there never a case where that should be an m_freem?  I can imagine
some, but they are likely impossible with the receive path of the
driver.  (I don't know for sure because the receive path and the mbuf
refresh code keep changing and I've been unable to keep up.)

I don't know which part it is, of course, because I don't know what
port it's coming from.  Like three other printfs in the driver where
which device is being used matters tremendously, it uses a bare printf
and not a device_printf.  I could modify the driver, but for now
disabling msix is easier than continuing to load new kernels to try to
debug the problem.

Is anyone else seeing this?  Has anyone further investigated the
problem?  Is there a patch floating around and I just haven't found
the right search terms?

Thanks in advance,
Juli.

PS: Yes, I know this is kind of a crappy bug report, sorry.  I've had
a limited amount of time to investigate so far, and don't want to
delay reporting it until I am able to get more time with the
problematic hardware.