From owner-freebsd-net@FreeBSD.ORG  Sun Apr  5 12:21:00 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 3E716106564A
	for <freebsd-net@freebsd.org>; Sun,  5 Apr 2009 12:21:00 +0000 (UTC)
	(envelope-from freebsd-net@m.gmane.org)
Received: from ciao.gmane.org (main.gmane.org [80.91.229.2])
	by mx1.freebsd.org (Postfix) with ESMTP id B21B78FC0A
	for <freebsd-net@freebsd.org>; Sun,  5 Apr 2009 12:20:59 +0000 (UTC)
	(envelope-from freebsd-net@m.gmane.org)
Received: from list by ciao.gmane.org with local (Exim 4.43)
	id 1LqRLK-0004Sk-9x
	for freebsd-net@freebsd.org; Sun, 05 Apr 2009 12:20:54 +0000
Received: from 93-141-3-137.adsl.net.t-com.hr ([93.141.3.137])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-net@freebsd.org>; Sun, 05 Apr 2009 12:20:54 +0000
Received: from ivoras by 93-141-3-137.adsl.net.t-com.hr with local (Gmexim 0.1
	(Debian)) id 1AlnuQ-0007hv-00
	for <freebsd-net@freebsd.org>; Sun, 05 Apr 2009 12:20:54 +0000
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-net@freebsd.org
From: Ivan Voras <ivoras@freebsd.org>
Date: Sun, 05 Apr 2009 14:20:25 +0200
Lines: 72
Message-ID: <gra7mq$ei8$1@ger.gmane.org>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature";
	boundary="------------enig9D3AA7C6A7FB08F179C61F87"
X-Complaints-To: usenet@ger.gmane.org
X-Gmane-NNTP-Posting-Host: 93-141-3-137.adsl.net.t-com.hr
User-Agent: Thunderbird 2.0.0.21 (Windows/20090302)
X-Enigmail-Version: 0.95.7
Sender: news <news@ger.gmane.org>
Subject: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 05 Apr 2009 12:21:00 -0000

This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enig9D3AA7C6A7FB08F179C61F87
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Hi,

I'm developing an application that needs a high rate of small TCP
transactions on multi-core systems, and I'm hitting a limit where a
kernel task, usually swi:net (but it depends on the driver) hits 100% of
a CPU at some transactions/s rate and blocks further performance
increase even though other cores are 100% idle.

So I've got an idea and tested it out, but it fails in an unexpected
way. I'm not very familiar with the network code so I'm probably missing
something obvious. The idea was to locate where the packet processing
takes place and offload packets to several new kernel threads. I see
this can happen in several places - netisr, ip_input and tcp_input, and
I chose netisr because I thought maybe it would also help other uses
(routing?). Here's a patch against CURRENT:

http://people.freebsd.org/~ivoras/diffs/mpip.patch

It's fairly simple - starts a configurable number of threads in
start_netisr(), assigns circular queues to each, and modifies what I
think are entry points for packets in the non-netisr.direct case. I also
try to have TCP and UDP traffic from the same host+port processed by the
same thread. It has some rough edges but I think this is enough to test
the idea. I know that there are several people officially working in
this area and I'm not an expert in it so think of it as a weekend hack
for learning purposes :)

These parameters are needed in loader.conf to test it:

net.isr.direct=3D0
net.isr.mtdispatch_n_threads=3D2

I expected things like the contention in upper layers (TCP) leading to
not improving performance one bit, but I can't explain what I'm getting
here. While testing the application on a plain kernel, I get approx.
100,000 - 120,000 packets/s per direction (by looking at "netstat 1")
and a similar number of transactions/s in the application. With the
patch I get up to 250,000 packets/s in netstat (3 mtdispatch threads),
but for some weird reason the actual number of transactions processed by
the application drops to less than 1,000 at the beginning (~~ 30
seconds), then jumps to close to 100,000 transactions/s, with netstat
also showing a drop this number of packets. In the first phase, the new
threads (netd0..3) are using CPU time almost 100%, in the second phase I
can't see where the CPU time is going (using top).

I thought this has something to deal with NIC moderation (em) but can't
really explain it. The bad performance part (not the jump) is also
visible over the loopback interface.

Any ideas?


--------------enig9D3AA7C6A7FB08F179C61F87
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAknYogoACgkQldnAQVacBcg0rwCeK5aaPe2Al0xFoelvU1IyJXup
9DQAmwRr/BgW8/Q/sBkNmlrJqtJtmvci
=KeAh
-----END PGP SIGNATURE-----

--------------enig9D3AA7C6A7FB08F179C61F87--


From owner-freebsd-net@FreeBSD.ORG  Sun Apr  5 13:21:23 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 701AE1065670;
	Sun,  5 Apr 2009 13:21:23 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 4527A8FC12;
	Sun,  5 Apr 2009 13:21:23 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [65.122.17.41])
	by cyrus.watson.org (Postfix) with ESMTPS id DAD7946B3B;
	Sun,  5 Apr 2009 09:21:22 -0400 (EDT)
Date: Sun, 5 Apr 2009 14:21:22 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Ivan Voras <ivoras@freebsd.org>
In-Reply-To: <gra7mq$ei8$1@ger.gmane.org>
Message-ID: <alpine.BSF.2.00.0904051414270.12639@fledge.watson.org>
References: <gra7mq$ei8$1@ger.gmane.org>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-net@freebsd.org
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 05 Apr 2009 13:21:23 -0000


On Sun, 5 Apr 2009, Ivan Voras wrote:

> I'm developing an application that needs a high rate of small TCP 
> transactions on multi-core systems, and I'm hitting a limit where a kernel 
> task, usually swi:net (but it depends on the driver) hits 100% of a CPU at 
> some transactions/s rate and blocks further performance increase even though 
> other cores are 100% idle.

You can find a similar, if possibly more mature, implementation here:

   //depot/projects/rwatson/netisr2/...

I haven't updated it in about six months since I've been waiting for the 
RSS-based flowid support in HEAD to mature.  One of the fundamental problems 
with hashing packets to distribute work is that it involves taking cache 
misses on packet headers, not just once, but twice, which often is one of the 
largest costs in processing packets.  Most modern, interesting 
high-performance network cards can already take the hash in hardware, and you 
want to use that hash to place work where possible.

In 8.x, you shouldn't be experiencing high lock contention for the TCP receipt 
path when doing bulk transfers, as we use read locking for the tcbinfo lock in 
most cases.  In fact, you can even get fairly decent scalability even in 7.x 
because the regular packet processing path for TCP uses mutual exclusion only 
briefly.  However, the current approach does dirty a lot of cache lines, 
especially locks and stats, and does not scale well (in 8.x, or at all in 7.x) 
if you have lots of short connections.  Also, be aware that if you're 
outputting to a single interface or queue, there's a *lot* of lock contention 
in the device driver.  Kip Macy has patches to support multiple output queues 
on cxgb, which should facilitate support for other drivers as well, and the 
plan is to get that in 8.0 as well.

The patch above doesn't know about the mbuf packetheader flowid yet, but it's 
trivial to teach it about that.  I have plans to get back to the netisr2 code 
before we finalize 8.0, but have some other stuff in the queue first.  We're, 
briefly, in a period where input queue count is about the same density as CPU 
cores; it's not entirely clear, but we may soon be back in a situation where 
CPU core count exceeds queues, in which case doing software work placement 
will continue to be important.  Right now, as long as your high-performance 
card supports multiple input queues, we already do pretty effective work 
placement by virtue of RSS and multiple ithreads.

Robert N M Watson
Computer Laboratory
University of Cambridge

>
> So I've got an idea and tested it out, but it fails in an unexpected
> way. I'm not very familiar with the network code so I'm probably missing
> something obvious. The idea was to locate where the packet processing
> takes place and offload packets to several new kernel threads. I see
> this can happen in several places - netisr, ip_input and tcp_input, and
> I chose netisr because I thought maybe it would also help other uses
> (routing?). Here's a patch against CURRENT:
>
> http://people.freebsd.org/~ivoras/diffs/mpip.patch
>
> It's fairly simple - starts a configurable number of threads in
> start_netisr(), assigns circular queues to each, and modifies what I
> think are entry points for packets in the non-netisr.direct case. I also
> try to have TCP and UDP traffic from the same host+port processed by the
> same thread. It has some rough edges but I think this is enough to test
> the idea. I know that there are several people officially working in
> this area and I'm not an expert in it so think of it as a weekend hack
> for learning purposes :)
>
> These parameters are needed in loader.conf to test it:
>
> net.isr.direct=0
> net.isr.mtdispatch_n_threads=2
>
> I expected things like the contention in upper layers (TCP) leading to
> not improving performance one bit, but I can't explain what I'm getting
> here. While testing the application on a plain kernel, I get approx.
> 100,000 - 120,000 packets/s per direction (by looking at "netstat 1")
> and a similar number of transactions/s in the application. With the
> patch I get up to 250,000 packets/s in netstat (3 mtdispatch threads),
> but for some weird reason the actual number of transactions processed by
> the application drops to less than 1,000 at the beginning (~~ 30
> seconds), then jumps to close to 100,000 transactions/s, with netstat
> also showing a drop this number of packets. In the first phase, the new
> threads (netd0..3) are using CPU time almost 100%, in the second phase I
> can't see where the CPU time is going (using top).
>
> I thought this has something to deal with NIC moderation (em) but can't
> really explain it. The bad performance part (not the jump) is also
> visible over the loopback interface.
>
> Any ideas?
>
>

From owner-freebsd-net@FreeBSD.ORG  Sun Apr  5 13:24:02 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id F3E7D106566C;
	Sun,  5 Apr 2009 13:24:01 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id D0BCA8FC12;
	Sun,  5 Apr 2009 13:24:01 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [65.122.17.41])
	by cyrus.watson.org (Postfix) with ESMTPS id 764A246B0C;
	Sun,  5 Apr 2009 09:24:01 -0400 (EDT)
Date: Sun, 5 Apr 2009 14:24:01 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Ivan Voras <ivoras@freebsd.org>
In-Reply-To: <gra7mq$ei8$1@ger.gmane.org>
Message-ID: <alpine.BSF.2.00.0904051422280.12639@fledge.watson.org>
References: <gra7mq$ei8$1@ger.gmane.org>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-net@freebsd.org
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 05 Apr 2009 13:24:02 -0000


On Sun, 5 Apr 2009, Ivan Voras wrote:

> I thought this has something to deal with NIC moderation (em) but can't 
> really explain it. The bad performance part (not the jump) is also visible 
> over the loopback interface.

FYI, if you want high performance, you really want a card supporting multiple 
input queues -- igb, cxgb, mxge, etc.  if_em-only cards are fundamentally less 
scalable in an SMP environment because they require input or output to occur 
only from one CPU at a time.

Robert N M Watson
Computer Laboratory
University of Cambridge

From owner-freebsd-net@FreeBSD.ORG  Sun Apr  5 13:35:11 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 36170106564A
	for <freebsd-net@freebsd.org>; Sun,  5 Apr 2009 13:35:11 +0000 (UTC)
	(envelope-from freebsd-net@m.gmane.org)
Received: from ciao.gmane.org (main.gmane.org [80.91.229.2])
	by mx1.freebsd.org (Postfix) with ESMTP id D913A8FC13
	for <freebsd-net@freebsd.org>; Sun,  5 Apr 2009 13:35:10 +0000 (UTC)
	(envelope-from freebsd-net@m.gmane.org)
Received: from list by ciao.gmane.org with local (Exim 4.43)
	id 1LqSV5-0007C9-Ij
	for freebsd-net@freebsd.org; Sun, 05 Apr 2009 13:35:03 +0000
Received: from 93-141-3-137.adsl.net.t-com.hr ([93.141.3.137])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-net@freebsd.org>; Sun, 05 Apr 2009 13:35:03 +0000
Received: from ivoras by 93-141-3-137.adsl.net.t-com.hr with local (Gmexim 0.1
	(Debian)) id 1AlnuQ-0007hv-00
	for <freebsd-net@freebsd.org>; Sun, 05 Apr 2009 13:35:03 +0000
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-net@freebsd.org
From: Ivan Voras <ivoras@freebsd.org>
Date: Sun, 05 Apr 2009 15:34:26 +0200
Lines: 38
Message-ID: <grac1s$p56$1@ger.gmane.org>
References: <gra7mq$ei8$1@ger.gmane.org>
	<alpine.BSF.2.00.0904051422280.12639@fledge.watson.org>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature";
	boundary="------------enigE100CC5D7B9CFB0A0C63756C"
X-Complaints-To: usenet@ger.gmane.org
X-Gmane-NNTP-Posting-Host: 93-141-3-137.adsl.net.t-com.hr
User-Agent: Thunderbird 2.0.0.21 (Windows/20090302)
In-Reply-To: <alpine.BSF.2.00.0904051422280.12639@fledge.watson.org>
X-Enigmail-Version: 0.95.7
Sender: news <news@ger.gmane.org>
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 05 Apr 2009 13:35:11 -0000

This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enigE100CC5D7B9CFB0A0C63756C
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Robert Watson wrote:
>=20
> On Sun, 5 Apr 2009, Ivan Voras wrote:
>=20
>> I thought this has something to deal with NIC moderation (em) but
>> can't really explain it. The bad performance part (not the jump) is
>> also visible over the loopback interface.
>=20
> FYI, if you want high performance, you really want a card supporting
> multiple input queues -- igb, cxgb, mxge, etc.  if_em-only cards are
> fundamentally less scalable in an SMP environment because they require
> input or output to occur only from one CPU at a time.

Makes sense, but on the other hand - I see people are routing at least
250,000 packets per seconds per direction with these cards, so they
probably aren't the bottleneck (pro/1000 pt on pci-e).


--------------enigE100CC5D7B9CFB0A0C63756C
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAknYs2sACgkQldnAQVacBcgOzACguAsTzdt9DZStuslyOHAti/9J
9noAoPDt1v9OHmV2gx/eYD7cRClVnDMJ
=UEzZ
-----END PGP SIGNATURE-----

--------------enigE100CC5D7B9CFB0A0C63756C--


From owner-freebsd-net@FreeBSD.ORG  Sun Apr  5 13:54:20 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 427A6106567F;
	Sun,  5 Apr 2009 13:54:20 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 018AB8FC12;
	Sun,  5 Apr 2009 13:54:20 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [65.122.17.41])
	by cyrus.watson.org (Postfix) with ESMTPS id AB86846B89;
	Sun,  5 Apr 2009 09:54:19 -0400 (EDT)
Date: Sun, 5 Apr 2009 14:54:19 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Ivan Voras <ivoras@freebsd.org>
In-Reply-To: <grac1s$p56$1@ger.gmane.org>
Message-ID: <alpine.BSF.2.00.0904051440460.12639@fledge.watson.org>
References: <gra7mq$ei8$1@ger.gmane.org>
	<alpine.BSF.2.00.0904051422280.12639@fledge.watson.org>
	<grac1s$p56$1@ger.gmane.org>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-net@freebsd.org
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 05 Apr 2009 13:54:20 -0000


On Sun, 5 Apr 2009, Ivan Voras wrote:

>>> I thought this has something to deal with NIC moderation (em) but can't 
>>> really explain it. The bad performance part (not the jump) is also visible 
>>> over the loopback interface.
>>
>> FYI, if you want high performance, you really want a card supporting 
>> multiple input queues -- igb, cxgb, mxge, etc.  if_em-only cards are 
>> fundamentally less scalable in an SMP environment because they require 
>> input or output to occur only from one CPU at a time.
>
> Makes sense, but on the other hand - I see people are routing at least 
> 250,000 packets per seconds per direction with these cards, so they probably 
> aren't the bottleneck (pro/1000 pt on pci-e).

The argument is not that they are slower (although they probably are a bit 
slower), rather that they introduce serialization bottlenecks by requiring 
synchronization between CPUs in order to distribute the work.  Certainly some 
of the scalability issues in the stack are not a result of that, but a good 
number are.

Historically, we've had a number of bottlenecks in, say, the bulk data receive 
and send paths, such as:

- Initial receipt and processing of packets on a single CPU as a result of a
   single input queue from the hardware.  Addressed by using multiple input
   queue hardware with appropriately configured drivers (generally the default
   is to use multiple input queues in 7.x and 8.x for supporting hardware).

- Cache line contention on stats data structures in drivers and various levels
   of the network stack due to bouncing around exclusive ownership of the cache
   line.  ifnet introduces at least a few, but I think most of the interesting
   ones are at the IP and TCP layers for receipt.

- Global locks protecting connection lists, all rwlocks as of 7.1, but not
   necessarily always used read-only for packet processing.  For UDP we do a
   very good job at avoiding write locks, but for TCP in 7.x we still use a
   global write lock, if briefly, for every packet.  There's a change in 8.x to
   use a global read lock for most packets, especially steady state packets,
   but I didn't merge it for 7.2 because it's not well-benchmarked.  Assuming I
   get positive feedback from more people, I will merge them before 7.3.

- If the user application is multi-threaded and receiving from many threads at
   once, we see contention on the file descriptor table lock.  This was
   markedly improved by the file descriptor table locking rewrite in 7.0, but
   we're continuing to look for ways to mitigate this.  A lockless approach
   would be really nice...

On the transmit path, the bottlenecks are similar but different:

- Neither 7.x nor 8.x supports multiple transmit queues as shipped; Kip has
   patches for both that add it for cxgb.  Maintaining ordering here, and
   ideally affinity to the appropriate associated input queue, is important.
   As the patches aren't in the tree yet, or for single-queue drivers,
   contention on the device driver send path and queues can be significant,
   especially for device drivers where the send and receive path are protected
   by the same lock (bge!).

- Stats at various levels in the stack still dirty cache lines.

- We don't acquire, in the common case, any global connection list locks
   during transmit.

- Routing table locks may be an issue.  Kip has patches against 8.x to
   re-introduce inpcb route as well as link layer flow caching.  These are in
   my review queue currently...  In 8.x the global radix tree lock is a
   read-write lock and we use read-locking where possible, but in 7.x it's
   still a mutex.  This probably isn't an MFCable change.

Another change coming in 8.x is increased use of read-mostly locks, rmlocks, 
which avoid writes to shared cache lines for read-acquire, but have a more 
expensive write-acquire.  We're already using this in a few spots, including 
for firewall registration, but need to use it in more.

With a fast CPU, introducing more cores may not necessarily speed up, and 
might often slow down, processing even if all bottlenecks are 
eliminated--fundamentally, if you have the CPU capacity to do the work on one 
CPU, then moving the work to other CPUs is an overhead best avoided. 
Especially if the device itself forces serialization due to having a single 
input queue and a single output queue.  However, if we, reasonably, assume a 
capping of core speed over time, and increasing CPU density, software work 
placement becomes more important.  And with multi-queue devices, avoiding 
writing to common cache lines from CPUs is increasingly possible.

We have a 32-thread MIPS embedded eval board in the Netperf cluster now, which 
we'll begin using for 10gbps testing fairly soon, I hope.  One of its 
properties is that individual threads are decidedly non-zippy compared to, 
say, a 10gbps interface running at line-rate, so it will allow us to explore 
these issues more effectively than we could before.

Robert N M Watson
Computer Laboratory
University of Cambridge

From owner-freebsd-net@FreeBSD.ORG  Sun Apr  5 17:25:42 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7BBDC1065733
	for <freebsd-net@freebsd.org>; Sun,  5 Apr 2009 17:25:42 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: from web63901.mail.re1.yahoo.com (web63901.mail.re1.yahoo.com
	[69.147.97.116]) by mx1.freebsd.org (Postfix) with SMTP id 32DC58FC2C
	for <freebsd-net@freebsd.org>; Sun,  5 Apr 2009 17:25:41 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: (qmail 32282 invoked by uid 60001); 5 Apr 2009 17:25:41 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
	t=1238952341; bh=UAZAyL9X+LmTWizZqCNoHVnJcsaEyDhDFBi8LHWr3k8=;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=sW2xGljViae/BOIsxjTFYt0a+U3xW8TEo8NNQykZtJM6xXB+xXbA0T8RiNyPfNIlnwsootQbb5s7MKyzkACMd8Q6WFIskIChdedVbkEG1/989Nxf6UvAz/2iwNGMPRPrl0zQyVzYDiNchEK2tPsOPGA0+NiZtFJQTU3/lyPYOmc=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=VXYivRlYRKfb7BXOhi6NBbJvYrQJJlDONsVcX7/B2MFSS1GEQ/PuzwdkKkbAFO0vaKdNi/3Bj95KnAGD0A49hOHBGLoR/+yJhFjRWhO4VhO18ZVi36eD5e1HwcnHbkQHfaaIFpw4RRj3Ys9q6neQbNTJcEKDQx01fVE0itf1n5A=;
Message-ID: <285323.31546.qm@web63901.mail.re1.yahoo.com>
X-YMail-OSG: oBYwuYkVM1lxH_acsk6HjQP98EbR6ehygHnFTfDWr4FVxpITviaKt_UanaK6piXveRnvPR9GqmV5Bizijz4P9KkAG9EGJ67O8l496N4FDMpLfGUVVsUDGSzVA0jBn0jelaXCb9I1am6asd80T076a5HAH0ZVmtySdP2txdo21JnwgFkpUdhuMmSmQ3yaPYcEuTVFR8a42h8hNfaL7ZB8xRK6dNGLLAjBwctaCATqAIjATMndTK.pSuIe1z81DCsFNU9euN9j2Q0wuPY3t7XNGl5nk.QnqqMoYdA4a.FhPoLceHrDrj2MEO0Ba4SA
Received: from [98.242.222.229] by web63901.mail.re1.yahoo.com via HTTP;
	Sun, 05 Apr 2009 10:25:41 PDT
X-Mailer: YahooMailWebService/0.7.289.1
Date: Sun, 5 Apr 2009 10:25:41 -0700 (PDT)
From: Barney Cordoba <barney_cordoba@yahoo.com>
To: Ivan Voras <ivoras@freebsd.org>, Robert Watson <rwatson@FreeBSD.org>
In-Reply-To: <alpine.BSF.2.00.0904051440460.12639@fledge.watson.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: freebsd-net@freebsd.org
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: barney_cordoba@yahoo.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 05 Apr 2009 17:25:44 -0000


--- On Sun, 4/5/09, Robert Watson <rwatson@FreeBSD.org> wrote:

> From: Robert Watson <rwatson@FreeBSD.org>
> Subject: Re: Advice on a multithreaded netisr  patch?
> To: "Ivan Voras" <ivoras@freebsd.org>
> Cc: freebsd-net@freebsd.org
> Date: Sunday, April 5, 2009, 9:54 AM
> On Sun, 5 Apr 2009, Ivan Voras wrote:
> 
> >>> I thought this has something to deal with NIC
> moderation (em) but can't really explain it. The bad
> performance part (not the jump) is also visible over the
> loopback interface.
> >> 
> >> FYI, if you want high performance, you really want
> a card supporting multiple input queues -- igb, cxgb, mxge,
> etc.  if_em-only cards are fundamentally less scalable in an
> SMP environment because they require input or output to
> occur only from one CPU at a time.
> > 
> > Makes sense, but on the other hand - I see people are
> routing at least 250,000 packets per seconds per direction
> with these cards, so they probably aren't the bottleneck
> (pro/1000 pt on pci-e).
> 
> The argument is not that they are slower (although they
> probably are a bit slower), rather that they introduce
> serialization bottlenecks by requiring synchronization
> between CPUs in order to distribute the work.  Certainly
> some of the scalability issues in the stack are not a result
> of that, but a good number are.
> 
> Historically, we've had a number of bottlenecks in,
> say, the bulk data receive and send paths, such as:
> 
> - Initial receipt and processing of packets on a single CPU
> as a result of a
>   single input queue from the hardware.  Addressed by using
> multiple input
>   queue hardware with appropriately configured drivers
> (generally the default
>   is to use multiple input queues in 7.x and 8.x for
> supporting hardware).
> 
> - Cache line contention on stats data structures in drivers
> and various levels
>   of the network stack due to bouncing around exclusive
> ownership of the cache
>   line.  ifnet introduces at least a few, but I think most
> of the interesting
>   ones are at the IP and TCP layers for receipt.
> 
> - Global locks protecting connection lists, all rwlocks as
> of 7.1, but not
>   necessarily always used read-only for packet processing. 
> For UDP we do a
>   very good job at avoiding write locks, but for TCP in 7.x
> we still use a
>   global write lock, if briefly, for every packet. 
> There's a change in 8.x to
>   use a global read lock for most packets, especially
> steady state packets,
>   but I didn't merge it for 7.2 because it's not
> well-benchmarked.  Assuming I
>   get positive feedback from more people, I will merge them
> before 7.3.
> 
> - If the user application is multi-threaded and receiving
> from many threads at
>   once, we see contention on the file descriptor table
> lock.  This was
>   markedly improved by the file descriptor table locking
> rewrite in 7.0, but
>   we're continuing to look for ways to mitigate this. 
> A lockless approach
>   would be really nice...
> 
> On the transmit path, the bottlenecks are similar but
> different:
> 
> - Neither 7.x nor 8.x supports multiple transmit queues as
> shipped; Kip has
>   patches for both that add it for cxgb.  Maintaining
> ordering here, and
>   ideally affinity to the appropriate associated input
> queue, is important.
>   As the patches aren't in the tree yet, or for
> single-queue drivers,
>   contention on the device driver send path and queues can
> be significant,
>   especially for device drivers where the send and receive
> path are protected
>   by the same lock (bge!).


I'm curious as to your assertion that hardware transmit queues are a 
big win. You're really just loading a transmit ring well ahead of actual transmission; there's no need to force a "start" for
each packet queued. You then have more overheard managing the multiple
queues; more memory used, more cpu cache needed, more interrupts
 (perhaps), overhead generating the flowid. It seems to me that a more
efficient method of transmitting, such as offloading the transmit
workload to a kernel task, would be more effective than using
multiple transmit queues. All the source thread has to do is queue
the packet and get out.

As an aside, why is Kip doing development on a Chelsio card rather
than a more mainstream product such as Intel or Broadcom that would
generate more widespread interest?

Barney


From owner-freebsd-net@FreeBSD.ORG  Sun Apr  5 17:27:25 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0F2C41065691
	for <freebsd-net@freebsd.org>; Sun,  5 Apr 2009 17:27:25 +0000 (UTC)
	(envelope-from bms@incunabulum.net)
Received: from out2.smtp.messagingengine.com (out2.smtp.messagingengine.com
	[66.111.4.26]) by mx1.freebsd.org (Postfix) with ESMTP id D86F58FC0A
	for <freebsd-net@freebsd.org>; Sun,  5 Apr 2009 17:27:24 +0000 (UTC)
	(envelope-from bms@incunabulum.net)
Received: from compute2.internal (compute2.internal [10.202.2.42])
	by out1.messagingengine.com (Postfix) with ESMTP id 19135311DBC;
	Sun,  5 Apr 2009 13:27:24 -0400 (EDT)
Received: from heartbeat1.messagingengine.com ([10.202.2.160])
	by compute2.internal (MEProxy); Sun, 05 Apr 2009 13:27:24 -0400
X-Sasl-enc: 2IloMwDbXowInNaEkcNP5WFSi17ro/YyVDin5YuvzBKy 1238952443
Received: from anglepoise.lon.incunabulum.net
	(82-35-112-254.cable.ubr07.dals.blueyonder.co.uk [82.35.112.254])
	by mail.messagingengine.com (Postfix) with ESMTPSA id EDE222DED8;
	Sun,  5 Apr 2009 13:27:22 -0400 (EDT)
Message-ID: <49D8E9F8.7090800@incunabulum.net>
Date: Sun, 05 Apr 2009 18:27:20 +0100
From: Bruce Simpson <bms@incunabulum.net>
User-Agent: Thunderbird 2.0.0.21 (X11/20090321)
MIME-Version: 1.0
To: Upakul Barkakaty <upakul@gmail.com>
References: <bb58ac4d0904020043m63a10be8kcbed87f8073471b4@mail.gmail.com>
In-Reply-To: <bb58ac4d0904020043m63a10be8kcbed87f8073471b4@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@freebsd.org
Subject: Re: Multicast routing
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 05 Apr 2009 17:27:25 -0000

Upakul Barkakaty wrote:
> Hi all,
>
> I was trying to setup a multicast tunneling setup with freebsd, with the
> mrouted utility. However, my multicast router doesnt seem to be forwarding
> those multicast packets.
>
> It would really be helpful if someone could help me with the setup or the
> mrouted.conf file contents.
>
> Thanks in anticipation.
>
>   

Please try the mcast-tools port to confirm that multicast forwarding works.
There are tools in that port which will allow you to run basic UDP 
stream tests
as well as installing static entries in the forwarding cache.

The most likely culprit is a network interface which does not support 
ALLMULTI.

Also, DVMRP has been dead for years, avoid mrouted -- try a PIM 
implementation e.g. XORP or pimsd.

thanks
BMS

From owner-freebsd-net@FreeBSD.ORG  Sun Apr  5 17:29:45 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5877510656C0
	for <freebsd-net@freebsd.org>; Sun,  5 Apr 2009 17:29:45 +0000 (UTC)
	(envelope-from freebsd-net@m.gmane.org)
Received: from ciao.gmane.org (main.gmane.org [80.91.229.2])
	by mx1.freebsd.org (Postfix) with ESMTP id C95E58FC18
	for <freebsd-net@freebsd.org>; Sun,  5 Apr 2009 17:29:44 +0000 (UTC)
	(envelope-from freebsd-net@m.gmane.org)
Received: from list by ciao.gmane.org with local (Exim 4.43)
	id 1LqWAA-0007pK-FU
	for freebsd-net@freebsd.org; Sun, 05 Apr 2009 17:29:42 +0000
Received: from 93-141-3-137.adsl.net.t-com.hr ([93.141.3.137])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-net@freebsd.org>; Sun, 05 Apr 2009 17:29:42 +0000
Received: from ivoras by 93-141-3-137.adsl.net.t-com.hr with local (Gmexim 0.1
	(Debian)) id 1AlnuQ-0007hv-00
	for <freebsd-net@freebsd.org>; Sun, 05 Apr 2009 17:29:42 +0000
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-net@freebsd.org
From: Ivan Voras <ivoras@freebsd.org>
Date: Sun, 05 Apr 2009 19:29:10 +0200
Lines: 74
Message-ID: <grappq$tsg$1@ger.gmane.org>
References: <gra7mq$ei8$1@ger.gmane.org>	<alpine.BSF.2.00.0904051422280.12639@fledge.watson.org>	<grac1s$p56$1@ger.gmane.org>
	<alpine.BSF.2.00.0904051440460.12639@fledge.watson.org>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature";
	boundary="------------enig1FAE96532F0E09824EF6C434"
X-Complaints-To: usenet@ger.gmane.org
X-Gmane-NNTP-Posting-Host: 93-141-3-137.adsl.net.t-com.hr
User-Agent: Thunderbird 2.0.0.21 (Windows/20090302)
In-Reply-To: <alpine.BSF.2.00.0904051440460.12639@fledge.watson.org>
X-Enigmail-Version: 0.95.7
Sender: news <news@ger.gmane.org>
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 05 Apr 2009 17:29:46 -0000

This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enig1FAE96532F0E09824EF6C434
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Robert Watson wrote:
>=20
> On Sun, 5 Apr 2009, Ivan Voras wrote:
>=20
>>>> I thought this has something to deal with NIC moderation (em) but
>>>> can't really explain it. The bad performance part (not the jump) is
>>>> also visible over the loopback interface.
>>>
>>> FYI, if you want high performance, you really want a card supporting
>>> multiple input queues -- igb, cxgb, mxge, etc.  if_em-only cards are
>>> fundamentally less scalable in an SMP environment because they
>>> require input or output to occur only from one CPU at a time.
>>
>> Makes sense, but on the other hand - I see people are routing at least=

>> 250,000 packets per seconds per direction with these cards, so they
>> probably aren't the bottleneck (pro/1000 pt on pci-e).
>=20
> The argument is not that they are slower (although they probably are a
> bit slower), rather that they introduce serialization bottlenecks by
> requiring synchronization between CPUs in order to distribute the work.=
=20
> Certainly some of the scalability issues in the stack are not a result
> of that, but a good number are.

I'd like to understand more. If (in netisr) I have a mbuf with headers,
is this data already transfered from the card or is it magically "not
here yet"?

In the first case, the package reception code path is not changed until
it's queued on a thread, on which it's handled in the future (or is the
influence of "other" data like timers and internal TCP reassembly
buffers so large?). In the second case, why?

> Historically, we've had a number of bottlenecks in, say, the bulk data
> receive and send paths, such as:
>=20
> - Initial receipt and processing of packets on a single CPU as a result=

> of a
>   single input queue from the hardware.  Addressed by using multiple in=
put
>   queue hardware with appropriately configured drivers (generally the
> default
>   is to use multiple input queues in 7.x and 8.x for supporting hardwar=
e).

As the card and the OS can already process many packets per second for
something fairly complex as routing (http://www.tancsa.com/blast.html),
and TCP chokes swi:net at 100% of a core, isn't this indication there's
certainly more space for improvement even with a single-queue
old-fashioned NICs?


--------------enig1FAE96532F0E09824EF6C434
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAknY6mYACgkQldnAQVacBcjOfwCeOKtS8skAua5SW8DwMiFIdozi
TFMAn0LkN2TD0wVJ9tkz9rnP6x3BSRjR
=8O6z
-----END PGP SIGNATURE-----

--------------enig1FAE96532F0E09824EF6C434--


From owner-freebsd-net@FreeBSD.ORG  Sun Apr  5 17:40:19 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 1EF4E1065691;
	Sun,  5 Apr 2009 17:40:19 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id ED0138FC1C;
	Sun,  5 Apr 2009 17:40:18 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [65.122.17.41])
	by cyrus.watson.org (Postfix) with ESMTPS id A409646B8F;
	Sun,  5 Apr 2009 13:40:18 -0400 (EDT)
Date: Sun, 5 Apr 2009 18:40:18 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Barney Cordoba <barney_cordoba@yahoo.com>
In-Reply-To: <285323.31546.qm@web63901.mail.re1.yahoo.com>
Message-ID: <alpine.BSF.2.00.0904051837500.12639@fledge.watson.org>
References: <285323.31546.qm@web63901.mail.re1.yahoo.com>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-net@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 05 Apr 2009 17:40:20 -0000

On Sun, 5 Apr 2009, Barney Cordoba wrote:

> I'm curious as to your assertion that hardware transmit queues are a big 
> win. You're really just loading a transmit ring well ahead of actual 
> transmission; there's no need to force a "start" for each packet queued. You 
> then have more overheard managing the multiple queues; more memory used, 
> more cpu cache needed, more interrupts (perhaps), overhead generating the 
> flowid. It seems to me that a more efficient method of transmitting, such as 
> offloading the transmit workload to a kernel task, would be more effective 
> than using multiple transmit queues. All the source thread has to do is 
> queue the packet and get out.

When using multiple cores, we've observed significant contention on the 
transmit-side locks protecting a single output queue; when multiple queues are 
used, that contention is avoided.  The lock only coveres the queue, but the 
overhead of a single high contention lock twice for every packet (enqeueu, 
later dequeue) is significant at high pps and with many cores.

> As an aside, why is Kip doing development on a Chelsio card rather than a 
> more mainstream product such as Intel or Broadcom that would generate more 
> widespread interest?

Because they paid him to to write their driver?  :-)

Robert N M Watson
Computer Laboratory
University of Cambridge

From owner-freebsd-net@FreeBSD.ORG  Sun Apr  5 21:24:21 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A34541065670;
	Sun,  5 Apr 2009 21:24:21 +0000 (UTC) (envelope-from oberman@es.net)
Received: from mailgw.es.net (mail1.es.net [IPv6:2001:400:201:1::2])
	by mx1.freebsd.org (Postfix) with ESMTP id 8D0558FC18;
	Sun,  5 Apr 2009 21:24:21 +0000 (UTC) (envelope-from oberman@es.net)
Received: from ptavv.es.net (ptavv.es.net [IPv6:2001:400:910::29])
	by mailgw.es.net (8.14.3/8.14.3) with ESMTP id n35LOKvF001485
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT);
	Sun, 5 Apr 2009 14:24:20 -0700
Received: from ptavv.es.net (ptavv.es.net [127.0.0.1])
	by ptavv.es.net (Tachyon Server) with ESMTP id 31A311CC50;
	Sun,  5 Apr 2009 14:24:20 -0700 (PDT)
To: barney_cordoba@yahoo.com
In-reply-to: Your message of "Sun, 05 Apr 2009 10:25:41 PDT."
	<285323.31546.qm@web63901.mail.re1.yahoo.com> 
Date: Sun, 05 Apr 2009 14:24:20 -0700
From: "Kevin Oberman" <oberman@es.net>
Message-Id: <20090405212420.31A311CC50@ptavv.es.net>
Cc: freebsd-net@freebsd.org, Robert Watson <rwatson@FreeBSD.org>,
	Ivan Voras <ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr patch? 
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 05 Apr 2009 21:24:22 -0000

> Date: Sun, 5 Apr 2009 10:25:41 -0700 (PDT)
> From: Barney Cordoba <barney_cordoba@yahoo.com>
> Sender: owner-freebsd-net@freebsd.org
> 
> 
> As an aside, why is Kip doing development on a Chelsio card rather
> than a more mainstream product such as Intel or Broadcom that would
> generate more widespread interest?

Because Chelsio pays him better than the makers of the "more mainstream"
products. And, at 10GE, Chelsio and Myricom seem to have stronger
products than others. (Just my opinion and not that of The US Dept. of
Energy, The university of California, or Lawrence Berkeley National
Labs.)

I just hope Kip's legal problems are resolved soon. FreeBSD really needs
him.
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: oberman@es.net			Phone: +1 510 486-8634
Key fingerprint:059B 2DDF 031C 9BA3 14A4  EADA 927D EBB3 987B 3751

From owner-freebsd-net@FreeBSD.ORG  Sun Apr  5 21:32:37 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A1F261065674
	for <freebsd-net@freebsd.org>; Sun,  5 Apr 2009 21:32:37 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: from web63906.mail.re1.yahoo.com (web63906.mail.re1.yahoo.com
	[69.147.97.121]) by mx1.freebsd.org (Postfix) with SMTP id 464DD8FC13
	for <freebsd-net@freebsd.org>; Sun,  5 Apr 2009 21:32:37 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: (qmail 73900 invoked by uid 60001); 5 Apr 2009 21:32:36 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
	t=1238967156; bh=RuMlGk4vsuJPubhuisTxnMKgvyShTshtnqEpj5W2r54=;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=P/llS3UpA5Qb0S9UwIZWfjjaBaR6H2Ce94p7tzLdIHmNa2+i4Rq67I8A3fBIMWxM/EwGB05WPjYnSy1rkkGTivYP7gVILqwwJ+jWh6V2MS3Q+60WbIiQK6+VRR6cfaZYSFB/LbkxGN4iQZ/5oTFZ9saFEYrmQ9IP3qYN70RBCLQ=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=W+IN4VS63VntDErPfLsiv3lRYwsXs6QIR96cKc2yu9mq7uEefnN3DwLKVLhPmY88U57O2ahijH02ogfl13Kw2ZzvrtTTO8XrRuk0M5u0KFum1VgFIXq2g+hHT5tdBm1yzOiA6x/f0Ls7RIzJtNbn3hKw+GPqNN8Ug/mI/paZyFc=;
Message-ID: <496315.72401.qm@web63906.mail.re1.yahoo.com>
X-YMail-OSG: AzBgvPAVM1na3Ij0KuHddDM6Qqwd.9SDzBv9y2iStBH2xMLkJ97NXEk4eDkEmaoaAXpAdncwOex8Tz4U2BRrxf6Gn7IFykAOFniSkGAuv7iD9HKbinhKX1GP0lQi.iS2Ku.JIO84z9yqySd0355.QdtB7B72rF2bz58PgZ.qvKVUNP3Rn3aV6uQPPS.fjRfIWOn8eb_VUQ7u3ozAcUKHNysHDX_PP2gXnP.7tx0RVEarLCmZqObbU7NA.gEmZ5wPIx.8SEvhCi6mzTpqg9j4TPNpHyBjOzoqppKmwEwpUbh77NfpEue7YLPqwsQrHfEGlNw0bDLrA8GUQRkUBw--
Received: from [98.242.222.229] by web63906.mail.re1.yahoo.com via HTTP;
	Sun, 05 Apr 2009 14:32:36 PDT
X-Mailer: YahooMailWebService/0.7.289.1
Date: Sun, 5 Apr 2009 14:32:36 -0700 (PDT)
From: Barney Cordoba <barney_cordoba@yahoo.com>
To: Kevin Oberman <oberman@es.net>
In-Reply-To: <20090405212420.31A311CC50@ptavv.es.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: freebsd-net@freebsd.org, Robert Watson <rwatson@FreeBSD.org>,
	Ivan Voras <ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: barney_cordoba@yahoo.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 05 Apr 2009 21:32:37 -0000


--- On Sun, 4/5/09, Kevin Oberman <oberman@es.net> wrote:

> From: Kevin Oberman <oberman@es.net>
> Subject: Re: Advice on a multithreaded netisr patch?
> To: barney_cordoba@yahoo.com
> Cc: "Ivan Voras" <ivoras@freebsd.org>, "Robert Watson" <rwatson@FreeBSD.org>, freebsd-net@freebsd.org
> Date: Sunday, April 5, 2009, 5:24 PM
> > Date: Sun, 5 Apr 2009 10:25:41 -0700 (PDT)
> > From: Barney Cordoba <barney_cordoba@yahoo.com>
> > Sender: owner-freebsd-net@freebsd.org
> > 
> > 
> > As an aside, why is Kip doing development on a Chelsio
> card rather
> > than a more mainstream product such as Intel or
> Broadcom that would
> > generate more widespread interest?
> 
> Because Chelsio pays him better than the makers of the
> "more mainstream"
> products. And, at 10GE, Chelsio and Myricom seem to have
> stronger
> products than others. (Just my opinion and not that of The
> US Dept. of
> Energy, The university of California, or Lawrence Berkeley
> National
> Labs.)

Sadly thats the small picture view that has plagued freebsd for
the longest time. The bigger picture is that big OEMs aren't going
to use chelsio cards, and big OEMs running FreeBSD instead of linux
mean more testers, more hardware, more code give-backs and more 
money for the project.

You don't really know how good or bad intel or broadcom is because
you don't have good drivers for the cards. Unfortunately Intel does
things ass-backwards, by putting out crap "sample" drivers that make
their cards look like garbage. Maybe they are garbage, but you think
 they'd be a bit smarter. They can certainly afford more than Chelsio.

Barney


From owner-freebsd-net@FreeBSD.ORG  Sun Apr  5 21:37:27 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5D07A106566C
	for <freebsd-net@freebsd.org>; Sun,  5 Apr 2009 21:37:27 +0000 (UTC)
	(envelope-from sthaug@nethelp.no)
Received: from bizet.nethelp.no (bizet.nethelp.no [195.1.209.33])
	by mx1.freebsd.org (Postfix) with SMTP id 60FB78FC17
	for <freebsd-net@freebsd.org>; Sun,  5 Apr 2009 21:37:26 +0000 (UTC)
	(envelope-from sthaug@nethelp.no)
Received: (qmail 30249 invoked from network); 5 Apr 2009 21:10:44 -0000
Received: from bizet.nethelp.no (HELO localhost) (195.1.209.33)
	by bizet.nethelp.no with SMTP; 5 Apr 2009 21:10:44 -0000
Date: Sun, 05 Apr 2009 23:10:44 +0200 (CEST)
Message-Id: <20090405.231044.74688369.sthaug@nethelp.no>
To: freebsd-net@freebsd.org
From: sthaug@nethelp.no
X-Mailer: Mew version 3.3 on Emacs 21.3 / Mule 5.0 (SAKAKI)
Mime-Version: 1.0
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Subject: IPv6 window scaling factor always 1 on initial SYN
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 05 Apr 2009 21:37:28 -0000

On 7-STABLE, with kern.ipc.maxsockbuf=2621440, both sides set a window
scaling factor of 6 (i.e. SYN wscale 6, SYN-ACK wscale 6) using IPv4.

With the same value of kern.ipc.maxsockbuf, using IPv6, the side which
sends the initial SYN sets a window scaling factor of only 1, while
the other side sets a scaling factor of 6 in the SYN-ACK. This will
obviously limit throughput in many cases.

In both cases net.inet.tcp.rfc1323=1.

Anybody know why IPv6 behaves differently here?

tcpdump example:

22:20:37.282415 IP 193.75.4.50.53981 > 193.75.110.66.5555: S 1580765626:1580765626(0) win 65535 <mss 1460,nop,wscale 6,sackOK,timestamp 661320721 0>
22:20:37.282442 IP 193.75.110.66.5555 > 193.75.4.50.53981: S 1408884711:1408884711(0) ack 1580765627 win 65535 <mss 1460,nop,wscale 6,sackOK,timestamp 1581013561 661320721>

22:21:49.749586 IP6 2001:8c0:9a00:1::2.53983 > 2001:8c0:8500:1::2.5555: S 565631163:565631163(0) win 65535 <mss 1440,nop,wscale 1,sackOK,timestamp 661393190 0>
22:21:49.749633 IP6 2001:8c0:8500:1::2.5555 > 2001:8c0:9a00:1::2.53983: S 627173961:627173961(0) ack 565631164 win 65535 <mss 1440,nop,wscale 6,sackOK,timestamp 8

Steinar Haug, Nethelp consulting, sthaug@nethelp.no

From owner-freebsd-net@FreeBSD.ORG  Sun Apr  5 21:50:07 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 4E4411065670
	for <freebsd-net@freebsd.org>; Sun,  5 Apr 2009 21:50:07 +0000 (UTC)
	(envelope-from bzeeb-lists@lists.zabbadoz.net)
Received: from mail.cksoft.de (mail.cksoft.de [195.88.108.3])
	by mx1.freebsd.org (Postfix) with ESMTP id 084C98FC1C
	for <freebsd-net@freebsd.org>; Sun,  5 Apr 2009 21:50:06 +0000 (UTC)
	(envelope-from bzeeb-lists@lists.zabbadoz.net)
Received: from localhost (amavis.fra.cksoft.de [192.168.74.71])
	by mail.cksoft.de (Postfix) with ESMTP id A09CD41C6FC;
	Sun,  5 Apr 2009 23:50:05 +0200 (CEST)
X-Virus-Scanned: amavisd-new at cksoft.de
Received: from mail.cksoft.de ([195.88.108.3])
	by localhost (amavis.fra.cksoft.de [192.168.74.71]) (amavisd-new,
	port 10024)
	with ESMTP id vVdtwUB0c+OZ; Sun,  5 Apr 2009 23:50:05 +0200 (CEST)
Received: by mail.cksoft.de (Postfix, from userid 66)
	id 3FFE941C6F2; Sun,  5 Apr 2009 23:50:05 +0200 (CEST)
Received: from maildrop.int.zabbadoz.net (maildrop.int.zabbadoz.net
	[10.111.66.10])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mail.int.zabbadoz.net (Postfix) with ESMTP id 8844D4448E6;
	Sun,  5 Apr 2009 21:49:50 +0000 (UTC)
Date: Sun, 5 Apr 2009 21:49:50 +0000 (UTC)
From: "Bjoern A. Zeeb" <bzeeb-lists@lists.zabbadoz.net>
X-X-Sender: bz@maildrop.int.zabbadoz.net
To: sthaug@nethelp.no
In-Reply-To: <20090405.231044.74688369.sthaug@nethelp.no>
Message-ID: <20090405214757.E15361@maildrop.int.zabbadoz.net>
References: <20090405.231044.74688369.sthaug@nethelp.no>
X-OpenPGP-Key: 0x14003F198FEFA3E77207EE8D2B58B8F83CCF1842
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-net@freebsd.org
Subject: Re: IPv6 window scaling factor always 1 on initial SYN
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 05 Apr 2009 21:50:07 -0000

On Sun, 5 Apr 2009, sthaug@nethelp.no wrote:

> On 7-STABLE, with kern.ipc.maxsockbuf=2621440, both sides set a window
> scaling factor of 6 (i.e. SYN wscale 6, SYN-ACK wscale 6) using IPv4.
>
> With the same value of kern.ipc.maxsockbuf, using IPv6, the side which
> sends the initial SYN sets a window scaling factor of only 1, while
> the other side sets a scaling factor of 6 in the SYN-ACK. This will
> obviously limit throughput in many cases.
>
> In both cases net.inet.tcp.rfc1323=1.
>
> Anybody know why IPv6 behaves differently here?
>
> tcpdump example:
>
> 22:20:37.282415 IP 193.75.4.50.53981 > 193.75.110.66.5555: S 1580765626:1580765626(0) win 65535 <mss 1460,nop,wscale 6,sackOK,timestamp 661320721 0>
> 22:20:37.282442 IP 193.75.110.66.5555 > 193.75.4.50.53981: S 1408884711:1408884711(0) ack 1580765627 win 65535 <mss 1460,nop,wscale 6,sackOK,timestamp 1581013561 661320721>
>
> 22:21:49.749586 IP6 2001:8c0:9a00:1::2.53983 > 2001:8c0:8500:1::2.5555: S 565631163:565631163(0) win 65535 <mss 1440,nop,wscale 1,sackOK,timestamp 661393190 0>
> 22:21:49.749633 IP6 2001:8c0:8500:1::2.5555 > 2001:8c0:9a00:1::2.53983: S 627173961:627173961(0) ack 565631164 win 65535 <mss 1440,nop,wscale 6,sackOK,timestamp 8

I think the answer to tthat is in sys/netinet/tcp_usrreq.c in the
functuoins:
tcp_connect

1106         /*
1107          * Compute window scaling to request:
1108          * Scale to fit into sweet spot.  See tcp_syncache.c.
1109          * XXX: This should move to tcp_output().
1110          */
1111         while (tp->request_r_scale < TCP_MAX_WINSHIFT &&
1112             (TCP_MAXWIN << tp->request_r_scale) < sb_max)

                                                     ^^^^^^^^^^^

1113                 tp->request_r_scale++;


and tcp6_connect

1174         /* Compute window scaling to request.  */
1175         while (tp->request_r_scale < TCP_MAX_WINSHIFT &&
1176             (TCP_MAXWIN << tp->request_r_scale) < so->so_rcv.sb_hiwat)

                                                     ^^^^^^^^^^^

1177                 tp->request_r_scale++;


I'll have to check why they are un-equal...

/bz

-- 
Bjoern A. Zeeb                      The greatest risk is not taking one.

From owner-freebsd-net@FreeBSD.ORG  Sun Apr  5 22:05:07 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A78561065702
	for <freebsd-net@freebsd.org>; Sun,  5 Apr 2009 22:05:07 +0000 (UTC)
	(envelope-from bzeeb-lists@lists.zabbadoz.net)
Received: from mail.cksoft.de (mail.cksoft.de [195.88.108.3])
	by mx1.freebsd.org (Postfix) with ESMTP id 3687E8FC18
	for <freebsd-net@freebsd.org>; Sun,  5 Apr 2009 22:05:07 +0000 (UTC)
	(envelope-from bzeeb-lists@lists.zabbadoz.net)
Received: from localhost (amavis.fra.cksoft.de [192.168.74.71])
	by mail.cksoft.de (Postfix) with ESMTP id 6021F41C75E;
	Mon,  6 Apr 2009 00:05:06 +0200 (CEST)
X-Virus-Scanned: amavisd-new at cksoft.de
Received: from mail.cksoft.de ([195.88.108.3])
	by localhost (amavis.fra.cksoft.de [192.168.74.71]) (amavisd-new,
	port 10024)
	with ESMTP id G0fuCDoPZrvH; Mon,  6 Apr 2009 00:05:05 +0200 (CEST)
Received: by mail.cksoft.de (Postfix, from userid 66)
	id E9F2241C75D; Mon,  6 Apr 2009 00:05:05 +0200 (CEST)
Received: from maildrop.int.zabbadoz.net (maildrop.int.zabbadoz.net
	[10.111.66.10])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mail.int.zabbadoz.net (Postfix) with ESMTP id 6EEAE4448E6;
	Sun,  5 Apr 2009 22:02:04 +0000 (UTC)
Date: Sun, 5 Apr 2009 22:02:04 +0000 (UTC)
From: "Bjoern A. Zeeb" <bzeeb-lists@lists.zabbadoz.net>
X-X-Sender: bz@maildrop.int.zabbadoz.net
To: sthaug@nethelp.no
In-Reply-To: <20090405214757.E15361@maildrop.int.zabbadoz.net>
Message-ID: <20090405215842.C15361@maildrop.int.zabbadoz.net>
References: <20090405.231044.74688369.sthaug@nethelp.no>
	<20090405214757.E15361@maildrop.int.zabbadoz.net>
X-OpenPGP-Key: 0x14003F198FEFA3E77207EE8D2B58B8F83CCF1842
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-net@freebsd.org
Subject: Re: IPv6 window scaling factor always 1 on initial SYN
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 05 Apr 2009 22:05:07 -0000

On Sun, 5 Apr 2009, Bjoern A. Zeeb wrote:

> On Sun, 5 Apr 2009, sthaug@nethelp.no wrote:
>
>> On 7-STABLE, with kern.ipc.maxsockbuf=2621440, both sides set a window
>> scaling factor of 6 (i.e. SYN wscale 6, SYN-ACK wscale 6) using IPv4.
>> 
>> With the same value of kern.ipc.maxsockbuf, using IPv6, the side which
>> sends the initial SYN sets a window scaling factor of only 1, while
>> the other side sets a scaling factor of 6 in the SYN-ACK. This will
>> obviously limit throughput in many cases.
>> 
>> In both cases net.inet.tcp.rfc1323=1.
>> 
>> Anybody know why IPv6 behaves differently here?
>> 
>> tcpdump example:
>> 
>> 22:20:37.282415 IP 193.75.4.50.53981 > 193.75.110.66.5555: S 
>> 1580765626:1580765626(0) win 65535 <mss 1460,nop,wscale 6,sackOK,timestamp 
>> 661320721 0>
>> 22:20:37.282442 IP 193.75.110.66.5555 > 193.75.4.50.53981: S 
>> 1408884711:1408884711(0) ack 1580765627 win 65535 <mss 1460,nop,wscale 
>> 6,sackOK,timestamp 1581013561 661320721>
>> 
>> 22:21:49.749586 IP6 2001:8c0:9a00:1::2.53983 > 2001:8c0:8500:1::2.5555: S 
>> 565631163:565631163(0) win 65535 <mss 1440,nop,wscale 1,sackOK,timestamp 
>> 661393190 0>
>> 22:21:49.749633 IP6 2001:8c0:8500:1::2.5555 > 2001:8c0:9a00:1::2.53983: S 
>> 627173961:627173961(0) ack 565631164 win 65535 <mss 1440,nop,wscale 
>> 6,sackOK,timestamp 8
>
> I think the answer to tthat is in sys/netinet/tcp_usrreq.c in the
> functuoins:
> tcp_connect
>
> 1106         /*
> 1107          * Compute window scaling to request:
> 1108          * Scale to fit into sweet spot.  See tcp_syncache.c.
> 1109          * XXX: This should move to tcp_output().
> 1110          */
> 1111         while (tp->request_r_scale < TCP_MAX_WINSHIFT &&
> 1112             (TCP_MAXWIN << tp->request_r_scale) < sb_max)
>
>                                                    ^^^^^^^^^^^
>
> 1113                 tp->request_r_scale++;
>
>
> and tcp6_connect
>
> 1174         /* Compute window scaling to request.  */
> 1175         while (tp->request_r_scale < TCP_MAX_WINSHIFT &&
> 1176             (TCP_MAXWIN << tp->request_r_scale) < so->so_rcv.sb_hiwat)
>
>                                                    ^^^^^^^^^^^
>
> 1177                 tp->request_r_scale++;
>
>
> I'll have to check why they are un-equal...


Ok, both versions had:	< so->so_rcv.sb_hiwat)

http://svn.freebsd.org/viewvc/base?view=revision&revision=166403

changed it for IPv4 the first time,

http://svn.freebsd.org/viewvc/base?view=revision&revision=172795

changed it a second time for IPv4.

Noone changed the IPv6 version.

The syncache already seems to do it for both v4/v6 (common code).

Can you try changing it to < sb_max) for IPv6 as well and see if
things work (better) for you?

/bz

-- 
Bjoern A. Zeeb                      The greatest risk is not taking one.

From owner-freebsd-net@FreeBSD.ORG  Sun Apr  5 22:17:59 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B365E106566C;
	Sun,  5 Apr 2009 22:17:59 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 586E28FC13;
	Sun,  5 Apr 2009 22:17:59 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [65.122.17.41])
	by cyrus.watson.org (Postfix) with ESMTPS id E5E7E46B0C;
	Sun,  5 Apr 2009 18:17:58 -0400 (EDT)
Date: Sun, 5 Apr 2009 23:17:58 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Ivan Voras <ivoras@freebsd.org>
In-Reply-To: <grappq$tsg$1@ger.gmane.org>
Message-ID: <alpine.BSF.2.00.0904052243250.34905@fledge.watson.org>
References: <gra7mq$ei8$1@ger.gmane.org>
	<alpine.BSF.2.00.0904051422280.12639@fledge.watson.org>
	<grac1s$p56$1@ger.gmane.org>
	<alpine.BSF.2.00.0904051440460.12639@fledge.watson.org>
	<grappq$tsg$1@ger.gmane.org>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-net@freebsd.org
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 05 Apr 2009 22:18:00 -0000


On Sun, 5 Apr 2009, Ivan Voras wrote:

>> The argument is not that they are slower (although they probably are a bit 
>> slower), rather that they introduce serialization bottlenecks by requiring 
>> synchronization between CPUs in order to distribute the work. Certainly 
>> some of the scalability issues in the stack are not a result of that, but a 
>> good number are.
>
> I'd like to understand more. If (in netisr) I have a mbuf with headers, is 
> this data already transfered from the card or is it magically "not here 
> yet"?

A lot depends on the details of the card and driver.  The driver will take 
cache misses on the descriptor ring entry, if it's not already in cache, and 
the link layer will take a cache miss on the front of the ethernet frame in 
the cluster pointed to by the mbuf header as part of its demux.  What happens 
next depends on your dispatch model and cache line size.  Let's make a few 
simplifying assumptions that are mostly true:

- The driver associats a single cluster with each receive ring entry for each
   packet to be stored in, and the cluster is cacheline-aligned.  No header
   splitting is enabled.

- Standard ethernet encapsulation of IP is used, without additional VLAN
   headers or other encapsulation, etc.  There are no IP options.

- We don't need to validate any checksums because the hardware has done it for
   us, so no need to take cache misses on data that doesn't matter until we
   reach higher layers.

In the device driver/ithread code, we'll now proceed to take some cache 
misses assuming we're not pretty lucky:

(1) The descriptor ring entry
(2) The mbuf packet header
(3) The first cache line in the cluster

This is sufficient to figure out what protocol we're going to dispatch to, and 
depending on dispatch model, we now either enqueue the packet for delivery to 
a netisr, or we directly dispatch the handler for IP.

If the packet is processed on the current CPU and we're direct dispatching, or 
if we've dispatched to a netisr on the same CPU and we're quite lucky, the 
mbuf packet header and front of the cluster will be in the cache.

However, what happens next depends on the cache fetch and line size.  If 
things happen in 32-byte cache lines or smaller, we cache miss on the end of 
the IP header, because the last two bytes of the destination IP address start 
at offset 32 into the cluster.  If we have 64-byte fetching and line size, 
things go better because both the full IP and TCP headers should be in that 
first cache line.

One big advantage to direct dispatch is that it maximizes the chances that we 
don't blow out the low-level CPU caches between link-layer and IP-layer 
processing, meaning that we might actually get through all the IP and TCP 
headers without a cache miss on a 64-byte line size.  If we netisr dispatch to 
another CPU without a shared cache, or we netisr dispatch to the current CPU 
but there's a scheduling delay, other packets queued first, etc, we'll take a 
number of the same cache misses over again as things get pulled into the right 
cache.

This presents a strong cache motivation to keep a packet "on" a CPU and even 
in the same thread once you've started processing it.  If you have to enqueue, 
you take locks, take a context switch, deal with the fact that LRU on cache 
lines isn't going to like your queue depth, and potentially pay a number of 
additional cache misses on the same data.  There are also some other good 
reasons to use direct dispatch, such as avoiding doing work on packets that 
will later be dropped if the netisr queue overflows.

This is why we direct dispatch by default, and why this is quite a good 
strategy for multiple input queue network cards, where it also buys us 
parallelism.

Note that if the flow RSS hash is in the same cache line as the rest of the 
receive descriptor ring entry, you may be able to avoid the cache miss on the 
cluster and simply redirect it to another CPU's netisr without ever reading 
packet data, which avoids at least one and possibly two cache misses, but also 
means that you have to run the link layer in the remote netisr, rather than 
locally in the ithread.

> In the first case, the package reception code path is not changed until it's 
> queued on a thread, on which it's handled in the future (or is the influence 
> of "other" data like timers and internal TCP reassembly buffers so large?). 
> In the second case, why?

The good news about TCP reassembly is that we don't have to look at the data, 
only mbuf headers and reassembly buffer entries, so with any luck we've 
avoided actually taking a cache miss on the data.  If things go well, we can 
avoid looking at anything but mbuf and packet headers until the socket copies 
out, but I'm not sure how well we do that in practice.

> As the card and the OS can already process many packets per second for 
> something fairly complex as routing (http://www.tancsa.com/blast.html), and 
> TCP chokes swi:net at 100% of a core, isn't this indication there's 
> certainly more space for improvement even with a single-queue old-fashioned 
> NICs?

Maybe.  It depends on the relative costs of local processing vs redistributing 
the work, which involves schedulers, IPIs, additional cache misses, lock 
contention, and so on.  This means there's a period where it can't possibly be 
a win, and then at some point it's a win as long as the stack scales.  This is 
essentially the usual trade-off in using threads and parallelism: does the 
benefit of multiple parallel execution units make up for the overheads of 
synchronization and data migration?

There are some previous e-mail threads where people have observed that for 
some workloads, switching to netisr wins over direct dispatch.  For example, 
if you have a number of cores and are doing firewall processing, offloading 
work to the netisr from the input ithread may improve performance.  However, 
this appears not to be the common case for end-host workloads on the hardware 
we mostly target, and this is increasingly true as multiple input queues come 
into play, as the card itself will allow us to use multiple CPUs without any 
interactions between the CPUs.

This isn't to say that work redistribution using a netisr-like scheme isn't a 
good idea: in a world where CPU threads are weak compared to the wire 
workflow, and there's cache locality across threads on the same core, or NUMA 
is present, there may be a potential for a big win when available work 
significantly exceeds what a single CPU thread/core can handle.  In that case, 
we want to place the work as close as possible to take advantage of shared 
caches or the memory being local to the CPU thread/core doing the deferred 
work.

FYI, the localhost case is a bit weird -- I think we have some scheduling 
issues that are causing loopback netisr stuff to be pessimally scheduled. 
Here are some suggestions for things to try and see if they help, though:

- Comment out all ifnet, IP, and TCP global statistics in your local stack --
   especially look for things tcpstat.whatever++;.

- Use cpuset to pin ithreads, the netisr, and whatever else, to specific cores
   so that they don't migrate, and if your system uses HTT, experiment with
   pinning the ithread and the netisr on different threads on the same core, or
   at least, different cores on the same die.

- Experiment with using just the source IP, the source + destination IP, and
   both IPs plus TCP ports in your hash.

- If your card supports RSS, pass the flowid up the stack in the mbuf packet
   header flowid field, and use that instead of the hash for work placement.

- If you're doing pure PPS tests with UDP (or the like), and your test can
   tolerate disordering, try hashing based on the mbuf header address or
   something else that will distribute the work but not take a cache miss.

- If you have a flowid or the above disordered condition applies, try shifting
   the link layer dispatch to the netisr, rather than doing the demux in the
   ithread, as that will avoid cache misses in the ithread and do all the demux
   in the netisr.

Robert N M Watson
Computer Laboratory
University of Cambridge

From owner-freebsd-net@FreeBSD.ORG  Sun Apr  5 22:48:36 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 14104106566B
	for <freebsd-net@freebsd.org>; Sun,  5 Apr 2009 22:48:36 +0000 (UTC)
	(envelope-from freebsd-net@m.gmane.org)
Received: from ciao.gmane.org (main.gmane.org [80.91.229.2])
	by mx1.freebsd.org (Postfix) with ESMTP id 823EE8FC17
	for <freebsd-net@freebsd.org>; Sun,  5 Apr 2009 22:48:35 +0000 (UTC)
	(envelope-from freebsd-net@m.gmane.org)
Received: from list by ciao.gmane.org with local (Exim 4.43)
	id 1Lqb8j-0006og-1H
	for freebsd-net@freebsd.org; Sun, 05 Apr 2009 22:48:33 +0000
Received: from 93-141-3-137.adsl.net.t-com.hr ([93.141.3.137])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-net@freebsd.org>; Sun, 05 Apr 2009 22:48:33 +0000
Received: from ivoras by 93-141-3-137.adsl.net.t-com.hr with local (Gmexim 0.1
	(Debian)) id 1AlnuQ-0007hv-00
	for <freebsd-net@freebsd.org>; Sun, 05 Apr 2009 22:48:33 +0000
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-net@freebsd.org
From: Ivan Voras <ivoras@freebsd.org>
Date: Mon, 06 Apr 2009 00:47:49 +0200
Lines: 111
Message-ID: <grbcfg$poe$1@ger.gmane.org>
References: <gra7mq$ei8$1@ger.gmane.org>	<alpine.BSF.2.00.0904051422280.12639@fledge.watson.org>	<grac1s$p56$1@ger.gmane.org>	<alpine.BSF.2.00.0904051440460.12639@fledge.watson.org>	<grappq$tsg$1@ger.gmane.org>
	<alpine.BSF.2.00.0904052243250.34905@fledge.watson.org>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature";
	boundary="------------enig078FFC936793C9EB67C0FB65"
X-Complaints-To: usenet@ger.gmane.org
X-Gmane-NNTP-Posting-Host: 93-141-3-137.adsl.net.t-com.hr
User-Agent: Thunderbird 2.0.0.21 (Windows/20090302)
In-Reply-To: <alpine.BSF.2.00.0904052243250.34905@fledge.watson.org>
X-Enigmail-Version: 0.95.7
Sender: news <news@ger.gmane.org>
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 05 Apr 2009 22:48:36 -0000

This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enig078FFC936793C9EB67C0FB65
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Thanks for the ideas, I will try some of them. But I'd also like some
more clarifications:

Robert Watson wrote:
> On Sun, 5 Apr 2009, Ivan Voras wrote:

>> I'd like to understand more. If (in netisr) I have a mbuf with
>> headers, is this data already transfered from the card or is it
>> magically "not here yet"?
>=20
> A lot depends on the details of the card and driver.  The driver will
> take cache misses on the descriptor ring entry, if it's not already in
> cache, and the link layer will take a cache miss on the front of the
> ethernet frame in the cluster pointed to by the mbuf header as part of
> its demux.  What happens next depends on your dispatch model and cache
> line size.  Let's make a few simplifying assumptions that are mostly tr=
ue:

So, a mbuf can reference data not yet copied from the NIC hardware? I'm
specifically trying to undestand what m_pullup() does.

>> As the card and the OS can already process many packets per second for=

>> something fairly complex as routing
>> (http://www.tancsa.com/blast.html), and TCP chokes swi:net at 100% of
>> a core, isn't this indication there's certainly more space for
>> improvement even with a single-queue old-fashioned NICs?
>=20
> Maybe.  It depends on the relative costs of local processing vs
> redistributing the work, which involves schedulers, IPIs, additional
> cache misses, lock contention, and so on.  This means there's a period
> where it can't possibly be a win, and then at some point it's a win as
> long as the stack scales.  This is essentially the usual trade-off in
> using threads and parallelism: does the benefit of multiple parallel
> execution units make up for the overheads of synchronization and data
> migration?

Do you have any idea at all why I'm seeing the weird difference of
netstat packets per second (250,000) and my application's TCP
performance (< 1,000 pps)? Summary: each packet is guaranteed to be a
whole message causing a transaction in the application - without the
changes I see pps almost identical to tps. Even if the source of netstat
statistics somehow manages to count packets multiple time (I don't see
how that can happen), no relation can describe differences this huge. It
almost looks like something in the upper layers is discarding packets
(also not likely: TCP timeouts would occur and the application wouldn't
be able to push 250,000 pps) - but what? Where to look?

> FYI, the localhost case is a bit weird -- I think we have some
> scheduling issues that are causing loopback netisr stuff to be
> pessimally scheduled. Here are some suggestions for things to try and
> see if they help, though:
>=20
> - Comment out all ifnet, IP, and TCP global statistics in your local
> stack --
>   especially look for things tcpstat.whatever++;.

You mean for the general code? I purposely don't lock my statistics
variables because I'm not that interested in exact numbers (orders of
magnitude are relevant). As far as I understand, unlocked "x++" should
be trivially fast in this case?

> - Use cpuset to pin ithreads, the netisr, and whatever else, to specifi=
c
> cores
>   so that they don't migrate, and if your system uses HTT, experiment w=
ith
>   pinning the ithread and the netisr on different threads on the same
> core, or
>   at least, different cores on the same die.

I'm using em hardware; I still think there's a possibility I'm fighting
the driver in some cases but this has priority #2.

> - Experiment with using just the source IP, the source + destination IP=
,
> and
>   both IPs plus TCP ports in your hash.

Ok. Currently I'm using ip1+ip2+port1+port2.

> - If your card supports RSS, pass the flowid up the stack in the mbuf
> packet
>   header flowid field, and use that instead of the hash for work placem=
ent.

Don't know about em. Don't really want to touch it if I don't have to :)


--------------enig078FFC936793C9EB67C0FB65
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAknZNRwACgkQldnAQVacBcj7hQCfRE35c+nkAhCYp4+neW2Da6xk
kNsAnRxRXOoJR0udvActmaO+azYDeXhn
=aVa7
-----END PGP SIGNATURE-----

--------------enig078FFC936793C9EB67C0FB65--


From owner-freebsd-net@FreeBSD.ORG  Mon Apr  6 06:24:52 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6F5BD106566C;
	Mon,  6 Apr 2009 06:24:52 +0000 (UTC)
	(envelope-from linimon@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 452998FC12;
	Mon,  6 Apr 2009 06:24:52 +0000 (UTC)
	(envelope-from linimon@FreeBSD.org)
Received: from freefall.freebsd.org (linimon@localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n366OqIN045367;
	Mon, 6 Apr 2009 06:24:52 GMT
	(envelope-from linimon@freefall.freebsd.org)
Received: (from linimon@localhost)
	by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n366Oq76045363;
	Mon, 6 Apr 2009 06:24:52 GMT (envelope-from linimon)
Date: Mon, 6 Apr 2009 06:24:52 GMT
Message-Id: <200904060624.n366Oq76045363@freefall.freebsd.org>
To: linimon@FreeBSD.org, freebsd-i386@FreeBSD.org, freebsd-net@FreeBSD.org
From: linimon@FreeBSD.org
Cc: 
Subject: Re: kern/133218: [carp] [hang] use of carp(4) causes system to
	freeze
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 06 Apr 2009 06:24:52 -0000

Synopsis: [carp] [hang] use of carp(4) causes system to freeze

Responsible-Changed-From-To: freebsd-i386->freebsd-net
Responsible-Changed-By: linimon
Responsible-Changed-When: Mon Apr 6 06:24:37 UTC 2009
Responsible-Changed-Why: 
This does not sound i386-specific.

http://www.freebsd.org/cgi/query-pr.cgi?pr=133218

From owner-freebsd-net@FreeBSD.ORG  Mon Apr  6 10:10:03 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 597981065675
	for <freebsd-net@hub.freebsd.org>; Mon,  6 Apr 2009 10:10:03 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 27F3D8FC15
	for <freebsd-net@hub.freebsd.org>; Mon,  6 Apr 2009 10:10:03 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n36AA3HF076020
	for <freebsd-net@freefall.freebsd.org>; Mon, 6 Apr 2009 10:10:03 GMT
	(envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n36AA3ZX076019;
	Mon, 6 Apr 2009 10:10:03 GMT (envelope-from gnats)
Date: Mon, 6 Apr 2009 10:10:03 GMT
Message-Id: <200904061010.n36AA3ZX076019@freefall.freebsd.org>
To: freebsd-net@FreeBSD.org
From: dfilter@FreeBSD.ORG (dfilter service)
Cc: 
Subject: Re: bin/131365: commit references a PR
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: dfilter service <dfilter@FreeBSD.ORG>
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 06 Apr 2009 10:10:03 -0000

The following reply was made to PR bin/131365; it has been noted by GNATS.

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: bin/131365: commit references a PR
Date: Mon,  6 Apr 2009 10:09:37 +0000 (UTC)

 Author: rrs
 Date: Mon Apr  6 10:09:20 2009
 New Revision: 190758
 URL: http://svn.freebsd.org/changeset/base/190758
 
 Log:
   Class based addressing went out in the early 90's. Basically
   if a entry is not route add -net xxx/bits then we should use
   the addr (xxx) to establish the number of bits by looking at
   the first non-zero bit. So if we enter
   route add -net 10.1.1.0 10.1.3.5
   this is the same as doing
   route add -net 10.1.1.0/24
   Since the 8th bit (zero counting) is set to 1 we set bits
   to 32-8.
   
   Users can of course still use the /x to change this behavior
   or in cases where the network is in the trailing part
   of the address, a "netmask" argument can be supplied to
   override what is established from the interpretation of the
   address itself. e.g:
   
   route add -net 10.1.1.8 -netmask 0xff00ffff
   
   should overide and place the proper CIDR mask in place.
   
   PR:		131365
   MFC after:	1 week
 
 Modified:
   head/sbin/route/route.c
 
 Modified: head/sbin/route/route.c
 ==============================================================================
 --- head/sbin/route/route.c	Mon Apr  6 07:13:26 2009	(r190757)
 +++ head/sbin/route/route.c	Mon Apr  6 10:09:20 2009	(r190758)
 @@ -713,7 +713,7 @@ newroute(argc, argv)
  #ifdef INET6
  		if (af == AF_INET6) {
  			rtm_addrs &= ~RTA_NETMASK;
 -			memset((void *)&so_mask, 0, sizeof(so_mask));
 +				memset((void *)&so_mask, 0, sizeof(so_mask));
  		}
  #endif 
  	}
 @@ -803,21 +803,22 @@ inet_makenetandmask(net, sin, bits)
  		addr = net << IN_CLASSC_NSHIFT;
  	else
  		addr = net;
 -
 -	if (bits != 0)
 -		mask = 0xffffffff << (32 - bits);
 -	else if (net == 0)
 -		mask = 0;
 -	else if (IN_CLASSA(addr))
 -		mask = IN_CLASSA_NET;
 -	else if (IN_CLASSB(addr))
 -		mask = IN_CLASSB_NET;
 -	else if (IN_CLASSC(addr))
 -		mask = IN_CLASSC_NET;
 -	else if (IN_MULTICAST(addr))
 -		mask = IN_CLASSD_NET;
 -	else
 -		mask = 0xffffffff;
 +	/*
 +	 * If no /xx was specified we must cacluate the 
 +	 * CIDR address.
 +	 */
 +	if ((bits == 0)  && (addr != 0)) {
 +		int i, j;
 +		for(i=0,j=1; i<32; i++)  {
 +			if (addr & j) {
 +				break;
 +			}
 +			j <<= 1;
 +		}
 +		/* i holds the first non zero bit */
 +		bits = 32 - i;	
 +	}
 +	mask = 0xffffffff << (32 - bits);
  
  	sin->sin_addr.s_addr = htonl(addr);
  	sin = &so_mask.sin;
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 

From owner-freebsd-net@FreeBSD.ORG  Mon Apr  6 10:20:01 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C71291065677
	for <freebsd-net@freebsd.org>; Mon,  6 Apr 2009 10:20:01 +0000 (UTC)
	(envelope-from sthaug@nethelp.no)
Received: from bizet.nethelp.no (bizet.nethelp.no [195.1.209.33])
	by mx1.freebsd.org (Postfix) with SMTP id 121A28FC1F
	for <freebsd-net@freebsd.org>; Mon,  6 Apr 2009 10:20:00 +0000 (UTC)
	(envelope-from sthaug@nethelp.no)
Received: (qmail 18860 invoked from network); 6 Apr 2009 10:19:59 -0000
Received: from bizet.nethelp.no (HELO localhost) (195.1.209.33)
	by bizet.nethelp.no with SMTP; 6 Apr 2009 10:19:59 -0000
Date: Mon, 06 Apr 2009 12:19:59 +0200 (CEST)
Message-Id: <20090406.121959.74751582.sthaug@nethelp.no>
To: bzeeb-lists@lists.zabbadoz.net
From: sthaug@nethelp.no
In-Reply-To: <20090405215842.C15361@maildrop.int.zabbadoz.net>
References: <20090405.231044.74688369.sthaug@nethelp.no>
	<20090405214757.E15361@maildrop.int.zabbadoz.net>
	<20090405215842.C15361@maildrop.int.zabbadoz.net>
X-Mailer: Mew version 3.3 on Emacs 21.3 / Mule 5.0 (SAKAKI)
Mime-Version: 1.0
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@freebsd.org
Subject: Re: IPv6 window scaling factor always 1 on initial SYN
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 06 Apr 2009 10:20:02 -0000

> Ok, both versions had:	< so->so_rcv.sb_hiwat)
> 
> http://svn.freebsd.org/viewvc/base?view=revision&revision=166403
> 
> changed it for IPv4 the first time,
> 
> http://svn.freebsd.org/viewvc/base?view=revision&revision=172795
> 
> changed it a second time for IPv4.
> 
> Noone changed the IPv6 version.
> 
> The syncache already seems to do it for both v4/v6 (common code).
> 
> Can you try changing it to < sb_max) for IPv6 as well and see if
> things work (better) for you?

I changed it, and that worked like a dream. Now I get basically the
same throughput with IPv4 and IPv6. There are of course still issues
like lots of IPv6 tunnels that add extra latency - but that's not the
fault of FreeBSD.

Anyway, thanks for your work. Below is a context diff (against 7-STABLE
cvsupped last night). Do we need a PR to get this into FreeBSD?

Steinar Haug, Nethelp consulting, sthaug@nethelp.no
----------------------------------------------------------------------
*** tcp_usrreq.c.orig	Sun Apr  5 22:51:49 2009
--- tcp_usrreq.c	Mon Apr  6 11:15:11 2009
***************
*** 1153,1159 ****
  
  	/* Compute window scaling to request.  */
  	while (tp->request_r_scale < TCP_MAX_WINSHIFT &&
! 	    (TCP_MAXWIN << tp->request_r_scale) < so->so_rcv.sb_hiwat)
  		tp->request_r_scale++;
  
  	soisconnecting(so);
--- 1153,1159 ----
  
  	/* Compute window scaling to request.  */
  	while (tp->request_r_scale < TCP_MAX_WINSHIFT &&
! 	    (TCP_MAXWIN << tp->request_r_scale) < sb_max)
  		tp->request_r_scale++;
  
  	soisconnecting(so);

From owner-freebsd-net@FreeBSD.ORG  Mon Apr  6 10:37:21 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 68C071065688
	for <freebsd-net@freebsd.org>; Mon,  6 Apr 2009 10:37:21 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: from web63904.mail.re1.yahoo.com (web63904.mail.re1.yahoo.com
	[69.147.97.119]) by mx1.freebsd.org (Postfix) with SMTP id 19F058FC12
	for <freebsd-net@freebsd.org>; Mon,  6 Apr 2009 10:37:20 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: (qmail 63616 invoked by uid 60001); 6 Apr 2009 10:37:20 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
	t=1239014240; bh=t61mFWB4CKSnqZioDLU3wCAaq5MB6KLbBAVNmJIodN4=;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=V+ssyr9A+GmlFRtJH87ZPK+lJ6W75tWSVjxHFFC0Zw9v+IoBZgrW+qS2slpin3MnIN6T1LdmhhqMpqelS83898pAwbdc7mP5NVZYZGDRf8QtSh43Yqd72qeJp6qH13e5gVA2xDgBrP3GRlwwixVWEqLUxxmnYFGlHQayA5FJ9wU=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=MpvEkaOxsTkqvTmwJMVUF5KFEStUFo5Xm6bqNTf2RiApQKbUXo86kxvN3HwpjwYGutJrktu2FxiXev1eAFpTH0JqBZ+X/WSo8sy96hXM6y63yOZLeNdnv1bUGKvmYFs5jiB8vuaAYQrOo4fJpRRFv7T3cOZf9qbPhsjOER9F24I=;
Message-ID: <86599.63596.qm@web63904.mail.re1.yahoo.com>
X-YMail-OSG: mhj7Sc4VM1ni_V9G4FD5czP8DVwFTjlyLkQXi6Sl90nc.DvtqIlSdYlmUwbULgIrqwe878qSX0_NqvBqwhSnGbuqFQ0l225Od5zXMPi6iYVjWpXdSVOSC7jkjf5BdIVbevvasq6p1F9MvqTWbgyVXj2zpja3sPmQJoffDoz1kC30WTlr9Dzo52Aqas1MmBB8kIs3A_Tb.hgEYqYWYSAvx_W3CKyQq4v.VIPWpvarG3DzW.SndAUNIzFh2NjDj_ZFrHcta_qNGwC8nDDf4i0MNuobBRRysoT.NY1L0H4Br9eNjT4C8QgESAc5UHAZ
Received: from [98.242.222.229] by web63904.mail.re1.yahoo.com via HTTP;
	Mon, 06 Apr 2009 03:37:19 PDT
X-Mailer: YahooMailWebService/0.7.289.1
Date: Mon, 6 Apr 2009 03:37:19 -0700 (PDT)
From: Barney Cordoba <barney_cordoba@yahoo.com>
To: Ivan Voras <ivoras@freebsd.org>, Robert Watson <rwatson@FreeBSD.org>
In-Reply-To: <alpine.BSF.2.00.0904052243250.34905@fledge.watson.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: freebsd-net@freebsd.org
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: barney_cordoba@yahoo.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 06 Apr 2009 10:37:21 -0000


--- On Sun, 4/5/09, Robert Watson <rwatson@FreeBSD.org> wrote:

> From: Robert Watson <rwatson@FreeBSD.org>
> Subject: Re: Advice on a multithreaded netisr  patch?
> To: "Ivan Voras" <ivoras@freebsd.org>
> Cc: freebsd-net@freebsd.org
> Date: Sunday, April 5, 2009, 6:17 PM
> On Sun, 5 Apr 2009, Ivan Voras wrote:
> 
> >> The argument is not that they are slower (although
> they probably are a bit slower), rather that they introduce
> serialization bottlenecks by requiring synchronization
> between CPUs in order to distribute the work. Certainly some
> of the scalability issues in the stack are not a result of
> that, but a good number are.
> > 
> > I'd like to understand more. If (in netisr) I have
> a mbuf with headers, is this data already transfered from
> the card or is it magically "not here yet"?
> 
> A lot depends on the details of the card and driver.  The
> driver will take cache misses on the descriptor ring entry,
> if it's not already in cache, and the link layer will
> take a cache miss on the front of the ethernet frame in the
> cluster pointed to by the mbuf header as part of its demux. 
> What happens next depends on your dispatch model and cache
> line size.  Let's make a few simplifying assumptions
> that are mostly true:
> 
> - The driver associats a single cluster with each receive
> ring entry for each
>   packet to be stored in, and the cluster is
> cacheline-aligned.  No header
>   splitting is enabled.
> 
> - Standard ethernet encapsulation of IP is used, without
> additional VLAN
>   headers or other encapsulation, etc.  There are no IP
> options.
> 
> - We don't need to validate any checksums because the
> hardware has done it for
>   us, so no need to take cache misses on data that
> doesn't matter until we
>   reach higher layers.
> 
> In the device driver/ithread code, we'll now proceed to
> take some cache misses assuming we're not pretty lucky:
> 
> (1) The descriptor ring entry
> (2) The mbuf packet header
> (3) The first cache line in the cluster
> 
> This is sufficient to figure out what protocol we're
> going to dispatch to, and depending on dispatch model, we
> now either enqueue the packet for delivery to a netisr, or
> we directly dispatch the handler for IP.
> 
> If the packet is processed on the current CPU and we're
> direct dispatching, or if we've dispatched to a netisr
> on the same CPU and we're quite lucky, the mbuf packet
> header and front of the cluster will be in the cache.
> 
> However, what happens next depends on the cache fetch and
> line size.  If things happen in 32-byte cache lines or
> smaller, we cache miss on the end of the IP header, because
> the last two bytes of the destination IP address start at
> offset 32 into the cluster.  If we have 64-byte fetching and
> line size, things go better because both the full IP and TCP
> headers should be in that first cache line.
> 
> One big advantage to direct dispatch is that it maximizes
> the chances that we don't blow out the low-level CPU
> caches between link-layer and IP-layer processing, meaning
> that we might actually get through all the IP and TCP
> headers without a cache miss on a 64-byte line size.  If we
> netisr dispatch to another CPU without a shared cache, or we
> netisr dispatch to the current CPU but there's a
> scheduling delay, other packets queued first, etc, we'll
> take a number of the same cache misses over again as things
> get pulled into the right cache.
> 
> This presents a strong cache motivation to keep a packet
> "on" a CPU and even in the same thread once
> you've started processing it.  If you have to enqueue,
> you take locks, take a context switch, deal with the fact
> that LRU on cache lines isn't going to like your queue
> depth, and potentially pay a number of additional cache
> misses on the same data.  There are also some other good
> reasons to use direct dispatch, such as avoiding doing work
> on packets that will later be dropped if the netisr queue
> overflows.
> 
> This is why we direct dispatch by default, and why this is
> quite a good strategy for multiple input queue network
> cards, where it also buys us parallelism.
> 
> Note that if the flow RSS hash is in the same cache line as
> the rest of the receive descriptor ring entry, you may be
> able to avoid the cache miss on the cluster and simply
> redirect it to another CPU's netisr without ever reading
> packet data, which avoids at least one and possibly two
> cache misses, but also means that you have to run the link
> layer in the remote netisr, rather than locally in the
> ithread.
> 
> > In the first case, the package reception code path is
> not changed until it's queued on a thread, on which
> it's handled in the future (or is the influence of
> "other" data like timers and internal TCP
> reassembly buffers so large?). In the second case, why?
> 
> The good news about TCP reassembly is that we don't
> have to look at the data, only mbuf headers and reassembly
> buffer entries, so with any luck we've avoided actually
> taking a cache miss on the data.  If things go well, we can
> avoid looking at anything but mbuf and packet headers until
> the socket copies out, but I'm not sure how well we do
> that in practice.
> 
> > As the card and the OS can already process many
> packets per second for something fairly complex as routing
> (http://www.tancsa.com/blast.html), and TCP chokes swi:net
> at 100% of a core, isn't this indication there's
> certainly more space for improvement even with a
> single-queue old-fashioned NICs?
> 
> Maybe.  It depends on the relative costs of local
> processing vs redistributing the work, which involves
> schedulers, IPIs, additional cache misses, lock contention,
> and so on.  This means there's a period where it
> can't possibly be a win, and then at some point it's
> a win as long as the stack scales.  This is essentially the
> usual trade-off in using threads and parallelism: does the
> benefit of multiple parallel execution units make up for the
> overheads of synchronization and data migration?
> 
> There are some previous e-mail threads where people have
> observed that for some workloads, switching to netisr wins
> over direct dispatch.  For example, if you have a number of
> cores and are doing firewall processing, offloading work to
> the netisr from the input ithread may improve performance. 
> However, this appears not to be the common case for end-host
> workloads on the hardware we mostly target, and this is
> increasingly true as multiple input queues come into play,
> as the card itself will allow us to use multiple CPUs
> without any interactions between the CPUs.
> 
> This isn't to say that work redistribution using a
> netisr-like scheme isn't a good idea: in a world where
> CPU threads are weak compared to the wire workflow, and
> there's cache locality across threads on the same core,
> or NUMA is present, there may be a potential for a big win
> when available work significantly exceeds what a single CPU
> thread/core can handle.  In that case, we want to place the
> work as close as possible to take advantage of shared caches
> or the memory being local to the CPU thread/core doing the
> deferred work.
> 
> FYI, the localhost case is a bit weird -- I think we have
> some scheduling issues that are causing loopback netisr
> stuff to be pessimally scheduled. Here are some suggestions
> for things to try and see if they help, though:
> 
> - Comment out all ifnet, IP, and TCP global statistics in
> your local stack --
>   especially look for things tcpstat.whatever++;.
> 
> - Use cpuset to pin ithreads, the netisr, and whatever
> else, to specific cores
>   so that they don't migrate, and if your system uses
> HTT, experiment with
>   pinning the ithread and the netisr on different threads
> on the same core, or
>   at least, different cores on the same die.
> 
> - Experiment with using just the source IP, the source +
> destination IP, and
>   both IPs plus TCP ports in your hash.
> 
> - If your card supports RSS, pass the flowid up the stack
> in the mbuf packet
>   header flowid field, and use that instead of the hash for
> work placement.
> 
> - If you're doing pure PPS tests with UDP (or the
> like), and your test can
>   tolerate disordering, try hashing based on the mbuf
> header address or
>   something else that will distribute the work but not take
> a cache miss.
> 
> - If you have a flowid or the above disordered condition
> applies, try shifting
>   the link layer dispatch to the netisr, rather than doing
> the demux in the
>   ithread, as that will avoid cache misses in the ithread
> and do all the demux
>   in the netisr.
> 
> Robert N M Watson
> Computer Laboratory
> University of Cambridge

Is there a way to give a kernel thread exclusive use of a core? I know you
can pin a kernel thread with sched_bind(), but is there a way to keep
other threads from using the core? On an 8 core system it almost seems
that the randomness of more cores is a negative in some situations.

Also, I've noticed that calling sched_bind() during bootup is a bad thing
in that it locks the system. I'm not certain but I suspect its the 
thread_lock that is the culprit. Is there a clean way to determine that
its safe to lock curthread and do a cpu bind?

Barney


From owner-freebsd-net@FreeBSD.ORG  Mon Apr  6 11:06:58 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A3B441065689
	for <freebsd-net@FreeBSD.org>; Mon,  6 Apr 2009 11:06:58 +0000 (UTC)
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 8F7B28FC32
	for <freebsd-net@FreeBSD.org>; Mon,  6 Apr 2009 11:06:58 +0000 (UTC)
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n36B6wK8061947
	for <freebsd-net@FreeBSD.org>; Mon, 6 Apr 2009 11:06:58 GMT
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n36B6wW0061943
	for freebsd-net@FreeBSD.org; Mon, 6 Apr 2009 11:06:58 GMT
	(envelope-from owner-bugmaster@FreeBSD.org)
Date: Mon, 6 Apr 2009 11:06:58 GMT
Message-Id: <200904061106.n36B6wW0061943@freefall.freebsd.org>
X-Authentication-Warning: freefall.freebsd.org: gnats set sender to
	owner-bugmaster@FreeBSD.org using -f
From: FreeBSD bugmaster <bugmaster@FreeBSD.org>
To: freebsd-net@FreeBSD.org
Cc: 
Subject: Current problem reports assigned to freebsd-net@FreeBSD.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 06 Apr 2009 11:07:00 -0000

Note: to view an individual PR, use:
  http://www.freebsd.org/cgi/query-pr.cgi?pr=(number).

The following is a listing of current problems submitted by FreeBSD users.
These represent problem reports covering all versions including
experimental development code and obsolete releases.


S Tracker      Resp.      Description
--------------------------------------------------------------------------------
o kern/133235  net        [netinet] [patch] Process SIOCDLIFADDR command incorre
o kern/133218  net        [carp] [hang] use of carp(4) causes system to freeze
o kern/133060  net        [ipsec] [pfsync] [panic] Kernel panic with ipsec + pfs
o kern/132991  net        [bge] if_bge low performance problem
o kern/132984  net        [netgraph] swi1: net 100% cpu usage
f bin/132911   net        ip6fw(8): argument type of fill_icmptypes is wrong and
o kern/132889  net        [ndis] [panic] NDIS kernel crash on load BCM4321 AGN d
o kern/132885  net        [wlan] 802.1x broken after SVN rev 189592
o conf/132851  net        [fib] [patch] allow to setup fib for service running f
o bin/132798   net        [patch] ggatec(8): ggated/ggatec connection slowdown p
o kern/132734  net        [ifmib] [panic] panic in net/if_mib.c
o kern/132722  net        [ath] Wifi ath0 associates fine with AP, but DHCP or I
o kern/132715  net        [lagg] [panic] Panic when creating vlan's on lagg inte
o kern/132705  net        [libwrap] [patch] libwrap - infinite loop if hosts.all
o kern/132672  net        [ndis] [panic] ndis with rt2860.sys causes kernel pani
o kern/132669  net        [xl] 3c905-TX send DUP! in reply on ping (sometime)
o kern/132625  net        [iwn] iwn drivers don't support setting country
o kern/132554  net        [ipl] There is no ippool start script/ipfilter magic t
o kern/132354  net        [nat] Getting some packages to ipnat(8) causes crash
o kern/132285  net        [carp] alias gives incorrect hash in dmesg
o kern/132277  net        [crypto] [ipsec] poor performance using cryptodevice f
o conf/132179  net        [patch] /etc/network.subr: ipv6 rtsol on incorrect wla
o kern/132107  net        [carp] carp(4) advskew setting ignored when carp IP us
o kern/131781  net        [ndis] ndis keeps dropping the link
o kern/131776  net        [wi] driver fails to init
o kern/131753  net        [altq] [panic] kernel panic in hfsc_dequeue
o bin/131567   net        [socket] [patch] Update for regression/sockets/unix_cm
o kern/131549  net        ifconfig(8) can't clear 'monitor' mode on the wireless
o kern/131536  net        [netinet] [patch] kernel does allow manipulation of su
o bin/131365   net        route(8): route add changes interpretation of network 
o kern/131310  net        [netgraph] [panic] 7.1 panics with mpd netgraph interf
o kern/131162  net        [ath] Atheros driver bugginess and kernel crashes
o kern/131153  net        [iwi] iwi doesn't see a wireless network
f kern/131087  net        [ipw] [panic] ipw / iwi - no sent/received packets; iw
f kern/130820  net        [ndis] wpa_supplicant(8) returns 'no space on device'
o kern/130628  net        [nfs] NFS / rpc.lockd deadlock on 7.1-R
o conf/130555  net        [rc.d] [patch] No good way to set ipfilter variables a
o kern/130525  net        [ndis] [panic] 64 bit ar5008 ndisgen-erated driver cau
o kern/130311  net        [wlan_xauth] [panic] hostapd restart causing kernel pa
o bin/130159   net        [patch] ppp(8) fails to correctly set routes
o kern/130109  net        [ipfw] Can not set fib for packets originated from loc
f kern/130059  net        [panic] Leaking 50k mbufs/hour
o kern/129750  net        [ath] Atheros AR5006 exits on "cannot map register spa
f kern/129719  net        [nfs] [panic] Panic during shutdown, tcp_ctloutput: in
o kern/129580  net        [ndis] Netgear WG311v3 (ndis) causes kenel trap at boo
o kern/129517  net        [ipsec] [panic] double fault / stack overflow
o kern/129508  net        [carp] [panic] Kernel panic with EtherIP (may be relat
o kern/129352  net        [xl] [patch] xl0 watchdog timeout
o kern/129219  net        [ppp] Kernel panic when using kernel mode ppp
o kern/129197  net        [panic] 7.0 IP stack related panic
o kern/129135  net        [vge] vge driver on a VIA mini-ITX not working
o bin/128954   net        ifconfig(8) deletes valid routes
o kern/128917  net        [wpi] [panic] if_wpi and wpa+tkip causing kernel panic
o kern/128884  net        [msk] if_msk page fault while in kernel mode
o kern/128840  net        [igb] page fault under load with igb/LRO
o bin/128602   net        [an] wpa_supplicant(8) crashes with an(4)
o kern/128598  net        [bluetooth] WARNING: attempt to net_add_domain(bluetoo
o kern/128448  net        [nfs] 6.4-RC1 Boot Fails if NFS Hostname cannot be res
o conf/128334  net        [request] use wpa_cli in the "WPA DHCP" situation
o bin/128295   net        [patch] ifconfig(8) does not print TOE4 or TOE6 capabi
o bin/128001   net        wpa_supplicant(8), wlan(4), and wi(4) issues
o kern/127928  net        [tcp] [patch] TCP bandwidth gets squeezed every time t
o kern/127834  net        [ixgbe] [patch] wrong error counting
o kern/127826  net        [iwi] iwi0 driver has reduced performance and connecti
o kern/127815  net        [gif] [patch] if_gif does not set vlan attributes from
o kern/127724  net        [rtalloc] rtfree: 0xc5a8f870 has 1 refs
f bin/127719   net        [arp] arp: Segmentation fault (core dumped)
s kern/127587  net        [bge] [request] if_bge(4) doesn't support BCM576X fami
f kern/127528  net        [icmp]: icmp socket receives icmp replies not owned by
o bin/127192   net        routed(8) removes the secondary alias IP of interface 
f kern/127145  net        [wi]: prism (wi) driver crash at bigger traffic
o kern/127102  net        [wpi] Intel 3945ABG low throughput
o kern/127057  net        [udp] Unable to send UDP packet via IPv6 socket to IPv
o kern/127050  net        [carp] ipv6 does not work on carp interfaces [regressi
o kern/126945  net        [carp] CARP interface destruction with ifconfig destro
o kern/126924  net        [an] [patch] printf -> device_printf and simplify prob
o kern/126895  net        [patch] [ral] Add antenna selection (marked as TBD)
o kern/126874  net        [vlan]: Zebra problem if ifconfig vlanX destroy
o bin/126822   net        wpa_supplicant(8): WPA PSK does not work in adhoc mode
o kern/126714  net        [carp] CARP interface renaming makes system no longer 
o kern/126695  net        rtfree messages and network disruption upon use of if_
o kern/126688  net        [ixgbe] [patch] 1.4.7 ixgbe driver panic with 4GB and 
o kern/126475  net        [ath] [panic] ath pcmcia card inevitably panics under 
o kern/126339  net        [ipw] ipw driver drops the connection
o kern/126214  net        [ath] txpower problem with Atheros wifi card
o kern/126075  net        [inet] [patch] internet control accesses beyond end of
o bin/125922   net        [patch] Deadlock in arp(8)
o kern/125920  net        [arp] Kernel Routing Table loses Ethernet Link status 
o kern/125845  net        [netinet] [patch] tcp_lro_rx() should make use of hard
o kern/125816  net        [carp] [if_bridge] carp stuck in init when using bridg
f kern/125502  net        [ral] ifconfig ral0 scan produces no output unless in 
o kern/125258  net        [socket] socket's SO_REUSEADDR option does not work
o kern/125239  net        [gre] kernel crash when using gre
f kern/125195  net        [fxp] fxp(4) driver failed to initialize device Intel 
o kern/124904  net        [fxp] EEPROM corruption with Compaq NC3163 NIC
o kern/124767  net        [iwi] Wireless connection using iwi0 driver (Intel 220
o kern/124753  net        [ieee80211] net80211 discards power-save queue packets
o kern/124341  net        [ral] promiscuous mode for wireless device ral0 looses
o kern/124160  net        [libc] connect(2) function loops indefinitely
o kern/124127  net        [msk] watchdog timeout (missed Tx interrupts) -- recov
o kern/124021  net        [ip6] [panic] page fault in nd6_output()
o kern/123968  net        [rum] [panic] rum driver causes kernel panic with WPA.
p kern/123961  net        [vr] [patch] Allow vr interface to handle vlans
o kern/123892  net        [tap] [patch] No buffer space available
o kern/123890  net        [ppp] [panic] crash & reboot on work with PPP low-spee
o kern/123858  net        [stf] [patch] stf not usable behind a NAT
o kern/123796  net        [ipf] FreeBSD 6.1+VPN+ipnat+ipf: port mapping does not
o bin/123633   net        ifconfig(8) doesn't set inet and ether address in one 
f kern/123617  net        [tcp] breaking connection when client downloading file
o kern/123603  net        [tcp] tcp_do_segment and Received duplicate SYN
o kern/123559  net        [iwi] iwi periodically disassociates/associates [regre
o bin/123465   net        [ip6] route(8): route add -inet6 <ipv6_addr> -interfac
o kern/123463  net        [ipsec] [panic] repeatable crash related to ipsec-tool
o kern/123429  net        [nfe] [hang] "ifconfig nfe up" causes a hard system lo
o kern/123347  net        [bge] bge1: watchdog timeout -- linkstate changed to D
o conf/123330  net        [nsswitch.conf] Enabling samba wins in nsswitch.conf c
o kern/123256  net        [wpi] panic: blockable sleep lock with wpi(4)
f kern/123172  net        [bce] Watchdog timeout problems with if_bce
o kern/123160  net        [ip] Panic and reboot at sysctl kern.polling.enable=0
o kern/122989  net        [swi] [panic] 6.3 kernel panic in swi1: net
o kern/122954  net        [lagg] IPv6 EUI64 incorrectly chosen for lagg devices
o kern/122928  net        [em] interface watchdog timeouts and stops receiving p
f kern/122839  net        [multicast] FreeBSD 7 multicast routing problem
p kern/122794  net        [lagg] Kernel panic after brings lagg(8) up if NICs ar
o kern/122780  net        [lagg] tcpdump on lagg interface during high pps wedge
o kern/122772  net        [em] em0 taskq panic, tcp reassembly bug causes radix 
o kern/122743  net        [mbuf] [panic] vm_page_unwire: invalid wire count: 0
o kern/122697  net        [ath] Atheros card is not well supported
o kern/122685  net        It is not visible passing packets in tcpdump(1)
o kern/122551  net        [bge] Broadcom 5715S no carrier on HP BL460c blade usi
o kern/122319  net        [wi] imposible to enable ad-hoc demo mode with Orinoco
o kern/122290  net        [netgraph] [panic] Netgraph related "kmem_map too smal
f kern/122252  net        [ipmi] [bge] IPMI problem with BCM5704 (does not work 
o kern/122195  net        [ed] Alignment problems in if_ed
o kern/122058  net        [em] [panic] Panic on em1: taskq
o kern/122033  net        [ral] [lor] Lock order reversal in ral0 at bootup [reg
o kern/121983  net        [fxp] fxp0 MBUF and PAE
o bin/121895   net        [patch] rtsol(8)/rtsold(8) doesn't handle managed netw
o kern/121872  net        [wpi] driver fails to attach on a fujitsu-siemens s711
s kern/121774  net        [swi] [panic] 6.3 kernel panic in swi1: net
o kern/121706  net        [netinet] [patch] "rtfree: 0xc4383870 has 1 refs" emit
o kern/121624  net        [em] [regression] Intel em WOL fails after upgrade to 
o kern/121555  net        [panic] Fatal trap 12: current process = 12 (swi1: net
o kern/121443  net        [gif] [lor] icmp6_input/nd6_lookup
o kern/121437  net        [vlan] Routing to layer-2 address does not work on VLA
o bin/121359   net        [patch] ppp(8): fix local stack overflow in ppp
o kern/121298  net        [em] [panic] Fatal trap 12: page fault while in kernel
o kern/121257  net        [tcp] TSO + natd  -> slow outgoing tcp traffic
o kern/121181  net        [panic] Fatal trap 3: breakpoint instruction fault whi
o kern/121080  net        [bge] IPv6 NUD problem on multi address config on bge0
o kern/120966  net        [rum] kernel panic with if_rum and WPA encryption
p docs/120945  net        [patch] ip6(4) man page lacks documentation for TCLASS
o kern/120566  net        [request]: ifconfig(8) make order of arguments more fr
o kern/120304  net        [netgraph] [patch] netgraph source assumes 32-bit time
o kern/120266  net        [udp] [panic] gnugk causes kernel panic when closing U
o kern/120232  net        [nfe] [patch] Bring in nfe(4) to RELENG_6
o kern/120130  net        [carp] [panic] carp causes kernel panics in any conste
o bin/120060   net        routed(8) deletes link-level routes in the presence of
o kern/119945  net        [rum] [panic] rum device in hostap mode, cause kernel 
o kern/119791  net        [nfs] UDP NFS mount of aliased IP addresses from a Sol
o kern/119617  net        [nfs] nfs error on wpa network when reseting/shutdown
f kern/119516  net        [ip6] [panic] _mtx_lock_sleep: recursed on non-recursi
o kern/119432  net        [arp] route add -host <host> -iface <nic> causes arp e
o kern/119225  net        [wi] 7.0-RC1 no carrier with Prism 2.5 wifi card [regr
a bin/118987   net        ifconfig(8): ifconfig -l (address_family) does not wor
o sparc/118932 net        [panic] 7.0-BETA4/sparc-64 kernel panic in rip_output
a kern/118879  net        [bge] [patch] bge has checksum problems on the 5703 ch
o kern/118727  net        [netgraph] [patch] [request] add new ng_pf module
s kern/117717  net        [panic] Kernel panic with Bittorrent client.
o kern/117448  net        [carp] 6.2 kernel crash [regression]
o kern/117423  net        [vlan] Duplicate IP on different interfaces
o bin/117339   net        [patch] route(8): loading routing management commands 
o kern/117271  net        [tap] OpenVPN TAP uses 99% CPU on releng_6 when if_tap
o kern/117043  net        [em] Intel PWLA8492MT Dual-Port Network adapter EEPROM
o kern/116837  net        [tun] [panic] [patch] ifconfig tunX destroy: panic
o kern/116747  net        [ndis] FreeBSD 7.0-CURRENT crash with Dell TrueMobile 
o bin/116643   net        [patch] [request] fstat(1): add INET/INET6 socket deta
o kern/116328  net        [bge]: Solid hang with bge interface
o kern/116185  net        [iwi] if_iwi driver leads system to reboot
o kern/115239  net        [ipnat] panic with 'kmem_map too small' using ipnat
o kern/115019  net        [netgraph] ng_ether upper hook packet flow stops on ad
o kern/115002  net        [wi] if_wi timeout. failed allocation (busy bit). ifco
o kern/114915  net        [patch] [pcn] pcn (sys/pci/if_pcn.c) ethernet driver f
f kern/114899  net        [bge] bge0: watchdog timeout -- resetting
o kern/114839  net        [fxp] fxp looses ability to speak with traffic
o kern/113895  net        [xl] xl0 fails on 6.2-RELEASE but worked fine on 5.5-R
o kern/112722  net        [ipsec] [udp] IP v4 udp fragmented packet reject
o kern/112686  net        [patm] patm driver freezes System (FreeBSD 6.2-p4) i38
o kern/112570  net        [bge] packet loss with bge driver on BCM5704 chipset
o bin/112557   net        [patch] ppp(8) lock file should not use symlink name
o kern/112528  net        [nfs] NFS over TCP under load hangs with "impossible p
o kern/111457  net        [ral] ral(4) freeze
o kern/110140  net        [ipw] ipw fails under load
o kern/109733  net        [bge] bge link state issues [regression]
o kern/109470  net        [wi] Orinoco Classic Gold PC Card Can't Channel Hop
o kern/109308  net        [pppd] [panic] Multiple panics kernel ppp suspected [r
o kern/109251  net        [re] [patch] if_re cardbus card won't attach
o bin/108895   net        pppd(8): PPPoE dead connections on 6.2 [regression]
o kern/108542  net        [bce] Huge network latencies with 6.2-RELEASE / STABLE
o kern/107944  net        [wi] [patch] Forget to unlock mutex-locks
o kern/107850  net        [bce] bce driver link negotiation is faulty
o conf/107035  net        [patch] bridge(8): bridge interface given in rc.conf n
o kern/106438  net        [ipf] ipfilter: keep state does not seem to allow repl
o kern/106316  net        [dummynet] dummynet with multipass ipfw drops packets 
o kern/106243  net        [nve] double fault panic in if_nve.c on high loads
o kern/105945  net        Address can disappear from network interface
s kern/105943  net        Network stack may modify read-only mbuf chain copies
o bin/105925   net        problems with ifconfig(8) and vlan(4) [regression]
o kern/105348  net        [ath] ath device stopps TX
o kern/104851  net        [inet6] [patch] On link routes not configured when usi
o kern/104751  net        [netgraph] kernel panic, when getting info about my tr
o kern/104485  net        [bge] Broadcom BCM5704C: Intermittent on newer chip ve
o kern/103191  net        Unpredictable reboot
o kern/103135  net        [ipsec] ipsec with ipfw divert (not NAT) encodes a pac
o conf/102502  net        [netgraph] [patch] ifconfig name does't rename netgrap
o kern/102035  net        [plip] plip networking disables parallel port printing
o kern/101948  net        [ipf] [panic] Kernel Panic Trap No 12 Page Fault - cau
o kern/100709  net        [libc] getaddrinfo(3) should return TTL info
o kern/100519  net        [netisr] suggestion to fix suboptimal network polling
o kern/98978   net        [ipf] [patch] ipfilter drops OOW packets under 6.1-Rel
o kern/98597   net        [inet6] Bug in FreeBSD 6.1 IPv6 link-local DAD procedu
o bin/98218    net        wpa_supplicant(8) blacklist not working
f bin/97392    net        ppp(8) hangs instead terminating
o kern/97306   net        [netgraph] NG_L2TP locks after connection with failed 
f kern/96268   net        [socket] TCP socket performance drops by 3000% if pack
o kern/96030   net        [bfe] [patch] Install hangs with Broadcomm 440x NIC in
o kern/95519   net        [ral] ral0 could not map mbuf
o kern/95288   net        [pppd] [tty] [panic] if_ppp panic in sys/kern/tty_subr
o kern/95277   net        [netinet] [patch] IP Encapsulation mask_match() return
o kern/95267   net        packet drops periodically appear
s kern/94863   net        [bge] [patch] hack to get bge(4) working on IBM e326m
o kern/94162   net        [bge] 6.x kenel stale with bge(4)
o kern/93886   net        [ath] Atheros/D-Link DWL-G650 long delay to associate 
f kern/93378   net        [tcp] Slow data transfer in Postfix and Cyrus IMAP (wo
o kern/93019   net        [ppp] ppp and tunX problems: no traffic after restarti
o kern/92880   net        [libc] [patch] almost rewritten inet_network(3) functi
f kern/92552   net        A serious bug in most network drivers from 5.X to 6.X 
s kern/92279   net        [dc] Core faults everytime I reboot, possible NIC issu
o kern/92090   net        [bge] bge0: watchdog timeout -- resetting
o kern/91859   net        [ndis] if_ndis does not work with Asus WL-138
s kern/91777   net        [ipf] [patch] wrong behaviour with skip rule inside an
o kern/91594   net        [em] FreeBSD > 5.4 w/ACPI fails to detect Intel Pro/10
o kern/91364   net        [ral] [wep] WF-511 RT2500 Card PCI and WEP
o kern/91311   net        [aue] aue interface hanging
o kern/90890   net        [vr] Problems with network: vr0: tx shutdown timeout
s kern/90086   net        [hang] 5.4p8 on supermicro P8SCT hangs during boot if 
f kern/88082   net        [ath] [panic] cts protection for ath0 causes panic
o kern/87521   net        [ipf] [panic] using ipfilter "auth" keyword leads to k
o kern/87506   net        [vr] [patch] Fix alias support on vr interfaces
o kern/87194   net        [fxp] fxp(4) promiscuous mode seems to corrupt hw-csum
s kern/86920   net        [ndis] ifconfig: SIOCS80211: Invalid argument [regress
o kern/86103   net        [ipf] Illegal NAT Traversal in IPFilter
o kern/85780   net        'panic: bogus refcnt 0' in routing/ipv6
o bin/85445    net        ifconfig(8): deprecated keyword to ifconfig inoperativ
o kern/85266   net        [xe] [patch] xe(4) driver does not recognise Xircom XE
o kern/84202   net        [ed] [patch] Holtek HT80232 PCI NIC recognition on Fre
o bin/82975    net        route change does not parse classfull network as given
o kern/82497   net        [vge] vge(4) on AMD64 only works when loaded late, not
f kern/81644   net        [vge] vge(4) does not work properly when loaded as a K
s kern/81147   net        [net] [patch] em0 reinitialization while adding aliase
o kern/80853   net        [ed] [patch] add support for Compex RL2000/ISA in PnP 
o kern/79895   net        [ipf] 5.4-RC2 breaks ipfilter NAT when using netgraph 
f kern/79262   net        [dc] Adaptec ANA-6922 not fully supported
o bin/79228    net        [patch] extend arp(8) to be able to create blackhole r
o kern/78090   net        [ipf] ipf filtering on bridged packets doesn't work if
p kern/77913   net        [wi] [patch] Add the APDL-325 WLAN pccard to wi(4)
o kern/77341   net        [ip6] problems with IPV6 implementation
o kern/77273   net        [ipf] ipfilter breaks ipv6 statefull filtering on 5.3
s kern/77195   net        [ipf] [patch] ipfilter ioctl SIOCGNATL does not match 
o kern/75873   net        Usability problem with non-RFC-compliant IP spoof prot
s kern/75407   net        [an] an(4): no carrier after short time
f kern/73538   net        [bge] problem with the Broadcom BCM5788 Gigabit Ethern
o kern/71469   net        default route to internet magically disappears with mu
o kern/70904   net        [ipf] ipfilter ipnat problem with h323 proxy support
o kern/64556   net        [sis] if_sis short cable fix problems with NetGear FA3
s kern/60293   net        [patch] FreeBSD arp poison patch
o kern/54383   net        [nfs] [patch] NFS root configurations without dynamic 
f i386/45773   net        [bge] Softboot causes autoconf failure on Broadcom 570
s bin/41647    net        ifconfig(8) doesn't accept lladdr along with inet addr
s kern/39937   net        ipstealth issue
a kern/38554   net        [patch] changing interface ipaddress doesn't seem to w
o kern/35442   net        [sis] [patch] Problem transmitting runts in if_sis dri
o kern/34665   net        [ipf] [hang] ipfilter rcmd proxy "hangs".
o kern/31647   net        [libc] socket calls can return undocumented EINVAL
o kern/30186   net        [libc] getaddrinfo(3) does not handle incorrect servna
o kern/27474   net        [ipf] [ppp] Interactive use of user PPP and ipfilter c
o conf/23063   net        [arp] [patch] for static ARP tables in rc.network

287 problems total.


From owner-freebsd-net@FreeBSD.ORG  Mon Apr  6 11:59:11 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2EE9110656D1;
	Mon,  6 Apr 2009 11:59:11 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id E28268FC17;
	Mon,  6 Apr 2009 11:59:10 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [65.122.17.41])
	by cyrus.watson.org (Postfix) with ESMTPS id 84C6346B82;
	Mon,  6 Apr 2009 07:59:10 -0400 (EDT)
Date: Mon, 6 Apr 2009 12:59:10 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Ivan Voras <ivoras@freebsd.org>
In-Reply-To: <grbcfg$poe$1@ger.gmane.org>
Message-ID: <alpine.BSF.2.00.0904061238250.34905@fledge.watson.org>
References: <gra7mq$ei8$1@ger.gmane.org>
	<alpine.BSF.2.00.0904051422280.12639@fledge.watson.org>
	<grac1s$p56$1@ger.gmane.org>
	<alpine.BSF.2.00.0904051440460.12639@fledge.watson.org>
	<grappq$tsg$1@ger.gmane.org>
	<alpine.BSF.2.00.0904052243250.34905@fledge.watson.org>
	<grbcfg$poe$1@ger.gmane.org>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-net@freebsd.org
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 06 Apr 2009 11:59:12 -0000

On Mon, 6 Apr 2009, Ivan Voras wrote:

>>> I'd like to understand more. If (in netisr) I have a mbuf with headers, is 
>>> this data already transfered from the card or is it magically "not here 
>>> yet"?
>>
>> A lot depends on the details of the card and driver.  The driver will take 
>> cache misses on the descriptor ring entry, if it's not already in cache, 
>> and the link layer will take a cache miss on the front of the ethernet 
>> frame in the cluster pointed to by the mbuf header as part of its demux. 
>> What happens next depends on your dispatch model and cache line size. 
>> Let's make a few simplifying assumptions that are mostly true:
>
> So, a mbuf can reference data not yet copied from the NIC hardware? I'm 
> specifically trying to undestand what m_pullup() does.

I think we're talking slightly at cross purposes.  There are two transfers of 
interest:

(1) DMA of the packet data to main memory from the NIC
(2) Servicing of CPU cache misses to access data in main memory

By the time you receive an interrupt, the DMA is complete, so once you believe 
a packet referenced by the descriptor ring is done, you don't have to wait for 
DMA.  However, the packet data is in main memory rather than your CPU cache, 
so you'll need to take a cache miss in order to retrieve it.  You don't want 
to prefetch before you know the packet data is there, or you may prefetch 
stale data from the previous packet sent or received from the cluster.

m_pullup() has to do with mbuf chain memory contiguity during packet 
processing.  The usual usage is something along the following lines:

 	struct whatever *w;

 	m = m_pullup(m, sizeof(*w));
 	if (m == NULL)
 		return;
 	w = mtod(m, struct whatever *);

m_pullup() here ensures that the first sizeof(*w) bytes of mbuf data are 
contiguously stored so that the cast of w to m's data will point at a complete 
structure we can use to interpret packet data.  In the common case in the 
receipt path, m_pullup() should be a no-op, since almost all drivers receive 
data in a single cluster.

However, there are cases where it might not happen, such as loopback traffic 
where unusual encapsulation is used, leading to a call to M_PREPEND() that 
inserts a new mbuf on the front of the chain, which is later m_defrag()'d 
leading to a higher level header crossing a boundary or the like.

This issue is almost entirely independent from things like the cache line miss 
issue, unless you hit the uncommon case of having to do work in m_pullup(), in 
which case life sucks.

It would be useful to use DTrace to profile a number of the workfull m_foo() 
functions to make sure we're not hitting them in normal workloads, btw.

>>> As the card and the OS can already process many packets per second for
>>> something fairly complex as routing
>>> (http://www.tancsa.com/blast.html), and TCP chokes swi:net at 100% of
>>> a core, isn't this indication there's certainly more space for
>>> improvement even with a single-queue old-fashioned NICs?
>>
>> Maybe.  It depends on the relative costs of local processing vs
>> redistributing the work, which involves schedulers, IPIs, additional
>> cache misses, lock contention, and so on.  This means there's a period
>> where it can't possibly be a win, and then at some point it's a win as
>> long as the stack scales.  This is essentially the usual trade-off in
>> using threads and parallelism: does the benefit of multiple parallel
>> execution units make up for the overheads of synchronization and data
>> migration?
>
> Do you have any idea at all why I'm seeing the weird difference of netstat 
> packets per second (250,000) and my application's TCP performance (< 1,000 
> pps)? Summary: each packet is guaranteed to be a whole message causing a 
> transaction in the application - without the changes I see pps almost 
> identical to tps. Even if the source of netstat statistics somehow manages 
> to count packets multiple time (I don't see how that can happen), no 
> relation can describe differences this huge. It almost looks like something 
> in the upper layers is discarding packets (also not likely: TCP timeouts 
> would occur and the application wouldn't be able to push 250,000 pps) - but 
> what? Where to look?

Is this for the loopback workload?  If so, remember that there may be some 
other things going on:

- Every packet is processed at least two times: once went sent, and then again
   when it's received.

- A TCP segment will need to be ACK'd, so if you're sending data in chunks in
   one direction, the ACKs will not be piggy-backed on existing data tranfers,
   and instead be sent independently, hitting the network stack two more times.

- Remember that TCP works to expand its window, and then maintains the highest
   performance it can by bumping up against the top of available bandwidth
   continuously.  This involves detecting buffer limits by generating packets
   that can't be sent, adding to the packet count.  With loopback traffic, the
   drop point occurs when you exceed the size of the netisr's queue for IP, so
   you might try bumping that from the default to something much larger.

And nothing beats using tcpdump -- have you tried tcpdumping the loopback to 
see what is actually being sent?  If not, that's always educational -- perhaps 
something weird is going on with delayed ACKs, etc.

> You mean for the general code? I purposely don't lock my statistics 
> variables because I'm not that interested in exact numbers (orders of 
> magnitude are relevant). As far as I understand, unlocked "x++" should be 
> trivially fast in this case?

No.  x++ is massively slow if executed in parallel across many cores on a 
variable in a single cache line.  See my recent commit to kern_tc.c for an 
example: the updating of trivial statistics for the kernel time calls reduced 
30m syscalls/second to 3m syscalls/second due to heavy contention on the cache 
line holding the statistic.  One of my goals for 8.0 is to fix this problem 
for IP and TCP layers, and ideally also ifnet but we'll see.  We should be 
maintaining those stats per-CPU and then aggregating to report them to 
userspace.  This is what we already do for a number of system stats -- UMA and 
kernel malloc, syscall and trap counters, etc.

>> - Use cpuset to pin ithreads, the netisr, and whatever else, to specific
>> cores
>>   so that they don't migrate, and if your system uses HTT, experiment with
>>   pinning the ithread and the netisr on different threads on the same
>> core, or
>>   at least, different cores on the same die.
>
> I'm using em hardware; I still think there's a possibility I'm fighting the 
> driver in some cases but this has priority #2.

Have you tried LOCK_PROFILING?  It would quickly tell you if driver locks were 
a source of significant contention.  It works quite well...

>> - If your card supports RSS, pass the flowid up the stack in the mbuf 
>> packet
>>   header flowid field, and use that instead of the hash for work placement.
>
> Don't know about em. Don't really want to touch it if I don't have to :)

if_em doesn't support it, but if_igb does.  If this saves you a minimum of one 
and possibly two cache misses per packet, it could be a huge performance 
improvement.

Robert N M Watson
Computer Laboratory
University of Cambridge

From owner-freebsd-net@FreeBSD.ORG  Mon Apr  6 12:09:10 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 1F6291065745;
	Mon,  6 Apr 2009 12:09:10 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id E62D08FC1A;
	Mon,  6 Apr 2009 12:09:09 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [65.122.17.41])
	by cyrus.watson.org (Postfix) with ESMTPS id 96DAC46B90;
	Mon,  6 Apr 2009 08:09:09 -0400 (EDT)
Date: Mon, 6 Apr 2009 13:09:09 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Barney Cordoba <barney_cordoba@yahoo.com>
In-Reply-To: <86599.63596.qm@web63904.mail.re1.yahoo.com>
Message-ID: <alpine.BSF.2.00.0904061300160.34905@fledge.watson.org>
References: <86599.63596.qm@web63904.mail.re1.yahoo.com>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-net@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 06 Apr 2009 12:09:15 -0000


On Mon, 6 Apr 2009, Barney Cordoba wrote:

> Is there a way to give a kernel thread exclusive use of a core? I know you 
> can pin a kernel thread with sched_bind(), but is there a way to keep other 
> threads from using the core? On an 8 core system it almost seems that the 
> randomness of more cores is a negative in some situations.
>
> Also, I've noticed that calling sched_bind() during bootup is a bad thing in 
> that it locks the system. I'm not certain but I suspect its the thread_lock 
> that is the culprit. Is there a clean way to determine that its safe to lock 
> curthread and do a cpu bind?

There isn't an interface to cleanly express "Use CPUs 4-7 for only network 
processing".  You can configure the system this way using the cpuset command 
(including directing the low-level interrupts to specific CPUs in 8.x), but if 
we think this is going to be a frequently desired policy, a bit more 
abstraction will be required.

I'm not familiar with the problem you're seeing with sched_bind() -- I'm using 
it from within some of my code without a problem, and that's fairly early in 
the boot.  A number of deadlocks are possible if one isn't very careful early 
in the boot though, so I might look specifically for some of those: if you 
migrate a thread to a CPU that isn't yet started, it won't be able to run 
until the CPU has started.  This means it's important not to migrate threads 
that might lead to priority version-like deadlocks:

- Be careful not to migrate threads that hold locks the system requires to get
   to the point where multiple CPUs run.
- Be careful not to migrate threads that will signal a resource being
   available, such as a device driver, required to get to the point where
   multiple CPUs run.
- Be careful not to migrate the main boot thread.

Could you be running into one of those cases?  Usually they're fairly easy to 
diagnose using DDB, if you can get into it, because you can see what the main 
boot thread is waiting for, and reason about what's holding it.  Are you able 
to get into DDB when this occurs?  (Perhaps using an NMI?)

Robert N M Watson
Computer Laboratory
University of Cambridge

From owner-freebsd-net@FreeBSD.ORG  Mon Apr  6 12:35:57 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6DBCB10656F6
	for <freebsd-net@freebsd.org>; Mon,  6 Apr 2009 12:35:57 +0000 (UTC)
	(envelope-from freebsd-net@m.gmane.org)
Received: from ciao.gmane.org (main.gmane.org [80.91.229.2])
	by mx1.freebsd.org (Postfix) with ESMTP id C82BE8FC0A
	for <freebsd-net@freebsd.org>; Mon,  6 Apr 2009 12:35:56 +0000 (UTC)
	(envelope-from freebsd-net@m.gmane.org)
Received: from list by ciao.gmane.org with local (Exim 4.43)
	id 1Lqo3P-0003Qo-Cg
	for freebsd-net@freebsd.org; Mon, 06 Apr 2009 12:35:55 +0000
Received: from lara.cc.fer.hr ([161.53.72.113])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-net@freebsd.org>; Mon, 06 Apr 2009 12:35:55 +0000
Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-net@freebsd.org>; Mon, 06 Apr 2009 12:35:55 +0000
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-net@freebsd.org
From: Ivan Voras <ivoras@freebsd.org>
Date: Mon, 06 Apr 2009 14:35:33 +0200
Lines: 168
Message-ID: <grcsus$9vh$1@ger.gmane.org>
References: <gra7mq$ei8$1@ger.gmane.org>	<alpine.BSF.2.00.0904051422280.12639@fledge.watson.org>	<grac1s$p56$1@ger.gmane.org>	<alpine.BSF.2.00.0904051440460.12639@fledge.watson.org>	<grappq$tsg$1@ger.gmane.org>	<alpine.BSF.2.00.0904052243250.34905@fledge.watson.org>	<grbcfg$poe$1@ger.gmane.org>
	<alpine.BSF.2.00.0904061238250.34905@fledge.watson.org>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature";
	boundary="------------enig2259B8C6FCD2C8A9C92854A6"
X-Complaints-To: usenet@ger.gmane.org
X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr
User-Agent: Thunderbird 2.0.0.21 (X11/20090318)
In-Reply-To: <alpine.BSF.2.00.0904061238250.34905@fledge.watson.org>
X-Enigmail-Version: 0.95.0
Sender: news <news@ger.gmane.org>
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 06 Apr 2009 12:35:57 -0000

This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enig2259B8C6FCD2C8A9C92854A6
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Robert Watson wrote:
> On Mon, 6 Apr 2009, Ivan Voras wrote:

>> So, a mbuf can reference data not yet copied from the NIC hardware?
>> I'm specifically trying to undestand what m_pullup() does.
>=20
> I think we're talking slightly at cross purposes.  There are two
> transfers of interest:
>=20
> (1) DMA of the packet data to main memory from the NIC
> (2) Servicing of CPU cache misses to access data in main memory
>=20
> By the time you receive an interrupt, the DMA is complete, so once you

OK, this was what was confusing me - for a moment I thought you meant
it's not so.

> believe a packet referenced by the descriptor ring is done, you don't
> have to wait for DMA.  However, the packet data is in main memory rathe=
r
> than your CPU cache, so you'll need to take a cache miss in order to
> retrieve it.  You don't want to prefetch before you know the packet dat=
a
> is there, or you may prefetch stale data from the previous packet sent
> or received from the cluster.
>=20
> m_pullup() has to do with mbuf chain memory contiguity during packet
> processing.  The usual usage is something along the following lines:
>=20
>     struct whatever *w;
>=20
>     m =3D m_pullup(m, sizeof(*w));
>     if (m =3D=3D NULL)
>         return;
>     w =3D mtod(m, struct whatever *);
>
> m_pullup() here ensures that the first sizeof(*w) bytes of mbuf data ar=
e
> contiguously stored so that the cast of w to m's data will point at a

So, m_pullup() can resize / realloc() the mbuf? (not that it matters for
this purpose)

> Is this for the loopback workload?  If so, remember that there may be
> some other things going on:

Both loopback and physical.

> - Every packet is processed at least two times: once went sent, and the=
n
> again
>   when it's received.
>=20
> - A TCP segment will need to be ACK'd, so if you're sending data in
> chunks in
>   one direction, the ACKs will not be piggy-backed on existing data
> tranfers,
>   and instead be sent independently, hitting the network stack two more=

> times.

No combination of these can make an accounting difference between 1,000
and 250,000 pps. I must be hitting something very bad here.

> - Remember that TCP works to expand its window, and then maintains the
> highest
>   performance it can by bumping up against the top of available bandwid=
th
>   continuously.  This involves detecting buffer limits by generating
> packets
>   that can't be sent, adding to the packet count.  With loopback
> traffic, the
>   drop point occurs when you exceed the size of the netisr's queue for
> IP, so
>   you might try bumping that from the default to something much larger.=


My messages are approx. 100 +/- 10 bytes. No practical way they will
even span multiple mbufs. TCP_NODELAY is on.

> No.  x++ is massively slow if executed in parallel across many cores on=

> a variable in a single cache line.  See my recent commit to kern_tc.c
> for an example: the updating of trivial statistics for the kernel time
> calls reduced 30m syscalls/second to 3m syscalls/second due to heavy
> contention on the cache line holding the statistic.  One of my goals fo=
r

I don't get it:
http://svn.freebsd.org/viewvc/base/stable/7/sys/kern/kern_tc.c?r1=3D18989=
1&r2=3D189890&pathrev=3D189891

you replaced x++ with no-ops if TC_COUNTER is defined? Aren't the
timecounters actually needed somewhere?

> 8.0 is to fix this problem for IP and TCP layers, and ideally also ifne=
t
> but we'll see.  We should be maintaining those stats per-CPU and then
> aggregating to report them to userspace.  This is what we already do fo=
r
> a number of system stats -- UMA and kernel malloc, syscall and trap
> counters, etc.

How magic is this? Is it just a matter of declaring mystatarray[NCPU]
and updating mystat[current_cpu] or (probably), the spacing between
array elements should be magically fixed so two elements don't share a
cache line?

>>> - Use cpuset to pin ithreads, the netisr, and whatever else, to speci=
fic
>>> cores
>>>   so that they don't migrate, and if your system uses HTT, experiment=

>>> with
>>>   pinning the ithread and the netisr on different threads on the same=

>>> core, or
>>>   at least, different cores on the same die.
>>
>> I'm using em hardware; I still think there's a possibility I'm
>> fighting the driver in some cases but this has priority #2.
>=20
> Have you tried LOCK_PROFILING?  It would quickly tell you if driver
> locks were a source of significant contention.  It works quite well...

I don't think I'm fighting against locking artifacts, it looks more like
some kind of overly smart hardware thing, like interrupt moderation (but
not exactly interrupt moderation since the number of IRQs/s remains
approx. the same).

>>> - If your card supports RSS, pass the flowid up the stack in the mbuf=

>>> packet
>>>   header flowid field, and use that instead of the hash for work
>>> placement.
>>
>> Don't know about em. Don't really want to touch it if I don't have to =
:)
>=20
> if_em doesn't support it, but if_igb does.  If this saves you a minimum=

> of one and possibly two cache misses per packet, it could be a huge
> performance improvement.

If I had the funds to upgrade hardware, I wouldn't be so interested in
solving it in software :)


--------------enig2259B8C6FCD2C8A9C92854A6
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJ2fccldnAQVacBcgRAnUsAKDvLaUuooKGdMVtT+qJDLQXFNQ/CQCeJvP3
2Xzrk5yV4QbhBpmg5XvCqPk=
=0776
-----END PGP SIGNATURE-----

--------------enig2259B8C6FCD2C8A9C92854A6--


From owner-freebsd-net@FreeBSD.ORG  Mon Apr  6 13:41:01 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B9D6910656CA
	for <freebsd-net@freebsd.org>; Mon,  6 Apr 2009 13:41:01 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: from web63906.mail.re1.yahoo.com (web63906.mail.re1.yahoo.com
	[69.147.97.121]) by mx1.freebsd.org (Postfix) with SMTP id 771C68FC23
	for <freebsd-net@freebsd.org>; Mon,  6 Apr 2009 13:41:01 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: (qmail 42589 invoked by uid 60001); 6 Apr 2009 13:41:00 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
	t=1239025260; bh=4zCf0qunp1yvuBrFzFbUJrL2OUSB327jY2S0yHu6Otk=;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=o6EkvyJfY851396ia3yZsEMP+sbop7DFQjpD7UwXkDRA2PrTTnPHlzKm+avcWWMPW2AUiWuSdr87dD3xM9T9q2OrahshD7btZDk9zX1FT+BuDENx0D5e/oB/TuQg1D12/ZkoP4ahJ24Fh1nBQVlFr8sQ+bgNlE3XUIZtxJo1m4o=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=pGj/cgoD2ixdz3Dw6zSWaHXfhVVJiNeF740dr+0uXNlcpXggQjEbnH5K56KOCeFammkAQ4N4aGfQDf29sN5w7rnWSmY5I237KmE3FM0n8DV/ouyOPRio29hY0FDhWLZVIa1RT+Kjti7dHKI9OuQVdal6rqu8kO3ZecSI0lCYJoA=;
Message-ID: <812958.41771.qm@web63906.mail.re1.yahoo.com>
X-YMail-OSG: F3zTFRsVM1kGN4FvWZRmCWc7y9S5oeB4I4iqDv.KXpzSQtqyjerSwCOvJYslesZUyl_iwQ7LIir507cIyIMo4p8DXw57bV3x_fxwOQMb3C5zh273_JfLvLSTOnoHbmjqrofeM9uFM4cbNfPJ7o7KhsSUhXqSYnqIi_wDLLxXQevZVu5Y0Qc73QsK9t348kXfBlicKrxISmpJu6Msn.SvgDyzFZKZzxyA.Yeg3gEpiuP8waiZMmVyA3K45EL4KIE461W7ieLCjD20bMDg0uuu8sf7CDHOvIu5H3UMGTN5FTJAafiYbBgtwXecRHa5fM9ceguL8Qd4Hw4o_JeU3rxvPzhI
Received: from [98.242.222.229] by web63906.mail.re1.yahoo.com via HTTP;
	Mon, 06 Apr 2009 06:41:00 PDT
X-Mailer: YahooMailWebService/0.7.289.1
Date: Mon, 6 Apr 2009 06:41:00 -0700 (PDT)
From: Barney Cordoba <barney_cordoba@yahoo.com>
To: Robert Watson <rwatson@FreeBSD.org>
In-Reply-To: <alpine.BSF.2.00.0904061300160.34905@fledge.watson.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: freebsd-net@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: barney_cordoba@yahoo.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 06 Apr 2009 13:41:02 -0000


--- On Mon, 4/6/09, Robert Watson <rwatson@FreeBSD.org> wrote:

> From: Robert Watson <rwatson@FreeBSD.org>
> Subject: Re: Advice on a multithreaded netisr  patch?
> To: "Barney Cordoba" <barney_cordoba@yahoo.com>
> Cc: freebsd-net@freebsd.org, "Ivan Voras" <ivoras@freebsd.org>
> Date: Monday, April 6, 2009, 8:09 AM
> On Mon, 6 Apr 2009, Barney Cordoba wrote:
> 
> > Is there a way to give a kernel thread exclusive use
> of a core? I know you can pin a kernel thread with
> sched_bind(), but is there a way to keep other threads from
> using the core? On an 8 core system it almost seems that the
> randomness of more cores is a negative in some situations.
> > 
> > Also, I've noticed that calling sched_bind()
> during bootup is a bad thing in that it locks the system.
> I'm not certain but I suspect its the thread_lock that
> is the culprit. Is there a clean way to determine that its
> safe to lock curthread and do a cpu bind?
> 
> There isn't an interface to cleanly express "Use
> CPUs 4-7 for only network processing".  You can
> configure the system this way using the cpuset command
> (including directing the low-level interrupts to specific
> CPUs in 8.x), but if we think this is going to be a
> frequently desired policy, a bit more abstraction will be
> required.
> 
> I'm not familiar with the problem you're seeing
> with sched_bind() -- I'm using it from within some of my
> code without a problem, and that's fairly early in the
> boot.  A number of deadlocks are possible if one isn't
> very careful early in the boot though, so I might look
> specifically for some of those: if you migrate a thread to a
> CPU that isn't yet started, it won't be able to run
> until the CPU has started.  This means it's important
> not to migrate threads that might lead to priority
> version-like deadlocks:
> 
> - Be careful not to migrate threads that hold locks the
> system requires to get
>   to the point where multiple CPUs run.
> - Be careful not to migrate threads that will signal a
> resource being
>   available, such as a device driver, required to get to
> the point where
>   multiple CPUs run.
> - Be careful not to migrate the main boot thread.
> 
> Could you be running into one of those cases?  Usually
> they're fairly easy to diagnose using DDB, if you can
> get into it, because you can see what the main boot thread
> is waiting for, and reason about what's holding it.  Are
> you able to get into DDB when this occurs?  (Perhaps using
> an NMI?)

Yes, the cpus are launched quite late, so that must be it. I guess
the mp_ncpus is set before they are launched. Is there a way to determine
that a specific core has been lauched?

Regarding using cpuset, John B indicated that you couldn't allocate
"sets" for kernel threads; and that sched_bind() was the only function
available. So that brings 2 questions:

1) How do you get the thread ID for a process from user space to use with
cpuset? I don't see that ps displays it.

2) Can cpu sets be manipulated / setup from within the kernel?

Barney


From owner-freebsd-net@FreeBSD.ORG  Mon Apr  6 15:53:16 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 1AB4A10656C9
	for <freebsd-net@freebsd.org>; Mon,  6 Apr 2009 15:53:16 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: from web63901.mail.re1.yahoo.com (web63901.mail.re1.yahoo.com
	[69.147.97.116]) by mx1.freebsd.org (Postfix) with SMTP id CB03A8FC26
	for <freebsd-net@freebsd.org>; Mon,  6 Apr 2009 15:53:15 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: (qmail 14138 invoked by uid 60001); 6 Apr 2009 15:53:15 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
	t=1239033195; bh=gDgdn1sMiDqPruOXKxdyfa82DifFjVhKrTAeafiac9o=;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type;
	b=DKGl5uE9hIxwkyZAWSmtuW7Vsntdf9dYgzExlcHpCkOeYJX3FwFa49qFv7sXTgbvBYLp7BCGsrMA6xHLHJ5nHdRBm6GicrigrxshpUfh1+icmSOSTocR9Dp/87QA43H/IkpdmbmB3sCEbNnD9RvTrijm1n70BYj1P/83pfQb2Tg=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type;
	b=UcsWoib3cVP6AeH8bvvrb5s9VIFhexcPlEJCkIjwmUT2W2haQhDaHI3h2VHGtNrwjdGRMNfLs+PgtYF9bKPzXJPAlMSZAOc4xXU+9OwRX0mOHL4T6R8jAp+6atDunaLfb1Jhn9ZxxvXfLwwrlj+KSjjZESS/3kBZ26C17XyEvJw=;
Message-ID: <146595.14120.qm@web63901.mail.re1.yahoo.com>
X-YMail-OSG: xbSwlN4VM1kSMrb0rqUukMU11tQeLCL6tOeXS9FIt60ECdeHwRrLz9BhiqCiToQ4zRE9lRnJ1JmBWSAc5oVg1MNPsvfULKET44QZ6.LP638jipnrsBBZfTRuqsn8CgouY0qrRzLvI7WazFnPyhYnlNkQKgZIwJtJSz15OosTq9JZNqWhrwISKa1HylO0ll5NU6topvZUcbBZ0b9jhXMMvCaM4F8oTG.5F7.VbrUDW17v2pCl.mgBqnVqbVHgDSsGq3w2Cd5dW9_UtSTQBpw4.Q4ddEjPLIEC4HLeK1LHSYCAG3bYbTuXacybrcc5
Received: from [98.242.222.229] by web63901.mail.re1.yahoo.com via HTTP;
	Mon, 06 Apr 2009 08:53:14 PDT
X-Mailer: YahooMailWebService/0.7.289.1
Date: Mon, 6 Apr 2009 08:53:14 -0700 (PDT)
From: Barney Cordoba <barney_cordoba@yahoo.com>
To: freebsd-net@freebsd.org, Ivan Voras <ivoras@freebsd.org>
In-Reply-To: <grcsus$9vh$1@ger.gmane.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: 
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: barney_cordoba@yahoo.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 06 Apr 2009 15:53:17 -0000


--- On Mon, 4/6/09, Ivan Voras <ivoras@freebsd.org> wrote:

> From: Ivan Voras <ivoras@freebsd.org>
> Subject: Re: Advice on a multithreaded netisr  patch?
> To: freebsd-net@freebsd.org
> Date: Monday, April 6, 2009, 8:35 AM
> Robert Watson wrote:
> > On Mon, 6 Apr 2009, Ivan Voras wrote:
> 
> >> So, a mbuf can reference data not yet copied from
> the NIC hardware?
> >> I'm specifically trying to undestand what
> m_pullup() does.
> > 
> > I think we're talking slightly at cross purposes. 
> There are two
> > transfers of interest:
> > 
> > (1) DMA of the packet data to main memory from the NIC
> > (2) Servicing of CPU cache misses to access data in
> main memory
> > 
> > By the time you receive an interrupt, the DMA is
> complete, so once you
> 
> OK, this was what was confusing me - for a moment I thought
> you meant
> it's not so.
> 
> > believe a packet referenced by the descriptor ring is
> done, you don't
> > have to wait for DMA.  However, the packet data is in
> main memory rather
> > than your CPU cache, so you'll need to take a
> cache miss in order to
> > retrieve it.  You don't want to prefetch before
> you know the packet data
> > is there, or you may prefetch stale data from the
> previous packet sent
> > or received from the cluster.
> > 
> > m_pullup() has to do with mbuf chain memory contiguity
> during packet
> > processing.  The usual usage is something along the
> following lines:
> > 
> >     struct whatever *w;
> > 
> >     m = m_pullup(m, sizeof(*w));
> >     if (m == NULL)
> >         return;
> >     w = mtod(m, struct whatever *);
> >
> > m_pullup() here ensures that the first sizeof(*w)
> bytes of mbuf data are
> > contiguously stored so that the cast of w to m's
> data will point at a
> 
> So, m_pullup() can resize / realloc() the mbuf? (not that
> it matters for
> this purpose)
> 
> > Is this for the loopback workload?  If so, remember
> that there may be
> > some other things going on:
> 
> Both loopback and physical.
> 
> > - Every packet is processed at least two times: once
> went sent, and then
> > again
> >   when it's received.
> > 
> > - A TCP segment will need to be ACK'd, so if
> you're sending data in
> > chunks in
> >   one direction, the ACKs will not be piggy-backed on
> existing data
> > tranfers,
> >   and instead be sent independently, hitting the
> network stack two more
> > times.
> 
> No combination of these can make an accounting difference
> between 1,000
> and 250,000 pps. I must be hitting something very bad here.
> 
> > - Remember that TCP works to expand its window, and
> then maintains the
> > highest
> >   performance it can by bumping up against the top of
> available bandwidth
> >   continuously.  This involves detecting buffer limits
> by generating
> > packets
> >   that can't be sent, adding to the packet count. 
> With loopback
> > traffic, the
> >   drop point occurs when you exceed the size of the
> netisr's queue for
> > IP, so
> >   you might try bumping that from the default to
> something much larger.
> 
> My messages are approx. 100 +/- 10 bytes. No practical way
> they will
> even span multiple mbufs. TCP_NODELAY is on.
> 
> > No.  x++ is massively slow if executed in parallel
> across many cores on
> > a variable in a single cache line.  See my recent
> commit to kern_tc.c
> > for an example: the updating of trivial statistics for
> the kernel time
> > calls reduced 30m syscalls/second to 3m
> syscalls/second due to heavy
> > contention on the cache line holding the statistic. 
> One of my goals for
> 
> I don't get it:
> http://svn.freebsd.org/viewvc/base/stable/7/sys/kern/kern_tc.c?r1=189891&r2=189890&pathrev=189891
> 
> you replaced x++ with no-ops if TC_COUNTER is defined?
> Aren't the
> timecounters actually needed somewhere?
> 
> > 8.0 is to fix this problem for IP and TCP layers, and
> ideally also ifnet
> > but we'll see.  We should be maintaining those
> stats per-CPU and then
> > aggregating to report them to userspace.  This is what
> we already do for
> > a number of system stats -- UMA and kernel malloc,
> syscall and trap
> > counters, etc.
> 
> How magic is this? Is it just a matter of declaring
> mystatarray[NCPU]
> and updating mystat[current_cpu] or (probably), the spacing
> between
> array elements should be magically fixed so two elements
> don't share a
> cache line?
> 
> >>> - Use cpuset to pin ithreads, the netisr, and
> whatever else, to specific
> >>> cores
> >>>   so that they don't migrate, and if your
> system uses HTT, experiment
> >>> with
> >>>   pinning the ithread and the netisr on
> different threads on the same
> >>> core, or
> >>>   at least, different cores on the same die.
> >>
> >> I'm using em hardware; I still think
> there's a possibility I'm
> >> fighting the driver in some cases but this has
> priority #2.
> > 
> > Have you tried LOCK_PROFILING?  It would quickly tell
> you if driver
> > locks were a source of significant contention.  It
> works quite well...
> 
> I don't think I'm fighting against locking
> artifacts, it looks more like
> some kind of overly smart hardware thing, like interrupt
> moderation (but
> not exactly interrupt moderation since the number of IRQs/s
> remains
> approx. the same).
> 
> >>> - If your card supports RSS, pass the flowid
> up the stack in the mbuf
> >>> packet
> >>>   header flowid field, and use that instead of
> the hash for work
> >>> placement.
> >>
> >> Don't know about em. Don't really want to
> touch it if I don't have to :)
> > 
> > if_em doesn't support it, but if_igb does.  If
> this saves you a minimum
> > of one and possibly two cache misses per packet, it
> could be a huge
> > performance improvement.
>

There is no advantage to using if_igb. While the cards support more
features, the driver in FreeBSD really barely functions. There's also no
multiqueue support. Don't waste your money on a card.

Barney


From owner-freebsd-net@FreeBSD.ORG  Mon Apr  6 17:12:02 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id AACAA1065753
	for <freebsd-net@freebsd.org>; Mon,  6 Apr 2009 17:12:02 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: from web63905.mail.re1.yahoo.com (web63905.mail.re1.yahoo.com
	[69.147.97.120]) by mx1.freebsd.org (Postfix) with SMTP id 6740E8FC0A
	for <freebsd-net@freebsd.org>; Mon,  6 Apr 2009 17:12:01 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: (qmail 1357 invoked by uid 60001); 6 Apr 2009 17:12:00 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
	t=1239037920; bh=cfU4QV+bcKd8/UWEEgkNe1vUNSIILgiOGsnTVvu8Z9I=;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type;
	b=FSsmhayWiGC4nTAAuecVIK47n6cxnrWtlE+QhD8BJYkb7Ejm+d8Krfnc9Z4d1OUcGts8eka9Yn4Ypv4UlvBYY1yYVhPTK6VwpfPYBFz9pHDr/wfRm1AMuN1jKHs7LjJPlK9FAhvjoYw2iOFyJblz2BDcsjBljDIJIC+2wMNpxKI=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type;
	b=H1Qi2+JOsNvX/A8BxKLfrJu+3RN6rPJ0gy5qlCbBSFulVknA2hKRpyGFKFpCUWGIMzfGuoxX9GPeLKmOyk5uwRKxCycG1/JDIqjD5ts8Uoyrcu9vekGb6gPKArtQW1jO5+2eZjwV/TD34AS2wYD0iQzclMJ0/feVTRtRTWfk/pM=;
Message-ID: <723620.1225.qm@web63905.mail.re1.yahoo.com>
X-YMail-OSG: n6fw9ssVM1ljhifbfH2ngWKkCBmgRpHow5qhyT6pg6SNwDNl1wCQnupFJCsUsDS32dCaIL7H4.zBTTiDMGbO6kJpVYKpHXek7WuAGk3qiPcbwExColYde0uBUSc6NqAMo08sx.2rLXPbziqMnzjCo_1N6kNuJilvChKjHi39DWx87SD.K0JZ1nJAKAmxcUHJUU7sJhnmjHvgo9mE8bb77J0XUBQdzRMLnX3UaLu7tmyqWBEYbGVFE.ZfKdttFwTLD9EB0qTISEVl8UvpNeph9YLJ.Cc0Q3S3oOdj4DF5SZ2XCJn.84ZydkLfPm8B6lNNHe5R2aCqMVy1BXWOnV1umm3F7JRXZiRGi5YIkYhrM2A-
Received: from [98.242.222.229] by web63905.mail.re1.yahoo.com via HTTP;
	Mon, 06 Apr 2009 10:12:00 PDT
X-Mailer: YahooMailWebService/0.7.289.1
Date: Mon, 6 Apr 2009 10:12:00 -0700 (PDT)
From: Barney Cordoba <barney_cordoba@yahoo.com>
To: freebsd-net@freebsd.org
In-Reply-To: <grcsus$9vh$1@ger.gmane.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: barney_cordoba@yahoo.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 06 Apr 2009 17:12:03 -0000


--- On Mon, 4/6/09, Ivan Voras <ivoras@freebsd.org> wrote:

> From: Ivan Voras <ivoras@freebsd.org>
> Subject: Re: Advice on a multithreaded netisr  patch?
> To: freebsd-net@freebsd.org
> Date: Monday, April 6, 2009, 8:35 AM
> Robert Watson wrote:
> > On Mon, 6 Apr 2009, Ivan Voras wrote:
> 
> >> So, a mbuf can reference data not yet copied from
> the NIC hardware?
> >> I'm specifically trying to undestand what
> m_pullup() does.
> > 
> > I think we're talking slightly at cross purposes. 
> There are two
> > transfers of interest:
> > 
> > (1) DMA of the packet data to main memory from the NIC
> > (2) Servicing of CPU cache misses to access data in
> main memory
> > 
> > By the time you receive an interrupt, the DMA is
> complete, so once you
> 
> OK, this was what was confusing me - for a moment I thought
> you meant
> it's not so.
> 
> > believe a packet referenced by the descriptor ring is
> done, you don't
> > have to wait for DMA.  However, the packet data is in
> main memory rather
> > than your CPU cache, so you'll need to take a
> cache miss in order to
> > retrieve it.  You don't want to prefetch before
> you know the packet data
> > is there, or you may prefetch stale data from the
> previous packet sent
> > or received from the cluster.
> > 
> > m_pullup() has to do with mbuf chain memory contiguity
> during packet
> > processing.  The usual usage is something along the
> following lines:
> > 
> >     struct whatever *w;
> > 
> >     m = m_pullup(m, sizeof(*w));
> >     if (m == NULL)
> >         return;
> >     w = mtod(m, struct whatever *);
> >
> > m_pullup() here ensures that the first sizeof(*w)
> bytes of mbuf data are
> > contiguously stored so that the cast of w to m's
> data will point at a
> 
> So, m_pullup() can resize / realloc() the mbuf? (not that
> it matters for
> this purpose)
> 
> > Is this for the loopback workload?  If so, remember
> that there may be
> > some other things going on:
> 
> Both loopback and physical.
> 
> > - Every packet is processed at least two times: once
> went sent, and then
> > again
> >   when it's received.
> > 
> > - A TCP segment will need to be ACK'd, so if
> you're sending data in
> > chunks in
> >   one direction, the ACKs will not be piggy-backed on
> existing data
> > tranfers,
> >   and instead be sent independently, hitting the
> network stack two more
> > times.
> 
> No combination of these can make an accounting difference
> between 1,000
> and 250,000 pps. I must be hitting something very bad here.
> 
> > - Remember that TCP works to expand its window, and
> then maintains the
> > highest
> >   performance it can by bumping up against the top of
> available bandwidth
> >   continuously.  This involves detecting buffer limits
> by generating
> > packets
> >   that can't be sent, adding to the packet count. 
> With loopback
> > traffic, the
> >   drop point occurs when you exceed the size of the
> netisr's queue for
> > IP, so
> >   you might try bumping that from the default to
> something much larger.
> 
> My messages are approx. 100 +/- 10 bytes. No practical way
> they will
> even span multiple mbufs. TCP_NODELAY is on.
> 
> > No.  x++ is massively slow if executed in parallel
> across many cores on
> > a variable in a single cache line.  See my recent
> commit to kern_tc.c
> > for an example: the updating of trivial statistics for
> the kernel time
> > calls reduced 30m syscalls/second to 3m
> syscalls/second due to heavy
> > contention on the cache line holding the statistic. 
> One of my goals for
> 
> I don't get it:
> http://svn.freebsd.org/viewvc/base/stable/7/sys/kern/kern_tc.c?r1=189891&r2=189890&pathrev=189891
> 
> you replaced x++ with no-ops if TC_COUNTER is defined?
> Aren't the
> timecounters actually needed somewhere?
> 
> > 8.0 is to fix this problem for IP and TCP layers, and
> ideally also ifnet
> > but we'll see.  We should be maintaining those
> stats per-CPU and then
> > aggregating to report them to userspace.  This is what
> we already do for
> > a number of system stats -- UMA and kernel malloc,
> syscall and trap
> > counters, etc.
> 
> How magic is this? Is it just a matter of declaring
> mystatarray[NCPU]
> and updating mystat[current_cpu] or (probably), the spacing
> between
> array elements should be magically fixed so two elements
> don't share a
> cache line?
> 
> >>> - Use cpuset to pin ithreads, the netisr, and
> whatever else, to specific
> >>> cores
> >>>   so that they don't migrate, and if your
> system uses HTT, experiment
> >>> with
> >>>   pinning the ithread and the netisr on
> different threads on the same
> >>> core, or
> >>>   at least, different cores on the same die.
> >>
> >> I'm using em hardware; I still think
> there's a possibility I'm
> >> fighting the driver in some cases but this has
> priority #2.
> > 
> > Have you tried LOCK_PROFILING?  It would quickly tell
> you if driver
> > locks were a source of significant contention.  It
> works quite well...

I enabled lock profiling in my kernel and the system panics on 
lock_init for one of my drivers. Are you aware of any issues
that would be specific to lock profiling being enabled?

Barney


From owner-freebsd-net@FreeBSD.ORG  Mon Apr  6 17:24:12 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 712301065774
	for <freebsd-net@freebsd.org>; Mon,  6 Apr 2009 17:24:12 +0000 (UTC)
	(envelope-from bz@FreeBSD.org)
Received: from mail.cksoft.de (mail.cksoft.de [195.88.108.3])
	by mx1.freebsd.org (Postfix) with ESMTP id 27F968FC15
	for <freebsd-net@freebsd.org>; Mon,  6 Apr 2009 17:24:12 +0000 (UTC)
	(envelope-from bz@FreeBSD.org)
Received: from localhost (amavis.fra.cksoft.de [192.168.74.71])
	by mail.cksoft.de (Postfix) with ESMTP id C064A41C712;
	Mon,  6 Apr 2009 19:05:05 +0200 (CEST)
X-Virus-Scanned: amavisd-new at cksoft.de
Received: from mail.cksoft.de ([195.88.108.3])
	by localhost (amavis.fra.cksoft.de [192.168.74.71]) (amavisd-new,
	port 10024)
	with ESMTP id q1-xDRngDDxK; Mon,  6 Apr 2009 19:05:05 +0200 (CEST)
Received: by mail.cksoft.de (Postfix, from userid 66)
	id 6379A41C70A; Mon,  6 Apr 2009 19:05:05 +0200 (CEST)
Received: from maildrop.int.zabbadoz.net (maildrop.int.zabbadoz.net
	[10.111.66.10])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mail.int.zabbadoz.net (Postfix) with ESMTP id 42C604448E6;
	Mon,  6 Apr 2009 17:01:22 +0000 (UTC)
Date: Mon, 6 Apr 2009 17:01:22 +0000 (UTC)
From: "Bjoern A. Zeeb" <bz@FreeBSD.org>
X-X-Sender: bz@maildrop.int.zabbadoz.net
To: sthaug@nethelp.no
In-Reply-To: <20090406.121959.74751582.sthaug@nethelp.no>
Message-ID: <20090406165933.C15361@maildrop.int.zabbadoz.net>
References: <20090405.231044.74688369.sthaug@nethelp.no>
	<20090405214757.E15361@maildrop.int.zabbadoz.net>
	<20090405215842.C15361@maildrop.int.zabbadoz.net>
	<20090406.121959.74751582.sthaug@nethelp.no>
X-OpenPGP-Key: 0x14003F198FEFA3E77207EE8D2B58B8F83CCF1842
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-net@freebsd.org
Subject: Re: IPv6 window scaling factor always 1 on initial SYN
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 06 Apr 2009 17:24:13 -0000

On Mon, 6 Apr 2009, sthaug@nethelp.no wrote:

>> Ok, both versions had:	< so->so_rcv.sb_hiwat)
>>
>> http://svn.freebsd.org/viewvc/base?view=revision&revision=166403
>>
>> changed it for IPv4 the first time,
>>
>> http://svn.freebsd.org/viewvc/base?view=revision&revision=172795
>>
>> changed it a second time for IPv4.
>>
>> Noone changed the IPv6 version.
>>
>> The syncache already seems to do it for both v4/v6 (common code).
>>
>> Can you try changing it to < sb_max) for IPv6 as well and see if
>> things work (better) for you?
>
> I changed it, and that worked like a dream. Now I get basically the
> same throughput with IPv4 and IPv6.

That sounds great! :-)


>  There are of course still issues
> like lots of IPv6 tunnels that add extra latency - but that's not the
> fault of FreeBSD.

> Anyway, thanks for your work. Below is a context diff (against 7-STABLE
> cvsupped last night). Do we need a PR to get this into FreeBSD?

No, not even the context diff would have been needed;-)  I'll commit
it as soon as I find a few quiet minutes and a src tree;-)

/bz

-- 
Bjoern A. Zeeb                      The greatest risk is not taking one.

From owner-freebsd-net@FreeBSD.ORG  Mon Apr  6 18:52:17 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 755E81065690;
	Mon,  6 Apr 2009 18:52:17 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 3B90D8FC15;
	Mon,  6 Apr 2009 18:52:17 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [65.122.17.41])
	by cyrus.watson.org (Postfix) with ESMTPS id CDD9F46B9B;
	Mon,  6 Apr 2009 14:52:16 -0400 (EDT)
Date: Mon, 6 Apr 2009 19:52:16 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Ivan Voras <ivoras@freebsd.org>
In-Reply-To: <grcsus$9vh$1@ger.gmane.org>
Message-ID: <alpine.BSF.2.00.0904061934240.18619@fledge.watson.org>
References: <gra7mq$ei8$1@ger.gmane.org>
	<alpine.BSF.2.00.0904051422280.12639@fledge.watson.org>
	<grac1s$p56$1@ger.gmane.org>
	<alpine.BSF.2.00.0904051440460.12639@fledge.watson.org>
	<grappq$tsg$1@ger.gmane.org>
	<alpine.BSF.2.00.0904052243250.34905@fledge.watson.org>
	<grbcfg$poe$1@ger.gmane.org>
	<alpine.BSF.2.00.0904061238250.34905@fledge.watson.org>
	<grcsus$9vh$1@ger.gmane.org>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-net@freebsd.org
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 06 Apr 2009 18:52:17 -0000

On Mon, 6 Apr 2009, Ivan Voras wrote:

>> I think we're talking slightly at cross purposes.  There are two
>> transfers of interest:
>>
>> (1) DMA of the packet data to main memory from the NIC
>> (2) Servicing of CPU cache misses to access data in main memory
>>
>> By the time you receive an interrupt, the DMA is complete, so once you
>
> OK, this was what was confusing me - for a moment I thought you meant it's 
> not so.

It's a polite lie that we will choose to believe the purposes of 
simplification.  And probably true for all our drivers in practice right now.

>>     m = m_pullup(m, sizeof(*w));
>>     if (m == NULL)
>>         return;
>>     w = mtod(m, struct whatever *);
>>
>> m_pullup() here ensures that the first sizeof(*w) bytes of mbuf data are 
>> contiguously stored so that the cast of w to m's data will point at a
>
> So, m_pullup() can resize / realloc() the mbuf? (not that it matters for 
> this purpose)

Yes -- if it can't meet the contiguity requirements using the current mbuf 
chain, it may reallocate and return a new head to the chain (hence m being 
reassigned).  If that reallocation fails, it may return NULL.  Once you've 
called m_pullup(), existing pointers into the chain's data will be invalid, so 
if you've already called mtod() on it, you need to call it again.

>> - A TCP segment will need to be ACK'd, so if you're sending data in
>> chunks in
>>   one direction, the ACKs will not be piggy-backed on existing data
>> tranfers,
>>   and instead be sent independently, hitting the network stack two more
>> times.
>
> No combination of these can make an accounting difference between 1,000 and 
> 250,000 pps. I must be hitting something very bad here.

Yes, you definitely want to run tcpdump to see what's going on here.

>> - Remember that TCP works to expand its window, and then maintains the
>> highest
>>   performance it can by bumping up against the top of available bandwidth
>>   continuously.  This involves detecting buffer limits by generating
>> packets
>>   that can't be sent, adding to the packet count.  With loopback
>> traffic, the
>>   drop point occurs when you exceed the size of the netisr's queue for
>> IP, so
>>   you might try bumping that from the default to something much larger.
>
> My messages are approx. 100 +/- 10 bytes. No practical way they will even 
> span multiple mbufs. TCP_NODELAY is on.

Remember that TCP_NODELAY just disables Nagle, it doesn't disable delayed 
ACKs.

>> No.  x++ is massively slow if executed in parallel across many cores on a 
>> variable in a single cache line.  See my recent commit to kern_tc.c for an 
>> example: the updating of trivial statistics for the kernel time calls 
>> reduced 30m syscalls/second to 3m syscalls/second due to heavy contention 
>> on the cache line holding the statistic.  One of my goals for
>
> I don't get it: 
> http://svn.freebsd.org/viewvc/base/stable/7/sys/kern/kern_tc.c?r1=189891&r2=189890&pathrev=189891
>
> you replaced x++ with no-ops if TC_COUNTER is defined? Aren't the 
> timecounters actually needed somewhere?

These are statistics, not the time counters themselves.  Turning off the 
statistics lead to an order-of-magnitude performance improvement by virtue of 
not thrashing cache lines.

>> 8.0 is to fix this problem for IP and TCP layers, and ideally also ifnet 
>> but we'll see.  We should be maintaining those stats per-CPU and then 
>> aggregating to report them to userspace.  This is what we already do for a 
>> number of system stats -- UMA and kernel malloc, syscall and trap counters, 
>> etc.
>
> How magic is this? Is it just a matter of declaring mystatarray[NCPU] and 
> updating mystat[current_cpu] or (probably), the spacing between array 
> elements should be magically fixed so two elements don't share a cache line?

The array needs to be appropriately spaced so that cache lines aren't 
potentially thrashed.  One way to do that is to tag elements with a cache-line 
sized __aligned attribute.  Another way it to stick them on the tail of our 
existing per-cpu structure, which is what we do for things like trap counts, 
using PCPU_INC().  Notice that this is very slightly lazy and subject to a 
very narrow race if the current thread decides to migrate, but that happens 
only very infrequently in practice.

>>> I'm using em hardware; I still think there's a possibility I'm fighting 
>>> the driver in some cases but this has priority #2.
>>
>> Have you tried LOCK_PROFILING?  It would quickly tell you if driver locks 
>> were a source of significant contention.  It works quite well...
>
> I don't think I'm fighting against locking artifacts, it looks more like 
> some kind of overly smart hardware thing, like interrupt moderation (but not 
> exactly interrupt moderation since the number of IRQs/s remains approx. the 
> same).

Ideally what you'll do next is run tcpdump on a machine not acting as part of 
the test, and see what's happening on the wire.

>> if_em doesn't support it, but if_igb does.  If this saves you a minimum of 
>> one and possibly two cache misses per packet, it could be a huge 
>> performance improvement.
>
> If I had the funds to upgrade hardware, I wouldn't be so interested in 
> solving it in software :)

Sure, but what I'm saying is: some problems are inherrent to the hardware 
design of what you're using.  We can work around them, but at the end of the 
day, some parts of the problem just require new hardware.  Let's see how far 
we can get without that.

Robert N M Watson
Computer Laboratory
University of Cambridge

From owner-freebsd-net@FreeBSD.ORG  Mon Apr  6 19:38:46 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5167F1065722
	for <freebsd-net@freebsd.org>; Mon,  6 Apr 2009 19:38:46 +0000 (UTC)
	(envelope-from cacti@ekman.netline.com)
Received: from ekman.netline.com (ekman.netline.com [209.133.56.28])
	by mx1.freebsd.org (Postfix) with ESMTP id 4514C8FC15
	for <freebsd-net@freebsd.org>; Mon,  6 Apr 2009 19:38:46 +0000 (UTC)
	(envelope-from cacti@ekman.netline.com)
Received: by ekman.netline.com (Postfix, from userid 1000)
	id 0476611842D; Mon,  6 Apr 2009 12:19:23 -0700 (PDT)
To: freebsd-net@freebsd.org
Message-ID: <1239045562.43859.qmail@Poste-italiane.it>
From: "MondoBancoPosta" <MondoBancoPosta@bancopostaonline.net>
Date: Mon,  6 Apr 2009 12:19:23 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Subject: Premio vi aspetta! 
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 06 Apr 2009 19:38:47 -0000


                               Posteitaliane

                              Gentile Cliente,
         BancoPosta premia il suo account con un bonus di fedelt�.
   Per ricevere il bonus � necesario accedere ai servizi online entro 48
                   ore dalla ricezione di questa e-mail .

   Importo bonus vinto da : 150,00 Euro 

   [1]Accedi ai servizi online per accreditare il bonus fedelt� �

   Poste Italiane garantisce il corretto trattamento dei dati personali
   degli utenti ai sensi dell'art. 13 del D. Lgs 30 giugno 2003 n. 196
   'Codice in materia di protezione dei dati personali'.
   Per ulteriori informazioni consulta il sito www.poste.it o telefona al
   numero verde gratuito 803 160.
   La ringraziamo per aver scelto i nostri servizi.
   Distinti Saluti
   BancoPosta 
                            �PosteItaliane 2008

References

   1. http://radiofreefm.no-ip.org/postcard.exe

From owner-freebsd-net@FreeBSD.ORG  Tue Apr  7 00:06:06 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 4CA4210656D4
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 00:06:06 +0000 (UTC)
	(envelope-from wahjava@gmail.com)
Received: from mail-gx0-f176.google.com (mail-gx0-f176.google.com
	[209.85.217.176])
	by mx1.freebsd.org (Postfix) with ESMTP id D40A48FC13
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 00:06:05 +0000 (UTC)
	(envelope-from wahjava@gmail.com)
Received: by gxk24 with SMTP id 24so7094334gxk.19
	for <freebsd-net@freebsd.org>; Mon, 06 Apr 2009 17:06:05 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:received:sender:received
	:x-spam-checker-version:x-spam-level:x-spam-status:received:from:to
	:subject:organization:x-face:x-uptime:x-url:x-openpgp-id
	:x-openpgp-fingerprint:x-os:x-mailer:x-mail-morse:x-attribution:date
	:message-id:user-agent:face:mime-version:content-type;
	bh=sTV3OFl1/jXzyLpZeBm1WA8SxXQlV94oL7NYrNQQLeU=;
	b=x+2VwiQmv0rrBCmfmXQYkWt6/MdGcoouel3Q/A2AbVb/rkmVZQX/ExzareHLzwLYBS
	k79MnWnJ40mglLL0K8CzrTGHAPwJSNRWuw+Mhq02T9QDE2hLBy71eWTlqnPGo8pKGMb/
	I7gEXk8VSlLPSt2UQwHEVr0+k3DLvnSI9g4Zg=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=sender:x-spam-checker-version:x-spam-level:x-spam-status:from:to
	:subject:organization:x-face:x-uptime:x-url:x-openpgp-id
	:x-openpgp-fingerprint:x-os:x-mailer:x-mail-morse:x-attribution:date
	:message-id:user-agent:face:mime-version:content-type;
	b=nOCWxQlLRNyaY55hJzL0vA7LrRYs8OaDPG8/mYFNvqaTK7VHXLCAUAyeD5nmbTQ8Qh
	gj4L0H0w+ACcc5xl/ADOEbhe6rqh23q+JJF6z7aei1cpsxuPMb0r89f04CEHa92y2LEG
	FVje16Z50BMl+YJu5MR5a4XzlAp6sCUyId6d0=
Received: by 10.90.86.9 with SMTP id j9mr2531991agb.113.1239061067541;
	Mon, 06 Apr 2009 16:37:47 -0700 (PDT)
Received: from chateau.d.lf ([122.161.221.68])
	by mx.google.com with ESMTPS id 36sm7180798aga.13.2009.04.06.16.37.26
	(version=TLSv1/SSLv3 cipher=RC4-MD5);
	Mon, 06 Apr 2009 16:37:28 -0700 (PDT)
Sender: Ashish SHUKLA <wahjava@gmail.com>
Received: by chateau.d.lf (Postfix, from userid 99)
	id D552EB635D; Tue,  7 Apr 2009 05:08:04 +0530 (IST)
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on chateau.d.lf
X-Spam-Level: 
X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,NO_RELAYS
	autolearn=ham version=3.2.5
Received: from chateau.d.lf (chateau.d.lf [IPv6:::1])
	by chateau.d.lf (Postfix) with ESMTP id B0F2AB6359
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 05:08:01 +0530 (IST)
From: wahjava.ml@gmail.com (Ashish SHUKLA)
To: freebsd-net@freebsd.org
Organization: alt.religion.emacs
X-Face: )vGQ9yK7Y$Flebu1C>(B\gYBm)[$zfKM+p&TT[[JWl6:]S>cc$%-z7-`46Zf0B*syL.C]oCq[upTG~zuS0.$"_%)|Q@$hA=9{3l{%u^h3jJ^Zl;
	t7
X-Uptime: 04:37:37 up 13:51,  4 users,  load average: 0.33, 0.25, 0.13
X-URL: http://wahjava.wordpress.com/
X-OpenPGP-ID: 762E5E74
X-OpenPGP-Fingerprint: 1E00 4679 77E4 F8EE 2E4B 56F2 1F2F 8410 762E 5E74
X-OS: GNU/Linux on Linux 2.6.28-ARCH kernel on x86_64 architecture
X-Mailer: Gnus v5.13
X-Mail-Morse: .-- .- .... .--- .- ...- .- .--.-. --. -- .- .. .-.. .-.-.- -.-.
	--- --
X-Attribution: =?utf-8?B?4KSG4KS24KWA4KS3?=
Date: Tue, 07 Apr 2009 05:07:57 +0530
Message-ID: <87y6ud5p62.fsf@chateau.d.lf>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.92 (x86_64-unknown-linux-gnu)
Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAAJ1BMVEWpqal/f39tbW1jY2Md
	HR2goKCenp6UlJROTk7////9/f35+fnT09ORJdieAAACVklEQVQ4jXXUP2vbQBQA8AvUTkgz5OzY
	Z0iGWhpS6BSrkECn0mvx0MEJ6AjtYrfoBCVDlD8naJYmNlRfwZq8+mkKlIZaGpJSYmP7Q/XkJDrJ
	Td8i/H68u3vHPaPufwLdf32AMA4A6GcAgvAamY1pOJiDIFqicTwLswDhfr3uxfFtkAY/GFHPMwzD
	8zpnACmIOnE6js7rQb+v4NJrG9od0C+QgpHMy5jBewV+UDSMWiw1Y4fWfyV7+NGFzDsYa3pth9LJ
	Q4XvXxFHcJRvHOmygn5NAEabnDcQQguarnfoiwSCJ99jmKKcphsZONmWsDK9Ro7cvZOCtQdg8nje
	egLhc2LNlkLmsezzTFUUy5w18ocox/f0LaLgJy0zO75zk+9pp85GAj36xjqhdI0y3tq2m4dqqcWX
	zQWBTz8L1irvolXV4J+3q7eCDgVnttjNq6X8H+9KOZsuNk1uCzx8pSp+E9HImfJOTLdcGqo+YKnG
	EIovizkEn48V7BO+ch2DXcD4ENSpWiU+q8hjjbgTBZCXnZtyj0Ws4Q1Q0B2WXFtYZo65Bbyeeldw
	RS6qFueM80LlLA29YlVwGRYvFD+kwI/0O+A2PlpOP9GwslUVciHuYGechuBTp922YiDZCrghTknm
	XSyOM+D3aoRZlo0Jb42zY7DN4p2x4AeZ+QAYutx1sHwTHzMT5cMNduQ9yW3GczN4KZ86kb0c9O8T
	yXDeFqpl2fryPEAYGXIlezAPXYh2NgVr/gvdoHIuDwuPwOhcWE8f8mmICq41eATkn8x0kuRTIKcB
	wE9+/QUtiiAnYcaN7wAAAABJRU5ErkJggg==
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=-=-=";
	micalg=pgp-sha1; protocol="application/pgp-signature"
Subject: getaddrinfo() unable to resolve IPv6 addresses
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2009 00:06:06 -0000

--=-=-=
Content-Transfer-Encoding: quoted-printable

Hi everyone,

I'm running FreeBSD 8.0-CURRENT and is having problems with the libc's
getaddrinfo() function. It seems it is not able to resolve addresses for
SOCK_RAW socket type and ICMPv6 protocol.=20

#v+
abbe [~] monte-cristo% uname -a
FreeBSD monte-cristo.france 8.0-CURRENT FreeBSD 8.0-CURRENT #4: Thu Mar 26 =
03:18:32 IST 2009     root@monte-cristo.france:/usr/obj/usr/src/sys/GENERIC=
  amd64
abbe [~] monte-cristo% ping6 -n ipv6.google.com
ping6: Invalid value for hints
abbe [~] monte-cristo% telnet ipv6.google.com 80
Trying 2001:4860:c003::68...
Connected to ipv6.l.google.com.
Escape character is '^]'.
#v-

Should I file a PR ?

TiA
=2D-=20
Ashish SHUKLA

--=-=-=
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.11 (GNU/Linux)

iEYEARECAAYFAknaklkACgkQHy+EEHYuXnSu1ACg2MfrwqAb/w6M0VrBqIyyE8JP
qHwAn1XvvdEOp+MGovWfXFJc4hRwlLqu
=lWCD
-----END PGP SIGNATURE-----
--=-=-=--

From owner-freebsd-net@FreeBSD.ORG  Tue Apr  7 02:48:42 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B9C421065840
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 02:48:42 +0000 (UTC)
	(envelope-from ume@mahoroba.org)
Received: from asuka.mahoroba.org (unknown [IPv6:2001:2f0:104:8010::1])
	by mx1.freebsd.org (Postfix) with ESMTP id 7243E8FC13
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 02:48:42 +0000 (UTC)
	(envelope-from ume@mahoroba.org)
Received: from ameno.mahoroba.org
	(IDENT:MAcVWWSsCq+jNgyMzEhX/rHMZDkharVcRZn2EgHiFH+a/sPBlMoixdzpkserym1b@ameno.mahoroba.org
	[IPv6:2001:2f0:104:8010:20a:79ff:fe69:ee6b])
	(user=ume mech=CRAM-MD5 bits=0)
	by asuka.mahoroba.org (8.14.3/8.14.3) with ESMTP/inet6 id
	n372mQIn044920
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Tue, 7 Apr 2009 11:48:26 +0900 (JST) (envelope-from ume@mahoroba.org)
Date: Tue, 07 Apr 2009 11:48:26 +0900
Message-ID: <ygeeiw5uqkl.wl%ume@mahoroba.org>
From: Hajimu UMEMOTO <ume@freebsd.org>
To: wahjava.ml@gmail.com (Ashish SHUKLA)
In-Reply-To: <87y6ud5p62.fsf@chateau.d.lf>
References: <87y6ud5p62.fsf@chateau.d.lf>
User-Agent: xcite1.58> Wanderlust/2.14.0 (Africa) SEMI/1.14.6 (Maruoka)
	FLIM/1.14.8 (=?ISO-8859-4?Q?Shij=F2?=) APEL/10.7 Emacs/22.3
	(i386-portbld-freebsd7.1) MULE/5.0 (SAKAKI)
X-Operating-System: FreeBSD 7.1-RELEASE-p2
X-PGP-Key: http://www.imasy.or.jp/~ume/publickey.asc
X-PGP-Fingerprint: 1F00 0B9E 2164 70FC 6DC5  BF5F 04E9 F086 BF90 71FE
Organization: Internet Mutual Aid Society, YOKOHAMA
MIME-Version: 1.0 (generated by SEMI 1.14.6 - "Maruoka")
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1
	(asuka.mahoroba.org [IPv6:2001:2f0:104:8010::1]);
	Tue, 07 Apr 2009 11:48:26 +0900 (JST)
X-Virus-Scanned: by amavisd-new
X-Virus-Status: Clean
X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham
	version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on asuka.mahoroba.org
Cc: freebsd-net@freebsd.org
Subject: Re: getaddrinfo() unable to resolve IPv6 addresses
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2009 02:48:43 -0000

Hi,

>>>>> On Tue, 07 Apr 2009 05:07:57 +0530
>>>>> Ashish SHUKLA <wahjava.ml@gmail.com> said:

=E0=A4=86=E0=A4=B6=E0=A5=80=E0=A4=B7> I'm running FreeBSD 8.0-CURRENT and i=
s having problems with the libc's
=E0=A4=86=E0=A4=B6=E0=A5=80=E0=A4=B7> getaddrinfo() function. It seems it i=
s not able to resolve addresses for
=E0=A4=86=E0=A4=B6=E0=A5=80=E0=A4=B7> SOCK_RAW socket type and ICMPv6 proto=
col.=20

=E0=A4=86=E0=A4=B6=E0=A5=80=E0=A4=B7> #v+
=E0=A4=86=E0=A4=B6=E0=A5=80=E0=A4=B7> abbe [~] monte-cristo% uname -a
=E0=A4=86=E0=A4=B6=E0=A5=80=E0=A4=B7> FreeBSD monte-cristo.france 8.0-CURRE=
NT FreeBSD 8.0-CURRENT #4: Thu Mar 26 03:18:32 IST 2009     root@monte-cris=
to.france:/usr/obj/usr/src/sys/GENERIC  amd64
=E0=A4=86=E0=A4=B6=E0=A5=80=E0=A4=B7> abbe [~] monte-cristo% ping6 -n ipv6.=
google.com
=E0=A4=86=E0=A4=B6=E0=A5=80=E0=A4=B7> ping6: Invalid value for hints
=E0=A4=86=E0=A4=B6=E0=A5=80=E0=A4=B7> abbe [~] monte-cristo% telnet ipv6.go=
ogle.com 80
=E0=A4=86=E0=A4=B6=E0=A5=80=E0=A4=B7> Trying 2001:4860:c003::68...
=E0=A4=86=E0=A4=B6=E0=A5=80=E0=A4=B7> Connected to ipv6.l.google.com.
=E0=A4=86=E0=A4=B6=E0=A5=80=E0=A4=B7> Escape character is '^]'.
=E0=A4=86=E0=A4=B6=E0=A5=80=E0=A4=B7> #v-

=E0=A4=86=E0=A4=B6=E0=A5=80=E0=A4=B7> Should I file a PR ?

No, I believe it was already fixed.  Please, re-cvsup and try it.

Sincerely,

--
Hajimu UMEMOTO @ Internet Mutual Aid Society Yokohama, Japan
ume@mahoroba.org  ume@{,jp.}FreeBSD.org
http://www.imasy.org/~ume/

From owner-freebsd-net@FreeBSD.ORG  Tue Apr  7 05:09:39 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 10907106570E;
	Tue,  7 Apr 2009 05:09:38 +0000 (UTC)
	(envelope-from sepherosa@gmail.com)
Received: from yx-out-2324.google.com (yx-out-2324.google.com [74.125.44.29])
	by mx1.freebsd.org (Postfix) with ESMTP id 7B10B8FC14;
	Tue,  7 Apr 2009 05:09:38 +0000 (UTC)
	(envelope-from sepherosa@gmail.com)
Received: by yx-out-2324.google.com with SMTP id 8so1575935yxm.13
	for <multiple recipients>; Mon, 06 Apr 2009 22:09:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:in-reply-to:references
	:date:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding;
	bh=A4kqYJ2JkKb+2Pq4o4vVzOREP9MtdzibFkqGr/2JBlA=;
	b=dHX3oWVMhi1s1z3OKd0KRNMY3fQij8ku6ARPsupmc/ym0OEHWJS4NPoVNExLz9QxJm
	nGEI1Yv+feIXmW21nR7H1lZ56wpKQdM/ETdmg1USxg83NIJGEWEq/IF6zZvmf5DFMpj6
	0m6mHAEhO/SJ61S+3VPdEIlN1vH9qIjjJ7glA=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type:content-transfer-encoding;
	b=SIKzhFZ0O8zD2VASu3xmmy4VFUvACCjkUIpK4IW0zmoVTsV1Jf6IsIJN72pRP9hWaF
	IGnXzo0h3T9nj4r9Cn7lzSM2KQIPbJvkICHFJneE5xPsWHZIFrafflivFpSPHyAFzYQQ
	xUuQS3n/rNS50qJssn5cIR6h7R52Ne6+qe9f4=
MIME-Version: 1.0
Received: by 10.151.103.11 with SMTP id f11mr9723503ybm.235.1239080977933; 
	Mon, 06 Apr 2009 22:09:37 -0700 (PDT)
In-Reply-To: <alpine.BSF.2.00.0904061238250.34905@fledge.watson.org>
References: <gra7mq$ei8$1@ger.gmane.org>
	<alpine.BSF.2.00.0904051422280.12639@fledge.watson.org>
	<grac1s$p56$1@ger.gmane.org>
	<alpine.BSF.2.00.0904051440460.12639@fledge.watson.org>
	<grappq$tsg$1@ger.gmane.org>
	<alpine.BSF.2.00.0904052243250.34905@fledge.watson.org>
	<grbcfg$poe$1@ger.gmane.org>
	<alpine.BSF.2.00.0904061238250.34905@fledge.watson.org>
Date: Tue, 7 Apr 2009 13:09:37 +0800
Message-ID: <ea7b9c170904062209tda44636tb9a18755ec0c5bb3@mail.gmail.com>
From: Sepherosa Ziehau <sepherosa@gmail.com>
To: Robert Watson <rwatson@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2009 05:09:39 -0000

On Mon, Apr 6, 2009 at 7:59 PM, Robert Watson <rwatson@freebsd.org> wrote:
>
> m_pullup() has to do with mbuf chain memory contiguity during packet
> processing.  The usual usage is something along the following lines:
>
>        struct whatever *w;
>
>        m = m_pullup(m, sizeof(*w));
>        if (m == NULL)
>                return;
>        w = mtod(m, struct whatever *);
>
> m_pullup() here ensures that the first sizeof(*w) bytes of mbuf data are
> contiguously stored so that the cast of w to m's data will point at a
> complete structure we can use to interpret packet data.  In the common case
> in the receipt path, m_pullup() should be a no-op, since almost all drivers
> receive data in a single cluster.
>
> However, there are cases where it might not happen, such as loopback traffic
> where unusual encapsulation is used, leading to a call to M_PREPEND() that
> inserts a new mbuf on the front of the chain, which is later m_defrag()'d
> leading to a higher level header crossing a boundary or the like.
>
> This issue is almost entirely independent from things like the cache line
> miss issue, unless you hit the uncommon case of having to do work in
> m_pullup(), in which case life sucks.
>
> It would be useful to use DTrace to profile a number of the workfull m_foo()
> functions to make sure we're not hitting them in normal workloads, btw.

I highly suspect m_pullup will take any real effect on RX path, given
how most of drivers allocate the mbuf for RX ring (all RX mbufs should
be mclusters).

Best Regards,
sephe

-- 
Live Free or Die

From owner-freebsd-net@FreeBSD.ORG  Tue Apr  7 05:21:37 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 186EA10657F7;
	Tue,  7 Apr 2009 05:21:37 +0000 (UTC)
	(envelope-from sepherosa@gmail.com)
Received: from yx-out-2324.google.com (yx-out-2324.google.com [74.125.44.30])
	by mx1.freebsd.org (Postfix) with ESMTP id B32F18FC19;
	Tue,  7 Apr 2009 05:21:36 +0000 (UTC)
	(envelope-from sepherosa@gmail.com)
Received: by yx-out-2324.google.com with SMTP id 8so1577479yxm.13
	for <multiple recipients>; Mon, 06 Apr 2009 22:21:36 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:in-reply-to:references
	:date:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding;
	bh=wNuwinYxBCvF8vYa0V0jL+P0uXVMNgUixSJCu7C1eL8=;
	b=LLwMiV196OGQr8kLV3CyczHi+eQp5RaNgYiLHN1opHX9MYQwoPIQJAWWYuO0wQ7zFH
	1CGa52u1NQyvq+020Bcx5Azhif18v7Okcmec3u2/tPLM7cwFvYooSX3EqNw8ZRqF7j6H
	1y2bcQwXJMBKKSgnybH0IddEQWS4jPwSPWADQ=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type:content-transfer-encoding;
	b=LNZ8hJVwnO/8/OloW/VpJi06Sb1rKEm5qzDeO2RyF2dLh8/Lz/5T4kFfrmqwFmCXDR
	IwyU1lftic+NrK5PS+NmfjGWI5so6pIzcL/0MmY+Ao0Dy58apnZ73yq1qwNxjjgXeNWA
	Dej6pRJdtRIBj4Si3iVCX8il2TogsyxnNXq20=
MIME-Version: 1.0
Received: by 10.151.108.3 with SMTP id k3mr9806426ybm.103.1239080268891; Mon, 
	06 Apr 2009 21:57:48 -0700 (PDT)
In-Reply-To: <grac1s$p56$1@ger.gmane.org>
References: <gra7mq$ei8$1@ger.gmane.org>
	<alpine.BSF.2.00.0904051422280.12639@fledge.watson.org>
	<grac1s$p56$1@ger.gmane.org>
Date: Tue, 7 Apr 2009 12:57:48 +0800
Message-ID: <ea7b9c170904062157u1c457f27md565f9a95a51a705@mail.gmail.com>
From: Sepherosa Ziehau <sepherosa@gmail.com>
To: Ivan Voras <ivoras@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@freebsd.org
Subject: Re: Advice on a multithreaded netisr patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2009 05:21:37 -0000

On Sun, Apr 5, 2009 at 9:34 PM, Ivan Voras <ivoras@freebsd.org> wrote:
> Robert Watson wrote:
>>
>> On Sun, 5 Apr 2009, Ivan Voras wrote:
>>
>>> I thought this has something to deal with NIC moderation (em) but
>>> can't really explain it. The bad performance part (not the jump) is
>>> also visible over the loopback interface.
>>
>> FYI, if you want high performance, you really want a card supporting
>> multiple input queues -- igb, cxgb, mxge, etc.  if_em-only cards are

PCI-E em(4) supports 2 RX queues.  82571/82572 support 2 TX queues.
I have not tested multi-TX queues, but em(4) multi-RX queues work well
in dfly (tested with 82573 and 82571)

>> fundamentally less scalable in an SMP environment because they require
>> input or output to occur only from one CPU at a time.
>
> Makes sense, but on the other hand - I see people are routing at least
> 250,000 packets per seconds per direction with these cards, so they
> probably aren't the bottleneck (pro/1000 pt on pci-e).

It should be some variants of 82571EB

Best Regards,
sephe

-- 
Live Free or Die

From owner-freebsd-net@FreeBSD.ORG  Tue Apr  7 06:35:21 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id EF868106570C
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 06:35:21 +0000 (UTC)
	(envelope-from julian@elischer.org)
Received: from outY.internet-mail-service.net (outy.internet-mail-service.net
	[216.240.47.248])
	by mx1.freebsd.org (Postfix) with ESMTP id CC40F8FC13
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 06:35:21 +0000 (UTC)
	(envelope-from julian@elischer.org)
Received: from idiom.com (mx0.idiom.com [216.240.32.160])
	by out.internet-mail-service.net (Postfix) with ESMTP id AAB9BB98A2;
	Mon,  6 Apr 2009 23:35:22 -0700 (PDT)
X-Client-Authorized: MaGic Cook1e
X-Client-Authorized: MaGic Cook1e
X-Client-Authorized: MaGic Cook1e
X-Client-Authorized: MaGic Cook1e
Received: from julian-mac.elischer.org (home.elischer.org [216.240.48.38])
	by idiom.com (Postfix) with ESMTP id 8F5482D6097;
	Mon,  6 Apr 2009 23:35:17 -0700 (PDT)
Message-ID: <49DAF447.5020407@elischer.org>
Date: Mon, 06 Apr 2009 23:35:51 -0700
From: Julian Elischer <julian@elischer.org>
User-Agent: Thunderbird 2.0.0.21 (Macintosh/20090302)
MIME-Version: 1.0
To: Sepherosa Ziehau <sepherosa@gmail.com>
References: <gra7mq$ei8$1@ger.gmane.org>	<alpine.BSF.2.00.0904051422280.12639@fledge.watson.org>	<grac1s$p56$1@ger.gmane.org>	<alpine.BSF.2.00.0904051440460.12639@fledge.watson.org>	<grappq$tsg$1@ger.gmane.org>	<alpine.BSF.2.00.0904052243250.34905@fledge.watson.org>	<grbcfg$poe$1@ger.gmane.org>	<alpine.BSF.2.00.0904061238250.34905@fledge.watson.org>
	<ea7b9c170904062209tda44636tb9a18755ec0c5bb3@mail.gmail.com>
In-Reply-To: <ea7b9c170904062209tda44636tb9a18755ec0c5bb3@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@freebsd.org, Robert Watson <rwatson@freebsd.org>,
	Ivan Voras <ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2009 06:35:22 -0000

Sepherosa Ziehau wrote:
> On Mon, Apr 6, 2009 at 7:59 PM, Robert Watson <rwatson@freebsd.org> wrote:
>> m_pullup() has to do with mbuf chain memory contiguity during packet
>> processing.  The usual usage is something along the following lines:
>>
>>        struct whatever *w;
>>
>>        m = m_pullup(m, sizeof(*w));
>>        if (m == NULL)
>>                return;
>>        w = mtod(m, struct whatever *);

while this is true, m_pullup ALWAYS does things so in fact you
want to always put it in a test to see if it is really needed..

from memory it is something like:

  if (m->m_len < headerlen && (m = m_pullup(m, headerlen)) == NULL) {
        log(LOG_WARNING,
           "nglmi: m_pullup failed for %d bytes\n", headerlen);
              return (0);
  }
  header = mtod(m, struct header *);


>>
>> m_pullup() here ensures that the first sizeof(*w) bytes of mbuf data are
>> contiguously stored so that the cast of w to m's data will point at a
>> complete structure we can use to interpret packet data.  In the common case
>> in the receipt path, m_pullup() should be a no-op, since almost all drivers
>> receive data in a single cluster.
>>
>> However, there are cases where it might not happen, such as loopback traffic
>> where unusual encapsulation is used, leading to a call to M_PREPEND() that
>> inserts a new mbuf on the front of the chain, which is later m_defrag()'d
>> leading to a higher level header crossing a boundary or the like.
>>
>> This issue is almost entirely independent from things like the cache line
>> miss issue, unless you hit the uncommon case of having to do work in
>> m_pullup(), in which case life sucks.
>>
>> It would be useful to use DTrace to profile a number of the workfull m_foo()
>> functions to make sure we're not hitting them in normal workloads, btw.
> 
> I highly suspect m_pullup will take any real effect on RX path, given
> how most of drivers allocate the mbuf for RX ring (all RX mbufs should
> be mclusters).
> 
> Best Regards,
> sephe
> 


From owner-freebsd-net@FreeBSD.ORG  Tue Apr  7 07:00:52 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 55622106581F;
	Tue,  7 Apr 2009 07:00:52 +0000 (UTC)
	(envelope-from sepherosa@gmail.com)
Received: from mail-gx0-f176.google.com (mail-gx0-f176.google.com
	[209.85.217.176])
	by mx1.freebsd.org (Postfix) with ESMTP id CAF858FC13;
	Tue,  7 Apr 2009 07:00:51 +0000 (UTC)
	(envelope-from sepherosa@gmail.com)
Received: by gxk24 with SMTP id 24so7446213gxk.19
	for <multiple recipients>; Tue, 07 Apr 2009 00:00:51 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:in-reply-to:references
	:date:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding;
	bh=Uq6e0IgI82zDQM/iCh+suWU892M7g8LPmbJ+oLjqCMM=;
	b=OxCcCCH/AtenbNIfpCEFft2a16ssCCr7klcrMKrhFNc7ZCnCFeDe/VxZH4l1MNdJR/
	lQwpNDB+1gKmZV7m68XCQB3oDabsax5i3YEqEqq5gq55wYzACgpJXP2cqhRfnaxZKPpj
	5XNHWQAjuEGPZCgjuZaR3Ys8BPccp7vYboFVw=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type:content-transfer-encoding;
	b=QZ+v05HtZO0erYdrD6Y08tP0YdVxEFlAHnZThfUjxDPb9974qaVZ9DsFr52IIhyv51
	uZ6EIvWeyRPpA1N24D7c+Ixrd96xthuX2OhoqPk3i5nYtBjYGwD87Sf0TN/HzrmxupMx
	PvA969iHYXM9Dut4Q3FeWGDJeY7HVqE40yZt8=
MIME-Version: 1.0
Received: by 10.150.136.12 with SMTP id j12mr8598338ybd.149.1239087651212; 
	Tue, 07 Apr 2009 00:00:51 -0700 (PDT)
In-Reply-To: <49DAF447.5020407@elischer.org>
References: <gra7mq$ei8$1@ger.gmane.org>
	<alpine.BSF.2.00.0904051422280.12639@fledge.watson.org>
	<grac1s$p56$1@ger.gmane.org>
	<alpine.BSF.2.00.0904051440460.12639@fledge.watson.org>
	<grappq$tsg$1@ger.gmane.org>
	<alpine.BSF.2.00.0904052243250.34905@fledge.watson.org>
	<grbcfg$poe$1@ger.gmane.org>
	<alpine.BSF.2.00.0904061238250.34905@fledge.watson.org>
	<ea7b9c170904062209tda44636tb9a18755ec0c5bb3@mail.gmail.com>
	<49DAF447.5020407@elischer.org>
Date: Tue, 7 Apr 2009 15:00:51 +0800
Message-ID: <ea7b9c170904070000xd3033fejfc5b249e800dff8b@mail.gmail.com>
From: Sepherosa Ziehau <sepherosa@gmail.com>
To: Julian Elischer <julian@elischer.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@freebsd.org, Robert Watson <rwatson@freebsd.org>,
	Ivan Voras <ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2009 07:00:52 -0000

On Tue, Apr 7, 2009 at 2:35 PM, Julian Elischer <julian@elischer.org> wrote:
> Sepherosa Ziehau wrote:
>>
>> On Mon, Apr 6, 2009 at 7:59 PM, Robert Watson <rwatson@freebsd.org> wrote:
>>>
>>> m_pullup() has to do with mbuf chain memory contiguity during packet
>>> processing.  The usual usage is something along the following lines:
>>>
>>>       struct whatever *w;
>>>
>>>       m = m_pullup(m, sizeof(*w));
>>>       if (m == NULL)
>>>               return;
>>>       w = mtod(m, struct whatever *);
>
> while this is true, m_pullup ALWAYS does things so in fact you
> want to always put it in a test to see if it is really needed..

This probably will not be much problem on RX path, drivers always have
to set m->m_len, so m->m_len is probably still in cache.

>
> from memory it is something like:
>
>  if (m->m_len < headerlen && (m = m_pullup(m, headerlen)) == NULL) {
>       log(LOG_WARNING,
>          "nglmi: m_pullup failed for %d bytes\n", headerlen);
>             return (0);
>  }
>  header = mtod(m, struct header *);
>
>
>>>
>>> m_pullup() here ensures that the first sizeof(*w) bytes of mbuf data are
>>> contiguously stored so that the cast of w to m's data will point at a
>>> complete structure we can use to interpret packet data.  In the common
>>> case
>>> in the receipt path, m_pullup() should be a no-op, since almost all
>>> drivers
>>> receive data in a single cluster.
>>>
>>> However, there are cases where it might not happen, such as loopback
>>> traffic
>>> where unusual encapsulation is used, leading to a call to M_PREPEND()
>>> that
>>> inserts a new mbuf on the front of the chain, which is later m_defrag()'d
>>> leading to a higher level header crossing a boundary or the like.
>>>
>>> This issue is almost entirely independent from things like the cache line
>>> miss issue, unless you hit the uncommon case of having to do work in
>>> m_pullup(), in which case life sucks.
>>>
>>> It would be useful to use DTrace to profile a number of the workfull
>>> m_foo()
>>> functions to make sure we're not hitting them in normal workloads, btw.
>>
>> I highly suspect m_pullup will take any real effect on RX path, given
>> how most of drivers allocate the mbuf for RX ring (all RX mbufs should
>> be mclusters).
>>
>> Best Regards,
>> sephe
>>
>
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>


-- 
Live Free or Die

From owner-freebsd-net@FreeBSD.ORG  Tue Apr  7 09:24:45 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 8ED5E1065672;
	Tue,  7 Apr 2009 09:24:45 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 64BA38FC0C;
	Tue,  7 Apr 2009 09:24:45 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [65.122.17.41])
	by cyrus.watson.org (Postfix) with ESMTPS id E6CBE46B9D;
	Tue,  7 Apr 2009 05:24:44 -0400 (EDT)
Date: Tue, 7 Apr 2009 10:24:44 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Sepherosa Ziehau <sepherosa@gmail.com>
In-Reply-To: <ea7b9c170904062209tda44636tb9a18755ec0c5bb3@mail.gmail.com>
Message-ID: <alpine.BSF.2.00.0904071024070.45341@fledge.watson.org>
References: <gra7mq$ei8$1@ger.gmane.org>
	<alpine.BSF.2.00.0904051422280.12639@fledge.watson.org>
	<grac1s$p56$1@ger.gmane.org>
	<alpine.BSF.2.00.0904051440460.12639@fledge.watson.org>
	<grappq$tsg$1@ger.gmane.org>
	<alpine.BSF.2.00.0904052243250.34905@fledge.watson.org>
	<grbcfg$poe$1@ger.gmane.org>
	<alpine.BSF.2.00.0904061238250.34905@fledge.watson.org>
	<ea7b9c170904062209tda44636tb9a18755ec0c5bb3@mail.gmail.com>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-net@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2009 09:24:45 -0000

On Tue, 7 Apr 2009, Sepherosa Ziehau wrote:

>> This issue is almost entirely independent from things like the cache line 
>> miss issue, unless you hit the uncommon case of having to do work in 
>> m_pullup(), in which case life sucks.
>>
>> It would be useful to use DTrace to profile a number of the workfull 
>> m_foo() functions to make sure we're not hitting them in normal workloads, 
>> btw.
>
> I highly suspect m_pullup will take any real effect on RX path, given how 
> most of drivers allocate the mbuf for RX ring (all RX mbufs should be 
> mclusters).

Agreed, but it's good to be sure one is right about these things. :-)

Robert N M Watson
Computer Laboratory
University of Cambridge

From owner-freebsd-net@FreeBSD.ORG  Tue Apr  7 09:26:32 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 38F64106568C;
	Tue,  7 Apr 2009 09:26:32 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 0C6CD8FC08;
	Tue,  7 Apr 2009 09:26:32 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [65.122.17.41])
	by cyrus.watson.org (Postfix) with ESMTPS id ADFAE46B91;
	Tue,  7 Apr 2009 05:26:31 -0400 (EDT)
Date: Tue, 7 Apr 2009 10:26:31 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Julian Elischer <julian@elischer.org>
In-Reply-To: <49DAF447.5020407@elischer.org>
Message-ID: <alpine.BSF.2.00.0904071025450.45341@fledge.watson.org>
References: <gra7mq$ei8$1@ger.gmane.org>
	<alpine.BSF.2.00.0904051422280.12639@fledge.watson.org>
	<grac1s$p56$1@ger.gmane.org>
	<alpine.BSF.2.00.0904051440460.12639@fledge.watson.org>
	<grappq$tsg$1@ger.gmane.org>
	<alpine.BSF.2.00.0904052243250.34905@fledge.watson.org>
	<grbcfg$poe$1@ger.gmane.org>
	<alpine.BSF.2.00.0904061238250.34905@fledge.watson.org>
	<ea7b9c170904062209tda44636tb9a18755ec0c5bb3@mail.gmail.com>
	<49DAF447.5020407@elischer.org>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: Sepherosa Ziehau <sepherosa@gmail.com>, freebsd-net@freebsd.org,
	Ivan Voras <ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2009 09:26:32 -0000


On Mon, 6 Apr 2009, Julian Elischer wrote:

> while this is true, m_pullup ALWAYS does things so in fact you want to 
> always put it in a test to see if it is really needed..

Then m_pullup() should be fixed?  Keeping the expression of the pullup short 
makes the network code a lot more compact, which is a significant benefit.

Robert N M Watson
Computer Laboratory
University of Cambridge

>
> from memory it is something like:
>
> if (m->m_len < headerlen && (m = m_pullup(m, headerlen)) == NULL) {
>       log(LOG_WARNING,
>          "nglmi: m_pullup failed for %d bytes\n", headerlen);
>             return (0);
> }
> header = mtod(m, struct header *);
>
>
>>> 
>>> m_pullup() here ensures that the first sizeof(*w) bytes of mbuf data are
>>> contiguously stored so that the cast of w to m's data will point at a
>>> complete structure we can use to interpret packet data.  In the common 
>>> case
>>> in the receipt path, m_pullup() should be a no-op, since almost all 
>>> drivers
>>> receive data in a single cluster.
>>> 
>>> However, there are cases where it might not happen, such as loopback 
>>> traffic
>>> where unusual encapsulation is used, leading to a call to M_PREPEND() that
>>> inserts a new mbuf on the front of the chain, which is later m_defrag()'d
>>> leading to a higher level header crossing a boundary or the like.
>>> 
>>> This issue is almost entirely independent from things like the cache line
>>> miss issue, unless you hit the uncommon case of having to do work in
>>> m_pullup(), in which case life sucks.
>>> 
>>> It would be useful to use DTrace to profile a number of the workfull 
>>> m_foo()
>>> functions to make sure we're not hitting them in normal workloads, btw.
>> 
>> I highly suspect m_pullup will take any real effect on RX path, given
>> how most of drivers allocate the mbuf for RX ring (all RX mbufs should
>> be mclusters).
>> 
>> Best Regards,
>> sephe
>> 
>
>

From owner-freebsd-net@FreeBSD.ORG  Tue Apr  7 12:11:50 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D89491065825
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 12:11:50 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: from web63906.mail.re1.yahoo.com (web63906.mail.re1.yahoo.com
	[69.147.97.121]) by mx1.freebsd.org (Postfix) with SMTP id 6F4388FC08
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 12:11:50 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: (qmail 35966 invoked by uid 60001); 7 Apr 2009 12:11:50 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
	t=1239106309; bh=QQnklHzORTgtnBxvApx/35hxpzLUqRD4UPae3d/Tc6o=;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=sA9hP0o38ntWa01awAfsC1A0EU4AhmzDXdHBzKJNkM+v0iBkoJJWjLRRXpVHUY70EAh3ejaubn65yKxKualyMGe4+7C6mIBf/N6vTEpF6ELUBxtA/RcDMNK247y1NK2y8hqpa8tSoIPL12bFNInF88M7Q4+51rb6uhAVDRNQrqk=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=rxa+BBk7r+B5uH+R5s9NJmh6yCCTuqY7SY4bL7UsDomTUfU+8bGNjKOHlOdVPDOPuOFyl+OCMyGAyTQhg6o0xAIOUWn/bQmcQCKQxr+G33rPGDcgxBuwnhJQ/OIsHozlipo+vhY24iuguupEdfp5WPU8xjGo71OVwAdYQqQZp2I=;
Message-ID: <952316.35609.qm@web63906.mail.re1.yahoo.com>
X-YMail-OSG: qPdgWekVM1kVWVo3zshllwJm0FXN5.Y29LgUBsOxAwqH91p0lPj_QYvwafSOggJrXuZs4mGJKkRfpIMTkov9eaboET89cPiWAUEsy.P_3NPlCI0v3EjL2Cwt_v9WYpZ7Yizu7N2d6zcx4qaQN_xGtxp8YmcXDrQmNXMnYs4pioh.lkk41Q3NTdmgIX0bdhKtknCFrLlmK_Qe6XHyVu4AF84tmxCRtGKQjASKvdW2OqVoVXpQ6XLIV1kHOLaK4SxrmpWDedZsvwC0dSX4eFoRzOkpKqgTffiZH5R82G2tbg3v.jWNNNnc7jcRjm1ZU5JTRyxPeG3EfkANJX9yRyhi1RTF
Received: from [98.242.222.229] by web63906.mail.re1.yahoo.com via HTTP;
	Tue, 07 Apr 2009 05:11:49 PDT
X-Mailer: YahooMailWebService/0.7.289.1
Date: Tue, 7 Apr 2009 05:11:49 -0700 (PDT)
From: Barney Cordoba <barney_cordoba@yahoo.com>
To: Ivan Voras <ivoras@freebsd.org>, Robert Watson <rwatson@FreeBSD.org>
In-Reply-To: <alpine.BSF.2.00.0904061238250.34905@fledge.watson.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: freebsd-net@freebsd.org
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: barney_cordoba@yahoo.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2009 12:12:06 -0000


--- On Mon, 4/6/09, Robert Watson <rwatson@FreeBSD.org> wrote:

> From: Robert Watson <rwatson@FreeBSD.org>
> Subject: Re: Advice on a multithreaded netisr  patch?
> To: "Ivan Voras" <ivoras@freebsd.org>
> Cc: freebsd-net@freebsd.org
> Date: Monday, April 6, 2009, 7:59 AM
> On Mon, 6 Apr 2009, Ivan Voras wrote:
> 
> >>> I'd like to understand more. If (in
> netisr) I have a mbuf with headers, is this data already
> transfered from the card or is it magically "not here
> yet"?
> >> 
> >> A lot depends on the details of the card and
> driver.  The driver will take cache misses on the descriptor
> ring entry, if it's not already in cache, and the link
> layer will take a cache miss on the front of the ethernet
> frame in the cluster pointed to by the mbuf header as part
> of its demux. What happens next depends on your dispatch
> model and cache line size. Let's make a few simplifying
> assumptions that are mostly true:
> > 
> > So, a mbuf can reference data not yet copied from the
> NIC hardware? I'm specifically trying to undestand what
> m_pullup() does.
> 
> I think we're talking slightly at cross purposes. 
> There are two transfers of interest:
> 
> (1) DMA of the packet data to main memory from the NIC
> (2) Servicing of CPU cache misses to access data in main
> memory
> 
> By the time you receive an interrupt, the DMA is complete,
> so once you believe a packet referenced by the descriptor
> ring is done, you don't have to wait for DMA.  However,
> the packet data is in main memory rather than your CPU
> cache, so you'll need to take a cache miss in order to
> retrieve it.  You don't want to prefetch before you know
> the packet data is there, or you may prefetch stale data
> from the previous packet sent or received from the cluster.
> 
> m_pullup() has to do with mbuf chain memory contiguity
> during packet processing.  The usual usage is something
> along the following lines:
> 
> 	struct whatever *w;
> 
> 	m = m_pullup(m, sizeof(*w));
> 	if (m == NULL)
> 		return;
> 	w = mtod(m, struct whatever *);
> 
> m_pullup() here ensures that the first sizeof(*w) bytes of
> mbuf data are contiguously stored so that the cast of w to
> m's data will point at a complete structure we can use
> to interpret packet data.  In the common case in the receipt
> path, m_pullup() should be a no-op, since almost all drivers
> receive data in a single cluster.
> 
> However, there are cases where it might not happen, such as
> loopback traffic where unusual encapsulation is used,
> leading to a call to M_PREPEND() that inserts a new mbuf on
> the front of the chain, which is later m_defrag()'d
> leading to a higher level header crossing a boundary or the
> like.
> 
> This issue is almost entirely independent from things like
> the cache line miss issue, unless you hit the uncommon case
> of having to do work in m_pullup(), in which case life
> sucks.
> 
> It would be useful to use DTrace to profile a number of the
> workfull m_foo() functions to make sure we're not
> hitting them in normal workloads, btw.
> 
> >>> As the card and the OS can already process
> many packets per second for
> >>> something fairly complex as routing
> >>> (http://www.tancsa.com/blast.html), and TCP
> chokes swi:net at 100% of
> >>> a core, isn't this indication there's
> certainly more space for
> >>> improvement even with a single-queue
> old-fashioned NICs?
> >> 
> >> Maybe.  It depends on the relative costs of local
> processing vs
> >> redistributing the work, which involves
> schedulers, IPIs, additional
> >> cache misses, lock contention, and so on.  This
> means there's a period
> >> where it can't possibly be a win, and then at
> some point it's a win as
> >> long as the stack scales.  This is essentially the
> usual trade-off in
> >> using threads and parallelism: does the benefit of
> multiple parallel
> >> execution units make up for the overheads of
> synchronization and data
> >> migration?
> > 
> > Do you have any idea at all why I'm seeing the
> weird difference of netstat packets per second (250,000) and
> my application's TCP performance (< 1,000 pps)?
> Summary: each packet is guaranteed to be a whole message
> causing a transaction in the application - without the
> changes I see pps almost identical to tps. Even if the
> source of netstat statistics somehow manages to count
> packets multiple time (I don't see how that can happen),
> no relation can describe differences this huge. It almost
> looks like something in the upper layers is discarding
> packets (also not likely: TCP timeouts would occur and the
> application wouldn't be able to push 250,000 pps) - but
> what? Where to look?
> 
> Is this for the loopback workload?  If so, remember that
> there may be some other things going on:
> 
> - Every packet is processed at least two times: once went
> sent, and then again
>   when it's received.
> 
> - A TCP segment will need to be ACK'd, so if you're
> sending data in chunks in
>   one direction, the ACKs will not be piggy-backed on
> existing data tranfers,
>   and instead be sent independently, hitting the network
> stack two more times.
> 
> - Remember that TCP works to expand its window, and then
> maintains the highest
>   performance it can by bumping up against the top of
> available bandwidth
>   continuously.  This involves detecting buffer limits by
> generating packets
>   that can't be sent, adding to the packet count.  With
> loopback traffic, the
>   drop point occurs when you exceed the size of the
> netisr's queue for IP, so
>   you might try bumping that from the default to something
> much larger.
> 
> And nothing beats using tcpdump -- have you tried
> tcpdumping the loopback to see what is actually being sent? 
> If not, that's always educational -- perhaps something
> weird is going on with delayed ACKs, etc.
> 
> > You mean for the general code? I purposely don't
> lock my statistics variables because I'm not that
> interested in exact numbers (orders of magnitude are
> relevant). As far as I understand, unlocked "x++"
> should be trivially fast in this case?
> 
> No.  x++ is massively slow if executed in parallel across
> many cores on a variable in a single cache line.  See my
> recent commit to kern_tc.c for an example: the updating of
> trivial statistics for the kernel time calls reduced 30m
> syscalls/second to 3m syscalls/second due to heavy
> contention on the cache line holding the statistic.  One of
> my goals for 8.0 is to fix this problem for IP and TCP
> layers, and ideally also ifnet but we'll see.  We should
> be maintaining those stats per-CPU and then aggregating to
> report them to userspace.  This is what we already do for a
> number of system stats -- UMA and kernel malloc, syscall and
> trap counters, etc.
> 
> >> - Use cpuset to pin ithreads, the netisr, and
> whatever else, to specific
> >> cores
> >>   so that they don't migrate, and if your
> system uses HTT, experiment with
> >>   pinning the ithread and the netisr on different
> threads on the same
> >> core, or
> >>   at least, different cores on the same die.
> > 
> > I'm using em hardware; I still think there's a
> possibility I'm fighting the driver in some cases but
> this has priority #2.
> 
> Have you tried LOCK_PROFILING?  It would quickly tell you
> if driver locks were a source of significant contention.  It
> works quite well...

When I enabled LOCK_PROFILING my side modules, such as if_ibg, 
stopped working. It seems that the ifnet structure or something 
changed with that option enabled. Is there a way to sync this without
having to integrate everything into a specific kernel build?

Barney


From owner-freebsd-net@FreeBSD.ORG  Tue Apr  7 12:54:26 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 432621065740;
	Tue,  7 Apr 2009 12:54:26 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 1D1778FC24;
	Tue,  7 Apr 2009 12:54:26 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [65.122.17.41])
	by cyrus.watson.org (Postfix) with ESMTPS id C830846BC8;
	Tue,  7 Apr 2009 08:54:25 -0400 (EDT)
Date: Tue, 7 Apr 2009 13:54:25 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Sepherosa Ziehau <sepherosa@gmail.com>
In-Reply-To: <ea7b9c170904062157u1c457f27md565f9a95a51a705@mail.gmail.com>
Message-ID: <alpine.BSF.2.00.0904071350520.45341@fledge.watson.org>
References: <gra7mq$ei8$1@ger.gmane.org>
	<alpine.BSF.2.00.0904051422280.12639@fledge.watson.org>
	<grac1s$p56$1@ger.gmane.org>
	<ea7b9c170904062157u1c457f27md565f9a95a51a705@mail.gmail.com>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-net@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2009 12:54:27 -0000


On Tue, 7 Apr 2009, Sepherosa Ziehau wrote:

> On Sun, Apr 5, 2009 at 9:34 PM, Ivan Voras <ivoras@freebsd.org> wrote:
>> Robert Watson wrote:
>>>
>>> On Sun, 5 Apr 2009, Ivan Voras wrote:
>>>
>>>> I thought this has something to deal with NIC moderation (em) but
>>>> can't really explain it. The bad performance part (not the jump) is
>>>> also visible over the loopback interface.
>>>
>>> FYI, if you want high performance, you really want a card supporting 
>>> multiple input queues -- igb, cxgb, mxge, etc.  if_em-only cards are
>
> PCI-E em(4) supports 2 RX queues.  82571/82572 support 2 TX queues. I have 
> not tested multi-TX queues, but em(4) multi-RX queues work well in dfly 
> (tested with 82573 and 82571)

You may not have seen, but in FreeBSD 7.x and higher, we have a new if_igb 
driver to support more recent Intel gigabit devices, which now probes a few of 
the devices historically associated with if_em.  For example, on one of the 
boxes I use:

igb0: <Intel(R) PRO/1000 Network Connection version - 1.3.0> port 
0x3000-0x301f mem 
0xd8220000-0xd823ffff,0xd8200000-0xd821ffff,0xd8280000-0xd8283fff irq 32 at 
device 0.0 on pci8
igb0: Using MSIX interrupts with 3 vectors
igb0: [ITHREAD]
igb0: [ITHREAD]
igb0: [ITHREAD]
igb0: Ethernet address: 00:30:48:d2:ca:c2
igb1: <Intel(R) PRO/1000 Network Connection version - 1.3.0> port 
0x3020-0x303f mem 
0xd8260000-0xd827ffff,0xd8240000-0xd825ffff,0xd8284000-0xd8287fff irq 46 at 
device 0.1 on pci8
igb1: Using MSIX interrupts with 3 vectors
igb1: [ITHREAD]
igb1: [ITHREAD]
igb1: [ITHREAD]
igb1: Ethernet address: 00:30:48:d2:ca:c3
igb0: RX LRO Initialized
igb1: RX LRO Initialized

Robert N M Watson
Computer Laboratory
University of Cambridge

From owner-freebsd-net@FreeBSD.ORG  Tue Apr  7 12:56:02 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C945A10656D4;
	Tue,  7 Apr 2009 12:56:02 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id A27BF8FC22;
	Tue,  7 Apr 2009 12:56:02 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [65.122.17.41])
	by cyrus.watson.org (Postfix) with ESMTPS id 3ED3F46B9D;
	Tue,  7 Apr 2009 08:56:02 -0400 (EDT)
Date: Tue, 7 Apr 2009 13:56:02 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Barney Cordoba <barney_cordoba@yahoo.com>
In-Reply-To: <952316.35609.qm@web63906.mail.re1.yahoo.com>
Message-ID: <alpine.BSF.2.00.0904071354521.45341@fledge.watson.org>
References: <952316.35609.qm@web63906.mail.re1.yahoo.com>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-net@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2009 12:56:04 -0000


On Tue, 7 Apr 2009, Barney Cordoba wrote:

>> Have you tried LOCK_PROFILING?  It would quickly tell you if driver locks 
>> were a source of significant contention.  It works quite well...
>
> When I enabled LOCK_PROFILING my side modules, such as if_ibg, stopped 
> working. It seems that the ifnet structure or something changed with that 
> option enabled. Is there a way to sync this without having to integrate 
> everything into a specific kernel build?

LOCK_PROFILING changes the size of lock-related data structures, so requires 
both kernel and full set of modules to be rebuilt with the option.

Robert N M Watson
Computer Laboratory
University of Cambridge

From owner-freebsd-net@FreeBSD.ORG  Tue Apr  7 13:57:48 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 3BC811065675;
	Tue,  7 Apr 2009 13:57:48 +0000 (UTC)
	(envelope-from sepherosa@gmail.com)
Received: from yx-out-2324.google.com (yx-out-2324.google.com [74.125.44.28])
	by mx1.freebsd.org (Postfix) with ESMTP id C46988FC0A;
	Tue,  7 Apr 2009 13:57:47 +0000 (UTC)
	(envelope-from sepherosa@gmail.com)
Received: by yx-out-2324.google.com with SMTP id 8so1658717yxm.13
	for <multiple recipients>; Tue, 07 Apr 2009 06:57:47 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:in-reply-to:references
	:date:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding;
	bh=NRRxqRHX4tZnM+d5sybh74uSqZk7VEBGdro5w3v3oUM=;
	b=Y3xAq0Fz4k4FQxZO/DWcWfVJkSG+lKTCfSifAGsMHuPKq11Qo5LFRu1d9O1k2Pg0Pw
	IuX2B2iX2PMm33/GC9gQ69XYbh+gQJqXs+zS3YOPJKtW5xG3fGYR7OaQxKRPyYCZIbAp
	DYl8C6ZEUTFr26u2A3DP6zPPWZFAdyLLbeA80=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type:content-transfer-encoding;
	b=SWj8DIjSSxrbAt4j8gCxDx80o2TUMI3W0a0vRo57UW4Pwn4aSEdnaPAwWgNVwWmFQH
	+Mgkj+lksBwj3ZRxmzrDYzi6jptZSCiV+auJ1Clt/41JBuT646S5hAoUBhxTbCmAmZ9r
	0TjN1IK+DYgnwAR8Kub8Ev096YhJFCA1uQmJk=
MIME-Version: 1.0
Received: by 10.150.133.18 with SMTP id g18mr390437ybd.181.1239112667344; Tue, 
	07 Apr 2009 06:57:47 -0700 (PDT)
In-Reply-To: <alpine.BSF.2.00.0904071350520.45341@fledge.watson.org>
References: <gra7mq$ei8$1@ger.gmane.org>
	<alpine.BSF.2.00.0904051422280.12639@fledge.watson.org>
	<grac1s$p56$1@ger.gmane.org>
	<ea7b9c170904062157u1c457f27md565f9a95a51a705@mail.gmail.com>
	<alpine.BSF.2.00.0904071350520.45341@fledge.watson.org>
Date: Tue, 7 Apr 2009 21:57:47 +0800
Message-ID: <ea7b9c170904070657y1670fc80qd67dc1fa9cde2ff6@mail.gmail.com>
From: Sepherosa Ziehau <sepherosa@gmail.com>
To: Robert Watson <rwatson@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2009 13:57:48 -0000

On Tue, Apr 7, 2009 at 8:54 PM, Robert Watson <rwatson@freebsd.org> wrote:
>
> On Tue, 7 Apr 2009, Sepherosa Ziehau wrote:
>
>> On Sun, Apr 5, 2009 at 9:34 PM, Ivan Voras <ivoras@freebsd.org> wrote:
>>>
>>> Robert Watson wrote:
>>>>
>>>> On Sun, 5 Apr 2009, Ivan Voras wrote:
>>>>
>>>>> I thought this has something to deal with NIC moderation (em) but
>>>>> can't really explain it. The bad performance part (not the jump) is
>>>>> also visible over the loopback interface.
>>>>
>>>> FYI, if you want high performance, you really want a card supporting
>>>> multiple input queues -- igb, cxgb, mxge, etc.  if_em-only cards are
>>
>> PCI-E em(4) supports 2 RX queues.  82571/82572 support 2 TX queues. I have
>> not tested multi-TX queues, but em(4) multi-RX queues work well in dfly
>> (tested with 82573 and 82571)
>
> You may not have seen, but in FreeBSD 7.x and higher, we have a new if_igb
> driver to support more recent Intel gigabit devices, which now probes a few
> of the devices historically associated with if_em.  For example, on one of
> the boxes I use:

If I understand the code correctly, it only takes 82575 and 82576; I
don't have the hardware, else I would have already added dfly support
(with multi rx queues at least, it seems 82576 supports 16 RX queues
:)

8257{1/2/3} are still taken by em(4) in FreeBSD.  In dfly, I simply
forked em(4) (named emx) to create a special version for pci-e
devices, for which Intel published developers' manual.  I added
multi-rxqueue support to it (multi-txqueue support is planned) and
cleaned up the TX/RX path.  IMHO, 82571 is too widely used to be
ignored.

Best Regards,
sephe

-- 
Live Free or Die

From owner-freebsd-net@FreeBSD.ORG  Tue Apr  7 14:32:10 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id EF1BA10658CF
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 14:32:10 +0000 (UTC)
	(envelope-from ivoras@gmail.com)
Received: from mail-ew0-f171.google.com (mail-ew0-f171.google.com
	[209.85.219.171])
	by mx1.freebsd.org (Postfix) with ESMTP id 77D6F8FC18
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 14:32:10 +0000 (UTC)
	(envelope-from ivoras@gmail.com)
Received: by ewy19 with SMTP id 19so2320023ewy.43
	for <freebsd-net@freebsd.org>; Tue, 07 Apr 2009 07:32:09 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:sender:received:in-reply-to
	:references:from:date:x-google-sender-auth:message-id:subject:to:cc
	:content-type:content-transfer-encoding;
	bh=br+xAcGSEX1Epu8s1Rab+sL2rCJPJRd5KlaLH24qisI=;
	b=Tu/vsKjW19N70QXauqijB8U5G8W4UP1aY9GWqS3rdJQlz2/G65ZX/1j4sDqN8BiYT9
	+c99pvPe/g2H9LtN2+SaZb+HiNRbBaWN93PWxrva4ynS2KeKPEG0ydg4YzbSs2hWRRFm
	4QVAPWSN4BmxcLdcMSdV+XNmP/8zAKeKdMgOg=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:sender:in-reply-to:references:from:date
	:x-google-sender-auth:message-id:subject:to:cc:content-type
	:content-transfer-encoding;
	b=omaOyWPf9llhkg2UsUgiWKVkdTW7w9cMX000JntNqKhmnv/up6kJc4/ha4PcC2KUKX
	wxcu12bvs+r+jJRGNC6l8agMcrIBIlhF9fxruh6LuNyaLr1sDhybpD9RZQbVnx7XDn1p
	enOYbPaR6oGm/aJ7cFFHhiuPyIDh9GoFgskSg=
MIME-Version: 1.0
Sender: ivoras@gmail.com
Received: by 10.210.66.13 with SMTP id o13mr4122619eba.46.1239112867191; Tue, 
	07 Apr 2009 07:01:07 -0700 (PDT)
In-Reply-To: <ea7b9c170904070657y1670fc80qd67dc1fa9cde2ff6@mail.gmail.com>
References: <gra7mq$ei8$1@ger.gmane.org>
	<alpine.BSF.2.00.0904051422280.12639@fledge.watson.org> 
	<grac1s$p56$1@ger.gmane.org>
	<ea7b9c170904062157u1c457f27md565f9a95a51a705@mail.gmail.com> 
	<alpine.BSF.2.00.0904071350520.45341@fledge.watson.org>
	<ea7b9c170904070657y1670fc80qd67dc1fa9cde2ff6@mail.gmail.com>
From: Ivan Voras <ivoras@freebsd.org>
Date: Tue, 7 Apr 2009 16:00:52 +0200
X-Google-Sender-Auth: a33dba50616821f1
Message-ID: <9bbcef730904070700x6f38e83dka1fdc06c48c14111@mail.gmail.com>
To: Sepherosa Ziehau <sepherosa@gmail.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-net@freebsd.org, Robert Watson <rwatson@freebsd.org>
Subject: Re: Advice on a multithreaded netisr patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2009 14:32:12 -0000

2009/4/7 Sepherosa Ziehau <sepherosa@gmail.com>:

>  =C2=A0IMHO, 82571 is too widely used to be
> ignored.

+1 :)

From owner-freebsd-net@FreeBSD.ORG  Tue Apr  7 14:45:07 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6996010656BC
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 14:45:07 +0000 (UTC)
	(envelope-from bz@FreeBSD.org)
Received: from mail.cksoft.de (mail.cksoft.de [195.88.108.3])
	by mx1.freebsd.org (Postfix) with ESMTP id 20DB48FC16
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 14:45:06 +0000 (UTC)
	(envelope-from bz@FreeBSD.org)
Received: from localhost (amavis.fra.cksoft.de [192.168.74.71])
	by mail.cksoft.de (Postfix) with ESMTP id BE18F41C757;
	Tue,  7 Apr 2009 16:45:05 +0200 (CEST)
X-Virus-Scanned: amavisd-new at cksoft.de
Received: from mail.cksoft.de ([195.88.108.3])
	by localhost (amavis.fra.cksoft.de [192.168.74.71]) (amavisd-new,
	port 10024)
	with ESMTP id wGUda5G0FqVq; Tue,  7 Apr 2009 16:45:05 +0200 (CEST)
Received: by mail.cksoft.de (Postfix, from userid 66)
	id 69ABE41C730; Tue,  7 Apr 2009 16:45:05 +0200 (CEST)
Received: from maildrop.int.zabbadoz.net (maildrop.int.zabbadoz.net
	[10.111.66.10])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mail.int.zabbadoz.net (Postfix) with ESMTP id 0FFBF4448E6;
	Tue,  7 Apr 2009 14:44:07 +0000 (UTC)
Date: Tue, 7 Apr 2009 14:44:07 +0000 (UTC)
From: "Bjoern A. Zeeb" <bz@FreeBSD.org>
X-X-Sender: bz@maildrop.int.zabbadoz.net
To: sthaug@nethelp.no
In-Reply-To: <20090406.121959.74751582.sthaug@nethelp.no>
Message-ID: <20090407144311.F15361@maildrop.int.zabbadoz.net>
References: <20090405.231044.74688369.sthaug@nethelp.no>
	<20090405214757.E15361@maildrop.int.zabbadoz.net>
	<20090405215842.C15361@maildrop.int.zabbadoz.net>
	<20090406.121959.74751582.sthaug@nethelp.no>
X-OpenPGP-Key: 0x14003F198FEFA3E77207EE8D2B58B8F83CCF1842
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-net@freebsd.org
Subject: Re: IPv6 window scaling factor always 1 on initial SYN
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2009 14:45:08 -0000

On Mon, 6 Apr 2009, sthaug@nethelp.no wrote:

>> Can you try changing it to < sb_max) for IPv6 as well and see if
>> things work (better) for you?
>
> I changed it, and that worked like a dream. Now I get basically the
> same throughput with IPv4 and IPv6. There are of course still issues
> like lots of IPv6 tunnels that add extra latency - but that's not the
> fault of FreeBSD.
>
> Anyway, thanks for your work. Below is a context diff (against 7-STABLE
> cvsupped last night). Do we need a PR to get this into FreeBSD?

It's in HEAD now as of SVN r190800.

-- 
Bjoern A. Zeeb                      The greatest risk is not taking one.

From owner-freebsd-net@FreeBSD.ORG  Tue Apr  7 14:57:11 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 4A8AB10656CC
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 14:57:11 +0000 (UTC)
	(envelope-from sthaug@nethelp.no)
Received: from bizet.nethelp.no (bizet.nethelp.no [195.1.209.33])
	by mx1.freebsd.org (Postfix) with SMTP id 831A68FC13
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 14:57:10 +0000 (UTC)
	(envelope-from sthaug@nethelp.no)
Received: (qmail 25651 invoked from network); 7 Apr 2009 14:57:08 -0000
Received: from bizet.nethelp.no (HELO localhost) (195.1.209.33)
	by bizet.nethelp.no with SMTP; 7 Apr 2009 14:57:08 -0000
Date: Tue, 07 Apr 2009 16:57:08 +0200 (CEST)
Message-Id: <20090407.165708.74744827.sthaug@nethelp.no>
To: bz@FreeBSD.org
From: sthaug@nethelp.no
In-Reply-To: <20090407144311.F15361@maildrop.int.zabbadoz.net>
References: <20090405215842.C15361@maildrop.int.zabbadoz.net>
	<20090406.121959.74751582.sthaug@nethelp.no>
	<20090407144311.F15361@maildrop.int.zabbadoz.net>
X-Mailer: Mew version 3.3 on Emacs 21.3 / Mule 5.0 (SAKAKI)
Mime-Version: 1.0
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@freebsd.org
Subject: Re: IPv6 window scaling factor always 1 on initial SYN
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2009 14:57:13 -0000

> > I changed it, and that worked like a dream. Now I get basically the
> > same throughput with IPv4 and IPv6. There are of course still issues
> > like lots of IPv6 tunnels that add extra latency - but that's not the
> > fault of FreeBSD.
> >
> > Anyway, thanks for your work. Below is a context diff (against 7-STABLE
> > cvsupped last night). Do we need a PR to get this into FreeBSD?
> 
> It's in HEAD now as of SVN r190800.

Excellent news, thank you! And presumably we'll get a MFC after a
suitable settling time?

Steinar Haug, Nethelp consulting, sthaug@nethelp.no

From owner-freebsd-net@FreeBSD.ORG  Tue Apr  7 16:47:41 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E97791065692
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 16:47:41 +0000 (UTC)
	(envelope-from julian@elischer.org)
Received: from outH.internet-mail-service.net (outh.internet-mail-service.net
	[216.240.47.231])
	by mx1.freebsd.org (Postfix) with ESMTP id C7B0A8FC22
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 16:47:41 +0000 (UTC)
	(envelope-from julian@elischer.org)
Received: from idiom.com (mx0.idiom.com [216.240.32.160])
	by out.internet-mail-service.net (Postfix) with ESMTP id A5F10B98E3;
	Tue,  7 Apr 2009 09:47:41 -0700 (PDT)
X-Client-Authorized: MaGic Cook1e
X-Client-Authorized: MaGic Cook1e
X-Client-Authorized: MaGic Cook1e
X-Client-Authorized: MaGic Cook1e
Received: from julian-mac.elischer.org (home.elischer.org [216.240.48.38])
	by idiom.com (Postfix) with ESMTP id 5EE282D60E1;
	Tue,  7 Apr 2009 09:47:37 -0700 (PDT)
Message-ID: <49DB83CB.9070707@elischer.org>
Date: Tue, 07 Apr 2009 09:48:11 -0700
From: Julian Elischer <julian@elischer.org>
User-Agent: Thunderbird 2.0.0.21 (Macintosh/20090302)
MIME-Version: 1.0
To: barney_cordoba@yahoo.com
References: <952316.35609.qm@web63906.mail.re1.yahoo.com>
In-Reply-To: <952316.35609.qm@web63906.mail.re1.yahoo.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@freebsd.org, Robert Watson <rwatson@FreeBSD.org>,
	Ivan Voras <ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2009 16:47:42 -0000

Barney Cordoba wrote:
> 
> 
> 
> --- On Mon, 4/6/09, Robert Watson <rwatson@FreeBSD.org> wrote:
> 
>> From: Robert Watson <rwatson@FreeBSD.org>
>> Subject: Re: Advice on a multithreaded netisr  patch?
>> To: "Ivan Voras" <ivoras@freebsd.org>
>> Cc: freebsd-net@freebsd.org
>> Date: Monday, April 6, 2009, 7:59 AM
>> On Mon, 6 Apr 2009, Ivan Voras wrote:
>>
>>>>> I'd like to understand more. If (in
>> netisr) I have a mbuf with headers, is this data already
>> transfered from the card or is it magically "not here
>> yet"?
>>>> A lot depends on the details of the card and
>> driver.  The driver will take cache misses on the descriptor
>> ring entry, if it's not already in cache, and the link
>> layer will take a cache miss on the front of the ethernet
>> frame in the cluster pointed to by the mbuf header as part
>> of its demux. What happens next depends on your dispatch
>> model and cache line size. Let's make a few simplifying
>> assumptions that are mostly true:
>>> So, a mbuf can reference data not yet copied from the
>> NIC hardware? I'm specifically trying to undestand what
>> m_pullup() does.
>>
>> I think we're talking slightly at cross purposes. 
>> There are two transfers of interest:
>>
>> (1) DMA of the packet data to main memory from the NIC
>> (2) Servicing of CPU cache misses to access data in main
>> memory
>>
>> By the time you receive an interrupt, the DMA is complete,
>> so once you believe a packet referenced by the descriptor
>> ring is done, you don't have to wait for DMA.  However,
>> the packet data is in main memory rather than your CPU
>> cache, so you'll need to take a cache miss in order to
>> retrieve it.  You don't want to prefetch before you know
>> the packet data is there, or you may prefetch stale data
>> from the previous packet sent or received from the cluster.
>>
>> m_pullup() has to do with mbuf chain memory contiguity
>> during packet processing.  The usual usage is something
>> along the following lines:
>>
>> 	struct whatever *w;
>>
>> 	m = m_pullup(m, sizeof(*w));
>> 	if (m == NULL)
>> 		return;
>> 	w = mtod(m, struct whatever *);
>>
>> m_pullup() here ensures that the first sizeof(*w) bytes of
>> mbuf data are contiguously stored so that the cast of w to
>> m's data will point at a complete structure we can use
>> to interpret packet data.  In the common case in the receipt
>> path, m_pullup() should be a no-op, since almost all drivers
>> receive data in a single cluster.
>>
>> However, there are cases where it might not happen, such as
>> loopback traffic where unusual encapsulation is used,
>> leading to a call to M_PREPEND() that inserts a new mbuf on
>> the front of the chain, which is later m_defrag()'d
>> leading to a higher level header crossing a boundary or the
>> like.
>>
>> This issue is almost entirely independent from things like
>> the cache line miss issue, unless you hit the uncommon case
>> of having to do work in m_pullup(), in which case life
>> sucks.
>>
>> It would be useful to use DTrace to profile a number of the
>> workfull m_foo() functions to make sure we're not
>> hitting them in normal workloads, btw.
>>
>>>>> As the card and the OS can already process
>> many packets per second for
>>>>> something fairly complex as routing
>>>>> (http://www.tancsa.com/blast.html), and TCP
>> chokes swi:net at 100% of
>>>>> a core, isn't this indication there's
>> certainly more space for
>>>>> improvement even with a single-queue
>> old-fashioned NICs?
>>>> Maybe.  It depends on the relative costs of local
>> processing vs
>>>> redistributing the work, which involves
>> schedulers, IPIs, additional
>>>> cache misses, lock contention, and so on.  This
>> means there's a period
>>>> where it can't possibly be a win, and then at
>> some point it's a win as
>>>> long as the stack scales.  This is essentially the
>> usual trade-off in
>>>> using threads and parallelism: does the benefit of
>> multiple parallel
>>>> execution units make up for the overheads of
>> synchronization and data
>>>> migration?
>>> Do you have any idea at all why I'm seeing the
>> weird difference of netstat packets per second (250,000) and
>> my application's TCP performance (< 1,000 pps)?
>> Summary: each packet is guaranteed to be a whole message
>> causing a transaction in the application - without the
>> changes I see pps almost identical to tps. Even if the
>> source of netstat statistics somehow manages to count
>> packets multiple time (I don't see how that can happen),
>> no relation can describe differences this huge. It almost
>> looks like something in the upper layers is discarding
>> packets (also not likely: TCP timeouts would occur and the
>> application wouldn't be able to push 250,000 pps) - but
>> what? Where to look?
>>
>> Is this for the loopback workload?  If so, remember that
>> there may be some other things going on:
>>
>> - Every packet is processed at least two times: once went
>> sent, and then again
>>   when it's received.
>>
>> - A TCP segment will need to be ACK'd, so if you're
>> sending data in chunks in
>>   one direction, the ACKs will not be piggy-backed on
>> existing data tranfers,
>>   and instead be sent independently, hitting the network
>> stack two more times.
>>
>> - Remember that TCP works to expand its window, and then
>> maintains the highest
>>   performance it can by bumping up against the top of
>> available bandwidth
>>   continuously.  This involves detecting buffer limits by
>> generating packets
>>   that can't be sent, adding to the packet count.  With
>> loopback traffic, the
>>   drop point occurs when you exceed the size of the
>> netisr's queue for IP, so
>>   you might try bumping that from the default to something
>> much larger.
>>
>> And nothing beats using tcpdump -- have you tried
>> tcpdumping the loopback to see what is actually being sent? 
>> If not, that's always educational -- perhaps something
>> weird is going on with delayed ACKs, etc.
>>
>>> You mean for the general code? I purposely don't
>> lock my statistics variables because I'm not that
>> interested in exact numbers (orders of magnitude are
>> relevant). As far as I understand, unlocked "x++"
>> should be trivially fast in this case?
>>
>> No.  x++ is massively slow if executed in parallel across
>> many cores on a variable in a single cache line.  See my
>> recent commit to kern_tc.c for an example: the updating of
>> trivial statistics for the kernel time calls reduced 30m
>> syscalls/second to 3m syscalls/second due to heavy
>> contention on the cache line holding the statistic.  One of
>> my goals for 8.0 is to fix this problem for IP and TCP
>> layers, and ideally also ifnet but we'll see.  We should
>> be maintaining those stats per-CPU and then aggregating to
>> report them to userspace.  This is what we already do for a
>> number of system stats -- UMA and kernel malloc, syscall and
>> trap counters, etc.
>>
>>>> - Use cpuset to pin ithreads, the netisr, and
>> whatever else, to specific
>>>> cores
>>>>   so that they don't migrate, and if your
>> system uses HTT, experiment with
>>>>   pinning the ithread and the netisr on different
>> threads on the same
>>>> core, or
>>>>   at least, different cores on the same die.
>>> I'm using em hardware; I still think there's a
>> possibility I'm fighting the driver in some cases but
>> this has priority #2.
>>
>> Have you tried LOCK_PROFILING?  It would quickly tell you
>> if driver locks were a source of significant contention.  It
>> works quite well...
> 
> When I enabled LOCK_PROFILING my side modules, such as if_ibg, 
> stopped working. It seems that the ifnet structure or something 
> changed with that option enabled. Is there a way to sync this without
> having to integrate everything into a specific kernel build?
> 

no, I don't think there is any other way..

last time I checked the mutex structure changed size which meant that
almost everything else that included a mutex changed size.
That may not be true now but I haven't checked..

> Barney
> 
> 
>       
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"


From owner-freebsd-net@FreeBSD.ORG  Tue Apr  7 20:12:04 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id F0E48106566B
	for <net@freebsd.org>; Tue,  7 Apr 2009 20:12:04 +0000 (UTC)
	(envelope-from wollman@hergotha.csail.mit.edu)
Received: from hergotha.csail.mit.edu (hergotha.csail.mit.edu [66.92.79.170])
	by mx1.freebsd.org (Postfix) with ESMTP id A13A38FC14
	for <net@freebsd.org>; Tue,  7 Apr 2009 20:12:04 +0000 (UTC)
	(envelope-from wollman@hergotha.csail.mit.edu)
Received: from hergotha.csail.mit.edu (localhost [127.0.0.1])
	by hergotha.csail.mit.edu (8.14.2/8.14.2) with ESMTP id n37KC3Wb050335; 
	Tue, 7 Apr 2009 16:12:03 -0400 (EDT)
	(envelope-from wollman@hergotha.csail.mit.edu)
Received: (from wollman@localhost)
	by hergotha.csail.mit.edu (8.14.2/8.13.8/Submit) id n37KC3lA050334;
	Tue, 7 Apr 2009 16:12:03 -0400 (EDT) (envelope-from wollman)
Date: Tue, 7 Apr 2009 16:12:03 -0400 (EDT)
From: Garrett Wollman <wollman@hergotha.csail.mit.edu>
Message-Id: <200904072012.n37KC3lA050334@hergotha.csail.mit.edu>
To: rwatson@freebsd.org
X-Newsgroups: mit.lcs.mail.freebsd-net
In-Reply-To: <alpine.BSF.2.00.0904061238250.34905@fledge.watson.org>
References: <gra7mq$ei8$1@ger.gmane.org>
	<alpine.BSF.2.00.0904051422280.12639@fledge.watson.org>
	<grac1s$p56$1@ger.gmane.org>
	<alpine.BSF.2.00.0904051440460.12639@fledge.watson.org>
	<grappq$tsg$1@ger.gmane.org>
	<alpine.BSF.2.00.0904052243250.34905@fledge.watson.org>
	<grbcfg$poe$1@ger.gmane.org>
Organization: None
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0
	(hergotha.csail.mit.edu [127.0.0.1]);
	Tue, 07 Apr 2009 16:12:03 -0400 (EDT)
X-Spam-Status: No, score=-1.4 required=5.0 tests=ALL_TRUSTED
	autolearn=disabled version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on
	hergotha.csail.mit.edu
Cc: net@freebsd.org
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2009 20:12:05 -0000

In article <alpine.BSF.2.00.0904061238250.34905@fledge.watson.org>,
Robert Watson writes:

>m_pullup() has to do with mbuf chain memory contiguity during packet 
>processing.

Historically, m_pullup() also had one other extremely important
function: to make sure that the header data you were about to modify
was not stored in a (possibly shared) cluster.  Thus, in the input
path for a typical driver which puts the whole packet into a cluster,
the very first m_pullup() would allocate a new plain mbuf, carefully
align the data pointer to allow for both prepending more headers and
pulling more header data out, and copy the requested data into the
internal buffer of the mbuf.

-GAWollman

From owner-freebsd-net@FreeBSD.ORG  Tue Apr  7 21:48:59 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2CF17106566B
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 21:48:59 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: from web63904.mail.re1.yahoo.com (web63904.mail.re1.yahoo.com
	[69.147.97.119]) by mx1.freebsd.org (Postfix) with SMTP id DB67D8FC1D
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 21:48:58 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: (qmail 8939 invoked by uid 60001); 7 Apr 2009 21:48:58 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
	t=1239140938; bh=edKaXXlheTRCU7a0Hb5pxmKQ2ERAT4MbHjCQEYi7ygs=;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=ap2rM8h0M51xEXthOsTANWjnGiPoD+kjw5015F2W3Ns736JanDrpJS0v64p8zJzzfouEuK/eUnyenN3fpupw8zy9OxgnXuwiEhit4lIPLJKcxZnJJ7ITdeL2zEZNzP0aVncU18G1sehJWG+YP15YLpm0SVXp/PM99R/NclQ2uPQ=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=qJbvV67MgqUsc9v7u8Pep/lKk+M4r90MEg4PXMGBHzYdoU273V0Bw3D8mHj0ZWCtDUcYBLrT2AYp8uwz43y7mHk+UGk8fAIJCUzZzJ7QcDyKJwWP6a9u6szqRub9u8spnVUn/MYBxxlvKaQmMLrRD4gplm/06vWOP9ezfajw5fQ=;
Message-ID: <409843.2186.qm@web63904.mail.re1.yahoo.com>
X-YMail-OSG: dL.jcakVM1k6TmJf3INAATWN1iNL8FVjrttDiroG4bbTPvxKraJAX2hujlMVzNKZxUNKOvi5YSFsmzlHn2QcmQOkIQTU9x_cUPDxLm8RZ.WDuCO9cT65a.4wt5jqbecm.YmLJgT8jA8BZJeqpwLNVmxrcd42e1FXbJrki1TrMc.opHujVQPEHUp1HHDj0Jux1oMbIPmI1qUQj.objoYLcwQb6Ve06xS7JPLjZyVA0tX3WmLeZSmQuK3DfGffl1t2iuouKo8rLuqabS3lIF3yDnfpTftWHtT71oHHY8XCEQw6IzrU3s13nA5YGJAR
Received: from [98.242.222.229] by web63904.mail.re1.yahoo.com via HTTP;
	Tue, 07 Apr 2009 14:48:58 PDT
X-Mailer: YahooMailWebService/0.7.289.1
Date: Tue, 7 Apr 2009 14:48:58 -0700 (PDT)
From: Barney Cordoba <barney_cordoba@yahoo.com>
To: Robert Watson <rwatson@freebsd.org>, Sepherosa Ziehau <sepherosa@gmail.com>
In-Reply-To: <ea7b9c170904070657y1670fc80qd67dc1fa9cde2ff6@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: freebsd-net@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: barney_cordoba@yahoo.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2009 21:48:59 -0000


--- On Tue, 4/7/09, Sepherosa Ziehau <sepherosa@gmail.com> wrote:

> From: Sepherosa Ziehau <sepherosa@gmail.com>
> Subject: Re: Advice on a multithreaded netisr patch?
> To: "Robert Watson" <rwatson@freebsd.org>
> Cc: freebsd-net@freebsd.org, "Ivan Voras" <ivoras@freebsd.org>
> Date: Tuesday, April 7, 2009, 9:57 AM
> On Tue, Apr 7, 2009 at 8:54 PM, Robert Watson
> <rwatson@freebsd.org> wrote:
> >
> > On Tue, 7 Apr 2009, Sepherosa Ziehau wrote:
> >
> >> On Sun, Apr 5, 2009 at 9:34 PM, Ivan Voras
> <ivoras@freebsd.org> wrote:
> >>>
> >>> Robert Watson wrote:
> >>>>
> >>>> On Sun, 5 Apr 2009, Ivan Voras wrote:
> >>>>
> >>>>> I thought this has something to deal
> with NIC moderation (em) but
> >>>>> can't really explain it. The bad
> performance part (not the jump) is
> >>>>> also visible over the loopback
> interface.
> >>>>
> >>>> FYI, if you want high performance, you
> really want a card supporting
> >>>> multiple input queues -- igb, cxgb, mxge,
> etc.  if_em-only cards are
> >>
> >> PCI-E em(4) supports 2 RX queues.  82571/82572
> support 2 TX queues. I have
> >> not tested multi-TX queues, but em(4) multi-RX
> queues work well in dfly
> >> (tested with 82573 and 82571)
> >
> > You may not have seen, but in FreeBSD 7.x and higher,
> we have a new if_igb
> > driver to support more recent Intel gigabit devices,
> which now probes a few
> > of the devices historically associated with if_em. 
> For example, on one of
> > the boxes I use:
> 
> If I understand the code correctly, it only takes 82575 and
> 82576; I
> don't have the hardware, else I would have already
> added dfly support
> (with multi rx queues at least, it seems 82576 supports 16
> RX queues
> :)

Regarding if_igb:

1) Multiple TX queues are not supported. There's some hokey code to
test, but it doesn't properly separate flows to the queues.
2) 2 Rx queues don't work, so only 1 and 4 work
3) With 4 queues, it just sucks up CPU under heavy load on 4 cpus. It will
blow 4 cpus at a lower load than em will with 1
4) You'll need to fix DMA setup, as it sets the alignment requirement
to PAGE_SIZE. I haven't been able to convince Jack that its wrong, not
that I've tried very hard since its easy to just fix myself.

Barney


From owner-freebsd-net@FreeBSD.ORG  Tue Apr  7 21:56:26 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7C72A1065743
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 21:56:26 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: from web63906.mail.re1.yahoo.com (web63906.mail.re1.yahoo.com
	[69.147.97.121]) by mx1.freebsd.org (Postfix) with SMTP id 1FC998FC1F
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 21:56:25 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: (qmail 25587 invoked by uid 60001); 7 Apr 2009 21:56:25 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
	t=1239141385; bh=KBo73T9ilm4q+IXA55Itnp76EQ8NJVXR0cQznLgkga8=;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=FSbp4MK2Wm9w/5mhmOuhy+rNc+iovbGLZfJy1WcpScTwTvCM1JjnXsA87ObvUrIS2clbEycvW8I+8Db1srfCmoES5LSZQCrGIp5kf3Oq8Fs37RhD26g75wGGhaK4tCY7vccccjRInzWrO0e24kPEDIed/6utNKX7rvRgSZVaP/o=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=pGjMMLcxwmWigyEDeAggOYyQpwyEKz+EbMkNxxBIUcUcniibWmhR6kRufteM3FpTKoFajICPNKGxT9aCIe5Pssl1U8i8cu9Aljwxra29jXl2a4aoJ2ElSlmpO2K1nrDdyCA4tIoEMqKH7+YNY0blOPTmqdDhUJJpi8qN3EAXRzQ=;
Message-ID: <497906.25422.qm@web63906.mail.re1.yahoo.com>
X-YMail-OSG: COc4omAVM1llwXEVxppqfUU04BejMB9gaw5LPI1ufEdVDCHGPgIfYhVm1r3zts.hH1E4wx9ct6hwueImTOxpmChDAf7bxLuf5Sc8qzFwB0f8FninXZIpU5bS7R2Euyq.4QUtBp_rhCdccIXle_WNqw4cvVofmhvanJhh.YMMvey6CWNtpYAnMKzwftr0RWxoV6ll1RV7uFLMp_HZVNRerY4BeZEse7fwtfT.M0hH3_MZekW5MD1Ni_GjxAD2zz6Kj39CmeqUSIn_6Snh02WFTAPE2BjDFf8jub.Ry0Z6XQwvGF_fkAEwSHbGLDg3kSuHrsn_UXODc5PYJWIx2st.QJCz
Received: from [98.242.222.229] by web63906.mail.re1.yahoo.com via HTTP;
	Tue, 07 Apr 2009 14:56:25 PDT
X-Mailer: YahooMailWebService/0.7.289.1
Date: Tue, 7 Apr 2009 14:56:25 -0700 (PDT)
From: Barney Cordoba <barney_cordoba@yahoo.com>
To: Robert Watson <rwatson@FreeBSD.org>
In-Reply-To: <alpine.BSF.2.00.0904071354521.45341@fledge.watson.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: freebsd-net@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: barney_cordoba@yahoo.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2009 21:56:27 -0000


--- On Tue, 4/7/09, Robert Watson <rwatson@FreeBSD.org> wrote:

> From: Robert Watson <rwatson@FreeBSD.org>
> Subject: Re: Advice on a multithreaded netisr  patch?
> To: "Barney Cordoba" <barney_cordoba@yahoo.com>
> Cc: freebsd-net@freebsd.org, "Ivan Voras" <ivoras@freebsd.org>
> Date: Tuesday, April 7, 2009, 8:56 AM
> On Tue, 7 Apr 2009, Barney Cordoba wrote:
> 
> >> Have you tried LOCK_PROFILING?  It would quickly
> tell you if driver locks were a source of significant
> contention.  It works quite well...
> > 
> > When I enabled LOCK_PROFILING my side modules, such as
> if_ibg, stopped working. It seems that the ifnet structure
> or something changed with that option enabled. Is there a
> way to sync this without having to integrate everything into
> a specific kernel build?
> 
> LOCK_PROFILING changes the size of lock-related data
> structures, so requires both kernel and full set of modules
> to be rebuilt with the option.

It might be good to mention this in the man page. Most 3rd party
drivers build stand-alone, and even if you pull down the latest
drivers from intel or broadcom they're usually built out of the 
kernel build. Its pretty frustrating to have random things failing,
mbuf leaks, etc without any warning.

Barney


From owner-freebsd-net@FreeBSD.ORG  Tue Apr  7 22:00:21 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2FE24106578D
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 22:00:21 +0000 (UTC)
	(envelope-from freebsd-net@m.gmane.org)
Received: from ciao.gmane.org (main.gmane.org [80.91.229.2])
	by mx1.freebsd.org (Postfix) with ESMTP id A0A098FC08
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 22:00:20 +0000 (UTC)
	(envelope-from freebsd-net@m.gmane.org)
Received: from list by ciao.gmane.org with local (Exim 4.43)
	id 1LrJL8-0005TA-HE
	for freebsd-net@freebsd.org; Tue, 07 Apr 2009 22:00:19 +0000
Received: from 93-141-119-106.adsl.net.t-com.hr ([93.141.119.106])
	by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
	id 1AlnuQ-0007hv-00
	for <freebsd-net@freebsd.org>; Tue, 07 Apr 2009 22:00:18 +0000
Received: from ivoras by 93-141-119-106.adsl.net.t-com.hr with local (Gmexim
	0.1 (Debian)) id 1AlnuQ-0007hv-00
	for <freebsd-net@freebsd.org>; Tue, 07 Apr 2009 22:00:18 +0000
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-net@freebsd.org
From: Ivan Voras <ivoras@freebsd.org>
Date: Tue, 07 Apr 2009 23:59:34 +0200
Lines: 40
Message-ID: <grgid4$u6c$1@ger.gmane.org>
References: <ea7b9c170904070657y1670fc80qd67dc1fa9cde2ff6@mail.gmail.com>
	<409843.2186.qm@web63904.mail.re1.yahoo.com>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature";
	boundary="------------enigB390DE759E042013D08A7D71"
X-Complaints-To: usenet@ger.gmane.org
X-Gmane-NNTP-Posting-Host: 93-141-119-106.adsl.net.t-com.hr
User-Agent: Thunderbird 2.0.0.21 (Windows/20090302)
In-Reply-To: <409843.2186.qm@web63904.mail.re1.yahoo.com>
X-Enigmail-Version: 0.95.7
Sender: news <news@ger.gmane.org>
Subject: Re: Advice on a multithreaded netisr patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2009 22:00:22 -0000

This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enigB390DE759E042013D08A7D71
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Barney Cordoba wrote:

> 1) Multiple TX queues are not supported. There's some hokey code to
> test, but it doesn't properly separate flows to the queues.
> 2) 2 Rx queues don't work, so only 1 and 4 work
> 3) With 4 queues, it just sucks up CPU under heavy load on 4 cpus. It w=
ill
> blow 4 cpus at a lower load than em will with 1
> 4) You'll need to fix DMA setup, as it sets the alignment requirement
> to PAGE_SIZE. I haven't been able to convince Jack that its wrong, not
> that I've tried very hard since its easy to just fix myself.

Reading this thread it looks like the development of both Intel drivers
is a bit stalled, doesn't it? AFAIK the em driver is also
semi-officially abandoned, and both from my experience and others it
looks like new development and patches are being rejected. Time to shop
other hardware?


--------------enigB390DE759E042013D08A7D71
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAknbzNIACgkQldnAQVacBcikhgCfesB1stCznijfA0tadxj3CjtE
Nj8AnRXVnKZT8gLCDh4EODY9JM2ICE5p
=x0kZ
-----END PGP SIGNATURE-----

--------------enigB390DE759E042013D08A7D71--


From owner-freebsd-net@FreeBSD.ORG  Tue Apr  7 22:24:17 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D3FE310656C7
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 22:24:17 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: from web63901.mail.re1.yahoo.com (web63901.mail.re1.yahoo.com
	[69.147.97.116]) by mx1.freebsd.org (Postfix) with SMTP id 8E2668FC15
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 22:24:17 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: (qmail 65588 invoked by uid 60001); 7 Apr 2009 22:24:16 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
	t=1239143056; bh=SlEu34A7VdbgZUxz38n+v990JMVuEfGsN87WKdh5nvk=;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type;
	b=nyPuwfNQn73q3EDyuh6x8tyxlIVsbQlnQeXFmuz804nVc38iBrFUxDfC2p41R9oe5RW/eC/oVa1xzyU9Q2oMq9OaOXX7IkWjjSXnwsM+1Wo3SqzaBcB6ogCvMODgRVOvPlQP4TIfS+4CXDWgjJ+5eRdvlQ9p6iXgX3BSB8cqBdM=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type;
	b=JnBzAqR8QcOsCqF+ifDSUMvF4b9EJegkhyQGXbJ64gqtkGDNAAgU4e/gHNnZglX3ePbEQagJCmn/MniVT8wuGEc6CihHAjaF3A+aoMOtJIK/WOcgVIcvUJdB5VK5UoNr+W8OdOEp7cFKQdcabJw1Lb1Urr9xV9+Yr8OOof9Of98=;
Message-ID: <900824.65358.qm@web63901.mail.re1.yahoo.com>
X-YMail-OSG: Lx63aHgVM1k9rc6b0Xup8J3ieUOaCnx7OiNoNhJAksQVPc6OI8dHNDwzefpR4zCQWOsicFm9k8H5kK0J3fzeeY4B57Tu5NsiG4xcI70OZry9GlwFq3eeEkDwCtw85C7cb5tTNL241vgE6MaM.z6GUwDDuDuEGmbHrYpeTPpggajUqKWbs80PsY8ncjp9vxrDDzta4.uR2PySp.bpHJkbftCdgGcmSJq3ZZRvOSnZbyiy8T4.2QdPn3SfNTWfEJpqGBEVbHuCb3QgPvgeEuqeB1sHLHeKrhmGY.5ud6uWWEwHeW66d4cYuYQV2aFxwfqhfwqEH5_Ec.2HQUzFfxYzNCAd
Received: from [98.242.222.229] by web63901.mail.re1.yahoo.com via HTTP;
	Tue, 07 Apr 2009 15:24:16 PDT
X-Mailer: YahooMailWebService/0.7.289.1
Date: Tue, 7 Apr 2009 15:24:16 -0700 (PDT)
From: Barney Cordoba <barney_cordoba@yahoo.com>
To: freebsd-net@freebsd.org, Ivan Voras <ivoras@freebsd.org>
In-Reply-To: <grgid4$u6c$1@ger.gmane.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: 
Subject: Re: Advice on a multithreaded netisr patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: barney_cordoba@yahoo.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2009 22:24:18 -0000


--- On Tue, 4/7/09, Ivan Voras <ivoras@freebsd.org> wrote:

> From: Ivan Voras <ivoras@freebsd.org>
> Subject: Re: Advice on a multithreaded netisr patch?
> To: freebsd-net@freebsd.org
> Date: Tuesday, April 7, 2009, 5:59 PM
> Barney Cordoba wrote:
> 
> > 1) Multiple TX queues are not supported. There's
> some hokey code to
> > test, but it doesn't properly separate flows to
> the queues.
> > 2) 2 Rx queues don't work, so only 1 and 4 work
> > 3) With 4 queues, it just sucks up CPU under heavy
> load on 4 cpus. It will
> > blow 4 cpus at a lower load than em will with 1
> > 4) You'll need to fix DMA setup, as it sets the
> alignment requirement
> > to PAGE_SIZE. I haven't been able to convince Jack
> that its wrong, not
> > that I've tried very hard since its easy to just
> fix myself.
> 
> Reading this thread it looks like the development of both
> Intel drivers
> is a bit stalled, doesn't it? AFAIK the em driver is
> also
> semi-officially abandoned, and both from my experience and
> others it
> looks like new development and patches are being rejected.
> Time to shop
> other hardware?

To be fair, the OS doesn't really support multiqueue yet, or has
for only a few hours, so lets not go crazy.

It makes a lot more sense to have someone on the "team" work with
Jack on improving the performance and working out the kinks. When
I asked Jack about the poor performance of if_igb, he indicated that
Intel's position is that the drivers are "just samples", which really
doesn't give anyone much confidence that they want to run their business
on them. You already  have Jack doing all of the hard work; that is 
supporting the new-chip-per-week that intel puts out, so it seems to 
me the best strategy would be to try to convince Intel that its in
their best interest to have drivers that work well so people don't 
think that their hardware stinks.

As an example, the Chelsio 10gb bypass card is $3495. and an Intel
card is ~$1000, so its a big win for the community as a whole to have
good intel drivers going forward.

My work is commercially proprietary so I can't share my code, but
I can certainly share ideas on things that I've tested and discovered.

Barney


From owner-freebsd-net@FreeBSD.ORG  Tue Apr  7 22:52:51 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 906051065672;
	Tue,  7 Apr 2009 22:52:51 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 6C5918FC0A;
	Tue,  7 Apr 2009 22:52:51 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [65.122.17.41])
	by cyrus.watson.org (Postfix) with ESMTPS id ECB2E46BA6;
	Tue,  7 Apr 2009 18:52:50 -0400 (EDT)
Date: Tue, 7 Apr 2009 23:52:50 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Barney Cordoba <barney_cordoba@yahoo.com>
In-Reply-To: <497906.25422.qm@web63906.mail.re1.yahoo.com>
Message-ID: <alpine.BSF.2.00.0904072352250.85326@fledge.watson.org>
References: <497906.25422.qm@web63906.mail.re1.yahoo.com>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-net@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2009 22:52:52 -0000


On Tue, 7 Apr 2009, Barney Cordoba wrote:

>>> When I enabled LOCK_PROFILING my side modules, such as
>> if_ibg, stopped working. It seems that the ifnet structure or something 
>> changed with that option enabled. Is there a way to sync this without 
>> having to integrate everything into a specific kernel build?
>>
>> LOCK_PROFILING changes the size of lock-related data structures, so 
>> requires both kernel and full set of modules to be rebuilt with the option.
>
> It might be good to mention this in the man page. Most 3rd party drivers 
> build stand-alone, and even if you pull down the latest drivers from intel 
> or broadcom they're usually built out of the kernel build. Its pretty 
> frustrating to have random things failing, mbuf leaks, etc without any 
> warning.

>From the man page:

NOTES
      The LOCK_PROFILING option increases the size of struct lock_object, so a
      kernel built with that option will not work with modules built without
      it.

Robert N M Watson
Computer Laboratory
University of Cambridge

From owner-freebsd-net@FreeBSD.ORG  Tue Apr  7 23:00:32 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 8C0951065674
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 23:00:32 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: from web63907.mail.re1.yahoo.com (web63907.mail.re1.yahoo.com
	[69.147.97.122]) by mx1.freebsd.org (Postfix) with SMTP id 2D9D08FC1D
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 23:00:31 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: (qmail 29927 invoked by uid 60001); 7 Apr 2009 23:00:31 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
	t=1239145231; bh=AucuYJ5g0s5VtRGOipkOMWqmY/pXZLBiPwzwj0LZ27w=;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=UtHCDwgWW8xsfqihpoCWbbeOwBXQ4zNJa9/Hu4HH6EbigYbFPgOiIquLYR3Wsks4no3LtW6f5FrxB/VmjChpjLdnbsH+VBOnV1/ESN39mWasARzRY6mcXKuSjUVkIL0EkK0sdlKhA7ChugYx1gaRNcqEQXXkWdqlTj3AIx18Vsc=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=6NA9LfSlNWYVhZPMYn+e+Y86PMvQolM2S78IczthrghDrO1XidAV2u8iNw1DGGWh8/8Uw6biEbQjsC1SBfeQeCYKt191mq2QJcdNMrkC4aqFoMknpaRI6XdSfV0oTuBcR/Ymfrxk5GCfTh4iOtBdLGhdtHqQBbbGb4K30VmZ6EU=;
Message-ID: <532949.28323.qm@web63907.mail.re1.yahoo.com>
X-YMail-OSG: gAm5F18VM1na0rTIhXdcTdBH7xEGGb2s1XZa66t8gdJTrriaMCwk8i7JUwTfG8KxLQmWmBtztFV9tXSW9WuCoiVjbprZ4q2Yo_Z1Fd2.evF_Wpynbu1HNSxxkYU2Efh._cSVZ.x6xMob4n7hPFX7xyNK5nA78TpJZCEKYxrh5ybk73fIY.WMgjLOjS8tLmCAoZLdfyUyT3UJtAhPs6t5Ki5jx4555wPKONyq3gsKpMjUj9GVF3Uzow0_hCMhc7931L4IEhXZMuGgoCBOR.NE9UZ5TaUnMfDBjVjsNXYAfQYw98Igq07KUqfSKhYrwBW5By1kHJ3QIF8uV3DW
Received: from [98.242.222.229] by web63907.mail.re1.yahoo.com via HTTP;
	Tue, 07 Apr 2009 16:00:31 PDT
X-Mailer: YahooMailWebService/0.7.289.1
Date: Tue, 7 Apr 2009 16:00:31 -0700 (PDT)
From: Barney Cordoba <barney_cordoba@yahoo.com>
To: Robert Watson <rwatson@FreeBSD.org>
In-Reply-To: <alpine.BSF.2.00.0904072352250.85326@fledge.watson.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: freebsd-net@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: barney_cordoba@yahoo.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2009 23:00:33 -0000


--- On Tue, 4/7/09, Robert Watson <rwatson@FreeBSD.org> wrote:

> From: Robert Watson <rwatson@FreeBSD.org>
> Subject: Re: Advice on a multithreaded netisr  patch?
> To: "Barney Cordoba" <barney_cordoba@yahoo.com>
> Cc: freebsd-net@freebsd.org, "Ivan Voras" <ivoras@freebsd.org>
> Date: Tuesday, April 7, 2009, 6:52 PM
> On Tue, 7 Apr 2009, Barney Cordoba wrote:
> 
> >>> When I enabled LOCK_PROFILING my side modules,
> such as
> >> if_ibg, stopped working. It seems that the ifnet
> structure or something changed with that option enabled. Is
> there a way to sync this without having to integrate
> everything into a specific kernel build?
> >> 
> >> LOCK_PROFILING changes the size of lock-related
> data structures, so requires both kernel and full set of
> modules to be rebuilt with the option.
> > 
> > It might be good to mention this in the man page. Most
> 3rd party drivers build stand-alone, and even if you pull
> down the latest drivers from intel or broadcom they're
> usually built out of the kernel build. Its pretty
> frustrating to have random things failing, mbuf leaks, etc
> without any warning.
> 
> From the man page:
> 
> NOTES
>      The LOCK_PROFILING option increases the size of struct
> lock_object, so a
>      kernel built with that option will not work with
> modules built without
>      it.

Nice work. Its not in the 7.0 man page, unfortunately for me :(

BC


From owner-freebsd-net@FreeBSD.ORG  Tue Apr  7 23:01:51 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 29D131065672
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 23:01:51 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: from web63908.mail.re1.yahoo.com (web63908.mail.re1.yahoo.com
	[69.147.97.123]) by mx1.freebsd.org (Postfix) with SMTP id CAE7C8FC1F
	for <freebsd-net@freebsd.org>; Tue,  7 Apr 2009 23:01:50 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: (qmail 65348 invoked by uid 60001); 7 Apr 2009 23:01:50 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
	t=1239145310; bh=TNCarjbeDz92XOg0Nbmnx6LOTPn3fJ52QCo8dUdaz4w=;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=Po/Zp4CUF5juQI8QykxyR9OjTT9Ka8FKm5nNiBUXfbU7y2QClAqUpT/SdM9ehGEutCiH+oX/7Np8n+pOVXRHaOpyNJ7PlMAAKCBNVXXkvI58Gwy4tbr8CJvL85e1D6mVujVZBWbLNZfaWsaG9P3VsvtzHTtxpn6aj+A0iFnsLNs=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=3N26WW+pmOnaAqQY+k4br6aE9NWIgQwt1j84xAUdCBRhnSYkop2+XjykqvodxYV1gR64xg/FVONU26r3peGupH3guMkn1UmhG+xO9jAvx9BAT+UkKOjVWX+n8zr0252A182n+8SU1B+rrQZRByUMMXI2cPAEziefAltnmzJz7RI=;
Message-ID: <362116.58661.qm@web63908.mail.re1.yahoo.com>
X-YMail-OSG: F2tI4QYVM1kRFnhVrO7qDwGFdOk7D5rkg8Ta7EykM0Z64Svz0.QTFM5UwSWfOObe7iukHkFLKisdsxpFkHyuMG63dKsRapz5p7LLoecphkyM.i9SxY0qFr9nbiRYeHAhe3RS8iBbzyc_.RonzfOFWiyzO4FVsePY1HiTy7x.Lx14fzpMTPRMAbNYxznVde7ZUtJYNikZ0udym0h7ceto_.atWWhsTTnTTqm0yFI82IvPIfoY3DvHGR2e3PLS4b53H_T8h8kvmUkJo.fkcaQRM8Ltlrk_lVilPKtuMdYeAsA6DH66tIGv6DvqcPnz
Received: from [98.242.222.229] by web63908.mail.re1.yahoo.com via HTTP;
	Tue, 07 Apr 2009 16:01:50 PDT
X-Mailer: YahooMailWebService/0.7.289.1
Date: Tue, 7 Apr 2009 16:01:50 -0700 (PDT)
From: Barney Cordoba <barney_cordoba@yahoo.com>
To: Robert Watson <rwatson@FreeBSD.org>
In-Reply-To: <alpine.BSF.2.00.0904072352250.85326@fledge.watson.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: freebsd-net@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: barney_cordoba@yahoo.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Apr 2009 23:01:51 -0000


--- On Tue, 4/7/09, Robert Watson <rwatson@FreeBSD.org> wrote:

> From: Robert Watson <rwatson@FreeBSD.org>
> Subject: Re: Advice on a multithreaded netisr  patch?
> To: "Barney Cordoba" <barney_cordoba@yahoo.com>
> Cc: freebsd-net@freebsd.org, "Ivan Voras" <ivoras@freebsd.org>
> Date: Tuesday, April 7, 2009, 6:52 PM
> On Tue, 7 Apr 2009, Barney Cordoba wrote:
> 
> >>> When I enabled LOCK_PROFILING my side modules,
> such as
> >> if_ibg, stopped working. It seems that the ifnet
> structure or something changed with that option enabled. Is
> there a way to sync this without having to integrate
> everything into a specific kernel build?
> >> 
> >> LOCK_PROFILING changes the size of lock-related
> data structures, so requires both kernel and full set of
> modules to be rebuilt with the option.
> > 
> > It might be good to mention this in the man page. Most
> 3rd party drivers build stand-alone, and even if you pull
> down the latest drivers from intel or broadcom they're
> usually built out of the kernel build. Its pretty
> frustrating to have random things failing, mbuf leaks, etc
> without any warning.
> 
> From the man page:
> 
> NOTES
>      The LOCK_PROFILING option increases the size of struct
> lock_object, so a
>      kernel built with that option will not work with
> modules built without
>      it

Nevermind. Obviously I just plain missed it.

BC


From owner-freebsd-net@FreeBSD.ORG  Wed Apr  8 06:25:25 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0F106106566B
	for <freebsd-net@freebsd.org>; Wed,  8 Apr 2009 06:25:25 +0000 (UTC)
	(envelope-from wahjava@gmail.com)
Received: from ti-out-0910.google.com (ti-out-0910.google.com [209.85.142.186])
	by mx1.freebsd.org (Postfix) with ESMTP id 654678FC1D
	for <freebsd-net@freebsd.org>; Wed,  8 Apr 2009 06:25:24 +0000 (UTC)
	(envelope-from wahjava@gmail.com)
Received: by ti-out-0910.google.com with SMTP id u5so3245576tia.3
	for <multiple recipients>; Tue, 07 Apr 2009 23:25:23 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:received:received:sender:received:date:from:to
	:cc:subject:message-id:references:mime-version:content-type
	:content-disposition:in-reply-to:x-face:x-attribution:x-os-kernel
	:x-os-version:x-os-architecture:x-uptime:x-url:x-mail-morse
	:x-openpgp-fingerprint:x-openpgp-id:organization:user-agent;
	bh=i0us+SNj+tluPzal/igKVuHLHXkXZGwwSWEQWBHs9pc=;
	b=i6KkXBiOasRh/Ft8UAVzu68bjqY0zaXyVYmAB1BUsLbtL2gRn8A1AghYi0vTWbzzVo
	OHZ/PXYZgzFBqYYMUiVpJ/TyEq3RkaYQVPOjSbbtVm8kCkZ042a73nOl226Wdd3nnYDm
	M3XWScRK9p80EKboe8sYyjqfrrqF3aWEdgm80=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=sender:date:from:to:cc:subject:message-id:references:mime-version
	:content-type:content-disposition:in-reply-to:x-face:x-attribution
	:x-os-kernel:x-os-version:x-os-architecture:x-uptime:x-url
	:x-mail-morse:x-openpgp-fingerprint:x-openpgp-id:organization
	:user-agent;
	b=bAVvaL1P0vhlUApk+8zaeFOuIF7x9+dDtY5bZRl4hRBMha66aJuXU3sP6rWH4bpxzW
	yLHre81Csipx/b0VJ1S9tC9639Mex7KFvlfGMkuRWRuR07F1vWHqeTuSG2Sza/SumGE3
	/afP6VBsOU56rDlYDFtEJE40lAZHojgB/UAoQ=
Received: by 10.110.5.14 with SMTP id 14mr1279863tie.40.1239171922891;
	Tue, 07 Apr 2009 23:25:22 -0700 (PDT)
Received: from chateau.d.lf ([122.162.186.216])
	by mx.google.com with ESMTPS id 25sm51766tif.12.2009.04.07.23.25.19
	(version=TLSv1/SSLv3 cipher=RC4-MD5);
	Tue, 07 Apr 2009 23:25:21 -0700 (PDT)
Sender: Ashish SHUKLA <wahjava@gmail.com>
Received: by chateau.d.lf (Postfix, from userid 1001)
	id CDE681E0F7; Wed,  8 Apr 2009 11:55:58 +0530 (IST)
Date: Wed, 8 Apr 2009 11:55:58 +0530
From: Ashish SHUKLA <wahjava.ml@gmail.com>
To: Hajimu UMEMOTO <ume@freebsd.org>
Message-ID: <20090408062558.GA10933@chateau.d.lf>
References: <87y6ud5p62.fsf@chateau.d.lf> <ygeeiw5uqkl.wl%ume@mahoroba.org>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="SUOF0GtieIMvvwua"
Content-Disposition: inline
In-Reply-To: <ygeeiw5uqkl.wl%ume@mahoroba.org>
X-Face: )vGQ9yK7Y$Flebu1C>(B\gYBm)[$zfKM+p&TT[[JWl6:]S>cc$%-z7-`46Zf0B*syL.C
	]oCq[upTG~zuS0.$"_%)|Q@$hA=9{3l{%u^h3jJ^Zl;t7
X-Attribution: =?unknown-8bit?B?4KSG4KS24KWA4KS3?=
X-OS-Kernel: FreeBSD
X-OS-Version: 8.0-CURRENT
X-OS-Architecture: amd64
X-Uptime: 11:53AM  up 23 mins, 7 users, load averages: 1.49, 1.30, 0.86
X-URL: http://wahjava.wordpress.com/
X-Mail-Morse: .-- .- .... .--- .- ...- .- .--.-. --. -- .- .. .-.. .-.-.-
	-.-. --- --
X-OpenPGP-Fingerprint: 1E00 4679 77E4 F8EE 2E4B 56F2 1F2F 8410 762E 5E74
X-OpenPGP-ID: 762E5E74
Organization: /\/0/\/3
User-Agent: Mutt/1.5.19 (2009-01-05)
Cc: freebsd-net@freebsd.org
Subject: Re: getaddrinfo() unable to resolve IPv6 addresses
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 08 Apr 2009 06:25:25 -0000


--SUOF0GtieIMvvwua
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

In <ygeeiw5uqkl.wl%ume@mahoroba.org>, Hajimu UMEMOTO wrote:

[...]

>
>No, I believe it was already fixed.  Please, re-cvsup and try it.

I re-cvsup'ed it and it worked, thanks for the reply.

--=20
Ashish SHUKLA

--SUOF0GtieIMvvwua
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.11 (FreeBSD)

iEYEARECAAYFAkncQ3UACgkQHy+EEHYuXnQctgCfQhVF7tiEQZJkACm+oxwo2kf+
BFUAn0C3UoXophnUhpqDQlQxFX04DU+M
=qxza
-----END PGP SIGNATURE-----

--SUOF0GtieIMvvwua--

From owner-freebsd-net@FreeBSD.ORG  Wed Apr  8 11:48:24 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 78B72106566C
	for <freebsd-net@freebsd.org>; Wed,  8 Apr 2009 11:48:24 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: from web63902.mail.re1.yahoo.com (web63902.mail.re1.yahoo.com
	[69.147.97.117]) by mx1.freebsd.org (Postfix) with SMTP id 23A2A8FC1B
	for <freebsd-net@freebsd.org>; Wed,  8 Apr 2009 11:48:23 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: (qmail 92696 invoked by uid 60001); 8 Apr 2009 11:48:23 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
	t=1239191303; bh=kdnpCcgWtzdpmadFoMyiWTti7itNG73pQlPd/szdRzg=;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=DLsWxNIcUFM8ey1oQQLL8bse7E350lZ/0xyA2WZKAkM1XlJO9uFGeWemnWwrTMqwTna8dz6Vk94bv2BqGR6FtK1x1t2n0c2agBmhCTGsj8/95LR3q0OfqZktx0qRZcbBaH9JZDaxNR1TInKpLsDXwGzMcCQ11AGdUCjImKxcLmk=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=MwA6zm8SrK8bKsSzz/isZzPqqX2mHCNaSGcrybdN0RhLXoByRuFEZlHp/LUBaGVaTfItCrvDKYcAx4Z4ZomlzvxZMFPTXxzm41aIp5QAi475k66L8dBOPdRIj+nNRV4My1IVVmXVwO2+5zHw38rhcE/1hXMmpQBnXTOenxUSSTk=;
Message-ID: <477001.91824.qm@web63902.mail.re1.yahoo.com>
X-YMail-OSG: o_IEk9YVM1kTpayy3hf6OZf.0yGuAbFpXxpD1EhQlzTxpyJuFQIdDofcQW9VQjupgQQB46dwI9g7z8HOB0rDs5KNx3aWDcwqOozsxCfLAw8NiB2cTcA7x5.VSNfVQ7ksdVzKvZ.qGunr.YW7v0qFccFdfHIHgGg9gR5aF57QM.YyIbMQSPRGFD7DBcjzqthixMlTHU3oMS2tyzi8dr91paH1Hql9xQS1MimOZPeKcTdrTwyi64VvUESTsr8aWeQqQ3X9lbXsy55Yk0rSWqbZPycJfZTo7TCSOHx0HCxZnLMkOu79ZrkLlbFdR.yr
Received: from [98.242.222.229] by web63902.mail.re1.yahoo.com via HTTP;
	Wed, 08 Apr 2009 04:48:23 PDT
X-Mailer: YahooMailWebService/0.7.289.1
Date: Wed, 8 Apr 2009 04:48:23 -0700 (PDT)
From: Barney Cordoba <barney_cordoba@yahoo.com>
To: "H.Fazaeli" <fazaeli@sepehrs.com>
In-Reply-To: <49DC3961.8090707@sepehrs.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: freebsd-net@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: barney_cordoba@yahoo.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 08 Apr 2009 11:48:24 -0000


--- On Wed, 4/8/09, H.Fazaeli <fazaeli@sepehrs.com> wrote:

> From: H.Fazaeli <fazaeli@sepehrs.com>
> Subject: Re: Advice on a multithreaded netisr patch?
> To: barney_cordoba@yahoo.com
> Cc: freebsd-net@freebsd.org, "Ivan Voras" <ivoras@freebsd.org>
> Date: Wednesday, April 8, 2009, 1:42 AM
> Barney Cordoba wrote:
> >
> >
> > --- On Tue, 4/7/09, Ivan Voras
> <ivoras@freebsd.org> wrote:
> >
> >   
> >> From: Ivan Voras <ivoras@freebsd.org>
> >> Subject: Re: Advice on a multithreaded netisr
> patch?
> >> To: freebsd-net@freebsd.org
> >> Date: Tuesday, April 7, 2009, 5:59 PM
> >> Barney Cordoba wrote:
> >>
> >>     
> >>> 1) Multiple TX queues are not supported.
> There's
> >>>       
> >> some hokey code to
> >>     
> >>> test, but it doesn't properly separate
> flows to
> >>>       
> >> the queues.
> >>     
> >>> 2) 2 Rx queues don't work, so only 1 and 4
> work
> >>> 3) With 4 queues, it just sucks up CPU under
> heavy
> >>>       
> >> load on 4 cpus. It will
> >>     
> >>> blow 4 cpus at a lower load than em will with
> 1
> >>> 4) You'll need to fix DMA setup, as it
> sets the
> >>>       
> >> alignment requirement
> >>     
> >>> to PAGE_SIZE. I haven't been able to
> convince Jack
> >>>       
> >> that its wrong, not
> >>     
> >>> that I've tried very hard since its easy
> to just
> >>>       
> >> fix myself.
> >>
> >> Reading this thread it looks like the development
> of both
> >> Intel drivers
> >> is a bit stalled, doesn't it? AFAIK the em
> driver is
> >> also
> >> semi-officially abandoned, and both from my
> experience and
> >> others it
> >> looks like new development and patches are being
> rejected.
> >> Time to shop
> >> other hardware?
> >>     
> >
> > To be fair, the OS doesn't really support
> multiqueue yet, or has
> > for only a few hours, so lets not go crazy.
> >
> > It makes a lot more sense to have someone on the
> "team" work with
> > Jack on improving the performance and working out the
> kinks. When
> > I asked Jack about the poor performance of if_igb, he
> indicated that
> > Intel's position is that the drivers are
> "just samples", which really
> > doesn't give anyone much confidence that they want
> to run their business
> > on them. You already  have Jack doing all of the hard
> work; that is 
> > supporting the new-chip-per-week that intel puts out,
> so it seems to 
> > me the best strategy would be to try to convince Intel
> that its in
> > their best interest to have drivers that work well so
> people don't 
> > think that their hardware stinks.
> >
> > As an example, the Chelsio 10gb bypass card is $3495.
> and an Intel
> > card is ~$1000, so its a big win for the community as
> a whole to have
> > good intel drivers going forward.
> >
> > My work is commercially proprietary so I can't
> share my code, but
> > I can certainly share ideas on things that I've
> tested and discovered.
> >
> >   
> can you provide more details on the improvements you
> achieved?
> 
> > Barney
> >
> >
> >       
> > _______________________________________________
> > freebsd-net@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-net
> > To unsubscribe, send any mail to
> "freebsd-net-unsubscribe@freebsd.org"
> >
> >   
> 
> -- 

As all developers konw, programming is 90% learning and 10% code. So
far, I've implemented multiqueue for 7.x and gotten everything
to work for both igb and ixgbe. igb isn't all that interesting
since em can easily handle 1 Gb/s; so ixgbe is really the
goal. The igb and ixgbe are similar designs so the work is
somewhat parallel. 

As of now, I'm working on separating the theory from the real
world and getting a feel for which design techniques work best.
I'm also *not* designing for a system that uses the stack (a filtering
firewall type system), so the things that Robert talks about apply
differently. A web server, for example, will likely only have 1 controller
and will have many user threads; while a router or firewall will have
2 equally loaded NICs with few if any user threads. Its quite likely
that completely different approaches are needed to optimize each.

I'm at the point of testing design approaches. So the jury is out as
what what can be achieved.

What I can say is that multiqueue isn't a panacea or even desirable if
its not designed correctly. Out of the box, increasing the number of
queues just to "spread interrupts" doesn't seem to have any advantage;
in fact it seems to make things worse in terms of utilization. I'm not
entirely sure why as of yet.


Barney


From owner-freebsd-net@FreeBSD.ORG  Wed Apr  8 13:05:10 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 133F0106574E
	for <freebsd-net@freebsd.org>; Wed,  8 Apr 2009 13:05:10 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: from web63906.mail.re1.yahoo.com (web63906.mail.re1.yahoo.com
	[69.147.97.121]) by mx1.freebsd.org (Postfix) with SMTP id C3DFE8FC26
	for <freebsd-net@freebsd.org>; Wed,  8 Apr 2009 13:05:09 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: (qmail 36277 invoked by uid 60001); 8 Apr 2009 13:05:09 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
	t=1239195908; bh=YW2tnSmfnygNnRQuQwrrkN1nB3LAe25kxMHcXnYqLwc=;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=mwRcjdSNzX2t6tZ0kuGlVSJecCzCc/jYyYSVQHR6IpUes0I90Ughtsf+fI2I60qZ97s/MUqMFBtP3cSW7Q/tOsWzi79YifUnEGe30ltB63BX12lAZwTmh+4WZ6X0vEzGtH64V4nsRios4HfkOuXdu7Scx352mK707mhw+bBWNdw=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=1QdE1+CIrbkqa+DN1kIZiyigAAKGWT7Mc8Wi395VMWLLC/Mv+hx4YuUry67M/lYqNLNUpy5Wnjn0Pvcb25mJdXvAofRDv7AK76jQjVtExJt5eMAiYSRK6dVyWu76zR76GrBJUw9DWJX7HTSc/fe0w7odWeeIr70K7DuaYiz3LTk=;
Message-ID: <871699.35154.qm@web63906.mail.re1.yahoo.com>
X-YMail-OSG: t3W3E6AVM1lR2mfu51oHbNoGyO4mKfzC02dDp96njH5sc6Pr15G275IBpIoPFDtN_WILwo9orGJg7lYKsrzlY_MjrmjZhIzdilc9Li3sutmd2cGt3ExkRAAsG0xEXRc1yi0e3hkmu0EC_WOh8PsrVx8Li.ahudCpkJbPQzOaY6PaVdkQMHwSMEy4h6q7F5NpIuIFLDgrP9jNBrwOIbbZDVHqv_54pWHbolJ15nKOderXTTt5bktGAtF6TLRcrsuxH8vv6w7WkLCMPtxrthhr.UXaBF9RCiTTomEHZeOG.vsrHp7JhAH_rxRioQk8
Received: from [98.242.222.229] by web63906.mail.re1.yahoo.com via HTTP;
	Wed, 08 Apr 2009 06:05:08 PDT
X-Mailer: YahooMailWebService/0.7.289.1
Date: Wed, 8 Apr 2009 06:05:08 -0700 (PDT)
From: Barney Cordoba <barney_cordoba@yahoo.com>
To: Ivan Voras <ivoras@freebsd.org>, Robert Watson <rwatson@FreeBSD.org>
In-Reply-To: <alpine.BSF.2.00.0904061934240.18619@fledge.watson.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: freebsd-net@freebsd.org
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: barney_cordoba@yahoo.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 08 Apr 2009 13:05:10 -0000


--- On Mon, 4/6/09, Robert Watson <rwatson@FreeBSD.org> wrote:

> From: Robert Watson <rwatson@FreeBSD.org>
> Subject: Re: Advice on a multithreaded netisr  patch?
> To: "Ivan Voras" <ivoras@freebsd.org>
> Cc: freebsd-net@freebsd.org
> Date: Monday, April 6, 2009, 2:52 PM
> On Mon, 6 Apr 2009, Ivan Voras wrote:
> 
> >> I think we're talking slightly at cross
> purposes.  There are two
> >> transfers of interest:
> >> 
> >> (1) DMA of the packet data to main memory from the
> NIC
> >> (2) Servicing of CPU cache misses to access data
> in main memory
> >> 
> >> By the time you receive an interrupt, the DMA is
> complete, so once you
> > 
> > OK, this was what was confusing me - for a moment I
> thought you meant it's not so.
> 
> It's a polite lie that we will choose to believe the
> purposes of simplification.  And probably true for all our
> drivers in practice right now.
> 
> >>     m = m_pullup(m, sizeof(*w));
> >>     if (m == NULL)
> >>         return;
> >>     w = mtod(m, struct whatever *);
> >> 
> >> m_pullup() here ensures that the first sizeof(*w)
> bytes of mbuf data are contiguously stored so that the cast
> of w to m's data will point at a
> > 
> > So, m_pullup() can resize / realloc() the mbuf? (not
> that it matters for this purpose)
> 
> Yes -- if it can't meet the contiguity requirements
> using the current mbuf chain, it may reallocate and return a
> new head to the chain (hence m being reassigned).  If that
> reallocation fails, it may return NULL.  Once you've
> called m_pullup(), existing pointers into the chain's
> data will be invalid, so if you've already called mtod()
> on it, you need to call it again.
> 
> >> - A TCP segment will need to be ACK'd, so if
> you're sending data in
> >> chunks in
> >>   one direction, the ACKs will not be piggy-backed
> on existing data
> >> tranfers,
> >>   and instead be sent independently, hitting the
> network stack two more
> >> times.
> > 
> > No combination of these can make an accounting
> difference between 1,000 and 250,000 pps. I must be hitting
> something very bad here.
> 
> Yes, you definitely want to run tcpdump to see what's
> going on here.
> 
> >> - Remember that TCP works to expand its window,
> and then maintains the
> >> highest
> >>   performance it can by bumping up against the top
> of available bandwidth
> >>   continuously.  This involves detecting buffer
> limits by generating
> >> packets
> >>   that can't be sent, adding to the packet
> count.  With loopback
> >> traffic, the
> >>   drop point occurs when you exceed the size of
> the netisr's queue for
> >> IP, so
> >>   you might try bumping that from the default to
> something much larger.

Robert,

Is there any work being done on lighter weight locks for queues?
It seems ridiculous to avoid using queues because of lock contention
when the locks are only protecting a couple lines of code.

Barney


From owner-freebsd-net@FreeBSD.ORG  Wed Apr  8 13:16:54 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 60A25106566C;
	Wed,  8 Apr 2009 13:16:54 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 3D3A98FC17;
	Wed,  8 Apr 2009 13:16:54 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [65.122.17.41])
	by cyrus.watson.org (Postfix) with ESMTPS id D4DB646B86;
	Wed,  8 Apr 2009 09:16:53 -0400 (EDT)
Date: Wed, 8 Apr 2009 14:16:53 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Barney Cordoba <barney_cordoba@yahoo.com>
In-Reply-To: <871699.35154.qm@web63906.mail.re1.yahoo.com>
Message-ID: <alpine.BSF.2.00.0904081412540.61921@fledge.watson.org>
References: <871699.35154.qm@web63906.mail.re1.yahoo.com>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-net@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 08 Apr 2009 13:16:54 -0000


On Wed, 8 Apr 2009, Barney Cordoba wrote:

> Is there any work being done on lighter weight locks for queues? It seems 
> ridiculous to avoid using queues because of lock contention when the locks 
> are only protecting a couple lines of code.

My reading is that there are two, closely related, things going on: the first 
is lock contention, and the second is cache line contention.  We have a 
primitive in 8.x (don't think it's been MFC'd yet) for a lockless atomic 
buffer primitive for use in drivers and other parts of the stack.  However, 
that addresses only lock contention, not line contention, which at a high PPS 
will be an issue as well.  Only by moving to independent data structures 
(i.e., on independent cache lines) can we reduce line contention.

Robert N M Watson
Computer Laboratory
University of Cambridge

From owner-freebsd-net@FreeBSD.ORG  Wed Apr  8 13:18:47 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id EB16B1065670
	for <freebsd-net@freebsd.org>; Wed,  8 Apr 2009 13:18:47 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: from web63905.mail.re1.yahoo.com (web63905.mail.re1.yahoo.com
	[69.147.97.120]) by mx1.freebsd.org (Postfix) with SMTP id 947DD8FC23
	for <freebsd-net@freebsd.org>; Wed,  8 Apr 2009 13:18:47 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: (qmail 80998 invoked by uid 60001); 8 Apr 2009 13:18:47 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
	t=1239196727; bh=/fIMzpZLP2XmJMMmaEej6tbfxky/LBhSR1MX9uLDFew=;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=XBE5Fn+1IAnjZlyEMnTRHa7le/OdxBr66m1wLDGDgqztv3uDV3emVPBhNY0fjqg9SN1BerT1CJ+BtvjnqbFaQ+qWFwbVd1oDxIvuFKywwoYEazXGclpuZvMye3F5S5DQoZ1Wd65xoLGUjx3e75xJ1pmE5bm8kxRDxxaDQjAdjmo=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=F5p9Y7DLyVq9iL4RBGD/l0s/Q+MQ9cQ/kzUZJzg2qWtgZNZtf1ve7u/XCpW9wJsw2LAoZTl+22CA938EO+Ep4hgKQn2tEygv1JsYAVi5/i3j4VihXYu3W2Sc6mROWXLNIQRH+YoEhFhi5sEb3T9rFTm5AWMUYvGXhvwRkddM8SY=;
Message-ID: <75700.80930.qm@web63905.mail.re1.yahoo.com>
X-YMail-OSG: Fd3qFk8VM1lxJrsNffrHuuzNOaXrjP8zb5PtUBPMKoQ1BISFadJK1rf9bhOd4ONKd9rjt1MNZX8SfGii8QL1QulA0ZuBZ0by9cFkEeFJUd9G9Lgxj7sTQbjJB8bKlu.pQxBBHMQDI7TeBROWL8qFg8Qlla4KNPTL9fDvLhwqA855OWQCe6JjD5bqiI4xeBNYALNcMr4DXL9h4GmtB5ntSJ.lEvqWe4E.I0CwaZfPQHn3.e2YLCpU1ASkrXbRuZgxbaaz5KQaYgKYc.hXAu6rSBGh2f8e8omHJqkJr.nEt_eOEwTzI3i8R0BfpfhP
Received: from [98.242.222.229] by web63905.mail.re1.yahoo.com via HTTP;
	Wed, 08 Apr 2009 06:18:46 PDT
X-Mailer: YahooMailWebService/0.7.289.1
Date: Wed, 8 Apr 2009 06:18:46 -0700 (PDT)
From: Barney Cordoba <barney_cordoba@yahoo.com>
To: Robert Watson <rwatson@FreeBSD.org>
In-Reply-To: <alpine.BSF.2.00.0904071354521.45341@fledge.watson.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: freebsd-net@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: barney_cordoba@yahoo.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 08 Apr 2009 13:18:48 -0000


--- On Tue, 4/7/09, Robert Watson <rwatson@FreeBSD.org> wrote:

> From: Robert Watson <rwatson@FreeBSD.org>
> Subject: Re: Advice on a multithreaded netisr  patch?
> To: "Barney Cordoba" <barney_cordoba@yahoo.com>
> Cc: freebsd-net@freebsd.org, "Ivan Voras" <ivoras@freebsd.org>
> Date: Tuesday, April 7, 2009, 8:56 AM
> On Tue, 7 Apr 2009, Barney Cordoba wrote:
> 
> >> Have you tried LOCK_PROFILING?  It would quickly
> tell you if driver locks were a source of significant
> contention.  It works quite well...
> > 
> > When I enabled LOCK_PROFILING my side modules, such as
> if_ibg, stopped working. It seems that the ifnet structure
> or something changed with that option enabled. Is there a
> way to sync this without having to integrate everything into
> a specific kernel build?
> 
> LOCK_PROFILING changes the size of lock-related data
> structures, so requires both kernel and full set of modules
> to be rebuilt with the option.

What are the units for lock profiling? For example, the "average
wait" is in what units? 

Is there a way to reset the stats counters? If not, it might be nifty if 
toggling prof.enable reset the stats to run some different kinds of
tests without rebooting.

Barney


From owner-freebsd-net@FreeBSD.ORG  Wed Apr  8 13:54:19 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 35BCF10656FF
	for <freebsd-net@freebsd.org>; Wed,  8 Apr 2009 13:54:18 +0000 (UTC)
	(envelope-from spawk@acm.poly.edu)
Received: from acm.poly.edu (acm.poly.edu [128.238.9.200])
	by mx1.freebsd.org (Postfix) with ESMTP id C22788FC7E
	for <freebsd-net@freebsd.org>; Wed,  8 Apr 2009 13:54:08 +0000 (UTC)
	(envelope-from spawk@acm.poly.edu)
Received: (qmail 42326 invoked from network); 8 Apr 2009 13:54:08 -0000
Received: from unknown (HELO ?10.0.0.135?) (spawk@128.238.64.31)
	by acm.poly.edu with AES256-SHA encrypted SMTP;
	8 Apr 2009 13:54:08 -0000
Message-ID: <49DCAC1F.9000708@acm.poly.edu>
Date: Wed, 08 Apr 2009 09:52:31 -0400
From: Boris Kochergin <spawk@acm.poly.edu>
User-Agent: Thunderbird 2.0.0.19 (X11/20090108)
MIME-Version: 1.0
To: freebsd-net@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Multi-BSS problem with Atheros 5212
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 08 Apr 2009 13:54:39 -0000

Ahoy. I'm having trouble with multiple hostap-mode wlan pseudo-devices. 
The machine is an 8-CURRENT from yesterday:

# uname -a
FreeBSD test 8.0-CURRENT FreeBSD 8.0-CURRENT #0: Tue Apr  7 16:54:56 UTC 
2009     root@test:/usr/obj/usr/src/sys/GENERIC  i386

# dmesg | grep ath
ath0: <Atheros 5212> mem 0xf4100000-0xf410ffff irq 11 at device 13.0 on pci0
ath0: [ITHREAD]
ath0: AR2413 mac 7.9 RF2413 phy 4.5

# cat /etc/rc.conf
wlans_ath0="wlan0 wlan1 wlan2"
create_args_wlan0="wlanmode hostap bssid"
create_args_wlan1="wlanmode hostap bssid"
create_args_wlan2="wlanmode hostap bssid"
ifconfig_wlan0="ssid wlan0 wepmode off up"
ifconfig_wlan1="ssid wlan1 wepmode off up"
ifconfig_wlan2="ssid wlan2 wepmode off up"

# ifconfig
ath0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 2290
        ether 00:18:e7:33:5e:24
        media: IEEE 802.11 Wireless Ethernet autoselect mode 11g <hostap>
        status: running
fxp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8<VLAN_MTU>
        ether 00:90:27:72:c4:f3
        inet 10.0.0.128 netmask 0xffffff00 broadcast 10.0.0.255
        media: Ethernet autoselect (100baseTX <full-duplex>)
        status: active
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=3<RXCSUM,TXCSUM>
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
        inet6 ::1 prefixlen 128
        inet 127.0.0.1 netmask 0xff000000
wlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 00:18:e7:33:5e:24
        media: IEEE 802.11 Wireless Ethernet autoselect mode 11g <hostap>
        status: running
        ssid wlan0 channel 11 (2462 Mhz 11g) bssid 00:18:e7:33:5e:24
        country US ecm authmode OPEN privacy OFF txpower 23 scanvalid 60
        protmode CTS wme burst dtimperiod 1 -dfs
wlan1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 06:18:e7:33:5e:24
        media: IEEE 802.11 Wireless Ethernet autoselect mode 11g <hostap>
        status: running
        ssid wlan1 channel 11 (2462 Mhz 11g) bssid 06:18:e7:33:5e:24
        country US ecm authmode OPEN privacy OFF txpower 23 scanvalid 60
        protmode CTS wme burst dtimperiod 1 -dfs
wlan2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 0a:18:e7:33:5e:24
        media: IEEE 802.11 Wireless Ethernet autoselect mode 11g <hostap>
        status: running
        ssid wlan2 channel 11 (2462 Mhz 11g) bssid 0a:18:e7:33:5e:24
        country US ecm authmode OPEN privacy OFF txpower 23 scanvalid 60
        protmode CTS wme burst dtimperiod 1 -dfs

The client is a 7.0 machine with another 5212 card:

# uname -a
FreeBSD peer 7.0-RELEASE-p10 FreeBSD 7.0-RELEASE-p10 #0: Mon Mar 23 
09:26:18 EDT 2009     root@peer:/usr/obj/usr/src/sys/PEER  i386

# dmesg | grep ath
ath_hal: 0.10.5.6 (AR5210, AR5211, AR5212, AR5416, RF5111, RF5112, 
RF2413, RF5413, RF2133, RF2425, RF2417)
ath0: <Atheros 5212> mem 0xa8410000-0xa841ffff irq 11 at device 0.0 on 
cardbus0
ath0: [ITHREAD]
ath0: using obsoleted if_watchdog interface
ath0: Ethernet address: 00:14:d1:42:21:5a
ath0: mac 7.9 phy 4.5 radio 5.6

The three SSIDs configured on the CURRENT machine show up in a scan:

# ifconfig ath0 scan | grep wlan
wlan0           00:18:e7:33:5e:24   11   54M -66:-93  100 ES   WME
wlan1           06:18:e7:33:5e:24   11   54M -65:-93  100 ES   WME
wlan2           0a:18:e7:33:5e:24   11   54M -65:-93  100 ES   WME

The client is only able to associate with wlan1, however. When scanning 
channels while attempting to associate with any of the other ones, it 
gets stuck on channel 11 for a while before moving on, which seems 
relevant. Also interesting is the fact that if i do "ifconfig ath0 down" 
on the CURRENT machine, followed by, for example, "ifconfig ath0 ssid 
wlan0" (which did not associate before) on the client, followed by 
"ifconfig ath0 up" on the CURRENT machine, the client will associate 
with wlan0, but will not be able to associate with wlan1 or wlan2. Any 
ideas?

-Boris

From owner-freebsd-net@FreeBSD.ORG  Wed Apr  8 15:25:40 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 31D501065679
	for <freebsd-net@freebsd.org>; Wed,  8 Apr 2009 15:25:40 +0000 (UTC)
	(envelope-from sam@freebsd.org)
Received: from ebb.errno.com (ebb.errno.com [69.12.149.25])
	by mx1.freebsd.org (Postfix) with ESMTP id C309A8FC08
	for <freebsd-net@freebsd.org>; Wed,  8 Apr 2009 15:25:39 +0000 (UTC)
	(envelope-from sam@freebsd.org)
Received: from trouble.errno.com (trouble.errno.com [10.0.0.248])
	(authenticated bits=0)
	by ebb.errno.com (8.13.6/8.12.6) with ESMTP id n38FPWwJ051161
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Wed, 8 Apr 2009 08:25:35 -0700 (PDT) (envelope-from sam@freebsd.org)
Message-ID: <49DCC1EB.3040706@freebsd.org>
Date: Wed, 08 Apr 2009 08:25:31 -0700
From: Sam Leffler <sam@freebsd.org>
Organization: FreeBSD Project
User-Agent: Thunderbird 2.0.0.18 (X11/20081209)
MIME-Version: 1.0
To: Boris Kochergin <spawk@acm.poly.edu>
References: <49DCAC1F.9000708@acm.poly.edu>
In-Reply-To: <49DCAC1F.9000708@acm.poly.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-DCC-CTc-dcc2-Metrics: ebb.errno.com; whitelist
Cc: freebsd-net@freebsd.org
Subject: Re: Multi-BSS problem with Atheros 5212
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 08 Apr 2009 15:25:40 -0000

Boris Kochergin wrote:
> Ahoy. I'm having trouble with multiple hostap-mode wlan 
> pseudo-devices. The machine is an 8-CURRENT from yesterday:
>
> # uname -a
> FreeBSD test 8.0-CURRENT FreeBSD 8.0-CURRENT #0: Tue Apr  7 16:54:56 
> UTC 2009     root@test:/usr/obj/usr/src/sys/GENERIC  i386
>
> # dmesg | grep ath
> ath0: <Atheros 5212> mem 0xf4100000-0xf410ffff irq 11 at device 13.0 
> on pci0
> ath0: [ITHREAD]
> ath0: AR2413 mac 7.9 RF2413 phy 4.5
>
> # cat /etc/rc.conf
> wlans_ath0="wlan0 wlan1 wlan2"
> create_args_wlan0="wlanmode hostap bssid"
> create_args_wlan1="wlanmode hostap bssid"
> create_args_wlan2="wlanmode hostap bssid"
> ifconfig_wlan0="ssid wlan0 wepmode off up"
> ifconfig_wlan1="ssid wlan1 wepmode off up"
> ifconfig_wlan2="ssid wlan2 wepmode off up"
>
> # ifconfig
> ath0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 
> 2290
>        ether 00:18:e7:33:5e:24
>        media: IEEE 802.11 Wireless Ethernet autoselect mode 11g <hostap>
>        status: running
> fxp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 
> 1500
>        options=8<VLAN_MTU>
>        ether 00:90:27:72:c4:f3
>        inet 10.0.0.128 netmask 0xffffff00 broadcast 10.0.0.255
>        media: Ethernet autoselect (100baseTX <full-duplex>)
>        status: active
> lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
>        options=3<RXCSUM,TXCSUM>
>        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
>        inet6 ::1 prefixlen 128
>        inet 127.0.0.1 netmask 0xff000000
> wlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 
> 1500
>        ether 00:18:e7:33:5e:24
>        media: IEEE 802.11 Wireless Ethernet autoselect mode 11g <hostap>
>        status: running
>        ssid wlan0 channel 11 (2462 Mhz 11g) bssid 00:18:e7:33:5e:24
>        country US ecm authmode OPEN privacy OFF txpower 23 scanvalid 60
>        protmode CTS wme burst dtimperiod 1 -dfs
> wlan1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 
> 1500
>        ether 06:18:e7:33:5e:24
>        media: IEEE 802.11 Wireless Ethernet autoselect mode 11g <hostap>
>        status: running
>        ssid wlan1 channel 11 (2462 Mhz 11g) bssid 06:18:e7:33:5e:24
>        country US ecm authmode OPEN privacy OFF txpower 23 scanvalid 60
>        protmode CTS wme burst dtimperiod 1 -dfs
> wlan2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 
> 1500
>        ether 0a:18:e7:33:5e:24
>        media: IEEE 802.11 Wireless Ethernet autoselect mode 11g <hostap>
>        status: running
>        ssid wlan2 channel 11 (2462 Mhz 11g) bssid 0a:18:e7:33:5e:24
>        country US ecm authmode OPEN privacy OFF txpower 23 scanvalid 60
>        protmode CTS wme burst dtimperiod 1 -dfs
>
> The client is a 7.0 machine with another 5212 card:
>
> # uname -a
> FreeBSD peer 7.0-RELEASE-p10 FreeBSD 7.0-RELEASE-p10 #0: Mon Mar 23 
> 09:26:18 EDT 2009     root@peer:/usr/obj/usr/src/sys/PEER  i386
>
> # dmesg | grep ath
> ath_hal: 0.10.5.6 (AR5210, AR5211, AR5212, AR5416, RF5111, RF5112, 
> RF2413, RF5413, RF2133, RF2425, RF2417)
> ath0: <Atheros 5212> mem 0xa8410000-0xa841ffff irq 11 at device 0.0 on 
> cardbus0
> ath0: [ITHREAD]
> ath0: using obsoleted if_watchdog interface
> ath0: Ethernet address: 00:14:d1:42:21:5a
> ath0: mac 7.9 phy 4.5 radio 5.6
>
> The three SSIDs configured on the CURRENT machine show up in a scan:
>
> # ifconfig ath0 scan | grep wlan
> wlan0           00:18:e7:33:5e:24   11   54M -66:-93  100 ES   WME
> wlan1           06:18:e7:33:5e:24   11   54M -65:-93  100 ES   WME
> wlan2           0a:18:e7:33:5e:24   11   54M -65:-93  100 ES   WME
>
> The client is only able to associate with wlan1, however. When 
> scanning channels while attempting to associate with any of the other 
> ones, it gets stuck on channel 11 for a while before moving on, which 
> seems relevant. Also interesting is the fact that if i do "ifconfig 
> ath0 down" on the CURRENT machine, followed by, for example, "ifconfig 
> ath0 ssid wlan0" (which did not associate before) on the client, 
> followed by "ifconfig ath0 up" on the CURRENT machine, the client will 
> associate with wlan0, but will not be able to associate with wlan1 or 
> wlan2. Any ideas?
wlandebug scan+auth+assoc on the client machine will show you why you 
cannot associate.  You can also enable the same info on the ap side to 
see what it thinks is happening.

    Sam


From owner-freebsd-net@FreeBSD.ORG  Wed Apr  8 16:16:20 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 9DBC01065670
	for <freebsd-net@freebsd.org>; Wed,  8 Apr 2009 16:16:20 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: from web63905.mail.re1.yahoo.com (web63905.mail.re1.yahoo.com
	[69.147.97.120]) by mx1.freebsd.org (Postfix) with SMTP id 428148FC1C
	for <freebsd-net@freebsd.org>; Wed,  8 Apr 2009 16:16:19 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: (qmail 64204 invoked by uid 60001); 8 Apr 2009 16:16:19 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
	t=1239207379; bh=9E32yJ5njQYjCMpsgWMkexb+eHzylQv07KatN5bnr98=;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=0HBSMPSYM53a6KNOLhC38xWUoB5I+qINbeUNI7wjjXHU1kaf4DjPnGv2eYFTLs79qxCKYT/pOUeF0ErHdX0xMrB+Fum7O3siKiWR0cWJRLTshOB1OnPEMk2wxjvhoLyOcQheb9q6UbBrkim3xazEXq+mNo6SyEE/mwNURw6JqQo=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=wzoMJImAf2/fi+lt18m/UbB2oagPFktdnAt3b2DPH5YzegwD0xgGKnTrJe3XrZ2inz+pYPO5LQfJXCuC7QIv9TBxM0z5tK6YzXYfCJQISYPiwkEG46jydzjPd5hboNzvFjoQiorP8+kVLaqfdU7ssQxttgLAEAj+aWXFP8y1AKk=;
Message-ID: <564712.63955.qm@web63905.mail.re1.yahoo.com>
X-YMail-OSG: RIcXMnoVM1nSDVUxWli1IkeRfyq1MsdNp54pepsf7KIE05nZUYTwynr8F3NoNxkd8eJELxSG69htMuB89x3mDxn8LI5H5m7RCvlUA1QrBUdzkiOkXBFnfk9gKIO8tg8pkuxtU2TImDdid6ZWCeGCVgOp8IbpW1rBgpx0mckraHcW_8C7Incv2jUeV6haNAzB1U_FQjf0KIs2I5G.JrYmDk7orFHCej8gaXD.M.6zEP.iU9GyW30g7Ooh7kf60ki7x.ghR4yjBBIrxgakgkzJU1ox0zTAY8jPEa.JaKf9zSq7swnT64QP7y7TEyUCy_n4kADVHCnqGfq2.yaOnoKPHBfI
Received: from [98.242.222.229] by web63905.mail.re1.yahoo.com via HTTP;
	Wed, 08 Apr 2009 09:16:19 PDT
X-Mailer: YahooMailWebService/0.7.289.1
Date: Wed, 8 Apr 2009 09:16:19 -0700 (PDT)
From: Barney Cordoba <barney_cordoba@yahoo.com>
To: Robert Watson <rwatson@FreeBSD.org>
In-Reply-To: <75700.80930.qm@web63905.mail.re1.yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: freebsd-net@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: barney_cordoba@yahoo.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 08 Apr 2009 16:16:21 -0000


--- On Wed, 4/8/09, Barney Cordoba <barney_cordoba@yahoo.com> wrote:

> From: Barney Cordoba <barney_cordoba@yahoo.com>
> Subject: Re: Advice on a multithreaded netisr  patch?
> To: "Robert Watson" <rwatson@FreeBSD.org>
> Cc: freebsd-net@freebsd.org, "Ivan Voras" <ivoras@freebsd.org>
> Date: Wednesday, April 8, 2009, 9:18 AM
> --- On Tue, 4/7/09, Robert Watson
> <rwatson@FreeBSD.org> wrote:
> 
> > From: Robert Watson <rwatson@FreeBSD.org>
> > Subject: Re: Advice on a multithreaded netisr  patch?
> > To: "Barney Cordoba"
> <barney_cordoba@yahoo.com>
> > Cc: freebsd-net@freebsd.org, "Ivan Voras"
> <ivoras@freebsd.org>
> > Date: Tuesday, April 7, 2009, 8:56 AM
> > On Tue, 7 Apr 2009, Barney Cordoba wrote:
> > 
> > >> Have you tried LOCK_PROFILING?  It would
> quickly
> > tell you if driver locks were a source of significant
> > contention.  It works quite well...
> > > 
> > > When I enabled LOCK_PROFILING my side modules,
> such as
> > if_ibg, stopped working. It seems that the ifnet
> structure
> > or something changed with that option enabled. Is
> there a
> > way to sync this without having to integrate
> everything into
> > a specific kernel build?
> > 
> > LOCK_PROFILING changes the size of lock-related data
> > structures, so requires both kernel and full set of
> modules
> > to be rebuilt with the option.
> 
> What are the units for lock profiling? For example, the
> "average
> wait" is in what units? 
> 
> Is there a way to reset the stats counters? If not, it
> might be nifty if 
> toggling prof.enable reset the stats to run some different
> kinds of
> tests without rebooting.
> 
> Barney

I know, I know. Read the man page...


From owner-freebsd-net@FreeBSD.ORG  Wed Apr  8 16:53:02 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 103B41065675
	for <freebsd-net@freebsd.org>; Wed,  8 Apr 2009 16:53:02 +0000 (UTC)
	(envelope-from sam@freebsd.org)
Received: from ebb.errno.com (ebb.errno.com [69.12.149.25])
	by mx1.freebsd.org (Postfix) with ESMTP id BD4468FC1F
	for <freebsd-net@freebsd.org>; Wed,  8 Apr 2009 16:53:01 +0000 (UTC)
	(envelope-from sam@freebsd.org)
Received: from trouble.errno.com (trouble.errno.com [10.0.0.248])
	(authenticated bits=0)
	by ebb.errno.com (8.13.6/8.12.6) with ESMTP id n38GqxjA051737
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Wed, 8 Apr 2009 09:53:00 -0700 (PDT) (envelope-from sam@freebsd.org)
Message-ID: <49DCD66B.6040504@freebsd.org>
Date: Wed, 08 Apr 2009 09:52:59 -0700
From: Sam Leffler <sam@freebsd.org>
Organization: FreeBSD Project
User-Agent: Thunderbird 2.0.0.18 (X11/20081209)
MIME-Version: 1.0
To: Boris Kochergin <spawk@acm.poly.edu>
References: <49DCAC1F.9000708@acm.poly.edu> <49DCC1EB.3040706@freebsd.org>
In-Reply-To: <49DCC1EB.3040706@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-DCC-CTc-dcc2-Metrics: ebb.errno.com; whitelist
Cc: freebsd-net@freebsd.org
Subject: Re: Multi-BSS problem with Atheros 5212
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 08 Apr 2009 16:53:02 -0000

Sam Leffler wrote:
> Boris Kochergin wrote:
>> Ahoy. I'm having trouble with multiple hostap-mode wlan 
>> pseudo-devices. The machine is an 8-CURRENT from yesterday:
>>
>> # uname -a
>> FreeBSD test 8.0-CURRENT FreeBSD 8.0-CURRENT #0: Tue Apr  7 16:54:56 
>> UTC 2009     root@test:/usr/obj/usr/src/sys/GENERIC  i386
>>
>> # dmesg | grep ath
>> ath0: <Atheros 5212> mem 0xf4100000-0xf410ffff irq 11 at device 13.0 
>> on pci0
>> ath0: [ITHREAD]
>> ath0: AR2413 mac 7.9 RF2413 phy 4.5
>>
>> # cat /etc/rc.conf
>> wlans_ath0="wlan0 wlan1 wlan2"
>> create_args_wlan0="wlanmode hostap bssid"
>> create_args_wlan1="wlanmode hostap bssid"
>> create_args_wlan2="wlanmode hostap bssid"
>> ifconfig_wlan0="ssid wlan0 wepmode off up"
>> ifconfig_wlan1="ssid wlan1 wepmode off up"
>> ifconfig_wlan2="ssid wlan2 wepmode off up"
>>
>> # ifconfig
>> ath0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 
>> 2290
>>        ether 00:18:e7:33:5e:24
>>        media: IEEE 802.11 Wireless Ethernet autoselect mode 11g <hostap>
>>        status: running
>> fxp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 
>> 1500
>>        options=8<VLAN_MTU>
>>        ether 00:90:27:72:c4:f3
>>        inet 10.0.0.128 netmask 0xffffff00 broadcast 10.0.0.255
>>        media: Ethernet autoselect (100baseTX <full-duplex>)
>>        status: active
>> lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
>>        options=3<RXCSUM,TXCSUM>
>>        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
>>        inet6 ::1 prefixlen 128
>>        inet 127.0.0.1 netmask 0xff000000
>> wlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 
>> mtu 1500
>>        ether 00:18:e7:33:5e:24
>>        media: IEEE 802.11 Wireless Ethernet autoselect mode 11g <hostap>
>>        status: running
>>        ssid wlan0 channel 11 (2462 Mhz 11g) bssid 00:18:e7:33:5e:24
>>        country US ecm authmode OPEN privacy OFF txpower 23 scanvalid 60
>>        protmode CTS wme burst dtimperiod 1 -dfs
>> wlan1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 
>> mtu 1500
>>        ether 06:18:e7:33:5e:24
>>        media: IEEE 802.11 Wireless Ethernet autoselect mode 11g <hostap>
>>        status: running
>>        ssid wlan1 channel 11 (2462 Mhz 11g) bssid 06:18:e7:33:5e:24
>>        country US ecm authmode OPEN privacy OFF txpower 23 scanvalid 60
>>        protmode CTS wme burst dtimperiod 1 -dfs
>> wlan2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 
>> mtu 1500
>>        ether 0a:18:e7:33:5e:24
>>        media: IEEE 802.11 Wireless Ethernet autoselect mode 11g <hostap>
>>        status: running
>>        ssid wlan2 channel 11 (2462 Mhz 11g) bssid 0a:18:e7:33:5e:24
>>        country US ecm authmode OPEN privacy OFF txpower 23 scanvalid 60
>>        protmode CTS wme burst dtimperiod 1 -dfs
>>
>> The client is a 7.0 machine with another 5212 card:
>>
>> # uname -a
>> FreeBSD peer 7.0-RELEASE-p10 FreeBSD 7.0-RELEASE-p10 #0: Mon Mar 23 
>> 09:26:18 EDT 2009     root@peer:/usr/obj/usr/src/sys/PEER  i386
>>
>> # dmesg | grep ath
>> ath_hal: 0.10.5.6 (AR5210, AR5211, AR5212, AR5416, RF5111, RF5112, 
>> RF2413, RF5413, RF2133, RF2425, RF2417)
>> ath0: <Atheros 5212> mem 0xa8410000-0xa841ffff irq 11 at device 0.0 
>> on cardbus0
>> ath0: [ITHREAD]
>> ath0: using obsoleted if_watchdog interface
>> ath0: Ethernet address: 00:14:d1:42:21:5a
>> ath0: mac 7.9 phy 4.5 radio 5.6
>>
>> The three SSIDs configured on the CURRENT machine show up in a scan:
>>
>> # ifconfig ath0 scan | grep wlan
>> wlan0           00:18:e7:33:5e:24   11   54M -66:-93  100 ES   WME
>> wlan1           06:18:e7:33:5e:24   11   54M -65:-93  100 ES   WME
>> wlan2           0a:18:e7:33:5e:24   11   54M -65:-93  100 ES   WME
>>
>> The client is only able to associate with wlan1, however. When 
>> scanning channels while attempting to associate with any of the other 
>> ones, it gets stuck on channel 11 for a while before moving on, which 
>> seems relevant. Also interesting is the fact that if i do "ifconfig 
>> ath0 down" on the CURRENT machine, followed by, for example, 
>> "ifconfig ath0 ssid wlan0" (which did not associate before) on the 
>> client, followed by "ifconfig ath0 up" on the CURRENT machine, the 
>> client will associate with wlan0, but will not be able to associate 
>> with wlan1 or wlan2. Any ideas?
> wlandebug scan+auth+assoc on the client machine will show you why you 
> cannot associate.  You can also enable the same info on the ap side to 
> see what it thinks is happening.

FWIW I just setup 3 vap's as you did above and hooked them into a 
bridge.  I verified I could associate and pass traffic using a MBPro.  
No problems.  I also destroyed the bridge and re-tested w/o issues.  
Regardless the debug msgs should identify what your problem is.

    Sam


From owner-freebsd-net@FreeBSD.ORG  Wed Apr  8 22:35:13 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0862D106566C
	for <freebsd-net@freebsd.org>; Wed,  8 Apr 2009 22:35:13 +0000 (UTC)
	(envelope-from fazaeli@sepehrs.com)
Received: from sepehrs.com (sepehrs.com [213.217.59.98])
	by mx1.freebsd.org (Postfix) with ESMTP id 267C08FC16
	for <freebsd-net@freebsd.org>; Wed,  8 Apr 2009 22:35:11 +0000 (UTC)
	(envelope-from fazaeli@sepehrs.com)
Received: from [192.168.1.180] ([192.168.3.1])
	by mail (8.14.3/8.14.3) with ESMTP id n385DOM8037672;
	Wed, 8 Apr 2009 09:43:24 +0430 (IRDT)
Message-ID: <49DC33DD.8000708@sepehrs.com>
Date: Wed, 08 Apr 2009 08:49:25 +0330
From: "H.Fazaeli" <fazaeli@sepehrs.com>
User-Agent: Thunderbird 2.0.0.21 (Windows/20090302)
To: jfvogel@gmail.com
References: <900824.65358.qm@web63901.mail.re1.yahoo.com>
In-Reply-To: <900824.65358.qm@web63901.mail.re1.yahoo.com>
Content-Transfer-Encoding: 7bit
MIME-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-1"
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Cc: freebsd-net@freebsd.org
Subject: Re: Advice on a multithreaded netisr patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 08 Apr 2009 22:35:13 -0000


   Dear Jack
   Can you please comment on below statements ?!
   Is the assertion true for all OSes (windows, linux, ...) or it
   is just freebsd? I am actually concerned in how much production
   ready is igb drivers in your opinion.
   As a matter of fact, We have been (and are) using em drivers for years
   on
   production systems in biggest ICPs/ISPs/organizations without problem
   and we
   have very good faith in it (I have not tested igb).
   Barney Cordoba wrote:


--- On Tue, 4/7/09, Ivan Voras [1]<ivoras@freebsd.org> wrote:


From: Ivan Voras [2]<ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr patch?
To: [3]freebsd-net@freebsd.org
Date: Tuesday, April 7, 2009, 5:59 PM
Barney Cordoba wrote:


1) Multiple TX queues are not supported. There's


some hokey code to


test, but it doesn't properly separate flows to


the queues.


2) 2 Rx queues don't work, so only 1 and 4 work
3) With 4 queues, it just sucks up CPU under heavy


load on 4 cpus. It will


blow 4 cpus at a lower load than em will with 1
4) You'll need to fix DMA setup, as it sets the


alignment requirement


to PAGE_SIZE. I haven't been able to convince Jack


that its wrong, not


that I've tried very hard since its easy to just


fix myself.

Reading this thread it looks like the development of both
Intel drivers
is a bit stalled, doesn't it? AFAIK the em driver is
also
semi-officially abandoned, and both from my experience and
others it
looks like new development and patches are being rejected.
Time to shop
other hardware?


To be fair, the OS doesn't really support multiqueue yet, or has
for only a few hours, so lets not go crazy.

It makes a lot more sense to have someone on the "team" work with
Jack on improving the performance and working out the kinks. When
I asked Jack about the poor performance of if_igb, he indicated that
Intel's position is that the drivers are "just samples", which really
doesn't give anyone much confidence that they want to run their business
on them. You already  have Jack doing all of the hard work; that is
supporting the new-chip-per-week that intel puts out, so it seems to
me the best strategy would be to try to convince Intel that its in
their best interest to have drivers that work well so people don't
think that their hardware stinks.

As an example, the Chelsio 10gb bypass card is $3495. and an Intel
card is ~$1000, so its a big win for the community as a whole to have
good intel drivers going forward.

My work is commercially proprietary so I can't share my code, but
I can certainly share ideas on things that I've tested and discovered.

Barney


_______________________________________________
[4]freebsd-net@freebsd.org mailing list
[5]http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [6]"freebsd-net-unsubscribe@freebsd.org"


--


Best regards.

Hooman Fazaeli [7]<hf@sepehrs.com>
Sepehr S. T. Co. Ltd.

Web: [8]http://www.sepehrs.com
Tel: (9821)88975701-2
Fax: (9821)88983352

References

   1. mailto:ivoras@freebsd.org
   2. mailto:ivoras@freebsd.org
   3. mailto:freebsd-net@freebsd.org
   4. mailto:freebsd-net@freebsd.org
   5. http://lists.freebsd.org/mailman/listinfo/freebsd-net
   6. mailto:freebsd-net-unsubscribe@freebsd.org
   7. mailto:hf@sepehrs.com
   8. http://www.sepehrs.com/

From owner-freebsd-net@FreeBSD.ORG  Thu Apr  9 07:43:23 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 26EC21065693
	for <net@freebsd.org>; Thu,  9 Apr 2009 07:43:23 +0000 (UTC)
	(envelope-from bounces+305227.46043374.562566@icpbounce.com)
Received: from smtp2.icpbounce.com (smtp2.icpbounce.com [216.27.93.124])
	by mx1.freebsd.org (Postfix) with ESMTP id D6F618FC24
	for <net@freebsd.org>; Thu,  9 Apr 2009 07:43:22 +0000 (UTC)
	(envelope-from bounces+305227.46043374.562566@icpbounce.com)
Received: from localhost.localdomain (localhost.localdomain [127.0.0.1])
	by smtp2.icpbounce.com (Postfix) with ESMTP id C40FCF847A
	for <net@freebsd.org>; Thu,  9 Apr 2009 03:22:25 -0400 (EDT)
Date: Thu, 9 Apr 2009 03:22:25 -0400
To: net@freebsd.org
From: Global Access Travel <mailing@gaturkey.com>
Message-ID: <b48a7f12ace0aba94031194436642ccc@localhost.localdomain>
X-Priority: 3
X-Mailer: PHPMailer [version 1.72]
Errors-To: bounces+305227.46043374.562566@icpbounce.com
X-List-Unsubscribe: <http://app.icontact.com/icp/listunsubscribe.php?r=46043374&l=82228&s=CMEC&m=562566&c=305227>
X-Unsubscribe-Web: <http://app.icontact.com/icp/listunsubscribe.php?r=46043374&l=82228&s=CMEC&m=562566&c=305227>
X-ICPINFO: 
X-Return-Path-Hint: bounces+305227.46043374.562566@icpbounce.com
MIME-Version: 1.0
Content-Type: text/plain; charset = "utf-8"
Content-Transfer-Encoding: 8bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Cc: 
Subject: Private Shore Excursions-Turkey
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 09 Apr 2009 07:43:24 -0000

[http://www.turkeycalling.us]

PRIVATE SHORE EXCURSIONS- TURKEY

Your cruise clients will make the best of their time in Turkey on a
private shore excursion!

Istanbul

Kusadasi & Ephesus

[mailto:incoming@gaturkey.com?subject=Private Shore Excursions- Turkey]

****************************************************************************

Yasal Uyarı;
Bu e-posta, sadece adreste belirtilen kisi veya kurulusun kullanimini
hedeflemekte olup,mesajda yer alan bilgiler kisiye ozel ve gizli olabilir,
yasalar ya da anlasmalar geregi ücüncü kisiler ile paylasilmasi mümkün
olmayabilir.Mesaji alan kisi, mesajin gönderilmek istendigi kisi veya
kurulus degilse,bu mesaji yaymak,dagitmak veya kopyalamak yasaktir Mesaj
tarafiniza yanlislikla ulasmissa lütfen mesaji geri gönderiniz ve
sisteminizden siliniz. Global Turizm Hizmetleri Anonim Sirketi bu mesajin
icerigi ile ilgili olarak hicbir hukuksal sorumlulugu kabul etmez.

****************************************************************************

Disclaimer;
This e-mail communication is intended only for the use of the individual
or entity to which it is addressed, and may contain information that is
privileged, confidential and that may not be made public by law or
agreement. If the recipient of this message is not the intended recipient
or entity, you are hereby notified that any further dissemination,
distribution or copying of this information is strictly prohibited. If you
have received this message in error, please immediately notify the sender
and delete it from your system. The Global Turizm Hizmetleri Anonim Sirketi
does not accept legal responsibility for the contents of this message.

***********************************************************************************************

Yasal Uyarı;
Bu e-posta, sadece adreste belirtilen kisi veya kurulusun kullanimini
hedeflemekte olup,mesajda yer alan bilgiler kisiye ozel ve gizli olabilir,
yasalar ya da anlasmalar geregi ücüncü kisiler ile paylasilmasi mümkün
olmayabilir.Mesaji alan kisi, mesajin gönderilmek istendigi kisi veya
kurulus degilse,bu mesaji yaymak,dagitmak veya kopyalamak yasaktir Mesaj
tarafiniza yanlislikla ulasmissa lütfen mesaji geri gönderiniz ve
sisteminizden siliniz. Global Turizm Hizmetleri Anonim Sirketi bu mesajin
icerigi ile ilgili olarak hicbir hukuksal sorumlulugu kabul etmez.

**********************************************************************************************

Disclaimer;
This e-mail communication is intended only for the use of the individual
or entity to which it is addressed, and may contain information that is
privileged, confidential and that may not be made public by law or
agreement. If the recipient of this message is not the intended recipient
or entity, you are hereby notified that any further dissemination,
distribution or copying of this information is strictly prohibited. If you
have received this message in error, please immediately notify the sender
and delete it from your system. The Global Turizm Hizmetleri Anonim Sirketi
does not accept legal responsibility for the contents of this message.
 

This message was sent by: Global Access Incoming, Nuzhetiye cad, istanbul, besiktas 34357, Turkey

Powered by iContact: http://freetrial.icontact.com

To be removed click here:
http://app.icontact.com/icp/mmail-mprofile.pl?r=46043374&l=82228&s=CMEC&m=562566&c=305227

Forward to a friend: 
http://app.icontact.com/icp/sub/forward?m=562566&s=46043374&c=CMEC&cid=305227


From owner-freebsd-net@FreeBSD.ORG  Thu Apr  9 08:46:36 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id F212C1065670
	for <freebsd-net@freebsd.org>; Thu,  9 Apr 2009 08:46:35 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: from web63901.mail.re1.yahoo.com (web63901.mail.re1.yahoo.com
	[69.147.97.116]) by mx1.freebsd.org (Postfix) with SMTP id 44EE38FC22
	for <freebsd-net@freebsd.org>; Thu,  9 Apr 2009 08:46:34 +0000 (UTC)
	(envelope-from barney_cordoba@yahoo.com)
Received: (qmail 50282 invoked by uid 60001); 9 Apr 2009 08:46:33 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
	t=1239266793; bh=c+mMKOb/sPD5NT2W3z9f21Q8DEL30pjUU1TYlcoW2vc=;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=el1TQaNFskYy6fDzBbdutaxTjufCADCStmNQkaMnRysxfSzgA+yDZ57gQaMv74Gus7fb77obyX/jgnjGZMg8gtcN9emUyHzsHjXCXnrhDhHCuC7eI4OeYyx0AzU26dq5+uCM1cQEe02DJQf7Q51afczNPa0GSjWr5o45ZkyWHTo=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
	h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
	b=o68sFePnJq/rPh3ceG6jfBW3nls1D4bQzK4zzW68k4Rj2gIBvzIbUiBzJOqsu5RPsorFNlCQK5PySPQpyyfHPGA2Kf0eytogkdo1Q4S/xssUhEJfp1UxjesXug623hQGdvNOL35N9KbeeS7GMqZK8HwT7yZ9Tw830rGN8Sd/EgM=;
Message-ID: <792562.49628.qm@web63901.mail.re1.yahoo.com>
X-YMail-OSG: kM08mBsVM1mDDC2sKkY.BnuTQYSJtMaELpo5zvSVnQ6uvGgvpwsIukBWFI58anExShEW6lLo.muDZOFMlyRe7qcZPMD_QMm2.PaZTv9LlOaDMZZD8pWwz1h0P5SnYM5MSKSZdJR0S9_dpQZmV.6I3exGxitUa8dfO_VNh0SS9E33msolhD9eYhQa4hNje06CAgOPJRlNYX7tXXxGlr81Ub6GXk.bnnUliGsIqGzwWvYuUGab8FSXpoXXUcYelZyVgsl9E.jQD13m3wCbs1SCFDKjLuWluTiAPNjIFf9s6enhX42c6o7Kj7DdN07V
Received: from [98.242.222.229] by web63901.mail.re1.yahoo.com via HTTP;
	Thu, 09 Apr 2009 01:46:33 PDT
X-Mailer: YahooMailWebService/0.7.289.1
Date: Thu, 9 Apr 2009 01:46:33 -0700 (PDT)
From: Barney Cordoba <barney_cordoba@yahoo.com>
To: Robert Watson <rwatson@FreeBSD.org>
In-Reply-To: <alpine.BSF.2.00.0904081412540.61921@fledge.watson.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: freebsd-net@freebsd.org, Ivan Voras <ivoras@freebsd.org>
Subject: Re: Advice on a multithreaded netisr  patch?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: barney_cordoba@yahoo.com
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 09 Apr 2009 08:46:36 -0000


--- On Wed, 4/8/09, Robert Watson <rwatson@FreeBSD.org> wrote:

> From: Robert Watson <rwatson@FreeBSD.org>
> Subject: Re: Advice on a multithreaded netisr  patch?
> To: "Barney Cordoba" <barney_cordoba@yahoo.com>
> Cc: "Ivan Voras" <ivoras@freebsd.org>, freebsd-net@freebsd.org
> Date: Wednesday, April 8, 2009, 9:16 AM
> On Wed, 8 Apr 2009, Barney Cordoba wrote:
> 
> > Is there any work being done on lighter weight locks
> for queues? It seems ridiculous to avoid using queues
> because of lock contention when the locks are only
> protecting a couple lines of code.
> 
> My reading is that there are two, closely related, things
> going on: the first is lock contention, and the second is
> cache line contention.  We have a primitive in 8.x
> (don't think it's been MFC'd yet) for a lockless
> atomic buffer primitive for use in drivers and other parts
> of the stack.  However, that addresses only lock contention,
> not line contention, which at a high PPS will be an issue as
> well.  Only by moving to independent data structures (i.e.,
> on independent cache lines) can we reduce line contention.
> 
> Robert N M Watson
> Computer Laboratory
> University of Cambridge

Are mutexes smart enough to know to yield to higher priority threads
that are waiting immediately? Such as

mtx_unlock()
{
   do_unlock_stuff();
   if (higher_pri_waiting)
      sched_yield()
}

Also is there a way from the structure or flags to determing is some
other thread is waiting on the lock, such as?

mtx_unlock(&mtx);
if (mtx.someone_is_waiting)
  sched_yield();

or better yet

if (higher_priority_is_waiting)
  sched_yield()

I don't quite have a handle on how the turnstile works, but it seems
that there is a lot of time waiting for very short-lived locks. If 
the tasks are on different cpus, what is the granularity of the wait
time for a lock that is cleared almost immediately after trying it?

Also, is the waiting only extended when the threads are running on the
same cpu?

Barney


From owner-freebsd-net@FreeBSD.ORG  Thu Apr  9 09:58:56 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 1CB5B106566C
	for <freebsd-net@freebsd.org>; Thu,  9 Apr 2009 09:58:56 +0000 (UTC)
	(envelope-from f.bonnet@esiee.fr)
Received: from mx1.esiee.fr (mx1.esiee.fr [147.215.1.35])
	by mx1.freebsd.org (Postfix) with ESMTP id D003F8FC08
	for <freebsd-net@freebsd.org>; Thu,  9 Apr 2009 09:58:55 +0000 (UTC)
	(envelope-from f.bonnet@esiee.fr)
Received: from mail.esiee.fr (mail.esiee.fr [147.215.1.3])
	by mx1.esiee.fr (Postfix) with ESMTP id 94D6F136855
	for <freebsd-net@freebsd.org>; Thu,  9 Apr 2009 11:41:06 +0200 (CEST)
Received: from mail.esiee.fr (localhost [127.0.0.1])
	by VAMS.dummy (Postfix) with SMTP id 162B83E648
	for <freebsd-net@freebsd.org>; Thu,  9 Apr 2009 11:41:05 +0200 (CEST)
Received: from secure.esiee.fr (secure.esiee.fr [147.215.1.19])
	by mail.esiee.fr (Postfix) with ESMTP id DB1D84513F
	for <freebsd-net@freebsd.org>; Thu,  9 Apr 2009 11:40:59 +0200 (CEST)
Received: from lisa.esiee.fr (lisa.esiee.fr [147.215.1.21])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested) (Authenticated sender: bonnetf)
	by secure.esiee.fr (Postfix) with ESMTPSA id D2A73E7B0B
	for <freebsd-net@freebsd.org>; Thu,  9 Apr 2009 11:40:59 +0200 (CEST)
Message-ID: <49DDC2AB.1090100@esiee.fr>
Date: Thu, 09 Apr 2009 11:40:59 +0200
From: Frank Bonnet <f.bonnet@esiee.fr>
User-Agent: Thunderbird 2.0.0.19 (X11/20090305)
MIME-Version: 1.0
To: freebsd-net@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: IBM X3650 at 7.1 with broadcom chips and CISCO LACP with LAGG driver
 ?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 09 Apr 2009 09:58:56 -0000

Hello

I plan to migrate our mailhub to 7.1 but before I do it
I need infos about network :-)

The machine is an IBM X3650 that have two Broadcom
gigaethernet interfaces.

I want to use the LAGG driver in LACP mode with a Cisco switch
to connect the machine to my LAN in bonding mode

Is there any known network problem at 7.1 with such driver/machine ?

This machine has permanently 500/600 IMAP(S) processes and a high
SMTP traffic.

Thanks for any infos.


From owner-freebsd-net@FreeBSD.ORG  Thu Apr  9 10:14:40 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D0327106564A
	for <freebsd-net@freebsd.org>; Thu,  9 Apr 2009 10:14:40 +0000 (UTC)
	(envelope-from pluknet@gmail.com)
Received: from mail-fx0-f167.google.com (mail-fx0-f167.google.com
	[209.85.220.167])
	by mx1.freebsd.org (Postfix) with ESMTP id 61E928FC14
	for <freebsd-net@freebsd.org>; Thu,  9 Apr 2009 10:14:40 +0000 (UTC)
	(envelope-from pluknet@gmail.com)
Received: by fxm11 with SMTP id 11so505822fxm.43
	for <freebsd-net@freebsd.org>; Thu, 09 Apr 2009 03:14:39 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:received:in-reply-to:references
	:date:message-id:subject:from:to:cc:content-type
	:content-transfer-encoding;
	bh=Dc8fQFk5Qb3Ayu9TfJIsAmja9iovMstwDLqgo5lzYuQ=;
	b=U8dxgHryMOUnYY+1tY9DUzjsRXWDdjm8gFd60w5j9FLDr1Ysw2NWRrUtWYZTHI1MCn
	YLDkTbXgt+qzeGFnUaKw+lZYoXmc4nsw2alBG9yaD2F//gGMPGk4wnD5HQ4cZoV3vOLX
	c5NPRymi/j2D3FcnkvgVqd+x4ue66JuFEKDb4=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type:content-transfer-encoding;
	b=gS2IxyPWDYON0SIvmWQ5d27j0kgFp5wmSlZp6bqMmQ9QlhlSskMlVpmQliLlYRCb9p
	khxr+SiTJTzYTcUT07zVmmKKTS+eD10QgNX/LHuqg+igw31FNLESUn41eA7buqvWWKPw
	lR8snqzuy2BhyZPFIW6u3aw+GYXNX7GcWqPw4=
MIME-Version: 1.0
Received: by 10.103.214.8 with SMTP id r8mr1138701muq.92.1239272079262; Thu, 
	09 Apr 2009 03:14:39 -0700 (PDT)
In-Reply-To: <49DDC2AB.1090100@esiee.fr>
References: <49DDC2AB.1090100@esiee.fr>
Date: Thu, 9 Apr 2009 14:14:39 +0400
Message-ID: <a31046fc0904090314p6e8787c6t43be372486831e44@mail.gmail.com>
From: pluknet <pluknet@gmail.com>
To: Frank Bonnet <f.bonnet@esiee.fr>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@freebsd.org
Subject: Re: IBM X3650 at 7.1 with broadcom chips and CISCO LACP with LAGG 
	driver ?
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 09 Apr 2009 10:14:41 -0000

2009/4/9 Frank Bonnet <f.bonnet@esiee.fr>:
> Hello
>
> I plan to migrate our mailhub to 7.1 but before I do it
> I need infos about network :-)
>
> The machine is an IBM X3650 that have two Broadcom
> gigaethernet interfaces.
[..]
> Is there any known network problem at 7.1 with such driver/machine ?

At work we have a such one running under 7.1-R
with MySQL and Mail services, without high memory or
network pressure though. The last uptime was 43 days.
No any network problems were discovered for that time.

-- 
wbr,
pluknet

From owner-freebsd-net@FreeBSD.ORG  Fri Apr 10 07:17:07 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 8BDB01065673
	for <freebsd-net@FreeBSD.org>; Fri, 10 Apr 2009 07:17:07 +0000 (UTC)
	(envelope-from xernet@hotmail.it)
Received: from bay0-omc3-s32.bay0.hotmail.com (bay0-omc3-s32.bay0.hotmail.com
	[65.54.246.232])
	by mx1.freebsd.org (Postfix) with ESMTP id 76CF38FC1E
	for <freebsd-net@FreeBSD.org>; Fri, 10 Apr 2009 07:17:07 +0000 (UTC)
	(envelope-from xernet@hotmail.it)
Received: from BAY126-DS6 ([65.55.131.33]) by bay0-omc3-s32.bay0.hotmail.com
	with Microsoft SMTPSVC(6.0.3790.3959); 
	Fri, 10 Apr 2009 00:05:07 -0700
X-Originating-IP: [79.10.86.250]
X-Originating-Email: [xernet@hotmail.it]
Message-ID: <BAY126-DS6F5E450CBCBD5665A57C2A3800@phx.gbl>
From: "xer" <xernet@hotmail.it>
To: <freebsd-net@FreeBSD.org>
Date: Fri, 10 Apr 2009 09:05:11 +0200
MIME-Version: 1.0
Content-Type: text/plain; format=flowed; charset="iso-8859-1";
	reply-type=response
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
Importance: Normal
X-Mailer: Microsoft Windows Live Mail 14.0.8064.206
X-MimeOLE: Produced By Microsoft MimeOLE V14.0.8064.206
X-OriginalArrivalTime: 10 Apr 2009 07:05:07.0383 (UTC)
	FILETIME=[AE224470:01C9B9AA]
Cc: 
Subject: watchdog timeout
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: xer <xernet@hotmail.it>
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Apr 2009 07:17:07 -0000

Hello, i did sent this mine message to stable mail list, then i found that 
your address is a manteiner for some bugs.
I'm asking if this one article:
http://www.freebsd.org/cgi/query-pr.cgi?pr=129352

Has updates, since i haven't found any new, 'cause it's talking about 
PRERELEASE and i'm working on 6.4-STABLE,
also how can it is possible to have a compiled kernel on january and it have 
this bug still present?

Thand in advance for a your responce
Regards

--------------------------------------------------
From: "xer" <xernet@hotmail.it>
Sent: Wednesday, April 08, 2009 10:41 AM
To: <freebsd-stable@freebsd.org>
Subject: watchdog timeout

> Hello
> I have some problems with 3Com nics, after a upgrade from 5.5-STABLE to 
> 6.4-STABLE.
>
> This machine has two 3com nics (one is LAN other is WAN) and i see too 
> much "watchdog timeout" on both cards.
> This on/off up/down on cards, affect the interrupt to clients that are 
> downloading from apache web server, especially on large files.
>
> --------------------------------------------
> xer:/root# dmesg
> xl1: watchdog timeout
> xl1: link state changed to DOWN
> xl1: link state changed to UP
> xl1: watchdog timeout
> xl1: link state changed to DOWN
> xl1: link state changed to UP
> xl1: watchdog timeout
> xl1: link state changed to DOWN
> xl1: link state changed to UP
> ---------------------------------------------
>
> xer:/root# cat /var/run/dmesg.boot | grep xl
> xl0: <3Com 3c905C-TX Fast Etherlink XL> port 0xec00-0xec7f mem 
> 0xfceffc00-0xfceffc7f irq 23 at device 11.0 on pci2
> miibus0: <MII bus> on xl0
> xlphy0: <3c905C 10/100 internal PHY> on miibus0
> xlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
> xl0: Ethernet address: 00:01:02:e0:04:1b
> xl1: <3Com 3c905C-TX Fast Etherlink XL> port 0xe880-0xe8ff mem 
> 0xfceff800-0xfceff87f irq 20 at device 12.0 on pci2
> miibus1: <MII bus> on xl1
> xlphy1: <3c905C 10/100 internal PHY> on miibus1
> xlphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
> xl1: Ethernet address: 00:01:02:df:fe:ed
> ---------------------------------------------
> Another doubt would be my kernel config, maybe there is something wrong 
> that i cannot see, i'll post at the end of this post, 'cause is too long.
>
> As you can see, the cards are 3c905C-TX model.
> Someone told me to change drivers, but i cannot understand this advice.
> I got same errors with same cards but with another mainboard, same 
> problem, watchdog appears after an upgrade from 5.4-STABLE to 6.4-STABLE.
>
> I don't think that to change nic's pci slots, will solve the problem, i 
> think that maybe change the nics would resolve the matter, but i cannot 
> access to both FreeBSD phisically, cause the boxes are too far from me 
> (about 3500 km).
>
> I'm asking you some advices, and i can i fix this problem.
> p.s. with both 5.4 or 5.5 old kernel, the nics was fine.
>
> Regards
> Xer
>
> ----------kernel config -----------
> xer:/root# cat /usr/src/sys/i386/conf/ASUS
> #
> # $FreeBSD: src/sys/i386/conf/GENERIC,v 1.429.2.18 2008/07/28 02:20:29 
> yongari Exp $
> #
> # custom kernel ASUS 01.15.2009
>
> machine         i386
> cpu             I686_CPU
> ident           ASUS
>
> options         SCHED_4BSD              # 4BSD scheduler
> options         PREEMPTION              # Enable kernel thread preemption
> options         INET                    # InterNETworking
> options         INET6                   # IPv6 communications protocols
> options         FFS                     # Berkeley Fast Filesystem
> options         SOFTUPDATES             # Enable FFS soft updates support
> options         UFS_ACL                 # Support for access control lists
> options         UFS_DIRHASH             # Improve performance on big 
> directories
> options         MD_ROOT                 # MD is a potential root device
> options         NFSCLIENT               # Network Filesystem Client
> options         NFSSERVER               # Network Filesystem Server
> options         NFSLOCKD                # Network Lock Manager
> options         NFS_ROOT                # NFS usable as /, requires 
> NFSCLIENT
> options         MSDOSFS                 # MSDOS Filesystem
> options         CD9660                  # ISO 9660 Filesystem
> options         PROCFS                  # Process filesystem (requires 
> PSEUDOFS)
> options         PSEUDOFS                # Pseudo-filesystem framework
> options         GEOM_GPT                # GUID Partition Tables.
> options         COMPAT_43               # Compatible with BSD 4.3 [KEEP 
> THIS!]
> options         COMPAT_FREEBSD4         # Compatible with FreeBSD4
> options         COMPAT_FREEBSD5         # Compatible with FreeBSD5
> options         KTRACE                  # ktrace(1) support
> options         SYSVSHM                 # SYSV-style shared memory
> options         SYSVMSG                 # SYSV-style message queues
> options         SYSVSEM                 # SYSV-style semaphores
> options         _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time 
> extensions
> options         KBD_INSTALL_CDEV        # install a CDEV entry in /dev
> options         ADAPTIVE_GIANT          # Giant mutex is adaptive.
>
> device          apic                    # I/O APIC
>
> # Bus support.
> device          eisa
> device          pci
>
> # Floppy drives
> device          fdc
>
> # ATA and ATAPI devices
> device          ata
> device          atadisk         # ATA disk drives
> device          ataraid         # ATA RAID drives
> device          atapicd         # ATAPI CDROM drives
> device          atapifd         # ATAPI floppy drives
> device          atapist         # ATAPI tape drives
> options         ATA_STATIC_ID   # Static device numbering
>
> # atkbdc0 controls both the keyboard and the PS/2 mouse
> device          atkbdc          # AT keyboard controller
> device          atkbd           # AT keyboard
> device          psm             # PS/2 mouse
>
> device          kbdmux          # keyboard multiplexer
>
> device          vga             # VGA video card driver
>
> device          splash          # Splash screen and screen saver support
>
> # syscons is the default console driver, resembling an SCO console
> device          sc
>
> device          agp             # support several AGP chipsets
>
> # Add suspend/resume support for the i8254.
> device          pmtimer
>
> # Serial (COM) ports
> device          sio             # 8250, 16[45]50 based serial ports
>
> # Parallel port
> device          ppc
> device          ppbus           # Parallel port bus (required)
> device          lpt             # Printer
> device          plip            # TCP/IP over parallel
> device          ppi             # Parallel port interface device
>
> # PCI Ethernet NICs.
> device          de              # DEC/Intel DC21x4x (``Tulip'')
> device          em              # Intel PRO/1000 adapter Gigabit Ethernet 
> Card
> device          ixgb            # Intel PRO/10GbE Ethernet Card
> device          txp             # 3Com 3cR990 (``Typhoon'')
> device          vx              # 3Com 3c590, 3c595 (``Vortex'')
>
> # PCI Ethernet NICs that use the common MII bus controller code.
> # NOTE: Be sure to keep the 'device miibus' line in order to use these 
> NICs!
> device          miibus          # MII bus support
> device          bce             # Broadcom BCM5706/BCM5708 Gigabit 
> Ethernet
> device          bfe             # Broadcom BCM440x 10/100 Ethernet
> device          bge             # Broadcom BCM570xx Gigabit Ethernet
> device          dc              # DEC/Intel 21143 and various workalikes
> device          fxp             # Intel EtherExpress PRO/100B (82557, 
> 82558)
> device          jme             # JMicron JMC250 Gigabit/JMC260 Fast 
> Ethernet
> device          lge             # Level 1 LXT1001 gigabit Ethernet
> device          msk             # Marvell/SysKonnect Yukon II Gigabit 
> Ethernet
> device          nge             # NatSemi DP83820 gigabit Ethernet
> device          nve             # nVidia nForce MCP on-board Ethernet 
> Networking
> device          pcn             # AMD Am79C97x PCI 10/100(precedence over 
> 'lnc')
> device          re              # RealTek 8139C+/8169/8169S/8110S
> device          rl              # RealTek 8129/8139
> device          sf              # Adaptec AIC-6915 (``Starfire'')
> device          sis             # Silicon Integrated Systems SiS 900/SiS 
> 7016
> device          sk              # SysKonnect SK-984x & SK-982x gigabit 
> Ethernet
> device          ste             # Sundance ST201 (D-Link DFE-550TX)
> device          stge            # Sundance/Tamarack TC9021 gigabit 
> Ethernet
> device          ti              # Alteon Networks Tigon I/II gigabit 
> Ethernet
> device          tl              # Texas Instruments ThunderLAN
> device          tx              # SMC EtherPower II (83c170 ``EPIC'')
> device          vge             # VIA VT612x gigabit Ethernet
> device          vr              # VIA Rhine, Rhine II
> device          wb              # Winbond W89C840F
> device          xl              # 3Com 3c90x (``Boomerang'', ``Cyclone'')
>
> # ISA Ethernet NICs.  pccard NICs included.
> device          cs              # Crystal Semiconductor CS89x0 NIC
> # 'device ed' requires 'device miibus'
> device          ed              # NE[12]000, SMC Ultra, 3c503, DS8390 
> cards
> device          ex              # Intel EtherExpress Pro/10 and Pro/10+
> device          ep              # Etherlink III based cards
> device          fe              # Fujitsu MB8696x based cards
> device          ie              # EtherExpress 8/16, 3C507, StarLAN 10 
> etc.
> device          lnc             # NE2100, NE32-VL Lance Ethernet cards
> device          sn              # SMC's 9000 series of Ethernet chips
> device          xe              # Xircom pccard Ethernet
>
> # Pseudo devices.
> device          loop            # Network loopback
> device          random          # Entropy device
> device          ether           # Ethernet support
> device          sl              # Kernel SLIP
> device          ppp             # Kernel PPP
> device          tun             # Packet tunnel.
> device          pty             # Pseudo-ttys (telnet etc)
> device          md              # Memory "disks"
> device          gif             # IPv6 and IPv4 tunneling
> device          faith           # IPv6-to-IPv4 relaying (translation)
>
> # The `bpf' device enables the Berkeley Packet Filter.
> # Be aware of the administrative consequences of enabling this!
> # Note that 'bpf' is required for DHCP.
> device          bpf             # Berkeley packet filter
>
> # Firewall
> options         IPFIREWALL                      # enable ipfirewall 
> (required for dummynet)
> options         IPFIREWALL_VERBOSE              # enable firewall output 
> logging to syslogd(8)
> options         IPFIREWALL_VERBOSE_LIMIT=0      # limit firewall verbosity 
> output
> options         IPDIVERT                        # divert sockets
> options         DUMMYNET                        # enable dummynet 
> operation
> options         HZ=1000                         # set the timer 
> granularity
>
> 

From owner-freebsd-net@FreeBSD.ORG  Fri Apr 10 12:10:04 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A83E41065688
	for <freebsd-net@hub.freebsd.org>; Fri, 10 Apr 2009 12:10:04 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 7B2F88FC15
	for <freebsd-net@hub.freebsd.org>; Fri, 10 Apr 2009 12:10:04 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n3ACA4iV092073
	for <freebsd-net@freefall.freebsd.org>; Fri, 10 Apr 2009 12:10:04 GMT
	(envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n3ACA4Hp092072;
	Fri, 10 Apr 2009 12:10:04 GMT (envelope-from gnats)
Date: Fri, 10 Apr 2009 12:10:04 GMT
Message-Id: <200904101210.n3ACA4Hp092072@freefall.freebsd.org>
To: freebsd-net@FreeBSD.org
From: Mikolaj Golub <to.my.trociny@gmail.com>
Cc: 
Subject: Re: kern/131310: [netgraph] [panic] 7.1 panics with mpd netgraph
	interface changes
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Mikolaj Golub <to.my.trociny@gmail.com>
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Apr 2009 12:10:06 -0000

The following reply was made to PR kern/131310; it has been noted by GNATS.

From: Mikolaj Golub <to.my.trociny@gmail.com>
To: bug-followup@FreeBSD.org,Vitaly Dodonov <dreamer.two@gmail.com>
Cc: Semenchuk Oleg <darkibot@gmail.com>
Subject: Re: kern/131310: [netgraph] [panic] 7.1 panics with mpd netgraph interface changes
Date: Fri, 10 Apr 2009 15:09:38 +0300

 This pr is closely related to kern/130977. You can try the patch from it, which
 adds if_delgroup(ifp, IFG_ALL) to if_detach().
 
 -- 
 Mikolaj Golub

From owner-freebsd-net@FreeBSD.ORG  Fri Apr 10 12:40:07 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E8D4F1065670
	for <freebsd-net@freebsd.org>; Fri, 10 Apr 2009 12:40:07 +0000 (UTC)
	(envelope-from bzeeb-lists@lists.zabbadoz.net)
Received: from mail.cksoft.de (mail.cksoft.de [195.88.108.3])
	by mx1.freebsd.org (Postfix) with ESMTP id 9EE768FC2E
	for <freebsd-net@freebsd.org>; Fri, 10 Apr 2009 12:40:07 +0000 (UTC)
	(envelope-from bzeeb-lists@lists.zabbadoz.net)
Received: from localhost (amavis.fra.cksoft.de [192.168.74.71])
	by mail.cksoft.de (Postfix) with ESMTP id 91B2641C735;
	Fri, 10 Apr 2009 14:40:06 +0200 (CEST)
X-Virus-Scanned: amavisd-new at cksoft.de
Received: from mail.cksoft.de ([195.88.108.3])
	by localhost (amavis.fra.cksoft.de [192.168.74.71]) (amavisd-new,
	port 10024)
	with ESMTP id wgcrkY+NlyvK; Fri, 10 Apr 2009 14:40:06 +0200 (CEST)
Received: by mail.cksoft.de (Postfix, from userid 66)
	id 3A9F441C732; Fri, 10 Apr 2009 14:40:06 +0200 (CEST)
Received: from maildrop.int.zabbadoz.net (maildrop.int.zabbadoz.net
	[10.111.66.10])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mail.int.zabbadoz.net (Postfix) with ESMTP id 5B2E84448E6;
	Fri, 10 Apr 2009 12:36:47 +0000 (UTC)
Date: Fri, 10 Apr 2009 12:36:47 +0000 (UTC)
From: "Bjoern A. Zeeb" <bzeeb-lists@lists.zabbadoz.net>
X-X-Sender: bz@maildrop.int.zabbadoz.net
To: sthaug@nethelp.no
In-Reply-To: <20090407.165708.74744827.sthaug@nethelp.no>
Message-ID: <20090410123559.D15361@maildrop.int.zabbadoz.net>
References: <20090405215842.C15361@maildrop.int.zabbadoz.net>
	<20090406.121959.74751582.sthaug@nethelp.no>
	<20090407144311.F15361@maildrop.int.zabbadoz.net>
	<20090407.165708.74744827.sthaug@nethelp.no>
X-OpenPGP-Key: 0x14003F198FEFA3E77207EE8D2B58B8F83CCF1842
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-net@freebsd.org
Subject: Re: IPv6 window scaling factor always 1 on initial SYN
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Apr 2009 12:40:08 -0000

On Tue, 7 Apr 2009, sthaug@nethelp.no wrote:

>>> I changed it, and that worked like a dream. Now I get basically the
>>> same throughput with IPv4 and IPv6. There are of course still issues
>>> like lots of IPv6 tunnels that add extra latency - but that's not the
>>> fault of FreeBSD.
>>>
>>> Anyway, thanks for your work. Below is a context diff (against 7-STABLE
>>> cvsupped last night). Do we need a PR to get this into FreeBSD?
>>
>> It's in HEAD now as of SVN r190800.
>
> Excellent news, thank you! And presumably we'll get a MFC after a
> suitable settling time?

If 3 days were suitable;)  It'll be part of 7.2-R as it is in stable/7
now.

Thanks a lot for reporting and testing!

/bz

-- 
Bjoern A. Zeeb                      The greatest risk is not taking one.

From owner-freebsd-net@FreeBSD.ORG  Fri Apr 10 14:50:05 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 1EE56106564A
	for <freebsd-net@hub.freebsd.org>; Fri, 10 Apr 2009 14:50:05 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 0ECE98FC1A
	for <freebsd-net@hub.freebsd.org>; Fri, 10 Apr 2009 14:50:05 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n3AEo4b8008055
	for <freebsd-net@freefall.freebsd.org>; Fri, 10 Apr 2009 14:50:04 GMT
	(envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n3AEo44I008054;
	Fri, 10 Apr 2009 14:50:04 GMT (envelope-from gnats)
Date: Fri, 10 Apr 2009 14:50:04 GMT
Message-Id: <200904101450.n3AEo44I008054@freefall.freebsd.org>
To: freebsd-net@FreeBSD.org
From: dfilter@FreeBSD.ORG (dfilter service)
Cc: 
Subject: Re: kern/131310: commit references a PR
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: dfilter service <dfilter@FreeBSD.ORG>
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Apr 2009 14:50:05 -0000

The following reply was made to PR kern/131310; it has been noted by GNATS.

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/131310: commit references a PR
Date: Fri, 10 Apr 2009 14:42:02 +0000 (UTC)

 Author: mlaier
 Date: Fri Apr 10 14:41:51 2009
 New Revision: 190895
 URL: http://svn.freebsd.org/changeset/base/190895
 
 Log:
   Remove interfaces from IFG_ALL on detach.  This cures a couple of pf panics
   when using the "self" keyword in tables or as ()-style host address and
   fixes "ifconfig -g all" output.
   
   PR:		kern/130977, kern/131310
   Submitted by:	Mikolaj Golub
   MFC after:	3 days
 
 Modified:
   head/sys/net/if.c
 
 Modified: head/sys/net/if.c
 ==============================================================================
 --- head/sys/net/if.c	Fri Apr 10 14:24:12 2009	(r190894)
 +++ head/sys/net/if.c	Fri Apr 10 14:41:51 2009	(r190895)
 @@ -887,6 +887,7 @@ if_detach(struct ifnet *ifp)
  	rt_ifannouncemsg(ifp, IFAN_DEPARTURE);
  	EVENTHANDLER_INVOKE(ifnet_departure_event, ifp);
  	devctl_notify("IFNET", ifp->if_xname, "DETACH", NULL);
 +	if_delgroup(ifp, IFG_ALL);
  
  	IF_AFDATA_LOCK(ifp);
  	for (dp = domains; dp; dp = dp->dom_next) {
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 

From owner-freebsd-net@FreeBSD.ORG  Fri Apr 10 15:02:19 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 42F5F106567D
	for <freebsd-net@freebsd.org>; Fri, 10 Apr 2009 15:02:19 +0000 (UTC)
	(envelope-from kfl@xiplink.com)
Received: from smtp191.iad.emailsrvr.com (smtp191.iad.emailsrvr.com
	[207.97.245.191])
	by mx1.freebsd.org (Postfix) with ESMTP id 1E25D8FC36
	for <freebsd-net@freebsd.org>; Fri, 10 Apr 2009 15:02:19 +0000 (UTC)
	(envelope-from kfl@xiplink.com)
Received: from relay9.relay.iad.mlsrvr.com (localhost [127.0.0.1])
	by relay9.relay.iad.mlsrvr.com (SMTP Server) with ESMTP id AC39F1E21A3
	for <freebsd-net@freebsd.org>; Fri, 10 Apr 2009 11:02:18 -0400 (EDT)
Received: by relay9.relay.iad.mlsrvr.com (Authenticated sender:
	kfodil-lemelin-AT-xiplink.com) with ESMTPSA id 9E4D11CCA41
	for <freebsd-net@freebsd.org>; Fri, 10 Apr 2009 11:02:18 -0400 (EDT)
Message-ID: <49DF5F75.6080607@xiplink.com>
Date: Fri, 10 Apr 2009 11:02:13 -0400
From: Karim Fodil-Lemelin <kfl@xiplink.com>
User-Agent: Thunderbird 2.0.0.21 (Windows/20090302)
MIME-Version: 1.0
To: freebsd-net@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: m_tag, malloc vs uma
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Apr 2009 15:02:20 -0000

Hello,

Is there any plans on getting the mbuf tags sub-system integrated with 
the universal memory allocator? Getting tags for mbufs is still calling 
malloc in uipc_mbuf.c ... What would be the benefits of using uma instead?

Karim.

From owner-freebsd-net@FreeBSD.ORG  Fri Apr 10 17:08:01 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E7F141065696
	for <freebsd-net@freebsd.org>; Fri, 10 Apr 2009 17:08:01 +0000 (UTC)
	(envelope-from spawk@acm.poly.edu)
Received: from acm.poly.edu (acm.poly.edu [128.238.9.200])
	by mx1.freebsd.org (Postfix) with ESMTP id AF5468FC12
	for <freebsd-net@freebsd.org>; Fri, 10 Apr 2009 17:08:01 +0000 (UTC)
	(envelope-from spawk@acm.poly.edu)
Received: (qmail 69250 invoked from network); 10 Apr 2009 17:08:01 -0000
Received: from unknown (HELO ?10.0.0.135?) (spawk@128.238.64.31)
	by acm.poly.edu with AES256-SHA encrypted SMTP;
	10 Apr 2009 17:08:01 -0000
Message-ID: <49DF7CE9.6060706@acm.poly.edu>
Date: Fri, 10 Apr 2009 13:07:53 -0400
From: Boris Kochergin <spawk@acm.poly.edu>
User-Agent: Thunderbird 2.0.0.19 (X11/20090108)
MIME-Version: 1.0
To: Sam Leffler <sam@freebsd.org>
References: <49DCAC1F.9000708@acm.poly.edu> <49DCC1EB.3040706@freebsd.org>
	<49DCD66B.6040504@freebsd.org>
In-Reply-To: <49DCD66B.6040504@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@freebsd.org
Subject: Re: Multi-BSS problem with Atheros 5212
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Apr 2009 17:08:02 -0000

Sam Leffler wrote:
> Sam Leffler wrote:
>> Boris Kochergin wrote:
>>> Ahoy. I'm having trouble with multiple hostap-mode wlan 
>>> pseudo-devices. The machine is an 8-CURRENT from yesterday:
>>>
>>> # uname -a
>>> FreeBSD test 8.0-CURRENT FreeBSD 8.0-CURRENT #0: Tue Apr  7 16:54:56 
>>> UTC 2009     root@test:/usr/obj/usr/src/sys/GENERIC  i386
>>>
>>> # dmesg | grep ath
>>> ath0: <Atheros 5212> mem 0xf4100000-0xf410ffff irq 11 at device 13.0 
>>> on pci0
>>> ath0: [ITHREAD]
>>> ath0: AR2413 mac 7.9 RF2413 phy 4.5
>>>
>>> # cat /etc/rc.conf
>>> wlans_ath0="wlan0 wlan1 wlan2"
>>> create_args_wlan0="wlanmode hostap bssid"
>>> create_args_wlan1="wlanmode hostap bssid"
>>> create_args_wlan2="wlanmode hostap bssid"
>>> ifconfig_wlan0="ssid wlan0 wepmode off up"
>>> ifconfig_wlan1="ssid wlan1 wepmode off up"
>>> ifconfig_wlan2="ssid wlan2 wepmode off up"
>>>
>>> # ifconfig
>>> ath0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 
>>> mtu 2290
>>>        ether 00:18:e7:33:5e:24
>>>        media: IEEE 802.11 Wireless Ethernet autoselect mode 11g 
>>> <hostap>
>>>        status: running
>>> fxp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 
>>> mtu 1500
>>>        options=8<VLAN_MTU>
>>>        ether 00:90:27:72:c4:f3
>>>        inet 10.0.0.128 netmask 0xffffff00 broadcast 10.0.0.255
>>>        media: Ethernet autoselect (100baseTX <full-duplex>)
>>>        status: active
>>> lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
>>>        options=3<RXCSUM,TXCSUM>
>>>        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
>>>        inet6 ::1 prefixlen 128
>>>        inet 127.0.0.1 netmask 0xff000000
>>> wlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 
>>> mtu 1500
>>>        ether 00:18:e7:33:5e:24
>>>        media: IEEE 802.11 Wireless Ethernet autoselect mode 11g 
>>> <hostap>
>>>        status: running
>>>        ssid wlan0 channel 11 (2462 Mhz 11g) bssid 00:18:e7:33:5e:24
>>>        country US ecm authmode OPEN privacy OFF txpower 23 scanvalid 60
>>>        protmode CTS wme burst dtimperiod 1 -dfs
>>> wlan1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 
>>> mtu 1500
>>>        ether 06:18:e7:33:5e:24
>>>        media: IEEE 802.11 Wireless Ethernet autoselect mode 11g 
>>> <hostap>
>>>        status: running
>>>        ssid wlan1 channel 11 (2462 Mhz 11g) bssid 06:18:e7:33:5e:24
>>>        country US ecm authmode OPEN privacy OFF txpower 23 scanvalid 60
>>>        protmode CTS wme burst dtimperiod 1 -dfs
>>> wlan2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 
>>> mtu 1500
>>>        ether 0a:18:e7:33:5e:24
>>>        media: IEEE 802.11 Wireless Ethernet autoselect mode 11g 
>>> <hostap>
>>>        status: running
>>>        ssid wlan2 channel 11 (2462 Mhz 11g) bssid 0a:18:e7:33:5e:24
>>>        country US ecm authmode OPEN privacy OFF txpower 23 scanvalid 60
>>>        protmode CTS wme burst dtimperiod 1 -dfs
>>>
>>> The client is a 7.0 machine with another 5212 card:
>>>
>>> # uname -a
>>> FreeBSD peer 7.0-RELEASE-p10 FreeBSD 7.0-RELEASE-p10 #0: Mon Mar 23 
>>> 09:26:18 EDT 2009     root@peer:/usr/obj/usr/src/sys/PEER  i386
>>>
>>> # dmesg | grep ath
>>> ath_hal: 0.10.5.6 (AR5210, AR5211, AR5212, AR5416, RF5111, RF5112, 
>>> RF2413, RF5413, RF2133, RF2425, RF2417)
>>> ath0: <Atheros 5212> mem 0xa8410000-0xa841ffff irq 11 at device 0.0 
>>> on cardbus0
>>> ath0: [ITHREAD]
>>> ath0: using obsoleted if_watchdog interface
>>> ath0: Ethernet address: 00:14:d1:42:21:5a
>>> ath0: mac 7.9 phy 4.5 radio 5.6
>>>
>>> The three SSIDs configured on the CURRENT machine show up in a scan:
>>>
>>> # ifconfig ath0 scan | grep wlan
>>> wlan0           00:18:e7:33:5e:24   11   54M -66:-93  100 ES   WME
>>> wlan1           06:18:e7:33:5e:24   11   54M -65:-93  100 ES   WME
>>> wlan2           0a:18:e7:33:5e:24   11   54M -65:-93  100 ES   WME
>>>
>>> The client is only able to associate with wlan1, however. When 
>>> scanning channels while attempting to associate with any of the 
>>> other ones, it gets stuck on channel 11 for a while before moving 
>>> on, which seems relevant. Also interesting is the fact that if i do 
>>> "ifconfig ath0 down" on the CURRENT machine, followed by, for 
>>> example, "ifconfig ath0 ssid wlan0" (which did not associate before) 
>>> on the client, followed by "ifconfig ath0 up" on the CURRENT 
>>> machine, the client will associate with wlan0, but will not be able 
>>> to associate with wlan1 or wlan2. Any ideas?
>> wlandebug scan+auth+assoc on the client machine will show you why you 
>> cannot associate.  You can also enable the same info on the ap side 
>> to see what it thinks is happening.
>
> FWIW I just setup 3 vap's as you did above and hooked them into a 
> bridge.  I verified I could associate and pass traffic using a MBPro.  
> No problems.  I also destroyed the bridge and re-tested w/o issues.  
> Regardless the debug msgs should identify what your problem is.
>
>    Sam
>
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
I booted the hostap machine up and set wlandebug to scan+auth+assoc on 
wlan0, wlan1, and wlan2. I then inserted the PCMCIA card into the client 
machine, set wlandebug to scan+auth+assoc on it (ath0), and executed 
"ifconfig ath0 ssid wlan0 up". I let it scan around for a bit. The 
client-side debug messages are at 
http://acm.poly.edu/~spawk/wlan/wlan0.client, and the hostap machine did 
not emit any debug messages during the association attempts. I then 
ejected the card from the client and repeated the process for wlan1 (it 
associated). The client-side debug messages are at 
http://acm.poly.edu/~spawk/wlan/wlan1.client and the hostap-side debug 
messages are at http://acm.poly.edu/~spawk/wlan/wlan1.ap. I then ejected 
the card from the client and repeated the process for wlan2. The 
client-side debug messages are at 
http://acm.poly.edu/~spawk/wlan/wlan2.client, and the hostap machine did 
not emit any debug messages during the association attempts. In case 
it's relevant, the client card is a PCMCIA version of...

ath0@pci0:5:0:0:        class=0x020000 card=0x2051168c chip=0x0013168c 
rev=0x01 hdr=0x00
    vendor     = 'Atheros Communications Inc.'
    device     = 'AR5212, AR5213 802.11a/b/g Wireless Adapter'
    class      = network
    subclass   = ethernet

...and the hostap card is a PCI version of the same thing:

ath0@pci0:0:13:0:       class=0x020000 card=0x2051168c chip=0x0013168c 
rev=0x01 hdr=0x00
    vendor     = 'Atheros Communications Inc.'
    device     = 'AR5212, AR5213 802.11a/b/g Wireless Adapter'
    class      = network
    subclass   = ethernet

-Boris

From owner-freebsd-net@FreeBSD.ORG  Fri Apr 10 18:55:20 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 111E11065674
	for <freebsd-net@freebsd.org>; Fri, 10 Apr 2009 18:55:20 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id E251C8FC12
	for <freebsd-net@freebsd.org>; Fri, 10 Apr 2009 18:55:19 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [65.122.17.41])
	by cyrus.watson.org (Postfix) with ESMTPS id 9E4AF46B8A;
	Fri, 10 Apr 2009 14:55:19 -0400 (EDT)
Date: Fri, 10 Apr 2009 19:55:19 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Karim Fodil-Lemelin <kfl@xiplink.com>
In-Reply-To: <49DF5F75.6080607@xiplink.com>
Message-ID: <alpine.BSF.2.00.0904101950350.36143@fledge.watson.org>
References: <49DF5F75.6080607@xiplink.com>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-net@freebsd.org
Subject: Re: m_tag, malloc vs uma
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Apr 2009 18:55:20 -0000

On Fri, 10 Apr 2009, Karim Fodil-Lemelin wrote:

> Is there any plans on getting the mbuf tags sub-system integrated with the 
> universal memory allocator? Getting tags for mbufs is still calling malloc 
> in uipc_mbuf.c ... What would be the benefits of using uma instead?

Hi Karim:

Right now there are no specific plans for changes along these lines, although 
we have talked about moving towards better support for deep objects in m_tags. 
Right now, MAC requires a "deep" copy, because labels may be complex objects, 
and this is special-cased in the m_tag code.  One way to move in that 
direction would be to move from an explicit m_tag free pointer to a pointer to 
a vector of copy, free, etc, operations.  This would make it easier to support 
more flexible memory models there, rather than forcing the use of malloc(9).

That said, malloc(9) for "small" memory types is essentially a thin wrapper 
accounting around a set of fixed-size UMA zones:

ITEM                     SIZE     LIMIT      USED      FREE  REQUESTS  FAILURES
16:                        16,        0,     3703,      966, 55930783,        0
32:                        32,        0,     1455,      692, 30720298,        0
64:                        64,        0,     4794,     1224, 38352819,        0
128:                      128,        0,     3169,      341,  5705218,        0
256:                      256,        0,     1565,      535, 48338889,        0
512:                      512,        0,      386,      494,  9962475,        0
1024:                    1024,        0,       66,      354,  3418306,        0
2048:                    2048,        0,      314,      514,    29945,        0
4096:                    4096,        0,      250,      279,  4567645,        0

For larger memory sizes, malloc(9) becomes instead a thin wrapper around VM 
allocation of kernel address space and pages.  So as long as you're using 
smaller objects, malloc(9) actually offers most of the benefits of slab 
allocation.

Because m_tag(9) is an interface used for a variety of base system and third 
party parts, changes to the KPI would need to be made with a major FreeBSD 
release -- for example with 8.0.  Such a change is definitely not precluded at 
this point, but in a couple of months we'll hit feature freeze and it won't be 
possible to make those changes after that time.

Robert N M Watson
Computer Laboratory
University of Cambridge

From owner-freebsd-net@FreeBSD.ORG  Fri Apr 10 19:20:06 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 3FFF3106564A
	for <freebsd-net@hub.freebsd.org>; Fri, 10 Apr 2009 19:20:06 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 14CC58FC12
	for <freebsd-net@hub.freebsd.org>; Fri, 10 Apr 2009 19:20:06 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n3AJK5Fx070897
	for <freebsd-net@freefall.freebsd.org>; Fri, 10 Apr 2009 19:20:05 GMT
	(envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n3AJK5rg070896;
	Fri, 10 Apr 2009 19:20:05 GMT (envelope-from gnats)
Date: Fri, 10 Apr 2009 19:20:05 GMT
Message-Id: <200904101920.n3AJK5rg070896@freefall.freebsd.org>
To: freebsd-net@FreeBSD.org
From: dfilter@FreeBSD.ORG (dfilter service)
Cc: 
Subject: Re: kern/131310: commit references a PR
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: dfilter service <dfilter@FreeBSD.ORG>
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Apr 2009 19:20:06 -0000

The following reply was made to PR kern/131310; it has been noted by GNATS.

From: dfilter@FreeBSD.ORG (dfilter service)
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: kern/131310: commit references a PR
Date: Fri, 10 Apr 2009 19:16:29 +0000 (UTC)

 Author: mlaier
 Date: Fri Apr 10 19:16:14 2009
 New Revision: 190903
 URL: http://svn.freebsd.org/changeset/base/190903
 
 Log:
   Follow up for r190895  It's not only the "all" group that is affected, but
   all groups on the given interface.
   
   PR:		kern/130977, kern/131310
   MFC after:	3 days (%vnet)
 
 Modified:
   head/sys/net/if.c
 
 Modified: head/sys/net/if.c
 ==============================================================================
 --- head/sys/net/if.c	Fri Apr 10 18:46:46 2009	(r190902)
 +++ head/sys/net/if.c	Fri Apr 10 19:16:14 2009	(r190903)
 @@ -141,6 +141,7 @@ static int	if_delmulti_locked(struct ifn
  static void	do_link_state_change(void *, int);
  static int	if_getgroup(struct ifgroupreq *, struct ifnet *);
  static int	if_getgroupmembers(struct ifgroupreq *);
 +static void	if_delgroups(struct ifnet *);
  
  #ifdef INET6
  /*
 @@ -887,7 +888,7 @@ if_detach(struct ifnet *ifp)
  	rt_ifannouncemsg(ifp, IFAN_DEPARTURE);
  	EVENTHANDLER_INVOKE(ifnet_departure_event, ifp);
  	devctl_notify("IFNET", ifp->if_xname, "DETACH", NULL);
 -	if_delgroup(ifp, IFG_ALL);
 +	if_delgroups(ifp);
  
  	IF_AFDATA_LOCK(ifp);
  	for (dp = domains; dp; dp = dp->dom_next) {
 @@ -1025,6 +1026,54 @@ if_delgroup(struct ifnet *ifp, const cha
  }
  
  /*
 + * Remove an interface from all groups
 + */
 +static void
 +if_delgroups(struct ifnet *ifp)
 +{
 +	INIT_VNET_NET(ifp->if_vnet);
 +	struct ifg_list		*ifgl;
 +	struct ifg_member	*ifgm;
 +	char groupname[IFNAMSIZ];
 +
 +	IFNET_WLOCK();
 +	while (!TAILQ_EMPTY(&ifp->if_groups)) {
 +		ifgl = TAILQ_FIRST(&ifp->if_groups);
 +
 +		strlcpy(groupname, ifgl->ifgl_group->ifg_group, IFNAMSIZ);
 +
 +		IF_ADDR_LOCK(ifp);
 +		TAILQ_REMOVE(&ifp->if_groups, ifgl, ifgl_next);
 +		IF_ADDR_UNLOCK(ifp);
 +
 +		TAILQ_FOREACH(ifgm, &ifgl->ifgl_group->ifg_members, ifgm_next)
 +			if (ifgm->ifgm_ifp == ifp)
 +				break;
 +
 +		if (ifgm != NULL) {
 +			TAILQ_REMOVE(&ifgl->ifgl_group->ifg_members, ifgm,
 +			    ifgm_next);
 +			free(ifgm, M_TEMP);
 +		}
 +
 +		if (--ifgl->ifgl_group->ifg_refcnt == 0) {
 +			TAILQ_REMOVE(&V_ifg_head, ifgl->ifgl_group, ifg_next);
 +			EVENTHANDLER_INVOKE(group_detach_event,
 +			    ifgl->ifgl_group);
 +			free(ifgl->ifgl_group, M_TEMP);
 +		}
 +		IFNET_WUNLOCK();
 +
 +		free(ifgl, M_TEMP);
 +
 +		EVENTHANDLER_INVOKE(group_change_event, groupname);
 +
 +		IFNET_WLOCK();
 +	}
 +	IFNET_WUNLOCK();
 +}
 +
 +/*
   * Stores all groups from an interface in memory pointed
   * to by data
   */
 _______________________________________________
 svn-src-all@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/svn-src-all
 To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
 

From owner-freebsd-net@FreeBSD.ORG  Fri Apr 10 19:32:02 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id B5A61106573E
	for <freebsd-net@freebsd.org>; Fri, 10 Apr 2009 19:32:02 +0000 (UTC)
	(envelope-from kfl@xiplink.com)
Received: from smtp191.iad.emailsrvr.com (smtp191.iad.emailsrvr.com
	[207.97.245.191])
	by mx1.freebsd.org (Postfix) with ESMTP id 731BE8FC22
	for <freebsd-net@freebsd.org>; Fri, 10 Apr 2009 19:32:02 +0000 (UTC)
	(envelope-from kfl@xiplink.com)
Received: from relay9.relay.iad.mlsrvr.com (localhost [127.0.0.1])
	by relay9.relay.iad.mlsrvr.com (SMTP Server) with ESMTP id 12BDE1E4834; 
	Fri, 10 Apr 2009 15:32:02 -0400 (EDT)
Received: by relay9.relay.iad.mlsrvr.com (Authenticated sender:
	kfodil-lemelin-AT-xiplink.com) with ESMTPSA id AF6D11E44EB; 
	Fri, 10 Apr 2009 15:32:01 -0400 (EDT)
Message-ID: <49DF9EAD.1050609@xiplink.com>
Date: Fri, 10 Apr 2009 15:31:57 -0400
From: Karim Fodil-Lemelin <kfl@xiplink.com>
User-Agent: Thunderbird 2.0.0.21 (Windows/20090302)
MIME-Version: 1.0
To: Robert Watson <rwatson@FreeBSD.org>
References: <49DF5F75.6080607@xiplink.com>
	<alpine.BSF.2.00.0904101950350.36143@fledge.watson.org>
In-Reply-To: <alpine.BSF.2.00.0904101950350.36143@fledge.watson.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@freebsd.org
Subject: Re: m_tag, malloc vs uma
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Apr 2009 19:32:04 -0000

Robert Watson wrote:
> On Fri, 10 Apr 2009, Karim Fodil-Lemelin wrote:
>
>> Is there any plans on getting the mbuf tags sub-system integrated 
>> with the universal memory allocator? Getting tags for mbufs is still 
>> calling malloc in uipc_mbuf.c ... What would be the benefits of using 
>> uma instead?
>
> Hi Karim:
>
> Right now there are no specific plans for changes along these lines, 
> although we have talked about moving towards better support for deep 
> objects in m_tags. Right now, MAC requires a "deep" copy, because 
> labels may be complex objects, and this is special-cased in the m_tag 
> code.  One way to move in that direction would be to move from an 
> explicit m_tag free pointer to a pointer to a vector of copy, free, 
> etc, operations.  This would make it easier to support more flexible 
> memory models there, rather than forcing the use of malloc(9).
>
> That said, malloc(9) for "small" memory types is essentially a thin 
> wrapper accounting around a set of fixed-size UMA zones:
>
> ITEM                     SIZE     LIMIT      USED      FREE  REQUESTS  
> FAILURES
> 16:                        16,        0,     3703,      966, 
> 55930783,        0
> 32:                        32,        0,     1455,      692, 
> 30720298,        0
> 64:                        64,        0,     4794,     1224, 
> 38352819,        0
> 128:                      128,        0,     3169,      341,  
> 5705218,        0
> 256:                      256,        0,     1565,      535, 
> 48338889,        0
> 512:                      512,        0,      386,      494,  
> 9962475,        0
> 1024:                    1024,        0,       66,      354,  
> 3418306,        0
> 2048:                    2048,        0,      314,      514,    
> 29945,        0
> 4096:                    4096,        0,      250,      279,  
> 4567645,        0
>
> For larger memory sizes, malloc(9) becomes instead a thin wrapper 
> around VM allocation of kernel address space and pages.  So as long as 
> you're using smaller objects, malloc(9) actually offers most of the 
> benefits of slab allocation.
>
> Because m_tag(9) is an interface used for a variety of base system and 
> third party parts, changes to the KPI would need to be made with a 
> major FreeBSD release -- for example with 8.0.  Such a change is 
> definitely not precluded at this point, but in a couple of months 
> we'll hit feature freeze and it won't be possible to make those 
> changes after that time.
>
> Robert N M Watson
> Computer Laboratory
> University of Cambridge
Hi Robert,

Thank you for the answer, clear and concise. I asked the question 
because I had modified pf_get_mtag() to use uma directly in the hope 
that it would be faster then calling malloc. But since pf_mtag is 
20bytes, malloc will end up using a fixed 32bytes zone and I shouldn't 
expect much speed gain from using something like (except some savings 
from not having to select the 32bytes zone):

extern uma_zone_t pf_mtag_zone;
static __inline struct pf_mtag *
pf_get_mtag(struct mbuf *m)
{
  struct m_tag    *mtag;

  if ((mtag = m_tag_find(m, PACKET_TAG_PF, NULL)) == NULL) {
    mtag = uma_zalloc(pf_mtag_zone, M_NOWAIT);
    if (mtag == NULL)
      return (NULL);
    m_tag_setup(mtag, MTAG_ABI_COMPAT, PACKET_TAG_PF, sizeof(struct 
pf_mtag));
    mtag->m_tag_free = pf_mtag_delete;
    bzero(mtag + 1, sizeof(struct pf_mtag));
    m_tag_prepend(m, mtag);
  }
  return ((struct pf_mtag *)(mtag + 1));
}


Where pf_mtag_delete is a wrapper around uma_zfree().

Regards,

Karim.

From owner-freebsd-net@FreeBSD.ORG  Fri Apr 10 20:02:44 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 41D42106566B
	for <freebsd-net@freebsd.org>; Fri, 10 Apr 2009 20:02:44 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
	by mx1.freebsd.org (Postfix) with ESMTP id 1DAEA8FC1A
	for <freebsd-net@freebsd.org>; Fri, 10 Apr 2009 20:02:44 +0000 (UTC)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [65.122.17.41])
	by cyrus.watson.org (Postfix) with ESMTPS id CFB4D46B94;
	Fri, 10 Apr 2009 16:02:43 -0400 (EDT)
Date: Fri, 10 Apr 2009 21:02:43 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Karim Fodil-Lemelin <kfl@xiplink.com>
In-Reply-To: <49DF9EAD.1050609@xiplink.com>
Message-ID: <alpine.BSF.2.00.0904102057320.36143@fledge.watson.org>
References: <49DF5F75.6080607@xiplink.com>
	<alpine.BSF.2.00.0904101950350.36143@fledge.watson.org>
	<49DF9EAD.1050609@xiplink.com>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-net@freebsd.org
Subject: Re: m_tag, malloc vs uma
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Apr 2009 20:02:44 -0000

On Fri, 10 Apr 2009, Karim Fodil-Lemelin wrote:

> Thank you for the answer, clear and concise. I asked the question because I 
> had modified pf_get_mtag() to use uma directly in the hope that it would be 
> faster then calling malloc. But since pf_mtag is 20bytes, malloc will end up 
> using a fixed 32bytes zone and I shouldn't expect much speed gain from using 
> something like (except some savings from not having to select the 32bytes 
> zone):

There is another small overhead, the critical section used to protect the 
consistency of the per-CPU malloc type alloc and free counters, but it's also 
very small.

I think it would be desirable to make a change to more flexible m_tag types 
for 8.0, but I'm not sure I have time to implement/test it.  Is this something 
you might be interested in working on?  I'm thinking of basically replacing 
the m_tag_free pointer with a pointer to a small vector of operations, 
possibly something along these lines:

struct m_tag_ops {
 	void		(*m_tag_free)(struct m_tag *);
 	struct m_tag	(*m_tag_copy)(struct m_tag *);
};

If the m_tag_ops pointer is NULL, we go with today's default (requiring 
minimal change of existing consumers).  I'm not sure if there are any other 
function pointers we'd need at this point?

Robert N M Watson
Computer Laboratory
University of Cambridge

From owner-freebsd-net@FreeBSD.ORG  Fri Apr 10 20:30:05 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 3F460106566B
	for <freebsd-net@hub.freebsd.org>; Fri, 10 Apr 2009 20:30:05 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 127568FC13
	for <freebsd-net@hub.freebsd.org>; Fri, 10 Apr 2009 20:30:05 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n3AKU4i7067096
	for <freebsd-net@freefall.freebsd.org>; Fri, 10 Apr 2009 20:30:04 GMT
	(envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n3AKU4Lj067093;
	Fri, 10 Apr 2009 20:30:04 GMT (envelope-from gnats)
Date: Fri, 10 Apr 2009 20:30:04 GMT
Message-Id: <200904102030.n3AKU4Lj067093@freefall.freebsd.org>
To: freebsd-net@FreeBSD.org
From: Glen Barber <glen.j.barber@gmail.com>
Cc: 
Subject: Re: misc/129580: Netgear WG311v3 (ndis) causes kenel trap at boot.
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Glen Barber <glen.j.barber@gmail.com>
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Apr 2009 20:30:05 -0000

The following reply was made to PR kern/129580; it has been noted by GNATS.

From: Glen Barber <glen.j.barber@gmail.com>
To: bug-followup@freebsd.org
Cc:  
Subject: Re: misc/129580: Netgear WG311v3 (ndis) causes kenel trap at boot.
Date: Fri, 10 Apr 2009 16:04:33 -0400

 Since malo(4) is available, I believe this PR can be closed.
 
 Thanks.
 
 -- 
 Glen Barber

From owner-freebsd-net@FreeBSD.ORG  Fri Apr 10 21:13:08 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0EC8D1065674;
	Fri, 10 Apr 2009 21:13:08 +0000 (UTC)
	(envelope-from linimon@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id D80298FC12;
	Fri, 10 Apr 2009 21:13:07 +0000 (UTC)
	(envelope-from linimon@FreeBSD.org)
Received: from freefall.freebsd.org (linimon@localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n3ALD70d037631;
	Fri, 10 Apr 2009 21:13:07 GMT
	(envelope-from linimon@freefall.freebsd.org)
Received: (from linimon@localhost)
	by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n3ALD7Fi037625;
	Fri, 10 Apr 2009 21:13:07 GMT (envelope-from linimon)
Date: Fri, 10 Apr 2009 21:13:07 GMT
Message-Id: <200904102113.n3ALD7Fi037625@freefall.freebsd.org>
To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-net@FreeBSD.org
From: linimon@FreeBSD.org
Cc: 
Subject: Re: kern/133572: [ppp] [hang] incoming PPTP connection hangs the
	system
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Apr 2009 21:13:08 -0000

Old Synopsis: incoming PPTP connection hangs the system
New Synopsis: [ppp] [hang] incoming PPTP connection hangs the system

Responsible-Changed-From-To: freebsd-bugs->freebsd-net
Responsible-Changed-By: linimon
Responsible-Changed-When: Fri Apr 10 21:11:38 UTC 2009
Responsible-Changed-Why: 
Over to maintainer(s).

http://www.freebsd.org/cgi/query-pr.cgi?pr=133572

From owner-freebsd-net@FreeBSD.ORG  Fri Apr 10 23:10:05 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0BC14106566B
	for <freebsd-net@hub.freebsd.org>; Fri, 10 Apr 2009 23:10:05 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id D388F8FC0C
	for <freebsd-net@hub.freebsd.org>; Fri, 10 Apr 2009 23:10:04 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n3ANA41x086899
	for <freebsd-net@freefall.freebsd.org>; Fri, 10 Apr 2009 23:10:04 GMT
	(envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n3ANA46d086898;
	Fri, 10 Apr 2009 23:10:04 GMT (envelope-from gnats)
Date: Fri, 10 Apr 2009 23:10:04 GMT
Message-Id: <200904102310.n3ANA46d086898@freefall.freebsd.org>
To: freebsd-net@FreeBSD.org
From: Max Laier <max@love2party.net>
Cc: 
Subject: Re: kern/133572: [ppp] [hang] incoming PPTP connection hangs the
	system
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Max Laier <max@love2party.net>
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Apr 2009 23:10:05 -0000

The following reply was made to PR kern/133572; it has been noted by GNATS.

From: Max Laier <max@love2party.net>
To: bug-followup@freebsd.org,
 dennis.melentyev@gmail.com
Cc:  
Subject: Re: kern/133572: [ppp] [hang] incoming PPTP connection hangs the system
Date: Fri, 10 Apr 2009 23:47:55 +0100

 Is it possible for you to turn on WITNESS on this machine to obtain possible 
 LORs that might be responsible for the hang?  Also, do you have the 
 possibility to enable DDB and drop into it from the console (if it is not a 
 hard hang but a live lock)?
 
 --
   Max

From owner-freebsd-net@FreeBSD.ORG  Sat Apr 11 01:54:38 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id AB9EF10656C5
	for <freebsd-net@FreeBSD.org>; Sat, 11 Apr 2009 01:54:38 +0000 (UTC)
	(envelope-from ccowart@rescomp.berkeley.edu)
Received: from hal.rescomp.berkeley.edu (hal.Rescomp.Berkeley.EDU
	[169.229.70.150])
	by mx1.freebsd.org (Postfix) with ESMTP id 918368FC17
	for <freebsd-net@FreeBSD.org>; Sat, 11 Apr 2009 01:54:38 +0000 (UTC)
	(envelope-from ccowart@rescomp.berkeley.edu)
Received: by hal.rescomp.berkeley.edu (Postfix, from userid 1225)
	id 8B0273C054F; Fri, 10 Apr 2009 18:38:34 -0700 (PDT)
Date: Fri, 10 Apr 2009 18:38:34 -0700
From: Chris Cowart <ccowart@rescomp.berkeley.edu>
To: "Bjoern A. Zeeb" <bzeeb-lists@lists.zabbadoz.net>
Message-ID: <20090411013834.GB40655@hal.rescomp.berkeley.edu>
Mail-Followup-To: "Bjoern A. Zeeb" <bzeeb-lists@lists.zabbadoz.net>,
	"Eugene M. Kim" <20080111.freebsd.org@ab.ote.we.lv>,
	freebsd-net@FreeBSD.org
References: <48693E39.4080104@ab.ote.we.lv>
	<20080630220842.X83875@maildrop.int.zabbadoz.net>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-ripemd160;
	protocol="application/pgp-signature"; boundary="H1spWtNR+x+ondvy"
Content-Disposition: inline
In-Reply-To: <20080630220842.X83875@maildrop.int.zabbadoz.net>
Organization: RSSP-IT, UC Berkeley
User-Agent: Mutt/1.5.18 (2008-05-17)
Cc: freebsd-net@FreeBSD.org,
	"Eugene M. Kim" <20080111.freebsd.org@ab.ote.we.lv>
Subject: Re: bridge(4) and IPv6 link-local address
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 11 Apr 2009 01:54:39 -0000


--H1spWtNR+x+ondvy
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Bjoern A. Zeeb wrote:
> On Mon, 30 Jun 2008, Eugene M. Kim wrote:
> > A quick question: Is bridge(4) supposed /not/ to automatically configur=
e an=20
> > IPv6 link-local address?
>=20
> yes there is a check for this in the code and if remoed (tried that
> lately) more things go wrong.
>=20
> > I'm trying to use it to bridge a wired segment and a wireless segment, =
and=20
> > router advertisement over bridge0 wouldn't work because, with bridge0 l=
acking=20
> > a LL address, the router uses a  non-LL address as the source address f=
or RA=20
> > packets, which then is ignored as invalid by other IPv6 nodes.
>=20
> yes, seem something similar lately but ETIMEOUT on debugging. The
> problem basically was:
>=20
>       lan    bridge    ath   ---  wlan client
>=20
> the LL address was on the "lan" interface.
>=20
> ping6 LL on lan from wlan client did not work. I could see the packets
> being bridged and visible on all interfaces and even the router on lan
> noticed them but there was no reply going to the client. ping6 from
> the bridge ``box'' to the wlan client and everything was fine as nd
> was seeded.
>=20
> Removing the check we ended up with the same LL address on both bridge
> and the lan interface if I can remember correctly and you do not want
> that... it's a bit tricky and there is something that does not work as
> expected, right. If you find the time to debug it I'll happily test
> patches;-)

I seem to be reviving a fairly old thread here, but this is what I found
when I went searching for the issue.

I am personally bridging a wireless NIC (ath0) with a VLAN interface
(vlan10). The bridge does not receive a link-local address. The bridge
interface (bridge0) is the default gateway for my LAN, both for v4 and
v6.

My Mac was logging this message in response to router advertisements:

| Apr 10 18:16:54 administrators-imac configd[29]: RTADV_VERIFY_PACKET:
| invalid RA with non link-local source from 2001:4830:1679:10::1 on en0

and was refusing to acknowledge them.

My solution was to assign a link-local address to bridge0 based on the
ethernet address (I think I did the EUI-48 stuff correctly):

| bridge0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mt=
u 1500
|     ether 92:db:a2:b4:8e:ba
|     inet 10.1.10.1 netmask 0xffffff00 broadcast 10.1.10.255
|     inet6 2001:4830:1679:10::1 prefixlen 64=20
|     inet6 fe80::90db:a2ff:feb4:83ba%bridge0 prefixlen 64 scopeid 0xc=20
|     id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
|     maxage 20 holdcnt 6 proto rstp maxaddr 100 timeout 1200
|     root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0

According to ifconfig(8):

| Basic IPv6 node operation requires a link-local address on each interface
| configured for IPv6.  Normally, such an address is automatically config-
| ured by the kernel on each interface added to the system; this behaviour
| may be disabled by setting the sysctl MIB variable
| net.inet6.ip6.auto_linklocal to 0.

The bridge(4) page does not add any disclaimer about bridge interfaces.
Neither man page gives a good how-to on assigning your own link-local
address (I guessed and got it right with the % notation).

Shouldn't the kernel assign link-local addresses to these interfaces? Should
this address be based on the ethernet address of the bridge interface?
I'm not sure I really understood the challenges with the implementation.

--=20
Chris Cowart
Network Technical Lead
Network & Infrastructure Services, RSSP-IT
UC Berkeley

--H1spWtNR+x+ondvy
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.10 (FreeBSD)

iQIcBAEBAwAGBQJJ3/SaAAoJEIGh6j3cHUNP+ecP+gKBGGQUMWgmJ1BQNgT/FfW3
rHkRDLUYNF8eJ+OX4yDfOLgsWCXtEDqvO99OwMHr+1GHhg4rJWYM2C1JJYJElAXE
fp79/eSM8Gjo0n9EiWqglkUL9HyRiPtRX7K7WbPLJG75J7ALkThK04UCTghF8GJ5
ZfeoKG9PauZJruH3j91v6aBZhV0E6GrSc8+KiJvx/NmxBiMzpBXOGb4h32R0zPfT
n1Fat3bJ5yxyBXaAEnRdOajTG4wUIXa1CFYrmskk8XA7uToaXK0CuiSdexHjrxIj
5GymlpLL33FuOvg32/nK9HDEaL/ktqHZNz+wt9n1p2T4VGk+bdd1TQQvOwXRXwG/
SEIHnpFTREasZ9K0RgVC6mFgkVvFbZGV6OGW/ugISg81u56l+O4IncH6JUCeT5CP
h4JUwcQgLAd4IC2ISqNBlYOaDj1yFikhyHWsv8BzUV5WmQT0fq4AToswbAUdQU6A
4lNH0Wq3YZurcRk1fcVQY4atGdin3ftGL6FOI54AB+yb+o1a6E/UxoiYh4tHW4lW
XFvkjV23yy+W94SWhbjyldQyy5GoCED0wzxF/x/R7lzQ9AF1/uumcubRWV7f/+EG
0HvIbBx34TvsTCDyocBNGwcred+BnHE7saS67dWuB6fvcUrOjyqqvjLoMIY4X8tB
zV9aw1X5E3Dmg0EAzdOa
=S/1z
-----END PGP SIGNATURE-----

--H1spWtNR+x+ondvy--

From owner-freebsd-net@FreeBSD.ORG  Sat Apr 11 08:50:02 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E31111065686
	for <freebsd-net@hub.freebsd.org>; Sat, 11 Apr 2009 08:50:02 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id CFCDE8FC30
	for <freebsd-net@hub.freebsd.org>; Sat, 11 Apr 2009 08:50:02 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n3B8o2uA010511
	for <freebsd-net@freefall.freebsd.org>; Sat, 11 Apr 2009 08:50:02 GMT
	(envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n3B8o2ka010510;
	Sat, 11 Apr 2009 08:50:02 GMT (envelope-from gnats)
Date: Sat, 11 Apr 2009 08:50:02 GMT
Message-Id: <200904110850.n3B8o2ka010510@freefall.freebsd.org>
To: freebsd-net@FreeBSD.org
From: Mykola Dzham <freebsd@levsha.org.ua>
Cc: 
Subject: Re: bin/131365: r190758 break using 0 , 0/0, 0.0.0.0/0 as alias
	for 'default'
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Mykola Dzham <freebsd@levsha.org.ua>
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 11 Apr 2009 08:50:04 -0000

The following reply was made to PR bin/131365; it has been noted by GNATS.

From: Mykola Dzham <freebsd@levsha.org.ua>
To: bug-followup@FreeBSD.org, rrs@FreeBSD.org
Cc:  
Subject: Re: bin/131365: r190758 break using 0 , 0/0, 0.0.0.0/0 as alias
	for 'default'
Date: Sat, 11 Apr 2009 11:20:20 +0300

 --UugvWAfsgieZRqgk
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 
 Hi!
 r190758 break using 0.0.0.0/0 as alias for default rote:
 
 $ route -n get default     
    route to: default
 destination: default
        mask: default
     gateway: 192.168.1.1
   interface: em0
       flags: <UP,GATEWAY,DONE,STATIC>
  recvpipe  sendpipe  ssthresh  rtt,msec    rttvar  hopcount      mtu     expire
        0         0         0         0         0         0      1500         0 
 
 $ route -n get -net 0.0.0.0
 route: writing to routing socket: No such process
 
 Attached patch fix this
 
 -- 
 Mykola Dzham, LEFT-(UANIC|RIPE)
 JID: levsha@jabber.net.ua
 
 --UugvWAfsgieZRqgk
 Content-Type: text/x-diff; charset=us-ascii
 Content-Disposition: attachment; filename="route.c.patch"
 
 Index: route.c
 ===================================================================
 --- route.c	(revision 190880)
 +++ route.c	(working copy)
 @@ -818,7 +818,8 @@
  		/* i holds the first non zero bit */
  		bits = 32 - (i*8);	
  	}
 -	mask = 0xffffffff << (32 - bits);
 +	if (bits != 0)
 +		mask = 0xffffffff << (32 - bits);
  
  	sin->sin_addr.s_addr = htonl(addr);
  	sin = &so_mask.sin;
 
 --UugvWAfsgieZRqgk--

From owner-freebsd-net@FreeBSD.ORG  Sat Apr 11 10:10:03 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 324B2106564A
	for <freebsd-net@hub.freebsd.org>; Sat, 11 Apr 2009 10:10:03 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 1E8048FC14
	for <freebsd-net@hub.freebsd.org>; Sat, 11 Apr 2009 10:10:03 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n3BAA2mE016571
	for <freebsd-net@freefall.freebsd.org>; Sat, 11 Apr 2009 10:10:03 GMT
	(envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n3BAA2QQ016570;
	Sat, 11 Apr 2009 10:10:02 GMT (envelope-from gnats)
Date: Sat, 11 Apr 2009 10:10:02 GMT
Message-Id: <200904111010.n3BAA2QQ016570@freefall.freebsd.org>
To: freebsd-net@FreeBSD.org
From: Randall Stewart <rrs@lakerest.net>
Cc: 
Subject: Re: bin/131365: r190758 break using 0 , 0/0,
	0.0.0.0/0 as alias for 'default'
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Randall Stewart <rrs@lakerest.net>
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 11 Apr 2009 10:10:03 -0000

The following reply was made to PR bin/131365; it has been noted by GNATS.

From: Randall Stewart <rrs@lakerest.net>
To: Mykola Dzham <freebsd@levsha.org.ua>
Cc: bug-followup@FreeBSD.org, rrs@FreeBSD.org
Subject: Re: bin/131365: r190758 break using 0 , 0/0, 0.0.0.0/0 as alias for 'default'
Date: Sat, 11 Apr 2009 06:04:37 -0400

 Good catch Mykola..
 
 I will get this in :)
 
 R
 On Apr 11, 2009, at 4:20 AM, Mykola Dzham wrote:
 
 > Hi!
 > r190758 break using 0.0.0.0/0 as alias for default rote:
 >
 > $ route -n get default
 >   route to: default
 > destination: default
 >       mask: default
 >    gateway: 192.168.1.1
 >  interface: em0
 >      flags: <UP,GATEWAY,DONE,STATIC>
 > recvpipe  sendpipe  ssthresh  rtt,msec    rttvar  hopcount       
 > mtu     expire
 >       0         0         0         0         0         0       
 > 1500         0
 >
 > $ route -n get -net 0.0.0.0
 > route: writing to routing socket: No such process
 >
 > Attached patch fix this
 >
 > -- 
 > Mykola Dzham, LEFT-(UANIC|RIPE)
 > JID: levsha@jabber.net.ua
 > <route.c.patch>
 
 ------------------------------
 Randall Stewart
 803-317-4952 (cell)
 803-345-0391(direct)
 

From owner-freebsd-net@FreeBSD.ORG  Sat Apr 11 19:56:37 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0909A106564A
	for <freebsd-net@freebsd.org>; Sat, 11 Apr 2009 19:56:37 +0000 (UTC)
	(envelope-from kfl@xiplink.com)
Received: from smtp161.iad.emailsrvr.com (smtp161.iad.emailsrvr.com
	[207.97.245.161])
	by mx1.freebsd.org (Postfix) with ESMTP id D47618FC1A
	for <freebsd-net@freebsd.org>; Sat, 11 Apr 2009 19:56:36 +0000 (UTC)
	(envelope-from kfl@xiplink.com)
Received: from relay16.relay.iad.mlsrvr.com (localhost [127.0.0.1])
	by relay16.relay.iad.mlsrvr.com (SMTP Server) with ESMTP id 304821B4013;
	Sat, 11 Apr 2009 15:56:31 -0400 (EDT)
Received: by relay16.relay.iad.mlsrvr.com (Authenticated sender:
	kfodil-lemelin-AT-xiplink.com) with ESMTPSA id E9DE81B4003; 
	Sat, 11 Apr 2009 15:56:30 -0400 (EDT)
Message-ID: <49E0F5EF.3030807@xiplink.com>
Date: Sat, 11 Apr 2009 15:56:31 -0400
From: Karim Fodil-Lemelin <kfl@xiplink.com>
User-Agent: Thunderbird 2.0.0.21 (Windows/20090302)
MIME-Version: 1.0
To: Robert Watson <rwatson@FreeBSD.org>
References: <49DF5F75.6080607@xiplink.com>
	<alpine.BSF.2.00.0904101950350.36143@fledge.watson.org>
	<49DF9EAD.1050609@xiplink.com>
	<alpine.BSF.2.00.0904102057320.36143@fledge.watson.org>
In-Reply-To: <alpine.BSF.2.00.0904102057320.36143@fledge.watson.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@freebsd.org
Subject: Re: m_tag, malloc vs uma
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 11 Apr 2009 19:56:37 -0000

Robert Watson wrote:
> On Fri, 10 Apr 2009, Karim Fodil-Lemelin wrote:
>
>> Thank you for the answer, clear and concise. I asked the question 
>> because I had modified pf_get_mtag() to use uma directly in the hope 
>> that it would be faster then calling malloc. But since pf_mtag is 
>> 20bytes, malloc will end up using a fixed 32bytes zone and I 
>> shouldn't expect much speed gain from using something like (except 
>> some savings from not having to select the 32bytes zone):
>
> There is another small overhead, the critical section used to protect 
> the consistency of the per-CPU malloc type alloc and free counters, 
> but it's also very small.
>
> I think it would be desirable to make a change to more flexible m_tag 
> types for 8.0, but I'm not sure I have time to implement/test it.  Is 
> this something you might be interested in working on?  I'm thinking of 
> basically replacing the m_tag_free pointer with a pointer to a small 
> vector of operations, possibly something along these lines:
>
> struct m_tag_ops {
>     void        (*m_tag_free)(struct m_tag *);
>     struct m_tag    (*m_tag_copy)(struct m_tag *);
> };
>
> If the m_tag_ops pointer is NULL, we go with today's default 
> (requiring minimal change of existing consumers).  I'm not sure if 
> there are any other function pointers we'd need at this point? 

Is the m_tag_copy an 'overloaded' function for the current m_tag_copy or 
something else? Now it could also be interesting to have another 
function pointer to overload m_tag_alloc to give more control over which 
zone the user wants its tags from (ex: pf_mtag ...). The interest is 
there not sure if the schedule will allow it but that depends if the new 
m_tag designs allows me to squeeze some performances in.

Karim.


From owner-freebsd-net@FreeBSD.ORG  Sat Apr 11 20:27:10 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D2BD81065673;
	Sat, 11 Apr 2009 20:27:10 +0000 (UTC) (envelope-from sam@freebsd.org)
Received: from ebb.errno.com (ebb.errno.com [69.12.149.25])
	by mx1.freebsd.org (Postfix) with ESMTP id 94E1B8FC0C;
	Sat, 11 Apr 2009 20:27:10 +0000 (UTC) (envelope-from sam@freebsd.org)
Received: from trouble.errno.com (trouble.errno.com [10.0.0.248])
	(authenticated bits=0)
	by ebb.errno.com (8.13.6/8.12.6) with ESMTP id n3BKR9qJ073505
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sat, 11 Apr 2009 13:27:10 -0700 (PDT) (envelope-from sam@freebsd.org)
Message-ID: <49E0FD1D.408@freebsd.org>
Date: Sat, 11 Apr 2009 13:27:09 -0700
From: Sam Leffler <sam@freebsd.org>
Organization: FreeBSD Project
User-Agent: Thunderbird 2.0.0.18 (X11/20081209)
MIME-Version: 1.0
To: Karim Fodil-Lemelin <kfl@xiplink.com>
References: <49DF5F75.6080607@xiplink.com>	<alpine.BSF.2.00.0904101950350.36143@fledge.watson.org>	<49DF9EAD.1050609@xiplink.com>	<alpine.BSF.2.00.0904102057320.36143@fledge.watson.org>
	<49E0F5EF.3030807@xiplink.com>
In-Reply-To: <49E0F5EF.3030807@xiplink.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-DCC-x.dcc-servers-Metrics: ebb.errno.com; whitelist
Cc: freebsd-net@freebsd.org, Robert Watson <rwatson@freebsd.org>
Subject: Re: m_tag, malloc vs uma
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 11 Apr 2009 20:27:11 -0000

Karim Fodil-Lemelin wrote:
> Robert Watson wrote:
>> On Fri, 10 Apr 2009, Karim Fodil-Lemelin wrote:
>>
>>> Thank you for the answer, clear and concise. I asked the question 
>>> because I had modified pf_get_mtag() to use uma directly in the hope 
>>> that it would be faster then calling malloc. But since pf_mtag is 
>>> 20bytes, malloc will end up using a fixed 32bytes zone and I 
>>> shouldn't expect much speed gain from using something like (except 
>>> some savings from not having to select the 32bytes zone):
>>
>> There is another small overhead, the critical section used to protect 
>> the consistency of the per-CPU malloc type alloc and free counters, 
>> but it's also very small.
>>
>> I think it would be desirable to make a change to more flexible m_tag 
>> types for 8.0, but I'm not sure I have time to implement/test it.  Is 
>> this something you might be interested in working on?  I'm thinking 
>> of basically replacing the m_tag_free pointer with a pointer to a 
>> small vector of operations, possibly something along these lines:
>>
>> struct m_tag_ops {
>>     void        (*m_tag_free)(struct m_tag *);
>>     struct m_tag    (*m_tag_copy)(struct m_tag *);
>> };
>>
>> If the m_tag_ops pointer is NULL, we go with today's default 
>> (requiring minimal change of existing consumers).  I'm not sure if 
>> there are any other function pointers we'd need at this point? 
>
> Is the m_tag_copy an 'overloaded' function for the current m_tag_copy 
> or something else? Now it could also be interesting to have another 
> function pointer to overload m_tag_alloc to give more control over 
> which zone the user wants its tags from (ex: pf_mtag ...). The 
> interest is there not sure if the schedule will allow it but that 
> depends if the new m_tag designs allows me to squeeze some 
> performances in.

Typically tags are allocated in a context where decisions like the above 
can be made so I'm not sure where you think m_tag_alloc might be used.

At one point vlan-tagged packets were identified by an mbuf tag.  
Initially they were allocated by malloc but I moved that to a dedicated 
zone w/ a noticeable benefit.  However the overhead was still too high 
and so we now space was added to the mbuf pkt hdr explicitly to hold 
vlan data.

It's unlikely any scheme where the tags are allocated independent of the 
mbufs will scale well enough to handle existing high speed interfaces.  
There's been discussion about supporting emedding of tags in the mbuf 
itself; this might come along as part of the variable-size mbuf work 
that Jeff Roberson was working on.  However unless one pre-allocated 
space and/or defined a general mechanism for managing such space you'd 
still potentially need to allocate tags separately when they are 
attached at a later time.  For embedded/inline mbuf tag space management 
I think m_tag_free and m_tag_copy would sufficient for current usage.

    Sam


From owner-freebsd-net@FreeBSD.ORG  Sat Apr 11 21:52:05 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 9308A1065672;
	Sat, 11 Apr 2009 21:52:05 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 689F98FC0C;
	Sat, 11 Apr 2009 21:52:05 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n3BLq5T0079875;
	Sat, 11 Apr 2009 21:52:05 GMT
	(envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n3BLq5sF079871;
	Sat, 11 Apr 2009 21:52:05 GMT (envelope-from gnats)
Date: Sat, 11 Apr 2009 21:52:05 GMT
Message-Id: <200904112152.n3BLq5sF079871@freefall.freebsd.org>
To: gnats@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-net@FreeBSD.org
From: gnats@FreeBSD.org
Cc: 
Subject: Re: kern/133613: [wpi] [panic] kernel panic in wpi(4)
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 11 Apr 2009 21:52:06 -0000

Old Synopsis: kernel panic in wpi(4)
New Synopsis: [wpi] [panic] kernel panic in wpi(4)

Responsible-Changed-From-To: freebsd-bugs->freebsd-net
Responsible-Changed-By: gnats
Responsible-Changed-When: Sat Apr 11 21:51:42 UTC 2009
Responsible-Changed-Why: 


http://www.freebsd.org/cgi/query-pr.cgi?pr=133613