From owner-freebsd-net@FreeBSD.ORG  Sun Dec  2 00:25:55 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@smarthost.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 74B67757;
 Sun,  2 Dec 2012 00:25:55 +0000 (UTC)
 (envelope-from rmacklem@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id 5917C8FC08;
 Sun,  2 Dec 2012 00:25:55 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id qB20Ptbv006658;
 Sun, 2 Dec 2012 00:25:55 GMT
 (envelope-from rmacklem@freefall.freebsd.org)
Received: (from rmacklem@localhost)
 by freefall.freebsd.org (8.14.5/8.14.5/Submit) id qB20Ps1L006654;
 Sun, 2 Dec 2012 00:25:54 GMT (envelope-from rmacklem)
Date: Sun, 2 Dec 2012 00:25:54 GMT
Message-Id: <201212020025.qB20Ps1L006654@freefall.freebsd.org>
To: jas@cse.yorku.ca, rmacklem@FreeBSD.org, freebsd-net@FreeBSD.org
From: rmacklem@FreeBSD.org
Subject: Re: kern/173479: [nfs] chown and chgrp operations fail between
 FreeBSD 9.1RC3 NFSv4 server and RH63 NFSv4 client
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 02 Dec 2012 00:25:55 -0000

Synopsis: [nfs] chown and chgrp operations fail between FreeBSD 9.1RC3 NFSv4 server and RH63 NFSv4 client

State-Changed-From-To: open->closed
State-Changed-By: rmacklem
State-Changed-When: Sun Dec 2 00:20:25 UTC 2012
State-Changed-Why: 

This bug is caused by Linux 3.3 or greater kernels defaulting
to using numeric uids/gids in the owner and owner_group strings.
Support for this is defined in an internet draft that has not
yet been published as an RFC. To swich the Linux server to the
old behaviour you may:
- create /etc/modprobe.d
- put a file in there called nfs.conf with the following line in it
options nfs nfs4_disable_idmapping=N
Support for this new behaviour was added to head as r240720 and has
been MFC'd to stable/8 and stable/9.

http://www.freebsd.org/cgi/query-pr.cgi?pr=173479

From owner-freebsd-net@FreeBSD.ORG  Sun Dec  2 00:28:37 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@smarthost.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id C0852765;
 Sun,  2 Dec 2012 00:28:37 +0000 (UTC)
 (envelope-from rmacklem@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id A64838FC08;
 Sun,  2 Dec 2012 00:28:37 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id qB20Sbw6006813;
 Sun, 2 Dec 2012 00:28:37 GMT
 (envelope-from rmacklem@freefall.freebsd.org)
Received: (from rmacklem@localhost)
 by freefall.freebsd.org (8.14.5/8.14.5/Submit) id qB20Sbmf006809;
 Sun, 2 Dec 2012 00:28:37 GMT (envelope-from rmacklem)
Date: Sun, 2 Dec 2012 00:28:37 GMT
Message-Id: <201212020028.qB20Sbmf006809@freefall.freebsd.org>
To: jas@cse.yorku.ca, rmacklem@FreeBSD.org, freebsd-net@FreeBSD.org
From: rmacklem@FreeBSD.org
Subject: Re: kern/173481: [NFS] RH63 NFSv4 client does not reconnect to
 FreeBSD 9.1RC3 NFSv4 server after server is rebooted
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 02 Dec 2012 00:28:37 -0000

Synopsis: [NFS] RH63 NFSv4 client does not reconnect to FreeBSD 9.1RC3 NFSv4 server after server is rebooted

State-Changed-From-To: open->closed
State-Changed-By: rmacklem
State-Changed-When: Sun Dec 2 00:26:10 UTC 2012
State-Changed-Why: 

Upon further investigation, Jason determined that the Linux client
generated no network traffic to the server for some time, but then
did reconnect and recover. The slow reconnect seems to be a Linux
client issue. He recommended closing the PR.

http://www.freebsd.org/cgi/query-pr.cgi?pr=173481

From owner-freebsd-net@FreeBSD.ORG  Sun Dec  2 00:48:43 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 4C04EE7E;
 Sun,  2 Dec 2012 00:48:43 +0000 (UTC)
 (envelope-from melifaro@FreeBSD.org)
Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 162FA8FC0C;
 Sun,  2 Dec 2012 00:48:42 +0000 (UTC)
Received: from secured.by.ipfw.ru ([95.143.220.47] helo=ws.su29.net)
 by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256)
 (Exim 4.76 (FreeBSD)) (envelope-from <melifaro@FreeBSD.org>)
 id 1Texmz-0008qu-Ll; Sun, 02 Dec 2012 04:52:10 +0400
Message-ID: <50BAA552.1010707@FreeBSD.org>
Date: Sun, 02 Dec 2012 04:48:18 +0400
From: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:9.0) Gecko/20120121 Thunderbird/9.0
MIME-Version: 1.0
To: Hiroki Sato <hrs@FreeBSD.org>
Subject: [CFT] Virtual BPF interfaces (was: CFR: ipfw0 pseudo-interface
 clonable)
References: <4F96D11B.2060007@FreeBSD.org>
 <20120425.020518.406495893112283552.hrs@allbsd.org>
 <4F96E71B.9020405@FreeBSD.org>
 <20120427.084414.1142593201575277510.hrs@allbsd.org>
 <4FD4AD29.3040204@FreeBSD.org>
In-Reply-To: <4FD4AD29.3040204@FreeBSD.org>
Content-Type: multipart/mixed; boundary="------------080701070400070005040601"
Cc: freebsd-ipfw@FreeBSD.org, delphij@freebsd.org,
 "freebsd-net@freebsd.org" <freebsd-net@FreeBSD.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 02 Dec 2012 00:48:43 -0000

This is a multi-part message in MIME format.
--------------080701070400070005040601
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

On 10.06.2012 18:20, Alexander V. Chernikov wrote:
> On 27.04.2012 03:44, Hiroki Sato wrote:
>> "Alexander V. Chernikov"<melifaro@FreeBSD.org> wrote
>> in<4F96E71B.9020405@FreeBSD.org>:
>>
>> me> On 24.04.2012 21:05, Hiroki Sato wrote:
>
> Proof-of-concept patch attached.

Hopefully, libcap code is easily extendable.
New version attached:
* BPF code is now able to use 'virtual' interfaces without real ifnet
* New bpfattach3() / bpfdetach3() routines were added to attach virtual 
ifaces
* New BIOCGIFLIST ioctl is added to permit userland to retrieve 
available virtual interfaces
* freebsd-specific 'platform_finddevs' version is added to libpcap code 
(new file)

There are some rough edges (conditional code in pcap-bpf.c, lack of 
documentation, maybe some style issues), but generally it seems to work 
and does not interfere with contrib/ code much (from my point of view).

ipfw log device was converted to use new bpf(4) api, see attached patch.


Small example:

4:17 [0] zfscurr0# tcpdump -D
1.em0
2.em1
3.lo0
4:17 [0] zfscurr0# kldload ipfw
4:17 [0] zfscurr0# ifconfig -l
em0 em1 lo0
4:17 [0] zfscurr0# tcpdump -D
1.em0
2.ipfw0 (ipfw log interface)
3.em1
4.lo0

4:40 [1] zfscurr0# ipfw add 100 count log logamount 0 ip from any to any
00100 count log ip from any to any

4:40 [0] zfscurr0# tcpdump -i ipfw0 -lns0
tcpdump: WARNING: SIOCGIFADDR: ipfw0: Device not configured
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ipfw0, link-type EN10MB (Ethernet), capture size 65535 bytes
04:41:27.233653 IP 10.0.0.92.22 > 10.0.0.5.59076: Flags [P.], seq 
2783103749:2783103941, ack 3836123088, win 1040, options [nop,nop,TS val 
1668094903 ecr 564715671], length 192
04:41:27.233678 IP 10.0.0.5.59076 > 10.0.0.92.22: Flags [.], ack 0, win 
1039, options [nop,nop,TS val 564715680 ecr 1668094903], length 0


Btw, do we still need warning about lack of IPv4 address?

>
> Unfortunately, there are problems with this approach, too.
>
> pcap_findalldevs() uses external to BPF method (possibly NET_RT_IFLIST),
> so programs relying on that function for showing some kind of combo-box
> (like wireshark) with all possible variant won't allow user to specify
> such interface.
>
> Additionally, tcpdump assumes that passed interface name is real and
> warns us that SIOCGIFADDR returns error.
>
>
>>
>> -- Hiroki
>


--------------080701070400070005040601
Content-Type: text/plain;
 name="bpf_virtual.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename="bpf_virtual.diff"

Index: lib/libpcap/Makefile
===================================================================
--- lib/libpcap/Makefile	(revision 243778)
+++ lib/libpcap/Makefile	(working copy)
@@ -6,7 +6,7 @@ SHLIBDIR?= /lib
 .include <bsd.own.mk>
 
 LIB=	pcap
-SRCS=	grammar.y tokdefs.h version.h pcap-bpf.c \
+SRCS=	grammar.y tokdefs.h version.h pcap-bpf.c pcap-freebsd.c \
 	pcap.c pcap-common.c inet.c fad-getad.c gencode.c optimize.c nametoaddr.c \
 	etherent.c savefile.c bpf_filter.c bpf_image.c bpf_dump.c \
 	scanner.l sf-pcap.c sf-pcap-ng.c version.c
Index: sys/net/bpf.c
===================================================================
--- sys/net/bpf.c	(revision 243778)
+++ sys/net/bpf.c	(working copy)
@@ -151,6 +151,7 @@ static void	bpf_detachd_locked(struct bpf_d *);
 static void	bpf_freed(struct bpf_d *);
 static int	bpf_movein(struct uio *, int, struct ifnet *, struct mbuf **,
 		    struct sockaddr *, int *, struct bpf_insn *);
+static int	bpf_getiflist(struct bpf_d *, struct bpf_iflist *);
 static int	bpf_setif(struct bpf_d *, struct ifreq *);
 static void	bpf_timed_out(void *);
 static __inline void
@@ -654,7 +655,7 @@ bpf_attachd(struct bpf_d *d, struct bpf_if *bp)
 	CTR3(KTR_NET, "%s: bpf_attach called by pid %d, adding to %s list",
 	    __func__, d->bd_pid, d->bd_writer ? "writer" : "active");
 
-	if (op_w == 0)
+	if ((op_w == 0) && (!BPF_IS_VIRTUAL(bp)))
 		EVENTHANDLER_INVOKE(bpf_track, bp->bif_ifp, bp->bif_dlt, 1);
 }
 
@@ -696,7 +697,8 @@ bpf_upgraded(struct bpf_d *d)
 
 	CTR2(KTR_NET, "%s: upgrade required by pid %d", __func__, d->bd_pid);
 
-	EVENTHANDLER_INVOKE(bpf_track, bp->bif_ifp, bp->bif_dlt, 1);
+	if (!BPF_IS_VIRTUAL(bp))
+		EVENTHANDLER_INVOKE(bpf_track, bp->bif_ifp, bp->bif_dlt, 1);
 }
 
 /*
@@ -743,6 +745,10 @@ bpf_detachd_locked(struct bpf_d *d)
 
 	bpf_bpfd_cnt--;
 
+	/* Nothing to do for fake interfaces */
+	if (BPF_IS_VIRTUAL(bp))
+		return;
+
 	/* Call event handler iff d is attached */
 	if (error == 0)
 		EVENTHANDLER_INVOKE(bpf_track, ifp, bp->bif_dlt, 0);
@@ -1037,7 +1043,11 @@ bpfwrite(struct cdev *dev, struct uio *uio, int io
 		return (ENXIO);
 	}
 
-	ifp = d->bd_bif->bif_ifp;
+	/* XXX: Writing to fake interfaces is not supported */
+	if ((ifp = d->bd_bif->bif_ifp) == NULL) {
+		d->bd_wdcount++;
+		return (ENXIO);
+	}
 
 	if ((ifp->if_flags & IFF_UP) == 0) {
 		d->bd_wdcount++;
@@ -1266,10 +1276,17 @@ bpfioctl(struct cdev *dev, u_long cmd, caddr_t add
 		{
 			struct ifnet *ifp;
 
-			if (d->bd_bif == NULL)
+			/*
+			 * Lock d since other thread can do reatach in
+			 * other thread causing d->bd_bif to be set to NULL
+			 */
+			BPFD_LOCK(d);
+			if ((d->bd_bif == NULL) || (BPF_IS_VIRTUAL(d->bd_bif))) {
 				error = EINVAL;
-			else {
+				BPFD_UNLOCK(d);
+			} else {
 				ifp = d->bd_bif->bif_ifp;
+				BPFD_UNLOCK(d);
 				error = (*ifp->if_ioctl)(ifp, cmd, addr);
 			}
 			break;
@@ -1325,6 +1342,13 @@ bpfioctl(struct cdev *dev, u_long cmd, caddr_t add
 			error = EINVAL;
 			break;
 		}
+
+		if (BPF_IS_VIRTUAL(d->bd_bif)) {
+			/* Silently ignore fake interfaces */
+			error = 0;
+			break;
+		}
+
 		if (d->bd_promisc == 0) {
 			error = ifpromisc(d->bd_bif->bif_ifp, 1);
 			if (error == 0)
@@ -1390,6 +1414,12 @@ bpfioctl(struct cdev *dev, u_long cmd, caddr_t add
 		BPF_UNLOCK();
 		break;
 
+	case BIOCGIFLIST:
+		BPF_LOCK();
+		error = bpf_getiflist(d, (struct bpf_iflist *)addr);
+		BPF_UNLOCK();
+		break;
+
 	/*
 	 * Get interface name.
 	 */
@@ -1401,7 +1431,8 @@ bpfioctl(struct cdev *dev, u_long cmd, caddr_t add
 			struct ifnet *const ifp = d->bd_bif->bif_ifp;
 			struct ifreq *const ifr = (struct ifreq *)addr;
 
-			strlcpy(ifr->ifr_name, ifp->if_xname,
+			strlcpy(ifr->ifr_name, BPF_IS_VIRTUAL(d->bd_bif) ?
+			    d->bd_bif->ifname : ifp->if_xname,
 			    sizeof(ifr->ifr_name));
 		}
 		BPF_UNLOCK();
@@ -1701,6 +1732,7 @@ bpfioctl(struct cdev *dev, u_long cmd, caddr_t add
 		break;
 	}
 	CURVNET_RESTORE();
+
 	return (error);
 }
 
@@ -1834,6 +1866,55 @@ bpf_setf(struct bpf_d *d, struct bpf_program *fp,
 }
 
 /*
+ * Get a list of available virtual interfaces
+ */
+static int
+bpf_getiflist(struct bpf_d *d, struct bpf_iflist *ifl)
+{
+	int len, tot_len, error;
+	struct bpf_if *bp;
+	struct bpf_ifreply ifr;
+	char *buffer;
+
+	BPF_LOCK_ASSERT();
+
+	tot_len = 0;
+	error = 0;
+	buffer = ifl->ifl_list;
+	LIST_FOREACH(bp, &bpf_iflist, bif_next) {
+		if (!BPF_IS_VIRTUAL(bp))
+			continue;
+
+		/* Count total length */
+		len = offsetof(struct bpf_ifreply, ifr_descr) +
+		    strlen(bp->ifdescr) + 1;
+		/* Align on 4-byte boundary */
+		len = roundup2(len, 4);
+
+		if (buffer != NULL) {
+			if (tot_len + len >= ifl->ifl_len)
+				return (ENOMEM);
+
+			/* Fill in interface record */
+			memset(&ifr, 0, sizeof(ifr));
+			ifr.ifr_len = len;
+			strlcpy(ifr.ifr_name, bp->ifname, IFNAMSIZ + 1);
+
+			copyout(&ifr, buffer, sizeof(ifr));
+			/* Write interface description */
+			error = copyout(bp->ifdescr,
+			    buffer + offsetof(struct bpf_ifreply, ifr_descr),
+			    strlen(bp->ifdescr) + 1);
+
+			buffer += len;
+		}
+		tot_len += len;
+	}
+	ifl->ifl_len = tot_len;
+	return (error);
+}
+
+/*
  * Detach a file from its current interface (if attached at all) and attach
  * to the interface indicated by the name stored in ifr.
  * Return an errno or 0.
@@ -1847,10 +1928,19 @@ bpf_setif(struct bpf_d *d, struct ifreq *ifr)
 	BPF_LOCK_ASSERT();
 
 	theywant = ifunit(ifr->ifr_name);
-	if (theywant == NULL || theywant->if_bpf == NULL)
-		return (ENXIO);
+	if (theywant == NULL || theywant->if_bpf == NULL) {
+		/* Check for fake interface existance */
+		LIST_FOREACH(bp, &bpf_iflist, bif_next) {
+			if (!BPF_IS_VIRTUAL(bp))
+				continue;
+			if (!strcmp(bp->ifname, ifr->ifr_name))
+				break;
+		}
 
-	bp = theywant->if_bpf;
+		if (bp == NULL)
+			return (ENXIO);
+	} else
+		bp = theywant->if_bpf;
 
 	/* Check if interface is not being detached from BPF */
 	BPFIF_RLOCK(bp);
@@ -2075,7 +2165,8 @@ bpf_tap(struct bpf_if *bp, u_char *pkt, u_int pktl
 			if (gottime < bpf_ts_quality(d->bd_tstamp))
 				gottime = bpf_gettime(&bt, d->bd_tstamp, NULL);
 #ifdef MAC
-			if (mac_bpfdesc_check_receive(d, bp->bif_ifp) == 0)
+			if (BPF_IS_VIRTUAL(bp) ||
+				(mac_bpfdesc_check_receive(d, bp->bif_ifp) == 0))
 #endif
 				catchpacket(d, pkt, pktlen, slen,
 				    bpf_append_bytes, &bt);
@@ -2085,6 +2176,7 @@ bpf_tap(struct bpf_if *bp, u_char *pkt, u_int pktl
 	BPFIF_RUNLOCK(bp);
 }
 
+/* Note i CAN be NULL */
 #define	BPF_CHECK_DIRECTION(d, r, i)				\
 	    (((d)->bd_direction == BPF_D_IN && (r) != (i)) ||	\
 	    ((d)->bd_direction == BPF_D_OUT && (r) == (i)))
@@ -2134,7 +2226,8 @@ bpf_mtap(struct bpf_if *bp, struct mbuf *m)
 			if (gottime < bpf_ts_quality(d->bd_tstamp))
 				gottime = bpf_gettime(&bt, d->bd_tstamp, m);
 #ifdef MAC
-			if (mac_bpfdesc_check_receive(d, bp->bif_ifp) == 0)
+			if ((BPF_IS_VIRTUAL(bp)) ||
+				(mac_bpfdesc_check_receive(d, bp->bif_ifp) == 0))
 #endif
 				catchpacket(d, (u_char *)m, pktlen, slen,
 				    bpf_append_mbuf, &bt);
@@ -2190,7 +2283,8 @@ bpf_mtap2(struct bpf_if *bp, void *data, u_int dle
 			if (gottime < bpf_ts_quality(d->bd_tstamp))
 				gottime = bpf_gettime(&bt, d->bd_tstamp, m);
 #ifdef MAC
-			if (mac_bpfdesc_check_receive(d, bp->bif_ifp) == 0)
+			if ((BPF_IS_VIRTUAL(bp)) ||
+				(mac_bpfdesc_check_receive(d, bp->bif_ifp) == 0))
 #endif
 				catchpacket(d, (u_char *)&mb, pktlen, slen,
 				    bpf_append_mbuf, &bt);
@@ -2484,6 +2578,45 @@ bpfattach2(struct ifnet *ifp, u_int dlt, u_int hdr
 }
 
 /*
+ * Attach fake interface to bpf. ifname is interface name to be attached,
+ * dlt is the link layer type, and hdrlen is the fixed size of the link header
+ * (variable length headers are not yet supporrted).
+ */
+void
+bpfattach3(char *ifname, char *ifdescr, u_int dlt, u_int hdrlen, struct bpf_if **driverp)
+{
+	struct bpf_if *bp;
+	int len;
+
+	len = strlen(ifdescr) + 1;
+
+	/* Assume bpf_if to be properly aligned */
+	bp = malloc(sizeof(*bp) + len, M_BPF, M_NOWAIT | M_ZERO);
+	if (bp == NULL)
+		panic("bpfattach");
+
+	LIST_INIT(&bp->bif_dlist);
+	LIST_INIT(&bp->bif_wlist);
+	strlcpy(bp->ifname, ifname, IFNAMSIZ + 1);
+	bp->ifdescr = (char *)(bp + 1);
+	strlcpy(bp->ifdescr, ifdescr, len);
+	bp->bif_dlt = dlt;
+	rw_init(&bp->bif_lock, "bpf interface lock");
+	KASSERT(*driverp == NULL, ("bpfattach3: driverp already initialized"));
+	*driverp = bp;
+
+	BPF_LOCK();
+	LIST_INSERT_HEAD(&bpf_iflist, bp, bif_next);
+	BPF_UNLOCK();
+
+	bp->bif_hdrlen = hdrlen;
+
+	if (bootverbose)
+		printf("%s: bpf attached\n", bp->ifname);
+}
+
+
+/*
  * Detach bpf from an interface. This involves detaching each descriptor
  * associated with the interface. Notify each descriptor as it's detached
  * so that any sleepers wake up and get ENXIO.
@@ -2546,6 +2679,54 @@ bpfdetach(struct ifnet *ifp)
 }
 
 /*
+ * Detach bpf from the fake interface. This involves detaching each descriptor
+ * associated with the interface. Notify each descriptor as it's detached
+ * so that any sleepers wake up and get ENXIO.
+ */
+void
+bpfdetach3(char *ifname)
+{
+	struct bpf_if	*bp;
+	struct bpf_d	*d;
+
+	BPF_LOCK();
+	/* Find all bpf_if struct's which reference ifp and detach them. */
+	LIST_FOREACH(bp, &bpf_iflist, bif_next) {
+		if (!BPF_IS_VIRTUAL(bp))
+			continue;
+		if (!strcmp(bp->ifname, ifname))
+			break;
+	}
+
+	if (bp != NULL)
+		LIST_REMOVE(bp, bif_next);
+	
+	BPF_UNLOCK();
+
+	if (bp != NULL) {
+		while ((d = LIST_FIRST(&bp->bif_dlist)) != NULL) {
+			bpf_detachd_locked(d);
+			BPFD_LOCK(d);
+			bpf_wakeup(d);
+			BPFD_UNLOCK(d);
+		}
+		/* Free writer-only descriptors */
+		while ((d = LIST_FIRST(&bp->bif_wlist)) != NULL) {
+			bpf_detachd_locked(d);
+			BPFD_LOCK(d);
+			bpf_wakeup(d);
+			BPFD_UNLOCK(d);
+		}
+
+		/*
+		 * Since this interface is fake we can free our
+		 * structure immediately.
+		 */
+		rw_destroy(&bp->bif_lock);
+		free(bp, M_BPF);
+	}
+}
+/*
  * Interface departure handler.
  * Note departure event does not guarantee interface is going down.
  */
@@ -2594,6 +2775,9 @@ bpf_getdltlist(struct bpf_d *d, struct bpf_dltlist
 	LIST_FOREACH(bp, &bpf_iflist, bif_next) {
 		if (bp->bif_ifp != ifp)
 			continue;
+		/* Compare fake interfaces by name */
+		if ((ifp == NULL) && (strcmp(d->bd_bif->ifname, bp->ifname)))
+			continue;
 		if (bfl->bfl_list != NULL) {
 			if (n >= bfl->bfl_len)
 				return (ENOMEM);
@@ -2623,7 +2807,13 @@ bpf_setdlt(struct bpf_d *d, u_int dlt)
 	ifp = d->bd_bif->bif_ifp;
 
 	LIST_FOREACH(bp, &bpf_iflist, bif_next) {
-		if (bp->bif_ifp == ifp && bp->bif_dlt == dlt)
+		if (bp->bif_ifp != ifp)
+			continue;
+
+		if ((ifp == NULL) && strcmp(d->bd_bif->ifname, bp->ifname))
+			continue;
+
+		if (bp->bif_dlt == dlt)
 			break;
 	}
 
@@ -2718,8 +2908,10 @@ bpfstats_fill_xbpf(struct xbpf_d *d, struct bpf_d
 	d->bd_hlen = bd->bd_hlen;
 	d->bd_bufsize = bd->bd_bufsize;
 	d->bd_pid = bd->bd_pid;
-	strlcpy(d->bd_ifname,
-	    bd->bd_bif->bif_ifp->if_xname, IFNAMSIZ);
+	if (!BPF_IS_VIRTUAL(bd->bd_bif))
+		strlcpy(d->bd_ifname, bd->bd_bif->bif_ifp->if_xname, IFNAMSIZ);
+	else
+		strlcpy(d->bd_ifname, bd->bd_bif->ifname, IFNAMSIZ);
 	d->bd_locked = bd->bd_locked;
 	d->bd_wcount = bd->bd_wcount;
 	d->bd_wdcount = bd->bd_wdcount;
Index: sys/net/bpf.h
===================================================================
--- sys/net/bpf.h	(revision 243778)
+++ sys/net/bpf.h	(working copy)
@@ -147,6 +147,7 @@ struct bpf_zbuf {
 #define	BIOCSETFNR	_IOW('B', 130, struct bpf_program)
 #define	BIOCGTSTAMP	_IOR('B', 131, u_int)
 #define	BIOCSTSTAMP	_IOW('B', 132, u_int)
+#define	BIOCGIFLIST	_IOWR('B', 133, struct bpf_iflist)
 
 /* Obsolete */
 #define	BIOCGSEESENT	BIOCGDIRECTION
@@ -1224,6 +1225,25 @@ struct bpf_dltlist {
 	u_int	*bfl_list;	/* array of DLTs */
 };
 
+#define	BIFNAMSIZ	16
+#if !defined(_KERNEL) || defined(BPF_INTERNAL)
+/*
+ * Structure to retrieve virtual BPF intefaces.
+ */
+struct bpf_iflist {
+	u_int	ifl_len;	/* total memory size */
+	u_int	ifl_ver;	/* version (set to 0) */
+	char	*ifl_list;	/* array of interfaces */
+};
+
+struct bpf_ifreply {
+	u_int	ifr_len;	/* Total record length */
+	u_int	ifr_spare[3];	/* Spare data */
+	char	ifr_name[BIFNAMSIZ + 1];	/* Interface name */
+	char	ifr_descr[0];	/* Interface description (variable) */
+};
+#endif
+
 #ifdef _KERNEL
 #ifdef MALLOC_DECLARE
 MALLOC_DECLARE(M_BPF);
@@ -1262,6 +1282,8 @@ struct bpf_if {
 	struct rwlock bif_lock;		/* interface lock */
 	LIST_HEAD(, bpf_d)	bif_wlist;	/* writer-only list */
 	int flags;			/* Interface flags */
+	char ifname[IFNAMSIZ + 1];	/* Virtual interface name */
+	char *ifdescr;			/* Virtual interface description */
 #endif
 };
 
@@ -1272,7 +1294,9 @@ void	 bpf_mtap(struct bpf_if *, struct mbuf *);
 void	 bpf_mtap2(struct bpf_if *, void *, u_int, struct mbuf *);
 void	 bpfattach(struct ifnet *, u_int, u_int);
 void	 bpfattach2(struct ifnet *, u_int, u_int, struct bpf_if **);
+void	 bpfattach3(char *, char *, u_int, u_int, struct bpf_if **);
 void	 bpfdetach(struct ifnet *);
+void	 bpfdetach3(char *);
 
 void	 bpfilterattach(int);
 u_int	 bpf_filter(const struct bpf_insn *, u_char *, u_int, u_int);
Index: sys/net/bpfdesc.h
===================================================================
--- sys/net/bpfdesc.h	(revision 243778)
+++ sys/net/bpfdesc.h	(working copy)
@@ -102,6 +102,8 @@ struct bpf_d {
 	u_char		bd_compat32;	/* 32-bit stream on LP64 system */
 };
 
+#define	BPF_IS_VIRTUAL(x)	((x)->bif_ifp == NULL)
+
 /* Values for bd_state */
 #define BPF_IDLE	0		/* no select in progress */
 #define BPF_WAITING	1		/* waiting for read timeout in select */
Index: sys/netpfil/ipfw/ip_fw_log.c
===================================================================
--- sys/netpfil/ipfw/ip_fw_log.c	(revision 243778)
+++ sys/netpfil/ipfw/ip_fw_log.c	(working copy)
@@ -93,142 +93,31 @@ ipfw_log_bpf(int onoff)
 {
 }
 #else /* !WITHOUT_BPF */
-static struct ifnet *log_if;	/* hook to attach to bpf */
-static struct rwlock log_if_lock;
-#define	LOGIF_LOCK_INIT(x)	rw_init(&log_if_lock, "ipfw log_if lock")
-#define	LOGIF_LOCK_DESTROY(x)	rw_destroy(&log_if_lock)
-#define	LOGIF_RLOCK(x)		rw_rlock(&log_if_lock)
-#define	LOGIF_RUNLOCK(x)	rw_runlock(&log_if_lock)
-#define	LOGIF_WLOCK(x)		rw_wlock(&log_if_lock)
-#define	LOGIF_WUNLOCK(x)	rw_wunlock(&log_if_lock)
+static struct bpf_if *log_bpfif = NULL;	/* hook to attach to bpf */
+#define BPF_IFNAME	"ipfw0"
+#define	IPFW_MTAP(_if_bpf,_data,_dlen,_m) do {			\
+	if (bpf_peers_present(_if_bpf)) {		\
+		M_ASSERTVALID(_m);				\
+		bpf_mtap2((_if_bpf),(_data),(_dlen),(_m));	\
+	}							\
+} while (0)
 
-static const char ipfwname[] = "ipfw";
-
-/* we use this dummy function for all ifnet callbacks */
-static int
-log_dummy(struct ifnet *ifp, u_long cmd, caddr_t addr)
-{
-	return EINVAL;
-}
-
-static int
-ipfw_log_output(struct ifnet *ifp, struct mbuf *m,
-	struct sockaddr *dst, struct route *ro)
-{
-	if (m != NULL)
-		FREE_PKT(m);
-	return EINVAL;
-}
-
-static void
-ipfw_log_start(struct ifnet* ifp)
-{
-	panic("ipfw_log_start() must not be called");
-}
-
 static const u_char ipfwbroadcastaddr[6] =
 	{ 0xff, 0xff, 0xff, 0xff, 0xff, 0xff };
 
-static int
-ipfw_log_clone_match(struct if_clone *ifc, const char *name)
-{
-
-	return (strncmp(name, ipfwname, sizeof(ipfwname) - 1) == 0);
-}
-
-static int
-ipfw_log_clone_create(struct if_clone *ifc, char *name, size_t len,
-    caddr_t params)
-{
-	int error;
-	int unit;
-	struct ifnet *ifp;
-
-	error = ifc_name2unit(name, &unit);
-	if (error)
-		return (error);
-
-	error = ifc_alloc_unit(ifc, &unit);
-	if (error)
-		return (error);
-
-	ifp = if_alloc(IFT_PFLOG);
-	if (ifp == NULL) {
-		ifc_free_unit(ifc, unit);
-		return (ENOSPC);
-	}
-	ifp->if_dname = ipfwname;
-	ifp->if_dunit = unit;
-	snprintf(ifp->if_xname, IFNAMSIZ, "%s%d", ipfwname, unit);
-	strlcpy(name, ifp->if_xname, len);
-	ifp->if_mtu = 65536;
-	ifp->if_flags = IFF_UP | IFF_SIMPLEX | IFF_MULTICAST;
-	ifp->if_init = (void *)log_dummy;
-	ifp->if_ioctl = log_dummy;
-	ifp->if_start = ipfw_log_start;
-	ifp->if_output = ipfw_log_output;
-	ifp->if_addrlen = 6;
-	ifp->if_hdrlen = 14;
-	ifp->if_broadcastaddr = ipfwbroadcastaddr;
-	ifp->if_baudrate = IF_Mbps(10);
-
-	LOGIF_WLOCK();
-	if (log_if == NULL)
-		log_if = ifp;
-	else {
-		LOGIF_WUNLOCK();
-		if_free(ifp);
-		ifc_free_unit(ifc, unit);
-		return (EEXIST);
-	}
-	LOGIF_WUNLOCK();
-	if_attach(ifp);
-	bpfattach(ifp, DLT_EN10MB, 14);
-
-	return (0);
-}
-
-static int
-ipfw_log_clone_destroy(struct if_clone *ifc, struct ifnet *ifp)
-{
-	int unit;
-
-	if (ifp == NULL)
-		return (0);
-
-	LOGIF_WLOCK();
-	if (log_if != NULL && ifp == log_if)
-		log_if = NULL;
-	else {
-		LOGIF_WUNLOCK();
-		return (EINVAL);
-	}
-	LOGIF_WUNLOCK();
-
-	unit = ifp->if_dunit;
-	bpfdetach(ifp);
-	if_detach(ifp);
-	if_free(ifp);
-	ifc_free_unit(ifc, unit);
-
-	return (0);
-}
-
-static struct if_clone *ipfw_log_cloner;
-
 void
 ipfw_log_bpf(int onoff)
 {
 
-	if (onoff) {
-		LOGIF_LOCK_INIT();
-		ipfw_log_cloner = if_clone_advanced(ipfwname, 0,
-		    ipfw_log_clone_match, ipfw_log_clone_create,
-		    ipfw_log_clone_destroy);
-	} else {
-		if_clone_detach(ipfw_log_cloner);
-		LOGIF_LOCK_DESTROY();
-	}
+  	if (onoff) {
+		if (log_bpfif)
+  			return;
+		bpfattach3(BPF_IFNAME, "ipfw log interface", DLT_EN10MB, 14, &log_bpfif);
+  	} else {
+		if (log_bpfif != NULL)
+			bpfdetach3(BPF_IFNAME);
+		log_bpfif = NULL;
+  	}
 }
 #endif /* !WITHOUT_BPF */
 
@@ -247,20 +136,18 @@ ipfw_log(struct ip_fw *f, u_int hlen, struct ip_fw
 
 	if (V_fw_verbose == 0) {
 #ifndef WITHOUT_BPF
-		LOGIF_RLOCK();
-		if (log_if == NULL || log_if->if_bpf == NULL) {
-			LOGIF_RUNLOCK();
+		if (log_bpfif == NULL)
 			return;
-		}
 
 		if (args->eh) /* layer2, use orig hdr */
-			BPF_MTAP2(log_if, args->eh, ETHER_HDR_LEN, m);
+			IPFW_MTAP(log_bpfif, args->eh, ETHER_HDR_LEN, m);
 		else
-			/* Add fake header. Later we will store
+			/*
+			 * Add fake header. Later we will store
 			 * more info in the header.
 			 */
-			BPF_MTAP2(log_if, "DDDDDDSSSSSS\x08\x00", ETHER_HDR_LEN, m);
-		LOGIF_RUNLOCK();
+			IPFW_MTAP(log_bpfif, "DDDDDDSSSSSS\x08\x00",
+			    ETHER_HDR_LEN, m);
 #endif /* !WITHOUT_BPF */
 		return;
 	}
Index: contrib/libpcap/pcap-bpf.c
===================================================================
--- contrib/libpcap/pcap-bpf.c	(revision 243778)
+++ contrib/libpcap/pcap-bpf.c	(working copy)
@@ -132,6 +132,8 @@ static int bpf_load(char *errbuf);
 #include "pcap-snf.h"
 #endif /* HAVE_SNF_API */
 
+#include "pcap-freebsd.h"
+
 #ifdef HAVE_OS_PROTO_H
 #include "os-proto.h"
 #endif
@@ -2311,6 +2313,8 @@ pcap_platform_finddevs(pcap_if_t **alldevsp, char
 	if (snf_platform_finddevs(alldevsp, errbuf) < 0)
 		return (-1);
 #endif /* HAVE_SNF_API */
+	if (freebsd_platform_finddevs(alldevsp, errbuf) < 0)
+		return (-1);
 
 	return (0);
 }
--- /dev/null	2012-12-02 04:22:01.000000000 +0400
+++ contrib/libpcap/pcap-freebsd.h	2012-12-02 02:50:44.251624161 +0400
@@ -0,0 +1 @@
+int freebsd_platform_finddevs(pcap_if_t **devlistp, char *errbuf);
--- /dev/null	2012-12-02 04:22:01.000000000 +0400
+++ contrib/libpcap/pcap-freebsd.c	2012-12-02 04:22:11.404710869 +0400
@@ -0,0 +1,138 @@
+/*
+ *  pcap-freebsd.c: Packet capture advanced interface to the FreeBSD kernel
+ *
+ *  License: BSD
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer. 
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 3. The name of the author may not be used to endorse or promote products
+ *    derived from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
+ * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+ * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  
+ * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+ * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
+ * AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 
+ * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#ifdef HAVE_CONFIG_H
+#include "config.h"
+#endif
+
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/ioctl.h>
+
+#include <net/if.h>
+#include <net/bpf.h>
+
+#include <ctype.h>
+#include <errno.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <fcntl.h>
+#include <unistd.h>
+
+#include "pcap-int.h"
+
+int
+freebsd_platform_finddevs(pcap_if_t **alldevsp, char *errbuf)
+{
+	int ret;
+
+	struct bpf_iflist ifl;
+	struct bpf_ifreply *ifr;
+	char *device = "/dev/bpf";
+	int fd, i, len, res;
+	caddr_t databuf;
+
+	if ((fd = open(device, O_RDWR)) == -1) {
+		snprintf(errbuf, PCAP_ERRBUF_SIZE,
+		    "(cannot open device) %s: %s",
+		    device, pcap_strerror(errno));
+
+		return (-1);
+	}
+
+	res = 0;
+
+	for (i = 0; i < 10; i++) {
+		/* Get size */
+		memset(&ifl, 0, sizeof(ifl));
+
+		if (ioctl(fd, BIOCGIFLIST, (caddr_t)&ifl) != 0) {
+			snprintf(errbuf, PCAP_ERRBUF_SIZE,
+			    "(cannot get interface list length): %s",
+			    pcap_strerror(errno));
+
+			close(fd);
+			return (-1);
+		}
+
+		/* Allocate requested length */
+		len = ifl.ifl_len + 1024;
+		databuf = calloc(1, len);
+
+		/* Try to read data */
+		ifl.ifl_list = databuf;
+		ifl.ifl_len = len;
+		
+		if (ioctl(fd, BIOCGIFLIST, (caddr_t)&ifl) != 0) {
+			if (errno == ENOMEM) {
+				/*
+				 * Probably new interface added.
+				 * Let's try another time.
+				 */
+				free(databuf);
+				databuf = NULL;
+				ifl.ifl_len = 0;
+				continue;
+			}
+
+			snprintf(errbuf, PCAP_ERRBUF_SIZE,
+			    "(cannot read interface list): %s",
+			    pcap_strerror(errno));
+
+			close(fd);
+			return (-1);
+		}
+
+		res = 1;
+		break;
+	}
+
+	close(fd);
+
+	if (res == 0) {
+		snprintf(errbuf, PCAP_ERRBUF_SIZE,
+		    "(error reading interface list): retries exceeded");
+		return (-1);
+	}
+
+	/* Okay, let's parse */
+	for (len = 0; len < ifl.ifl_len; ) {
+		ifr = (struct bpf_ifreply *)&databuf[len];
+
+		if (pcap_add_if(alldevsp, ifr->ifr_name, 0,
+		    ifr->ifr_descr, errbuf) < 0)
+			return (-1);
+
+		len += ifr->ifr_len;
+	}
+
+	return (0);
+}
+

--------------080701070400070005040601--

From owner-freebsd-net@FreeBSD.ORG  Sun Dec  2 03:38:26 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 9DCD2364
 for <freebsd-net@freebsd.org>; Sun,  2 Dec 2012 03:38:26 +0000 (UTC)
 (envelope-from moonlightakkiy@yahoo.ca)
Received: from nm14-vm0.bullet.mail.bf1.yahoo.com
 (nm14-vm0.bullet.mail.bf1.yahoo.com [98.139.213.164])
 by mx1.freebsd.org (Postfix) with ESMTP id 01B7C8FC0C
 for <freebsd-net@freebsd.org>; Sun,  2 Dec 2012 03:38:25 +0000 (UTC)
Received: from [98.139.215.141] by nm14.bullet.mail.bf1.yahoo.com with NNFMP;
 02 Dec 2012 03:38:24 -0000
Received: from [98.139.213.12] by tm12.bullet.mail.bf1.yahoo.com with NNFMP;
 02 Dec 2012 03:38:24 -0000
Received: from [127.0.0.1] by smtp112.mail.bf1.yahoo.com with NNFMP;
 02 Dec 2012 03:38:24 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.ca; s=s1024;
 t=1354419504; bh=gDgQb+NYGEbZG5K2h4Hfjk2kihIV/REgvVGw7DKkxic=;
 h=X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:Received:Received:MIME-Version:Received:Received:In-Reply-To:References:Date:Message-ID:Subject:From:To:Cc:Content-Type;
 b=NtV4e6RI5Htm0wV4/08YsYFB+KeD1IEp9WH61uqhjRKGW1BUriPp6w5ScJi/33XQW1A33OmXcLLRCAg6EMo/BEbZLhYBrbe7FBYePRycljFdrOEN2IVfO9J8j0lir5b15Tg5mPPXm3iqqeOwh7efXxzWseMMD9zm6FUyzoQBy78=
X-Yahoo-Newman-Id: 713105.73332.bm@smtp112.mail.bf1.yahoo.com
X-Yahoo-Newman-Property: ymail-3
X-YMail-OSG: ezmBgt8VM1kDHqNXfrVqnXF5lJw_JZXFwbqQszOidZgXG8k
 w0BDIIdHfRZaHHw3vccIm44_yQg50qbTMr79llAQuOT2O3yeabWTZEuonxYo
 dK58sAw3HQLU2iks5Q9KYiZsF0iltq1IW48fNJT5.TslSAloWwlt5CojYzFI
 lyxtfdFEkr.QWgTDH0UoTbfIws9q7343c8MO.OmeOvPkhcOoFa3Zb87Z6nfb
 TOnoXucSoMCFswFXyjztQwNH2L7MvkwQeUMC2Uj4zPn6H_WJqhfefUUZPVi4
 QR9ha.A.phZkuBpXh09YaBCMla3OIjP2WJBrnnrmW9hY21TY7WO3bwJcYhiX
 KDu.OnS4mAHyvxtYiNoyl2qZD85vKgEkl_kL_.uNBl6Hr2P4.Oplw42l3uhP
 fXZPdqEfSID5olHOe361SnT8H8G0bWNJqAkIpMtZAjuvM.K2nJMRAfEdjfPm
 Afx.PmQaRgZOXAYr4vLBuFs0n5.ZZHpC.nsGnfWqBsMck8MU4F_Lu6_bm93c
 Z6VTj5PYNI20U0BVC5HHlh_DRb85rRrM53cUd87X.Y6aQoEit_DB8soxamZ7
 eyhb64ykgz5H90iQLN5FQ7oWqaYTO6P3oumy1miedW5ZqjFEB2Jyt3klRDg- -
X-Yahoo-SMTP: Xr6qjFWswBAEmd20sAvB4Q3keqXvXsIH9TjJ
Received: from mail-vc0-f182.google.com (moonlightakkiy@209.85.220.182 with
 plain)
 by smtp112.mail.bf1.yahoo.com with SMTP; 01 Dec 2012 19:38:24 -0800 PST
Received: by mail-vc0-f182.google.com with SMTP id fo14so1112501vcb.13
 for <freebsd-net@freebsd.org>; Sat, 01 Dec 2012 19:38:24 -0800 (PST)
MIME-Version: 1.0
Received: by 10.58.12.231 with SMTP id b7mr5401465vec.31.1354419504218; Sat,
 01 Dec 2012 19:38:24 -0800 (PST)
Received: by 10.58.182.72 with HTTP; Sat, 1 Dec 2012 19:38:23 -0800 (PST)
In-Reply-To: <CAK=C58K51H9g+vhuzurnQUv-ZxP4PCFpTHpnMsx7tKYcHcgJYg@mail.gmail.com>
References: <CAFZ_MYKUAtz=Lem-LQsC_Jgw6zzWPx6EibxJqrCB32faWx8PVA@mail.gmail.com>
 <CAK=C58+3bnU=_jqtitT8cauO9_CSuvAhMpp7VAgCp4h6EBhXyQ@mail.gmail.com>
 <CAFZ_MY+m=YNMC6iODQuSkEt7-UU=hAEgXLxqgiCoO+gtu3O+HA@mail.gmail.com>
 <CAK=C58+GPSTRZRWPE-c-9vwAECoV5pG8a7P7bv4uqDi_hqiNFA@mail.gmail.com>
 <CAFZ_MYKDGDHDi7iax5gR2trJ9expv-km9=cwCgKNHZF2h8ip2Q@mail.gmail.com>
 <CAK=C58K51H9g+vhuzurnQUv-ZxP4PCFpTHpnMsx7tKYcHcgJYg@mail.gmail.com>
Date: Sat, 1 Dec 2012 20:38:23 -0700
Message-ID: <CAFZ_MYLo4KjQaHSWs1L3LYqh4OcnBZj6yeRKBDpUT8J4msDs0A@mail.gmail.com>
Subject: Re: Ralink RT2860 Driver Code
From: PseudoCylon <moonlightakkiy@yahoo.ca>
To: Ramanujan Seshadri <ksramanujan@gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 02 Dec 2012 03:38:26 -0000

On Sat, Dec 1, 2012 at 3:08 PM, Ramanujan Seshadri
<ksramanujan@gmail.com> wrote:
> Hello,
>   Thanks for the explanation. In fact when i saw the code i also thought the
> same, but when i tried
> to print out the transmitted A-MPDU's i found something different.
>
> For example,
>    The Counter AggSize15Count should print the the numbers in the multiples
> of 15, but sometimes it doesn't.
> So, my understanding is that, the MPDU's are written into these registers,
> and then an ampdu is formed only when there
> are enough number of MPDU;s. For example, AggSize15Count sometimes show the
> counter as 54, so
> there would be only (54/15) == 3 ampdu's ( 9 remainder).
>
>  But, then i am not sure what will happen to the remaining 9 MPDU's. Does
> the register wait for 6 more MPDU's
> so that it can aggregate 15 MPDU's to form 1 ampdu or does it write to a
> different registry like
> AggSize9Count where these 9 MPDU's can get aggregated to an ampdu.
>
> Can you please explain ?

Maybe, re-transmitted packets were counted multiple times.

If you need to know exactly what is going on, you have to figure out,
i.e by reading BA packet or checking what other end is receiving.

Unfortunately, this is what you need to do when you are writing a
driver without proper documentation.


AK

>
> -Ram
>
> On Thu, Nov 29, 2012 at 3:35 AM, PseudoCylon <moonlightakkiy@yahoo.ca>
> wrote:
>>
>> On Wed, Nov 28, 2012 at 9:35 PM, Ramanujan Seshadri
>> <ksramanujan@gmail.com> wrote:
>> > Hello,
>> >
>> > Thanks for the reply. I just had one more doubt.
>> >
>> > In the counters to update the transmitted A-MPDU counter (Function Name:
>> > NICUpdateRawCounters),  i saw these lines of codes
>> >
>> > pRalinkCounters->TransmittedAMPDUCount.u.LowPart +=
>> > TxAggCnt0.field.AggSize1Count;
>> > pRalinkCounters->TransmittedAMPDUCount.u.LowPart +=
>> > (TxAggCnt0.field.AggSize2Count >> 1);
>> > pRalinkCounters->TransmittedAMPDUCount.u.LowPart +=
>> > (TxAggCnt0.field.AggSize3Count /3);
>> > .
>> > .
>> > .
>> > .
>> > pRalinkCounters->TransmittedAMPDUCount.u.LowPart +=
>> > (TxAggCnt0.field.AggSize15Count/ 15);
>> > pRalinkCounters->TransmittedAMPDUCount.u.LowPart +=
>> > (TxAggCnt0.field.AggSize16Count >> 4);
>> >
>> > Can you please explain the reason why the 'i'th counter is being divided
>> > by
>> > i, for example .TxAggCnt0.field.AggSize15Count is being divided by 15.
>>
>> [NB] For people who haven't seen Ralink's code, the above codes are
>> theirs.
>>
>> I guess I didn't explain well. Those counters show number of mpdu
>> packets, i.e. AggSize15Count == 30 means 30 mpdu or 2 (30/15) ampdu
>> packets. (Because I don't have any datasheet, that how I interpret
>> Ralink's code.)
>>
>> >
>> > Also if these were little endian counters then i could not understand
>> > the
>> > reason why the four counters "TxAggCnt0.field.AggSize2Count,
>> > TxAggCnt0.field.AggSize4Count, TxAggCnt0.field.AggSize8Count
>> > and TxAggCnt0.field.AggSize16Count " are shifted right by some bits,
>> > which
>> > means that they are multiplying them (since it is little endian
>> > registers)
>> > and why they are dividing the others.
>>
>> RTMP_IO_READ32() does byte swapping. The values should be saved into
>> AggSizeNCount with host's byte order. So, right sifting means dividing
>> regardless of the byte order.
>> >>1 == /2
>> ...
>> >>4 == /16
>> They are playing nice to CPUs, I think.
>>
>>
>> AK
>>
>> >
>> > Thanks for the help.
>> >
>> > -ram
>> >
>> >
>> > On Tue, Nov 27, 2012 at 6:07 PM, PseudoCylon <moonlightakkiy@yahoo.ca>
>> > wrote:
>> >>
>> >> On Tue, Nov 27, 2012 at 1:23 PM, Ramanujan Seshadri
>> >> <ksramanujan@gmail.com> wrote:
>> >> > I want to know how many MPDU's are aggregated in each AMPDU
>> >> > transmission.
>> >>
>> >> You could use following statistic counters
>> >> RT2860_TX_AGG_CNT0 to 7
>> >>
>> >>
>> >> https://gitorious.org/run/run/blobs/11n_rc3/dev/usb/wlan/if_runreg.h#line186
>> >> Each 32-bit little-endian read-on-clear register contains 2 16-bit
>> >> counters (total 16 16-bit counters).
>> >> counter at offset 0x1720 MPDU count 1
>> >> counter at offset 0x1722 MPDU count 2
>> >>  ...
>> >> counter at offset 0x173c MPDU count 15
>> >> counter at offset 0x173e MPDU count >= 16
>> >>
>> >> These regs are identical on RT2800 and RT2700 (pci/usb).
>> >>
>> >> Example (see #if 0 part)
>> >>
>> >> https://gitorious.org/run/run/blobs/11n_rc3/dev/usb/wlan/if_run.c#line2493
>> >>
>> >> You can only find out statistical numbers (total Tx counts past X
>> >> sec). You cannot find out an MPDU count in a particular packet, i.e.
>> >> an aggregated packet just Tx'd, unless you read the counters on each
>> >> Tx.
>> >>
>> >>
>> >> AK
>> >>
>> >> >
>> >> > -ram
>> >> >
>> >> >
>> >> > On Tue, Nov 27, 2012 at 2:11 PM, PseudoCylon
>> >> > <moonlightakkiy@yahoo.ca>
>> >> > wrote:
>> >> >>
>> >> >> > ------------------------------
>> >> >> >
>> >> >> > Message: 12
>> >> >> > Date: Tue, 27 Nov 2012 04:33:37 -0500
>> >> >> > From: Ramanujan Seshadri <ksramanujan@gmail.com>
>> >> >> > To: freebsd-net@freebsd.org
>> >> >> > Subject: Ralink RT2860 Driver Code
>> >> >> > Message-ID:
>> >> >> >
>> >> >> >
>> >> >> > <CAK=C58L7YAp+YGk3PZ2VJG9toaKWcBHHi7xsaxth6-KYf0d6xg@mail.gmail.com>
>> >> >> > Content-Type: text/plain; charset=ISO-8859-1
>> >> >> >
>> >> >> > Hello,
>> >> >> >   Can i know how to get the MPDU's aggregated in each AMPDU in a
>> >> >> > ralink
>> >> >> > driver code for RT2860. I saw the existing counters of ralink and
>> >> >> > tried
>> >> >> > to
>> >> >> > get some info, but was not very useful.
>> >> >> >   Any help is greatly appreciated.
>> >> >> >
>> >> >>
>> >> >> https://gitorious.org/run/run/trees/11n_rc3/dev/usb/wlan
>> >> >>
>> >> >> What info are you trying to get?
>> >> >>
>> >> >>
>> >> >> AK
>> >> >>
>> >> >> > Thanks
>> >> >> > ram
>> >> >> >
>> >> >> >
>> >> >> > ------------------------------
>> >> >> >
>> >> >> > _______________________________________________
>> >> >> > freebsd-net@freebsd.org mailing list
>> >> >> > http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> >> >> > To unsubscribe, send any mail to
>> >> >> > "freebsd-net-unsubscribe@freebsd.org"
>> >> >> >
>> >> >> > End of freebsd-net Digest, Vol 504, Issue 2
>> >> >> > *******************************************
>> >> >
>> >> >
>> >
>> >
>
>

From owner-freebsd-net@FreeBSD.ORG  Sun Dec  2 03:47:47 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 9BC32441
 for <freebsd-net@freebsd.org>; Sun,  2 Dec 2012 03:47:47 +0000 (UTC)
 (envelope-from moonlightakkiy@yahoo.ca)
Received: from nm27-vm0.bullet.mail.bf1.yahoo.com
 (nm27-vm0.bullet.mail.bf1.yahoo.com [98.139.213.139])
 by mx1.freebsd.org (Postfix) with ESMTP id 1FC578FC12
 for <freebsd-net@freebsd.org>; Sun,  2 Dec 2012 03:47:47 +0000 (UTC)
Received: from [98.139.212.146] by nm27.bullet.mail.bf1.yahoo.com with NNFMP;
 02 Dec 2012 03:47:46 -0000
Received: from [98.139.213.2] by tm3.bullet.mail.bf1.yahoo.com with NNFMP;
 02 Dec 2012 03:47:46 -0000
Received: from [127.0.0.1] by smtp102.mail.bf1.yahoo.com with NNFMP;
 02 Dec 2012 03:47:46 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.ca; s=s1024;
 t=1354420066; bh=iP/bG+7ePlaH/SvhNV5fJrfTDDKgEwa+zdj+rgfrTr8=;
 h=X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:Received:Received:MIME-Version:Received:Received:In-Reply-To:References:Date:Message-ID:Subject:From:To:Content-Type;
 b=G+hwW83llhbXHQrkt02IGpVCXCERVyXrMHHZU/hABulEBMcdKYRcuUBR7WkYKrqo368Cj8JGhDpowqaLfK57f6Q0CCGARa7eJ4n7cQEUsJ9M9lIoNGbknv8aXEPD//z2DsofYn7Z9bJg8DTa0H1Xob4twfLLHytIIUO7DApkGHk=
X-Yahoo-Newman-Id: 630941.51747.bm@smtp102.mail.bf1.yahoo.com
X-Yahoo-Newman-Property: ymail-3
X-YMail-OSG: dwI.zrgVM1nVRoO5SHQ3rdelXoFlf4psDFkkZfAOBJcrJSx
 MFvjJnXW0cNEPSLKwbP2sd8rcagmzj_E0m31sXrRDpyjEgxgbxICFD.TZmlf
 TC_sEZDiReH2vayCY33pAp0VBFeuvLDAdJ5paSWRRF3VZVfd1g24kde8p9MN
 XMgUIPpJBVJgFdFurjg4PGd1JHeg.5Fd8MP1XcmELSa0syLKMcfRXy8xx3pQ
 sRP88sKtIFj6o3rAmz_jV_skz1AUydhg5AplzFKjFI5FJLiPfxw5znLJJrSP
 nJ4KKtA0jcWcO5gNQ5kaNRIpqc.D7tGwDnPLr_tg7Z8P9U13m1Iw9cchTtzC
 frToNACk5kvBHculMGWEirTxPjBNB3GqY1fxdS5CH.wRLVRCq9FKYzkLVIsE
 uC6uFT13HdR2CC143b_4cjvBu9JDCSYZabCGkHP.dPN7cQTZTfMt7xvbvnEi
 jfMuedhK_H0Ixr8Euoy6pqRuNzo4dTBKPcvzl1gFVRfU34uNiGS47pxa23zG
 2Mh8HsxiqI066Mo5ErXktPncfjaSYGMqRt2UmRXvgePrmC2ANRsG28A--
X-Yahoo-SMTP: Xr6qjFWswBAEmd20sAvB4Q3keqXvXsIH9TjJ
Received: from mail-vb0-f54.google.com (moonlightakkiy@209.85.212.54 with
 plain)
 by smtp102.mail.bf1.yahoo.com with SMTP; 01 Dec 2012 19:47:46 -0800 PST
Received: by mail-vb0-f54.google.com with SMTP id l1so911690vba.13
 for <freebsd-net@freebsd.org>; Sat, 01 Dec 2012 19:47:43 -0800 (PST)
MIME-Version: 1.0
Received: by 10.220.151.72 with SMTP id b8mr5180193vcw.38.1354420063383; Sat,
 01 Dec 2012 19:47:43 -0800 (PST)
Received: by 10.58.182.72 with HTTP; Sat, 1 Dec 2012 19:47:43 -0800 (PST)
In-Reply-To: <mailman.507.1354401070.67595.freebsd-net@freebsd.org>
References: <mailman.507.1354401070.67595.freebsd-net@freebsd.org>
Date: Sat, 1 Dec 2012 20:47:43 -0700
Message-ID: <CAFZ_MYL5QhK5ABOrBA2+r_PznqpY6aLmM53e3ZDf5_QsqeGL4w@mail.gmail.com>
Subject: Re: freebsd-net Digest, Vol 504, Issue 7
From: PseudoCylon <moonlightakkiy@yahoo.ca>
To: freebsd-net@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 02 Dec 2012 03:47:47 -0000

> ------------------------------
>
> Message: 12
> Date: Sat, 1 Dec 2012 17:12:53 -0500
> From: Ramanujan Seshadri <ksramanujan@gmail.com>
> To: freebsd-net@freebsd.org
> Subject: MCS selected for each transmission in Ralink RT2860
> Message-ID:
>         <CAK=C58LcR-Q0ZM7HWe4HRJYwwVCRiFXfwKtAFGWTbAia96uTBA@mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> Hello all,
>    i wanted to know if i can get the MCS (bit rate ) used to send each
> packet in a ralink rt2860 wireless NIC.
> I saw the ralink code and got to know that they have a rate-adaptation
> algorithm to select the best
> rate (when the HT_MCS parameter =33). But, i wanted to know if i can get
> the details the bit-rate used
> to send each packet.

This is the actual mcs used.
http://fxr.watson.org/fxr/source/dev/ral/rt2860.c#L1118

>
> Thanks
> ram
>
>

From owner-freebsd-net@FreeBSD.ORG  Sun Dec  2 14:25:30 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 8EE3E98C
 for <freebsd-net@freebsd.org>; Sun,  2 Dec 2012 14:25:30 +0000 (UTC)
 (envelope-from to.my.trociny@gmail.com)
Received: from mail-ee0-f54.google.com (mail-ee0-f54.google.com [74.125.83.54])
 by mx1.freebsd.org (Postfix) with ESMTP id 19CD38FC0C
 for <freebsd-net@freebsd.org>; Sun,  2 Dec 2012 14:25:29 +0000 (UTC)
Received: by mail-ee0-f54.google.com with SMTP id c13so1375389eek.13
 for <freebsd-net@freebsd.org>; Sun, 02 Dec 2012 06:25:29 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=sender:date:from:to:subject:message-id:mime-version:content-type
 :content-disposition:user-agent;
 bh=i+4l6owSa/UKNxUdrhKcWTzEy90bee3D3Vvr1nhCX9M=;
 b=F256IzNcsYSEZS3OXuGCaiaqE+bydGBwESz6MiYtTDY/QC6ZSFXI0I3FLLxb5/2sdu
 gN+idF8zjBzuZhjauiXkxN9JanlKeUquEMAhHvceBJmOkJxt5jYiCTquFDMy0igBiSlt
 E+KDgyij953iyoVUx78ikX+NXmKMg2W+cZtJRi6zfY757dsBua1lHTJjzz+eqg2+kS08
 p5XFqvCmaeg/vezIjZsvAb+y3f7G3Ch7DZJehLJRhca/HEoG3CwerH0iGjZBGeq02P9Y
 v4w4ZG22Sis6LVLxpwDCV4m4eaT4PkBkzhvwZux+bQuoQ1Mc/2wqs7HA84YU/RYgmeZQ
 NXSQ==
Received: by 10.14.176.66 with SMTP id a42mr26060180eem.34.1354458329068;
 Sun, 02 Dec 2012 06:25:29 -0800 (PST)
Received: from localhost ([178.150.115.244])
 by mx.google.com with ESMTPS id w3sm24784183eel.17.2012.12.02.06.25.27
 (version=TLSv1/SSLv3 cipher=OTHER);
 Sun, 02 Dec 2012 06:25:27 -0800 (PST)
Sender: Mikolaj Golub <to.my.trociny@gmail.com>
Date: Sun, 2 Dec 2012 16:25:25 +0200
From: Mikolaj Golub <trociny@FreeBSD.org>
To: freebsd-net@freebsd.org
Subject: lagg with wireless iface: iieee80211_waitfor_parent is called with a
 non-sleepable lock held
Message-ID: <20121202142524.GA8207@gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.5.21 (2010-09-15)
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 02 Dec 2012 14:25:30 -0000

Hi,

On my laptop I have lagg setup in failover mode between wired and
wireless interfaces, as it is decribed in handbook:

http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/network-aggregation.html#networking-lagg-wired-and-wireless

On start I have been observing witness warnings like below:

taskqueue_drain with the following non-sleepable locks held:
exclusive rw if_lagg rwlock (if_lagg rwlock) r = 0 (0xfffffe000aa9d408) locked @ /home/golub/freebsd/base/head/sys/modules/if_lagg/../../net/if_lagg.c:1065
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b
kdb_backtrace() at kdb_backtrace+0x39
witness_warn() at witness_warn+0x4b2
taskqueue_drain() at taskqueue_drain+0x3a
ieee80211_waitfor_parent() at ieee80211_waitfor_parent+0x28
ieee80211_ioctl() at ieee80211_ioctl+0x3e9
if_setflag() at if_setflag+0xc0
ifpromisc() at ifpromisc+0x2c
lagg_ioctl() at lagg_ioctl+0x7d5
if_setflag() at if_setflag+0xc0
ifpromisc() at ifpromisc+0x2c
bridge_ioctl_add() at bridge_ioctl_add+0x454
bridge_ioctl() at bridge_ioctl+0x268
in_control() at in_control+0x219
ifioctl() at ifioctl+0x1896
kern_ioctl() at kern_ioctl+0x1b0
sys_ioctl() at sys_ioctl+0x11f
amd64_syscall() at amd64_syscall+0x282
Xfast_syscall() at Xfast_syscall+0xfb
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x8011815ca, rsp = 0x7fffffffd3f8, rbp = 0x7fffffffd4a0 ---

and eventually the panic "Sleeping thread owns a non-sleepable lock"
in lagg_input, when a packet arrives simultaneously with ifconfig run.

The lagg gets if_lagg rwlock before going to setflag, which ends up
calling ieee80211_ioctl and ieee80211_waitfor_parent (wait for all
deferred parent interface tasks to complete).

Does anybody see a way how it could be solved?

-- 
Mikolaj Golub

From owner-freebsd-net@FreeBSD.ORG  Sun Dec  2 19:48:18 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 1F9A817D
 for <freebsd-net@freebsd.org>; Sun,  2 Dec 2012 19:48:18 +0000 (UTC)
 (envelope-from Choupani@gmail.com)
Received: from sam.nabble.com (sam.nabble.com [216.139.236.26])
 by mx1.freebsd.org (Postfix) with ESMTP id ECBF58FC15
 for <freebsd-net@freebsd.org>; Sun,  2 Dec 2012 19:48:17 +0000 (UTC)
Received: from [192.168.236.26] (helo=sam.nabble.com)
 by sam.nabble.com with esmtp (Exim 4.72)
 (envelope-from <Choupani@gmail.com>) id 1TfFWS-0001x9-AK
 for freebsd-net@freebsd.org; Sun, 02 Dec 2012 11:48:16 -0800
Date: Sun, 2 Dec 2012 11:48:16 -0800 (PST)
From: Choupani <Choupani@gmail.com>
To: freebsd-net@freebsd.org
Message-ID: <1354477696312-5766007.post@n5.nabble.com>
Subject: protect common resources in kernel
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 02 Dec 2012 19:48:18 -0000

Dears,
I'm working on kernel in FreeBSD-9. I need to protect a=20
common resource (for example a queue).=20
There are 4 points for access (read/write) this common resource as bellows:
1. ether_input() =E2=80=93 hardware interrupt
2. ip_input() & ip_output() =E2=80=93 software interrupt
3. dev_ioctl() =E2=80=93 local io control in our own kernel module
4. another kernel thread

Which scenario is proper to use for this purpose:

1. kernel mutex (MTX_DEF)
2. kernel mutex (MTX_SPIN)
3. kernel share/exclusive lock
4. kernel reader/writer lock


--
View this message in context: http://freebsd.1045724.n5.nabble.com/protect-=
common-resources-in-kernel-tp5766007.html
Sent from the freebsd-net mailing list archive at Nabble.com.

From owner-freebsd-net@FreeBSD.ORG  Sun Dec  2 22:47:52 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id AB02E61C
 for <freebsd-net@freebsd.org>; Sun,  2 Dec 2012 22:47:52 +0000 (UTC)
 (envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
 by mx1.freebsd.org (Postfix) with ESMTP id 7926F8FC14
 for <freebsd-net@freebsd.org>; Sun,  2 Dec 2012 22:47:52 +0000 (UTC)
Received: from fledge.watson.org (fledge.watson.org [65.122.17.41])
 by cyrus.watson.org (Postfix) with ESMTPS id 2B9F946B2A;
 Sun,  2 Dec 2012 17:47:52 -0500 (EST)
Date: Sun, 2 Dec 2012 22:47:51 +0000 (GMT)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Choupani <Choupani@gmail.com>
Subject: Re: protect common resources in kernel
In-Reply-To: <1354477696312-5766007.post@n5.nabble.com>
Message-ID: <alpine.BSF.2.00.1212022246160.18806@fledge.watson.org>
References: <1354477696312-5766007.post@n5.nabble.com>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: MULTIPART/MIXED;
 BOUNDARY="621616949-1806727502-1354488472=:18806"
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 02 Dec 2012 22:47:52 -0000

  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--621616949-1806727502-1354488472=:18806
Content-Type: TEXT/PLAIN; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8BIT

On Sun, 2 Dec 2012, Choupani wrote:

> I'm working on kernel in FreeBSD-9. I need to protect a 
> common resource (for example a queue). 
> There are 4 points for access (read/write) this common resource as bellows:
> 1. ether_input() – hardware interrupt
> 2. ip_input() & ip_output() – software interrupt
> 3. dev_ioctl() – local io control in our own kernel module
> 4. another kernel thread
>
> Which scenario is proper to use for this purpose:
>
> 1. kernel mutex (MTX_DEF)
> 2. kernel mutex (MTX_SPIN)
> 3. kernel share/exclusive lock
> 4. kernel reader/writer lock

Hi Choupani:

Assuming you are not accessing the resource from a low-level interrupt handler 
("filter") or within the scheduler, your best bets are (1) or (4), depending 
on whether you think you will benefit from read-locking as opposed to just 
write-locking.  (2) should be avoided unless in the low-level 
interrupt/scheduler context, as it takes additional overhead (disabling 
interrupts, etc), and (3) can't be used in contexts were unbounded sleeping 
isn't allowed (e.g., from ithreads, within most parts of the lower network 
stack).

Robert
--621616949-1806727502-1354488472=:18806--

From owner-freebsd-net@FreeBSD.ORG  Sun Dec  2 23:31:05 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 5C8E7C9
 for <freebsd-net@freebsd.org>; Sun,  2 Dec 2012 23:31:05 +0000 (UTC)
 (envelope-from fineuropa06@hotmail.it)
Received: from smtpdg7.aruba.it (smtpdg225.aruba.it [62.149.158.225])
 by mx1.freebsd.org (Postfix) with ESMTP id A03788FC13
 for <freebsd-net@freebsd.org>; Sun,  2 Dec 2012 23:31:04 +0000 (UTC)
Received: from eliot.com ([67.205.103.205])
 by smtpcmd03.ad.aruba.it with bizsmtp
 id WnVp1k00D4RuJtv01nVr1L; Mon, 03 Dec 2012 00:29:52 +0100
From: "GIOBBE" <fineuropa06@hotmail.it>
Subject: Dicembre disponibili le nuove agevolazioni per l'impresa e la famiglia
To: "freebsd-net" <freebsd-net@freebsd.org>
Content-Type: text/plain; charset=iso-8859-1
MIME-Version: 1.0
Date: Mon, 3 Dec 2012 00:29:51 +0100
X-Mailer-MsgId: IB202VDM7zIuXUA09UEpHLlFSXUE6OjpAkLmdsY3NwbW5fPm03c3JqbW0waSxhbW8s6LS0tPGRnbGNzcG1uXy4vPm1zckFqbW1pLGE2bWs6LS0tPGRnbGNzcG1uXy4wP9m1zcmptbWksYW1rOi0tLTxkZ2xjc3Btbl8uMT5tc3JqbW1pLGFBbWs6LS0tPDVkZ2xjc3Btbl8+Zm1ya19naixncjotLS08ZGdsY3NwbW5fLi8+Zm1ya19naixncjot5LS08ZGdsY3NwbW5fLjA+Zm1ya19naixncjotLS08ZGdsY3NwbW5fLjE+Zm1ya19naixncjotLS08ZGdsY3NwbW5fLjI+ZkFtcmtfZ2osZzVyOi0tLTxkZ2xjc3Btbl8uMz5mbXJrX2dqLGdyOi0tLTxkZ2xjc3Btbl8uND5mbXJrX2dqLGdyOi0tLTxkZ2xjc3Btbl8uNT5mbXJrX2dqLGdyOi0tLTxkZ2xjc3Btbl8uNj5mbXJrX2dqLGdyPDw08IwdAM0NENkI/Mys2MEI/
Message-Id: <20121202233105.5C8E7C9@hub.freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 02 Dec 2012 23:31:05 -0000

Finalmente arrivano le agevolazioni per l'attivit� e la famiglia.
Potrai trovare tutte le novit�, incollando su google   
"  agevolazioni_italia_soluzioni  "
Per non ricevere ulteriori comunicazioni segui la procedura e clicca su unsubscribe
Spero possa essere utile.
Ciao


From owner-freebsd-net@FreeBSD.ORG  Mon Dec  3 08:11:43 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 707C02E6;
 Mon,  3 Dec 2012 08:11:43 +0000 (UTC)
 (envelope-from glebius@FreeBSD.org)
Received: from cell.glebius.int.ru (glebius.int.ru [81.19.69.10])
 by mx1.freebsd.org (Postfix) with ESMTP id DC12E8FC15;
 Mon,  3 Dec 2012 08:11:42 +0000 (UTC)
Received: from cell.glebius.int.ru (localhost [127.0.0.1])
 by cell.glebius.int.ru (8.14.5/8.14.5) with ESMTP id qB38BZHS040405;
 Mon, 3 Dec 2012 12:11:35 +0400 (MSK)
 (envelope-from glebius@FreeBSD.org)
Received: (from glebius@localhost)
 by cell.glebius.int.ru (8.14.5/8.14.5/Submit) id qB38BYgH040404;
 Mon, 3 Dec 2012 12:11:34 +0400 (MSK)
 (envelope-from glebius@FreeBSD.org)
X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to
 glebius@FreeBSD.org using -f
Date: Mon, 3 Dec 2012 12:11:34 +0400
From: Gleb Smirnoff <glebius@FreeBSD.org>
To: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
Subject: Re: [CFT] Virtual BPF interfaces (was: CFR: ipfw0 pseudo-interface
 clonable)
Message-ID: <20121203081134.GO14202@glebius.int.ru>
References: <4F96D11B.2060007@FreeBSD.org>
 <20120425.020518.406495893112283552.hrs@allbsd.org>
 <4F96E71B.9020405@FreeBSD.org>
 <20120427.084414.1142593201575277510.hrs@allbsd.org>
 <4FD4AD29.3040204@FreeBSD.org> <50BAA552.1010707@FreeBSD.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=koi8-r
Content-Disposition: inline
In-Reply-To: <50BAA552.1010707@FreeBSD.org>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-ipfw@FreeBSD.org, Hiroki Sato <hrs@FreeBSD.org>,
 delphij@FreeBSD.org, "freebsd-net@freebsd.org" <freebsd-net@FreeBSD.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 03 Dec 2012 08:11:43 -0000

On Sun, Dec 02, 2012 at 04:48:18AM +0400, Alexander V. Chernikov wrote:
A> On 10.06.2012 18:20, Alexander V. Chernikov wrote:
A> > On 27.04.2012 03:44, Hiroki Sato wrote:
A> >> "Alexander V. Chernikov"<melifaro@FreeBSD.org> wrote
A> >> in<4F96E71B.9020405@FreeBSD.org>:
A> >>
A> >> me> On 24.04.2012 21:05, Hiroki Sato wrote:
A> >
A> > Proof-of-concept patch attached.
A> 
A> Hopefully, libcap code is easily extendable.
A> New version attached:
A> * BPF code is now able to use 'virtual' interfaces without real ifnet
A> * New bpfattach3() / bpfdetach3() routines were added to attach virtual 
A> ifaces
A> * New BIOCGIFLIST ioctl is added to permit userland to retrieve 
A> available virtual interfaces
A> * freebsd-specific 'platform_finddevs' version is added to libpcap code 
A> (new file)
A> 
A> There are some rough edges (conditional code in pcap-bpf.c, lack of 
A> documentation, maybe some style issues), but generally it seems to work 
A> and does not interfere with contrib/ code much (from my point of view).
A> 
A> ipfw log device was converted to use new bpf(4) api, see attached patch.

Nice proof of concept, Alexander!

What does prevent us from unifing all bpf providers to be "virtual" in
current terms? I think if we finish divorce between ifnet and bpf, the code
would get simplier and you can proceed further with reducing locking
overhead.

-- 
Totus tuus, Glebius.

From owner-freebsd-net@FreeBSD.ORG  Mon Dec  3 11:06:48 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 73D0DDB
 for <freebsd-net@FreeBSD.org>; Mon,  3 Dec 2012 11:06:48 +0000 (UTC)
 (envelope-from owner-bugmaster@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2001:1900:2254:206c::16:87])
 by mx1.freebsd.org (Postfix) with ESMTP id 566A28FC1D
 for <freebsd-net@FreeBSD.org>; Mon,  3 Dec 2012 11:06:48 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
 by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id qB3B6mZC027628
 for <freebsd-net@FreeBSD.org>; Mon, 3 Dec 2012 11:06:48 GMT
 (envelope-from owner-bugmaster@FreeBSD.org)
Received: (from gnats@localhost)
 by freefall.freebsd.org (8.14.5/8.14.5/Submit) id qB3B6leN027626
 for freebsd-net@FreeBSD.org; Mon, 3 Dec 2012 11:06:47 GMT
 (envelope-from owner-bugmaster@FreeBSD.org)
Date: Mon, 3 Dec 2012 11:06:47 GMT
Message-Id: <201212031106.qB3B6leN027626@freefall.freebsd.org>
X-Authentication-Warning: freefall.freebsd.org: gnats set sender to
 owner-bugmaster@FreeBSD.org using -f
From: FreeBSD bugmaster <bugmaster@FreeBSD.org>
To: freebsd-net@FreeBSD.org
Subject: Current problem reports assigned to freebsd-net@FreeBSD.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 03 Dec 2012 11:06:48 -0000

Note: to view an individual PR, use:
  http://www.freebsd.org/cgi/query-pr.cgi?pr=(number).

The following is a listing of current problems submitted by FreeBSD users.
These represent problem reports covering all versions including
experimental development code and obsolete releases.


S Tracker      Resp.      Description
--------------------------------------------------------------------------------
o kern/173475  net        [tun] tun(4) stays opened by PID after process is term
o kern/173201  net        [ixgbe] [patch] Missing / broken ixgbe sysctl's and tu
o kern/173137  net        [em] em(4) unable to run at gigabit with 9.1-RC2
o kern/173002  net        [patch] data type size problem in if_spppsubr.c
o kern/172985  net        [patch] [ip6] lltable leak when adding and removing IP
o kern/172895  net        [ixgb] [ixgbe] do not properly determine link-state
o kern/172683  net        [ip6] Duplicate IPv6 Link Local Addresses
o kern/172675  net        [netinet] [patch] sysctl_tcp_hc_list (net.inet.tcp.hos
o kern/172113  net        [panic] [e1000] [patch] 9.1-RC1/amd64 panices in igb(4
o kern/171840  net        [ip6] IPv6 packets transmitting only on queue 0
o kern/171838  net        [oce] [patch] Possible lock reversal and duplicate loc
o kern/171739  net        [bce] [panic] bce related kernel panic
o kern/171728  net        [arp] arp issue
o kern/171711  net        [dummynet] [panic] Kernel panic in dummynet
o kern/171697  net        [ip6] [ndp] panic when changing routes
o kern/171532  net        [ndis] ndis(4) driver includes 'pccard'-specific code,
o kern/171531  net        [ndis] undocumented dependency for ndis(4)
o kern/171524  net        [ipmi] ipmi driver crashes kernel by reboot or shutdow
s kern/171508  net        [epair] [request] Add the ability to name epair device
o kern/171228  net        [re] [patch] if_re - eeprom write issues
o kern/170701  net        [ppp] killl ppp or reboot with active ppp connection c
o kern/170267  net        [ixgbe] IXGBE_LE32_TO_CPUS is probably an unintentiona
o kern/170081  net        [fxp] pf/nat/jails not working if checksum offloading 
o kern/169898  net        ifconfig(8) fails to set MTU on multiple interfaces.
o kern/169676  net        [bge] [hang] system hangs, fully or partially after re
o kern/169664  net        [bgp] Wrongful replacement of interface connected net 
o kern/169620  net        [ng] [pf] ng_l2tp incoming packet bypass pf firewall
o kern/169459  net        [ppp] umodem/ppp/3g stopped working after update from 
o kern/169438  net        [ipsec] ipv4-in-ipv6 tunnel mode IPsec does not work
p kern/168294  net        [ixgbe] [patch] ixgbe driver compiled in kernel has no
o kern/168246  net        [em] Multiple em(4) not working with qemu
o kern/168245  net        [arp] [regression] Permanent ARP entry not deleted on 
o kern/168244  net        [arp] [regression] Unable to manually remove permanent
o kern/168183  net        [bce] bce driver hang system
o kern/167947  net        [setfib] [patch] arpresolve checks only the default FI
o kern/167603  net        [ip] IP fragment reassembly's broken: file transfer ov
o kern/167500  net        [em] [panic] Kernel panics in em driver
o kern/167325  net        [netinet] [patch] sosend sometimes return EINVAL with 
o kern/167202  net        [igmp]: Sending multiple IGMP packets crashes kernel
o kern/167059  net        [tcp] [panic] System does panic in in_pcbbind() and ha
o kern/166940  net        [ipfilter] [panic] Double fault in kern 8.2
o kern/166462  net        [gre] gre(4) when using a tunnel source address from c
o kern/166372  net        [patch] ipfilter drops UDP packets with zero checksum 
o kern/166285  net        [arp] FreeBSD v8.1 REL p8 arp: unknown hardware addres
o kern/166255  net        [net] [patch] It should be possible to disable "promis
o kern/165963  net        [panic] [ipf] ipfilter/nat NULL pointer deference
o kern/165903  net        mbuf leak
o kern/165643  net        [net] [patch] Missing vnet restores in net/if_ethersub
o kern/165622  net        [ndis][panic][patch] Unregistered use of FPU in kernel
s kern/165562  net        [request] add support for Intel i350 in FreeBSD 7.4
o kern/165526  net        [bxe] UDP packets checksum calculation whithin if_bxe 
o kern/165488  net        [ppp] [panic] Fatal trap 12 jails and ppp , kernel wit
o kern/165305  net        [ip6] [request] Feature parity between IP_TOS and IPV6
o kern/165296  net        [vlan] [patch] Fix EVL_APPLY_VLID, update EVL_APPLY_PR
o kern/165181  net        [igb] igb freezes after about 2 weeks of uptime
o kern/165174  net        [patch] [tap] allow tap(4) to keep its address on clos
o kern/165152  net        [ip6] Does not work through the issue of ipv6 addresse
o kern/164495  net        [igb] connect double head igb to switch cause system t
o kern/164490  net        [pfil] Incorrect IP checksum on pfil pass from ip_outp
o kern/164475  net        [gre] gre misses RUNNING flag after a reboot
o kern/164265  net        [netinet] [patch] tcp_lro_rx computes wrong checksum i
o kern/163903  net        [igb] "igb0:tx(0)","bpf interface lock" v2.2.5 9-STABL
o kern/163481  net        freebsd do not add itself to ping route packet
o kern/162927  net        [tun] Modem-PPP error ppp[1538]: tun0: Phase: Clearing
o kern/162926  net        [ipfilter] Infinite loop in ipfilter with fragmented I
o kern/162558  net        [dummynet] [panic] seldom dummynet panics
o kern/162153  net        [em] intel em driver 7.2.4 don't compile
o kern/162110  net        [igb] [panic] RELENG_9 panics on boot in IGB driver - 
o kern/162028  net        [ixgbe] [patch] misplaced #endif in ixgbe.c
o kern/161277  net        [em] [patch] BMC cannot receive IPMI traffic after loa
o kern/160873  net        [igb] igb(4) from HEAD fails to build on 7-STABLE
o kern/160750  net        Intel PRO/1000 connection breaks under load until rebo
o kern/160693  net        [gif] [em] Multicast packet are not passed from GIF0 t
o kern/160293  net        [ieee80211] ppanic] kernel panic during network setup 
o kern/160206  net        [gif] gifX stops working after a while (IPv6 tunnel)
o kern/159817  net        [udp] write UDPv4: No buffer space available (code=55)
o kern/159629  net        [ipsec] [panic] kernel panic with IPsec in transport m
o kern/159621  net        [tcp] [panic] panic: soabort: so_count
o kern/159603  net        [netinet] [patch] in_ifscrubprefix() - network route c
o kern/159601  net        [netinet] [patch] in_scrubprefix() - loopback route re
o kern/159294  net        [em] em watchdog timeouts
o kern/159203  net        [wpi] Intel 3945ABG Wireless LAN not support IBSS
o kern/158930  net        [bpf] BPF element leak in ifp->bpf_if->bif_dlist
o kern/158726  net        [ip6] [patch] ICMPv6 Router Announcement flooding limi
o kern/158694  net        [ix] [lagg] ix0 is not working within lagg(4)
o kern/158665  net        [ip6] [panic] kernel pagefault in in6_setscope()
o kern/158635  net        [em] TSO breaks BPF packet captures with em driver
f kern/157802  net        [dummynet] [panic] kernel panic in dummynet
o kern/157785  net        amd64 + jail + ipfw + natd = very slow outbound traffi
o kern/157418  net        [em] em driver lockup during boot on Supermicro X9SCM-
o kern/157410  net        [ip6] IPv6 Router Advertisements Cause Excessive CPU U
o kern/157287  net        [re] [panic] INVARIANTS panic (Memory modified after f
o kern/157209  net        [ip6] [patch] locking error in rip6_input() (sys/netin
o kern/157200  net        [network.subr] [patch] stf(4) can not communicate betw
o kern/157182  net        [lagg] lagg interface not working together with epair 
o kern/156877  net        [dummynet] [panic] dummynet move_pkt() null ptr derefe
o kern/156667  net        [em] em0 fails to init on CURRENT after March 17
o kern/156408  net        [vlan] Routing failure when using VLANs vs. Physical e
o kern/156328  net        [icmp]: host can ping other subnet but no have IP from
o kern/156317  net        [ip6] Wrong order of IPv6 NS DAD/MLD Report
o kern/156283  net        [ip6] [patch] nd6_ns_input - rtalloc_mpath does not re
o kern/156279  net        [if_bridge][divert][ipfw] unable to correctly re-injec
o kern/156226  net        [lagg]: failover does not announce the failover to swi
o kern/156030  net        [ip6] [panic] Crash in nd6_dad_start() due to null ptr
o kern/155772  net        ifconfig(8): ioctl (SIOCAIFADDR): File exists on direc
o kern/155680  net        [multicast] problems with multicast
s kern/155642  net        [request] Add driver for Realtek RTL8191SE/RTL8192SE W
o kern/155597  net        [panic] Kernel panics with "sbdrop" message
o kern/155420  net        [vlan] adding vlan break existent vlan
o kern/155177  net        [route] [panic] Panic when inject routes in kernel
p kern/155030  net        [igb] igb(4) DEVICE_POLLING does not work with carp(4)
o kern/155010  net        [msk] ntfs-3g via iscsi using msk driver cause kernel 
o kern/154943  net        [gif] ifconfig gifX create on existing gifX clears IP
s kern/154851  net        [request]: Port brcm80211 driver from Linux to FreeBSD
o kern/154850  net        [netgraph] [patch] ng_ether fails to name nodes when t
o kern/154679  net        [em] Fatal trap 12: "em1 taskq" only at startup (8.1-R
o kern/154600  net        [tcp] [panic] Random kernel panics on tcp_output
o kern/154557  net        [tcp] Freeze tcp-session of the clients, if in the gat
o kern/154443  net        [if_bridge] Kernel module bridgestp.ko missing after u
o kern/154286  net        [netgraph] [panic] 8.2-PRERELEASE panic in netgraph
o kern/154255  net        [nfs] NFS not responding
o kern/154214  net        [stf] [panic] Panic when creating stf interface
o kern/154185  net        race condition in mb_dupcl
o kern/154169  net        [multicast] [ip6] Node Information Query multicast add
o kern/154134  net        [ip6] stuck kernel state in LISTEN on ipv6 daemon whic
o kern/154091  net        [netgraph] [panic] netgraph, unaligned mbuf?
o conf/154062  net        [vlan] [patch] change to way of auto-generatation of v
o kern/153937  net        [ral] ralink panics the system (amd64 freeBSDD 8.X) wh
o kern/153936  net        [ixgbe] [patch] MPRC workaround incorrectly applied to
o kern/153816  net        [ixgbe] ixgbe doesn't work properly with the Intel 10g
o kern/153772  net        [ixgbe] [patch] sysctls reference wrong XON/XOFF varia
o kern/153497  net        [netgraph] netgraph panic due to race conditions
o kern/153454  net        [patch] [wlan] [urtw] Support ad-hoc and hostap modes 
o kern/153308  net        [em] em interface use 100% cpu
o kern/153244  net        [em] em(4) fails to send UDP to port 0xffff
o kern/152893  net        [netgraph] [panic] 8.2-PRERELEASE panic in netgraph
o kern/152853  net        [em] tftpd (and likely other udp traffic) fails over e
o kern/152828  net        [em] poor performance on 8.1, 8.2-PRE
o kern/152569  net        [net]: Multiple ppp connections and routing table prob
o kern/152235  net        [arp] Permanent local ARP entries are not properly upd
o kern/152141  net        [vlan] [patch] encapsulate vlan in ng_ether before out
o kern/152036  net        [libc] getifaddrs(3) returns truncated sockaddrs for n
o kern/151690  net        [ep] network connectivity won't work until dhclient is
o kern/151681  net        [nfs] NFS mount via IPv6 leads to hang on client with 
o kern/151593  net        [igb] [panic] Kernel panic when bringing up igb networ
o kern/150920  net        [ixgbe][igb] Panic when packets are dropped with heade
o kern/150557  net        [igb] igb0: Watchdog timeout -- resetting
o kern/150251  net        [patch] [ixgbe] Late cable insertion broken
o kern/150249  net        [ixgbe] Media type detection broken
o bin/150224   net        ppp(8) does not reassign static IP after kill -KILL co
f kern/149969  net        [wlan] [ral] ralink rt2661 fails to maintain connectio
o kern/149937  net        [ipfilter] [patch] kernel panic in ipfilter IP fragmen
o kern/149643  net        [rum] device not sending proper beacon frames in ap mo
o kern/149609  net        [panic] reboot after adding second default route
o kern/149117  net        [inet] [patch] in_pcbbind: redundant test
o kern/149086  net        [multicast] Generic multicast join failure in 8.1
o kern/148018  net        [flowtable] flowtable crashes on ia64
o kern/147912  net        [boot] FreeBSD 8 Beta won't boot on Thinkpad i1300  11
o kern/147894  net        [ipsec] IPv6-in-IPv4 does not work inside an ESP-only 
o kern/147155  net        [ip6] setfb not work with ipv6
o kern/146845  net        [libc] close(2) returns error 54 (connection reset by 
f kern/146792  net        [flowtable] flowcleaner 100% cpu's core load
o kern/146719  net        [pf] [panic] PF or dumynet kernel panic
o kern/146534  net        [icmp6] wrong source address in echo reply
o kern/146427  net        [mwl] Additional virtual access points don't work on m
f kern/146394  net        [vlan] IP source address for outgoing connections
o bin/146377   net        [ppp] [tun] Interface doesn't clear addresses when PPP
o kern/146358  net        [vlan] wrong destination MAC address
o kern/146165  net        [wlan] [panic] Setting bssid in adhoc mode causes pani
o kern/146082  net        [ng_l2tp] a false invaliant check was performed in ng_
o kern/146037  net        [panic] mpd + CoA = kernel panic
o kern/145825  net        [panic] panic: soabort: so_count
o kern/145728  net        [lagg] Stops working lagg between two servers.
p kern/145600  net        TCP/ECN behaves different to CE/CWR than ns2 reference
f kern/144917  net        [flowtable] [panic] flowtable crashes system [regressi
o kern/144882  net        MacBookPro =>4.1 does not connect to BSD in hostap wit
o kern/144874  net        [if_bridge] [patch] if_bridge frees mbuf after pfil ho
o conf/144700  net        [rc.d] async dhclient breaks stuff for too many people
o kern/144616  net        [nat] [panic] ip_nat panic FreeBSD 7.2
f kern/144315  net        [ipfw] [panic] freebsd 8-stable reboot after add ipfw 
o kern/144231  net        bind/connect/sendto too strict about sockaddr length
o kern/143846  net        [gif] bringing gif3 tunnel down causes gif0 tunnel to 
s kern/143673  net        [stf] [request] there should be a way to support multi
s kern/143666  net        [ip6] [request] PMTU black hole detection not implemen
o kern/143622  net        [pfil] [patch] unlock pfil lock while calling firewall
o kern/143593  net        [ipsec] When using IPSec, tcpdump doesn't show outgoin
o kern/143591  net        [ral] RT2561C-based DLink card (DWL-510) fails to work
o kern/143208  net        [ipsec] [gif] IPSec over gif interface not working
o kern/143034  net        [panic] system reboots itself in tcp code [regression]
o kern/142877  net        [hang] network-related repeatable 8.0-STABLE hard hang
o kern/142774  net        Problem with outgoing connections on interface with mu
o kern/142772  net        [libc] lla_lookup: new lle malloc failed
f kern/142518  net        [em] [lagg] Problem on 8.0-STABLE with em and lagg
o kern/142018  net        [iwi] [patch] Possibly wrong interpretation of beacon-
o kern/141861  net        [wi] data garbled with WEP and wi(4) with Prism 2.5
f kern/141741  net        Etherlink III NIC won't work after upgrade to FBSD 8, 
o kern/140742  net        rum(4) Two asus-WL167G adapters cannot talk to each ot
o kern/140682  net        [netgraph] [panic] random panic in netgraph
f kern/140634  net        [vlan] destroying if_lagg interface with if_vlan membe
o kern/140619  net        [ifnet] [patch] refine obsolete if_var.h comments desc
o kern/140346  net        [wlan] High bandwidth use causes loss of wlan connecti
o kern/140142  net        [ip6] [panic] FreeBSD 7.2-amd64 panic w/IPv6
o kern/140066  net        [bwi] install report for 8.0 RC 2 (multiple problems)
o kern/139565  net        [ipfilter] ipfilter ioctl SIOCDELST broken
o kern/139387  net        [ipsec] Wrong lenth of PF_KEY messages in promiscuous 
o bin/139346   net        [patch] arp(8) add option to remove static entries lis
o kern/139268  net        [if_bridge] [patch] allow if_bridge to forward just VL
p kern/139204  net        [arp] DHCP server replies rejected, ARP entry lost bef
o kern/139117  net        [lagg] + wlan boot timing (EBUSY)
o kern/139058  net        [ipfilter] mbuf cluster leak on FreeBSD 7.2
o kern/138850  net        [dummynet] dummynet doesn't work correctly on a bridge
o kern/138782  net        [panic] sbflush_internal: cc 0 || mb 0xffffff004127b00
o kern/138688  net        [rum] possibly broken on 8 Beta 4 amd64: able to wpa a
o kern/138678  net        [lo] FreeBSD does not assign linklocal address to loop
o kern/138407  net        [gre] gre(4) interface does not come up after reboot
o kern/138332  net        [tun] [lor] ifconfig tun0 destroy causes LOR if_adata/
o kern/138266  net        [panic] kernel panic when udp benchmark test used as r
o kern/138177  net        [ipfilter] FreeBSD crashing repeatedly in ip_nat.c:257
f kern/138029  net        [bpf] [panic] periodically kernel panic and reboot
o kern/137881  net        [netgraph] [panic] ng_pppoe fatal trap 12
p bin/137841   net        [patch] wpa_supplicant(8) cannot verify SHA256 signed 
p kern/137776  net        [rum] panic in rum(4) driver on 8.0-BETA2
o bin/137641   net        ifconfig(8): various problems with "vlan_device.vlan_i
o kern/137392  net        [ip] [panic] crash in ip_nat.c line 2577
o kern/137372  net        [ral] FreeBSD doesn't support wireless interface from 
o kern/137089  net        [lagg] lagg falsely triggers IPv6 duplicate address de
o bin/136994   net        [patch] ifconfig(8) print carp mac address
o kern/136911  net        [netgraph] [panic] system panic on kldload ng_bpf.ko t
o kern/136618  net        [pf][stf] panic on cloning interface without unit numb
o kern/135502  net        [periodic] Warning message raised by rtfree function i
o kern/134583  net        [hang] Machine with jail freezes after random amount o
o kern/134531  net        [route] [panic] kernel crash related to routes/zebra
o kern/134157  net        [dummynet] dummynet loads cpu for 100% and make a syst
o kern/133969  net        [dummynet] [panic] Fatal trap 12: page fault while in 
o kern/133968  net        [dummynet] [panic] dummynet kernel panic
o kern/133736  net        [udp] ip_id not protected ...
o kern/133595  net        [panic] Kernel Panic at pcpu.h:195
o kern/133572  net        [ppp] [hang] incoming PPTP connection hangs the system
o kern/133490  net        [bpf] [panic] 'kmem_map too small' panic on Dell r900 
o kern/133235  net        [netinet] [patch] Process SIOCDLIFADDR command incorre
f kern/133213  net        arp and sshd errors on 7.1-PRERELEASE
o kern/133060  net        [ipsec] [pfsync] [panic] Kernel panic with ipsec + pfs
o kern/132889  net        [ndis] [panic] NDIS kernel crash on load BCM4321 AGN d
o conf/132851  net        [patch] rc.conf(5): allow to setfib(1) for service run
o kern/132734  net        [ifmib] [panic] panic in net/if_mib.c
o kern/132705  net        [libwrap] [patch] libwrap - infinite loop if hosts.all
o kern/132672  net        [ndis] [panic] ndis with rt2860.sys causes kernel pani
o kern/132554  net        [ipl] There is no ippool start script/ipfilter magic t
o kern/132354  net        [nat] Getting some packages to ipnat(8) causes crash
o kern/132277  net        [crypto] [ipsec] poor performance using cryptodevice f
o kern/131781  net        [ndis] ndis keeps dropping the link
o kern/131776  net        [wi] driver fails to init
o kern/131753  net        [altq] [panic] kernel panic in hfsc_dequeue
o kern/131601  net        [ipfilter] [panic] 7-STABLE panic in nat_finalise (tcp
o bin/131567   net        [socket] [patch] Update for regression/sockets/unix_cm
o bin/131365   net        route(8): route add changes interpretation of network 
f kern/130820  net        [ndis] wpa_supplicant(8) returns 'no space on device'
o kern/130628  net        [nfs] NFS / rpc.lockd deadlock on 7.1-R
o conf/130555  net        [rc.d] [patch] No good way to set ipfilter variables a
o kern/130525  net        [ndis] [panic] 64 bit ar5008 ndisgen-erated driver cau
o kern/130311  net        [wlan_xauth] [panic] hostapd restart causing kernel pa
o kern/130109  net        [ipfw] Can not set fib for packets originated from loc
f kern/130059  net        [panic] Leaking 50k mbufs/hour
f kern/129719  net        [nfs] [panic] Panic during shutdown, tcp_ctloutput: in
o kern/129517  net        [ipsec] [panic] double fault / stack overflow
f kern/129508  net        [carp] [panic] Kernel panic with EtherIP (may be relat
o kern/129219  net        [ppp] Kernel panic when using kernel mode ppp
o kern/129197  net        [panic] 7.0 IP stack related panic
o bin/128954   net        ifconfig(8) deletes valid routes
o bin/128602   net        [an] wpa_supplicant(8) crashes with an(4)
o kern/128448  net        [nfs] 6.4-RC1 Boot Fails if NFS Hostname cannot be res
o bin/128295   net        [patch] ifconfig(8) does not print TOE4 or TOE6 capabi
o bin/128001   net        wpa_supplicant(8), wlan(4), and wi(4) issues
o kern/127826  net        [iwi] iwi0 driver has reduced performance and connecti
o kern/127815  net        [gif] [patch] if_gif does not set vlan attributes from
o kern/127724  net        [rtalloc] rtfree: 0xc5a8f870 has 1 refs
f bin/127719   net        [arp] arp: Segmentation fault (core dumped)
f kern/127528  net        [icmp]: icmp socket receives icmp replies not owned by
p kern/127360  net        [socket] TOE socket options missing from sosetopt()
o bin/127192   net        routed(8) removes the secondary alias IP of interface 
f kern/127145  net        [wi]: prism (wi) driver crash at bigger traffic
o kern/126895  net        [patch] [ral] Add antenna selection (marked as TBD)
o kern/126874  net        [vlan]: Zebra problem if ifconfig vlanX destroy
o kern/126695  net        rtfree messages and network disruption upon use of if_
o kern/126339  net        [ipw] ipw driver drops the connection
o kern/126075  net        [inet] [patch] internet control accesses beyond end of
o bin/125922   net        [patch] Deadlock in arp(8)
o kern/125920  net        [arp] Kernel Routing Table loses Ethernet Link status 
o kern/125845  net        [netinet] [patch] tcp_lro_rx() should make use of hard
o kern/125258  net        [socket] socket's SO_REUSEADDR option does not work
o kern/125239  net        [gre] kernel crash when using gre
o kern/124341  net        [ral] promiscuous mode for wireless device ral0 looses
o kern/124225  net        [ndis] [patch] ndis network driver sometimes loses net
o kern/124160  net        [libc] connect(2) function loops indefinitely
o kern/124021  net        [ip6] [panic] page fault in nd6_output()
o kern/123968  net        [rum] [panic] rum driver causes kernel panic with WPA.
o kern/123892  net        [tap] [patch] No buffer space available
o kern/123890  net        [ppp] [panic] crash & reboot on work with PPP low-spee
o kern/123858  net        [stf] [patch] stf not usable behind a NAT
o kern/123796  net        [ipf] FreeBSD 6.1+VPN+ipnat+ipf: port mapping does not
o kern/123758  net        [panic] panic while restarting net/freenet6
o bin/123633   net        ifconfig(8) doesn't set inet and ether address in one 
o kern/123559  net        [iwi] iwi periodically disassociates/associates [regre
o bin/123465   net        [ip6] route(8): route add -inet6 <ipv6_addr> -interfac
o kern/123463  net        [ipsec] [panic] repeatable crash related to ipsec-tool
o conf/123330  net        [nsswitch.conf] Enabling samba wins in nsswitch.conf c
o kern/123160  net        [ip] Panic and reboot at sysctl kern.polling.enable=0
o kern/122989  net        [swi] [panic] 6.3 kernel panic in swi1: net
o kern/122954  net        [lagg] IPv6 EUI64 incorrectly chosen for lagg devices
f kern/122780  net        [lagg] tcpdump on lagg interface during high pps wedge
o kern/122685  net        It is not visible passing packets in tcpdump(1)
o kern/122319  net        [wi] imposible to enable ad-hoc demo mode with Orinoco
o kern/122290  net        [netgraph] [panic] Netgraph related "kmem_map too smal
o kern/122252  net        [ipmi] [bge] IPMI problem with BCM5704 (does not work 
o kern/122033  net        [ral] [lor] Lock order reversal in ral0 at bootup ieee
o bin/121895   net        [patch] rtsol(8)/rtsold(8) doesn't handle managed netw
s kern/121774  net        [swi] [panic] 6.3 kernel panic in swi1: net
o kern/121555  net        [panic] Fatal trap 12: current process = 12 (swi1: net
o kern/121443  net        [gif] [lor] icmp6_input/nd6_lookup
o kern/121437  net        [vlan] Routing to layer-2 address does not work on VLA
o bin/121359   net        [patch] [security] ppp(8): fix local stack overflow in
o kern/121257  net        [tcp] TSO + natd  -> slow outgoing tcp traffic
o kern/121181  net        [panic] Fatal trap 3: breakpoint instruction fault whi
o kern/120966  net        [rum] kernel panic with if_rum and WPA encryption
o kern/120566  net        [request]: ifconfig(8) make order of arguments more fr
o kern/120304  net        [netgraph] [patch] netgraph source assumes 32-bit time
o kern/120266  net        [udp] [panic] gnugk causes kernel panic when closing U
o bin/120060   net        routed(8) deletes link-level routes in the presence of
o kern/119945  net        [rum] [panic] rum device in hostap mode, cause kernel 
o kern/119791  net        [nfs] UDP NFS mount of aliased IP addresses from a Sol
o kern/119617  net        [nfs] nfs error on wpa network when reseting/shutdown
f kern/119516  net        [ip6] [panic] _mtx_lock_sleep: recursed on non-recursi
o kern/119432  net        [arp] route add -host <host> -iface <nic> causes arp e
o kern/119225  net        [wi] 7.0-RC1 no carrier with Prism 2.5 wifi card [regr
o kern/118727  net        [netgraph] [patch] [request] add new ng_pf module
o kern/117423  net        [vlan] Duplicate IP on different interfaces
o bin/117339   net        [patch] route(8): loading routing management commands 
o bin/116643   net        [patch] [request] fstat(1): add INET/INET6 socket deta
o kern/116185  net        [iwi] if_iwi driver leads system to reboot
o kern/115239  net        [ipnat] panic with 'kmem_map too small' using ipnat
o kern/115019  net        [netgraph] ng_ether upper hook packet flow stops on ad
o kern/115002  net        [wi] if_wi timeout. failed allocation (busy bit). ifco
o kern/114915  net        [patch] [pcn] pcn (sys/pci/if_pcn.c) ethernet driver f
o kern/113432  net        [ucom] WARNING: attempt to net_add_domain(netgraph) af
o kern/112722  net        [ipsec] [udp] IP v4 udp fragmented packet reject
o kern/112686  net        [patm] patm driver freezes System (FreeBSD 6.2-p4) i38
o bin/112557   net        [patch] ppp(8) lock file should not use symlink name
o kern/112528  net        [nfs] NFS over TCP under load hangs with "impossible p
o kern/111537  net        [inet6] [patch] ip6_input() treats mbuf cluster wrong
o kern/111457  net        [ral] ral(4) freeze
o kern/110284  net        [if_ethersubr] Invalid Assumption in SIOCSIFADDR in et
o kern/110249  net        [kernel] [regression] [patch] setsockopt() error regre
o kern/109470  net        [wi] Orinoco Classic Gold PC Card Can't Channel Hop
o bin/108895   net        pppd(8): PPPoE dead connections on 6.2 [regression]
o kern/107944  net        [wi] [patch] Forget to unlock mutex-locks
o conf/107035  net        [patch] bridge(8): bridge interface given in rc.conf n
o kern/106444  net        [netgraph] [panic] Kernel Panic on Binding to an ip to
o kern/106316  net        [dummynet] dummynet with multipass ipfw drops packets 
o kern/105945  net        Address can disappear from network interface
s kern/105943  net        Network stack may modify read-only mbuf chain copies
o bin/105925   net        problems with ifconfig(8) and vlan(4) [regression]
o kern/104851  net        [inet6] [patch] On link routes not configured when usi
o kern/104751  net        [netgraph] kernel panic, when getting info about my tr
o kern/103191  net        Unpredictable reboot
o kern/103135  net        [ipsec] ipsec with ipfw divert (not NAT) encodes a pac
o kern/102540  net        [netgraph] [patch] supporting vlan(4) by ng_fec(4)
o conf/102502  net        [netgraph] [patch] ifconfig name does't rename netgrap
o kern/102035  net        [plip] plip networking disables parallel port printing
o kern/101948  net        [ipf] [panic] Kernel Panic Trap No 12 Page Fault - cau
o kern/100709  net        [libc] getaddrinfo(3) should return TTL info
o kern/100519  net        [netisr] suggestion to fix suboptimal network polling
o kern/98978   net        [ipf] [patch] ipfilter drops OOW packets under 6.1-Rel
o kern/98597   net        [inet6] Bug in FreeBSD 6.1 IPv6 link-local DAD procedu
o bin/98218    net        wpa_supplicant(8) blacklist not working
o kern/97306   net        [netgraph] NG_L2TP locks after connection with failed 
o conf/97014   net        [gif] gifconfig_gif? in rc.conf does not recognize IPv
f kern/96268   net        [socket] TCP socket performance drops by 3000% if pack
o kern/95519   net        [ral] ral0 could not map mbuf
o kern/95288   net        [pppd] [tty] [panic] if_ppp panic in sys/kern/tty_subr
o kern/95277   net        [netinet] [patch] IP Encapsulation mask_match() return
o kern/95267   net        packet drops periodically appear
f kern/93378   net        [tcp] Slow data transfer in Postfix and Cyrus IMAP (wo
o kern/93019   net        [ppp] ppp and tunX problems: no traffic after restarti
o kern/92880   net        [libc] [patch] almost rewritten inet_network(3) functi
s kern/92279   net        [dc] Core faults everytime I reboot, possible NIC issu
o kern/91859   net        [ndis] if_ndis does not work with Asus WL-138
s kern/91777   net        [ipf] [patch] wrong behaviour with skip rule inside an
o kern/91364   net        [ral] [wep] WF-511 RT2500 Card PCI and WEP
o kern/91311   net        [aue] aue interface hanging
o kern/87521   net        [ipf] [panic] using ipfilter "auth" keyword leads to k
o kern/87421   net        [netgraph] [panic]: ng_ether + ng_eiface + if_bridge
o kern/86871   net        [tcp] [patch] allocation logic for PCBs in TIME_WAIT s
o kern/86427   net        [lor] Deadlock with FASTIPSEC and nat
o kern/86103   net        [ipf] Illegal NAT Traversal in IPFilter
o kern/85780   net        'panic: bogus refcnt 0' in routing/ipv6
o bin/85445    net        ifconfig(8): deprecated keyword to ifconfig inoperativ
p kern/85320   net        [gre] [patch] possible depletion of kernel stack in ip
o bin/82975    net        route change does not parse classfull network as given
o kern/82881   net        [netgraph] [panic] ng_fec(4) causes kernel panic after
o kern/82468   net        Using 64MB tcp send/recv buffers, trafficflow stops, i
o bin/82185    net        [patch] ndp(8) can delete the incorrect entry
o kern/81095   net        IPsec connection stops working if associated network i
o kern/78968   net        FreeBSD freezes on mbufs exhaustion (network interface
o kern/78090   net        [ipf] ipf filtering on bridged packets doesn't work if
o kern/77341   net        [ip6] problems with IPV6 implementation
s kern/77195   net        [ipf] [patch] ipfilter ioctl SIOCGNATL does not match 
o kern/75873   net        Usability problem with non-RFC-compliant IP spoof prot
s kern/75407   net        [an] an(4): no carrier after short time
a kern/71474   net        [route] route lookup does not skip interfaces marked d
o kern/71469   net        default route to internet magically disappears with mu
o kern/70904   net        [ipf] ipfilter ipnat problem with h323 proxy support
o kern/68889   net        [panic] m_copym, length > size of mbuf chain
o kern/66225   net        [netgraph] [patch] extend ng_eiface(4) control message
o kern/65616   net        IPSEC can't detunnel GRE packets after real ESP encryp
s kern/60293   net        [patch] FreeBSD arp poison patch
a kern/56233   net        IPsec tunnel (ESP) over IPv6: MTU computation is wrong
s bin/41647    net        ifconfig(8) doesn't accept lladdr along with inet addr
o kern/39937   net        ipstealth issue
a kern/38554   net        [patch] changing interface ipaddress doesn't seem to w
o kern/34665   net        [ipf] [hang] ipfilter rcmd proxy "hangs".
o kern/31940   net        ip queue length too short for >500kpps
o kern/31647   net        [libc] socket calls can return undocumented EINVAL
o kern/30186   net        [libc] getaddrinfo(3) does not handle incorrect servna
o kern/27474   net        [ipf] [ppp] Interactive use of user PPP and ipfilter c
f kern/24959   net        [patch] proper TCP_NOPUSH/TCP_CORK compatibility
o conf/23063   net        [arp] [patch] for static ARP tables in rc.network
o kern/21998   net        [socket] [patch] ident only for outgoing connections
o kern/5877    net        [socket] sb_cc counts control data as well as data dat

428 problems total.


From owner-freebsd-net@FreeBSD.ORG  Mon Dec  3 12:21:48 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id ABDB230F;
 Mon,  3 Dec 2012 12:21:48 +0000 (UTC)
 (envelope-from melifaro@FreeBSD.org)
Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 39C778FC12;
 Mon,  3 Dec 2012 12:21:48 +0000 (UTC)
Received: from [2a02:6b8:0:401:222:4dff:fe50:cd2f]
 (helo=dhcp170-36-red.yandex.net)
 by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256)
 (Exim 4.76 (FreeBSD)) (envelope-from <melifaro@FreeBSD.org>)
 id 1TfV5I-00053l-89; Mon, 03 Dec 2012 16:25:16 +0400
Message-ID: <50BC989E.3080303@FreeBSD.org>
Date: Mon, 03 Dec 2012 16:18:38 +0400
From: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:13.0) Gecko/20120627 Thunderbird/13.0.1
MIME-Version: 1.0
To: Gleb Smirnoff <glebius@FreeBSD.org>
Subject: Re: [CFT] Virtual BPF interfaces
References: <4F96D11B.2060007@FreeBSD.org>
 <20120425.020518.406495893112283552.hrs@allbsd.org>
 <4F96E71B.9020405@FreeBSD.org>
 <20120427.084414.1142593201575277510.hrs@allbsd.org>
 <4FD4AD29.3040204@FreeBSD.org> <50BAA552.1010707@FreeBSD.org>
 <20121203081134.GO14202@glebius.int.ru>
In-Reply-To: <20121203081134.GO14202@glebius.int.ru>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-ipfw@FreeBSD.org, Hiroki Sato <hrs@FreeBSD.org>,
 delphij@FreeBSD.org, "freebsd-net@freebsd.org" <freebsd-net@FreeBSD.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 03 Dec 2012 12:21:48 -0000

On 03.12.2012 12:11, Gleb Smirnoff wrote:
> On Sun, Dec 02, 2012 at 04:48:18AM +0400, Alexander V. Chernikov wrote:
> A> On 10.06.2012 18:20, Alexander V. Chernikov wrote:
> A> > On 27.04.2012 03:44, Hiroki Sato wrote:
> A> >> "Alexander V. Chernikov"<melifaro@FreeBSD.org> wrote
> A> >> in<4F96E71B.9020405@FreeBSD.org>:
> A> >>
> A> >> me> On 24.04.2012 21:05, Hiroki Sato wrote:
> A> >
> A> > Proof-of-concept patch attached.
> A>
> A> Hopefully, libcap code is easily extendable.
> A> New version attached:
> A> * BPF code is now able to use 'virtual' interfaces without real ifnet
> A> * New bpfattach3() / bpfdetach3() routines were added to attach virtual
> A> ifaces
> A> * New BIOCGIFLIST ioctl is added to permit userland to retrieve
> A> available virtual interfaces
> A> * freebsd-specific 'platform_finddevs' version is added to libpcap code
> A> (new file)
> A>
> A> There are some rough edges (conditional code in pcap-bpf.c, lack of
> A> documentation, maybe some style issues), but generally it seems to work
> A> and does not interfere with contrib/ code much (from my point of view).
> A>
> A> ipfw log device was converted to use new bpf(4) api, see attached patch.
>
> Nice proof of concept, Alexander!
>
> What does prevent us from unifing all bpf providers to be "virtual" in
> current terms? I think if we finish divorce between ifnet and bpf, the code
> would get simplier and you can proceed further with reducing locking
> overhead.

We have to jump from ifnet to the list of per-ifnet BPF consumers 
somehow, so I'm not sure if we can do much more here. BPF itself doesn't 
require much from parent ifnet.

What I really want to do next is the following:

1) Make BPF_PEERS_PRESENT(ifp) to be (ifp->if_bpf != NULL). This saves 
some processing time and permits 'bpf_if' to be be totally opaque 
without any hacks.
2) Set if_bpf pointer IFF there are some consumers (and set it back to 
NULL when all consumers are detached). This should work well for 'main' 
BPF DLT, but single (currently, 802.11) interface can hold more than one 
DLTs. Probably we can save dst pointer passed to bpfattach2() to given 
bpf_if structure, and set this value instead of ->if_bpf.
This, however, can lead to hard-to-find problems, since bpfattach[2] is 
usually not called by driver directly.


>


-- 
WBR, Alexander


From owner-freebsd-net@FreeBSD.ORG  Mon Dec  3 15:36:57 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 087A86F0
 for <freebsd-net@freebsd.org>; Mon,  3 Dec 2012 15:36:57 +0000 (UTC)
 (envelope-from freebsd-net@m.gmane.org)
Received: from plane.gmane.org (plane.gmane.org [80.91.229.3])
 by mx1.freebsd.org (Postfix) with ESMTP id A76FB8FC15
 for <freebsd-net@freebsd.org>; Mon,  3 Dec 2012 15:36:56 +0000 (UTC)
Received: from list by plane.gmane.org with local (Exim 4.69)
 (envelope-from <freebsd-net@m.gmane.org>) id 1TfY4u-0000mO-Sl
 for freebsd-net@freebsd.org; Mon, 03 Dec 2012 16:37:04 +0100
Received: from broadband-77-37-234-86.nationalcablenetworks.ru ([77.37.234.86])
 by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
 id 1AlnuQ-0007hv-00
 for <freebsd-net@freebsd.org>; Mon, 03 Dec 2012 16:37:04 +0100
Received: from vadim_nuclight by
 broadband-77-37-234-86.nationalcablenetworks.ru with local (Gmexim 0.1
 (Debian)) id 1AlnuQ-0007hv-00
 for <freebsd-net@freebsd.org>; Mon, 03 Dec 2012 16:37:04 +0100
X-Injected-Via-Gmane: http://gmane.org/
To: freebsd-net@freebsd.org
From: Vadim Goncharov <vadim_nuclight@mail.ru>
Subject: Re: [CFT] Virtual BPF interfaces
Date: Mon, 3 Dec 2012 15:36:33 +0000 (UTC)
Organization: Nuclear Lightning @ Moscow, Home
Lines: 63
Message-ID: <slrnkbpho3.2f5k.vadim_nuclight@kernblitz.nuclight.ipfw.ru>
References: <4F96D11B.2060007@FreeBSD.org>
 <20120425.020518.406495893112283552.hrs@allbsd.org>
 <4F96E71B.9020405@FreeBSD.org>
 <20120427.084414.1142593201575277510.hrs@allbsd.org>
 <4FD4AD29.3040204@FreeBSD.org> <50BAA552.1010707@FreeBSD.org>
 <20121203081134.GO14202@glebius.int.ru> <50BC989E.3080303@FreeBSD.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Complaints-To: usenet@ger.gmane.org
X-Gmane-NNTP-Posting-Host: broadband-77-37-234-86.nationalcablenetworks.ru
X-Comment-To: Alexander V. Chernikov
User-Agent: slrn/0.9.9p1 (FreeBSD)
Cc: freebsd-ipfw@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: vadim_nuclight@mail.ru
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 03 Dec 2012 15:36:57 -0000

Hi Alexander V. Chernikov! 

On Mon, 03 Dec 2012 16:18:38 +0400; Alexander V. Chernikov wrote about 'Re: [CFT] Virtual BPF interfaces':

> On 03.12.2012 12:11, Gleb Smirnoff wrote:
>> On Sun, Dec 02, 2012 at 04:48:18AM +0400, Alexander V. Chernikov wrote:
>> A> On 10.06.2012 18:20, Alexander V. Chernikov wrote:
>> A>> On 27.04.2012 03:44, Hiroki Sato wrote:
>> A>>> "Alexander V. Chernikov"<melifaro@FreeBSD.org> wrote
>> A>>> in<4F96E71B.9020405@FreeBSD.org>:
>> A>>>
>> A>>> me> On 24.04.2012 21:05, Hiroki Sato wrote:
>> A>>
>> A>> Proof-of-concept patch attached.
>> A>
>> A> Hopefully, libcap code is easily extendable.
>> A> New version attached:
>> A> * BPF code is now able to use 'virtual' interfaces without real ifnet
>> A> * New bpfattach3() / bpfdetach3() routines were added to attach virtual
>> A> ifaces
>> A> * New BIOCGIFLIST ioctl is added to permit userland to retrieve
>> A> available virtual interfaces
>> A> * freebsd-specific 'platform_finddevs' version is added to libpcap code
>> A> (new file)
>> A>
>> A> There are some rough edges (conditional code in pcap-bpf.c, lack of
>> A> documentation, maybe some style issues), but generally it seems to work
>> A> and does not interfere with contrib/ code much (from my point of view).
>> A>
>> A> ipfw log device was converted to use new bpf(4) api, see attached patch.
>>
>> Nice proof of concept, Alexander!
>>
>> What does prevent us from unifing all bpf providers to be "virtual" in
>> current terms? I think if we finish divorce between ifnet and bpf, the code
>> would get simplier and you can proceed further with reducing locking
>> overhead.

> We have to jump from ifnet to the list of per-ifnet BPF consumers 
> somehow, so I'm not sure if we can do much more here. BPF itself doesn't 
> require much from parent ifnet.

> What I really want to do next is the following:

> 1) Make BPF_PEERS_PRESENT(ifp) to be (ifp->if_bpf != NULL). This saves 
> some processing time and permits 'bpf_if' to be be totally opaque 
> without any hacks.
> 2) Set if_bpf pointer IFF there are some consumers (and set it back to 
> NULL when all consumers are detached). This should work well for 'main' 
> BPF DLT, but single (currently, 802.11) interface can hold more than one 
> DLTs. Probably we can save dst pointer passed to bpfattach2() to given 

There probably will be more of them when we will support tcpdump -i iggroupnam
as admin can decide to move to one group interfaces with defferent DLTs.

> bpf_if structure, and set this value instead of ->if_bpf.
> This, however, can lead to hard-to-find problems, since bpfattach[2] is 
> usually not called by driver directly.


-- 
WBR, Vadim Goncharov. ICQ#166852181       mailto:vadim_nuclight@mail.ru
[Anti-Greenpeace][Sober FreeBSD zealot][http://nuclight.livejournal.com]


From owner-freebsd-net@FreeBSD.ORG  Mon Dec  3 16:15:34 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 7290B399
 for <freebsd-net@freebsd.org>; Mon,  3 Dec 2012 16:15:34 +0000 (UTC)
 (envelope-from keith.arner@gmail.com)
Received: from mail-ee0-f54.google.com (mail-ee0-f54.google.com [74.125.83.54])
 by mx1.freebsd.org (Postfix) with ESMTP id 0429E8FC14
 for <freebsd-net@freebsd.org>; Mon,  3 Dec 2012 16:15:33 +0000 (UTC)
Received: by mail-ee0-f54.google.com with SMTP id c13so2065218eek.13
 for <freebsd-net@freebsd.org>; Mon, 03 Dec 2012 08:15:32 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:date:x-google-sender-auth:message-id:subject
 :from:to:content-type;
 bh=EmVW8Why0F+/C0Bq8S3HSCIg+0ECaNUG15YxnZ28rxM=;
 b=YNEheMW7L8z//47eHhKytnH+aj0J+8XtIJ72PA2BsYC2DEd3L7tfoTtl4pe2Vz9ZnF
 qIe9ZGP+eHeOmWENkqGmBzookx/LNiG4SDG+ne+qyjrDfoyTquWnjGux9RLiB3gxDZ/8
 wLbyYDgurmj/bLgqQFfrPuQt39G0DxNPMzHsliLigPok7l+jRTTy/XD9gE9Ti+dyxgH7
 rUIpHT+PbhURvH+uQ6Pag25iGY6zgynnCWrej0rRQ2TjXl4C1Cdf+W5hpgvVKb+7k3k4
 DFYpOQ2q2qTOx1vNJtadnqcjavoVd56NutsloKSjScSP5pJROi+W0F5y67tlHT/G22W5
 cOsA==
MIME-Version: 1.0
Received: by 10.14.218.69 with SMTP id j45mr37814254eep.35.1354551332826; Mon,
 03 Dec 2012 08:15:32 -0800 (PST)
Sender: keith.arner@gmail.com
Received: by 10.14.48.1 with HTTP; Mon, 3 Dec 2012 08:15:32 -0800 (PST)
Date: Mon, 3 Dec 2012 11:15:32 -0500
X-Google-Sender-Auth: 86FYCYWSKuVss6tQFoQnOPMLK7Q
Message-ID: <CAEo_tUHOtv2DQawyS85-gaoUPawRwLFinSXWsMGcKFWP0qCgYA@mail.gmail.com>
Subject: Re: Problems with ephemeral port selection
From: Keith Arner <vornum@gmail.com>
To: freebsd-net@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 03 Dec 2012 16:15:34 -0000

> Date: Sat, 01 Dec 2012 09:28:05 +0100
> From: Andre Oppermann <andre@freebsd.org>
>
> On 30.11.2012 15:09, Keith Arner wrote:
>> I've noticed some issues with ephemeral port number selection from
>> tcp_connect(),
>
> this is an excellent analysis.  Could you please file it as a problem
> report too and post the PR-number here so we can better track it?

Done.  PR-number is: kern/174087

> From: Fernando Gont <fernando@gont.com.ar>
> Subject: Re: Problems with ephemeral port selection
>
> Please take a look at the discussion on how to "steal" incomming
> connections in Section 3.1 of RFC 6056.

Fair point.  I added your comment to kern/174087 when I filed it.
The points made in RFC 6056 actually answer a few outstanding
questions I had about why in_pcbbind_setup() behaves the way
it does.  In particular, I previously couldn't figure out why it was
taking special consideration for unconnected sockets.

With that in mind, I believe the criteria for check_suitable_port()
(as described bt RFC 6056) should be*:

  A candidate ephemeral port is suitable if and only if:
  1) There is no other existing local socket with the same 5-tuple.
  2) There is no local socket using the same local port number,
       and with either a wildcard fport or wildcard faddr.

I had previously suggested using in_pcblookup_hash() as
a check_suitable_port() function.  That would suffice for criterion
#1, but would fall short for criterion #2.  Looks like we need
yet another pcb lookup function.

Keith

* Yes, I realize that my terminology freely mixes the abstract
concepts in the RFC with the concrete language of the FreeBSD
implementation.

-- 
"A problem well put is half solved."

From owner-freebsd-net@FreeBSD.ORG  Mon Dec  3 17:34:16 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id EA2FFF0C
 for <freebsd-net@FreeBSD.org>; Mon,  3 Dec 2012 17:34:15 +0000 (UTC)
 (envelope-from andre@freebsd.org)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
 by mx1.freebsd.org (Postfix) with ESMTP id 3E2578FC1E
 for <freebsd-net@FreeBSD.org>; Mon,  3 Dec 2012 17:34:14 +0000 (UTC)
Received: (qmail 92577 invoked from network); 3 Dec 2012 19:04:45 -0000
Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2])
 (envelope-sender <andre@freebsd.org>)
 by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
 for <melifaro@FreeBSD.org>; 3 Dec 2012 19:04:45 -0000
Message-ID: <50BCE294.4070409@freebsd.org>
Date: Mon, 03 Dec 2012 18:34:12 +0100
From: Andre Oppermann <andre@freebsd.org>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:16.0) Gecko/20121026 Thunderbird/16.0.2
MIME-Version: 1.0
To: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
Subject: Re: [CFT] Virtual BPF interfaces
References: <4F96D11B.2060007@FreeBSD.org>
 <20120425.020518.406495893112283552.hrs@allbsd.org>
 <4F96E71B.9020405@FreeBSD.org>
 <20120427.084414.1142593201575277510.hrs@allbsd.org>
 <4FD4AD29.3040204@FreeBSD.org> <50BAA552.1010707@FreeBSD.org>
 <20121203081134.GO14202@glebius.int.ru> <50BC989E.3080303@FreeBSD.org>
In-Reply-To: <50BC989E.3080303@FreeBSD.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-ipfw@FreeBSD.org, delphij@FreeBSD.org,
 Hiroki Sato <hrs@FreeBSD.org>,
 "freebsd-net@freebsd.org" <freebsd-net@FreeBSD.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 03 Dec 2012 17:34:16 -0000

On 03.12.2012 13:18, Alexander V. Chernikov wrote:
> On 03.12.2012 12:11, Gleb Smirnoff wrote:
>> On Sun, Dec 02, 2012 at 04:48:18AM +0400, Alexander V. Chernikov wrote:
>> A> On 10.06.2012 18:20, Alexander V. Chernikov wrote:
>> A> > On 27.04.2012 03:44, Hiroki Sato wrote:
>> A> >> "Alexander V. Chernikov"<melifaro@FreeBSD.org> wrote
>> A> >> in<4F96E71B.9020405@FreeBSD.org>:
>> A> >>
>> A> >> me> On 24.04.2012 21:05, Hiroki Sato wrote:
>> A> >
>> A> > Proof-of-concept patch attached.
>> A>
>> A> Hopefully, libcap code is easily extendable.
>> A> New version attached:
>> A> * BPF code is now able to use 'virtual' interfaces without real ifnet
>> A> * New bpfattach3() / bpfdetach3() routines were added to attach virtual
>> A> ifaces
>> A> * New BIOCGIFLIST ioctl is added to permit userland to retrieve
>> A> available virtual interfaces
>> A> * freebsd-specific 'platform_finddevs' version is added to libpcap code
>> A> (new file)
>> A>
>> A> There are some rough edges (conditional code in pcap-bpf.c, lack of
>> A> documentation, maybe some style issues), but generally it seems to work
>> A> and does not interfere with contrib/ code much (from my point of view).
>> A>
>> A> ipfw log device was converted to use new bpf(4) api, see attached patch.
>>
>> Nice proof of concept, Alexander!
>>
>> What does prevent us from unifing all bpf providers to be "virtual" in
>> current terms? I think if we finish divorce between ifnet and bpf, the code
>> would get simplier and you can proceed further with reducing locking
>> overhead.
>
> We have to jump from ifnet to the list of per-ifnet BPF consumers somehow, so I'm not sure if we can
> do much more here. BPF itself doesn't require much from parent ifnet.
>
> What I really want to do next is the following:
>
> 1) Make BPF_PEERS_PRESENT(ifp) to be (ifp->if_bpf != NULL). This saves some processing time and
> permits 'bpf_if' to be be totally opaque without any hacks.

You have to be a bit careful with locking, or rather not locking.  When
the consumer is not doing any lock operations it may not (immediately)
pick up that the pointer was changed on another CPU.

> 2) Set if_bpf pointer IFF there are some consumers (and set it back to NULL when all consumers are
> detached). This should work well for 'main' BPF DLT, but single (currently, 802.11) interface can
> hold more than one DLTs. Probably we can save dst pointer passed to bpfattach2() to given bpf_if
> structure, and set this value instead of ->if_bpf.
> This, however, can lead to hard-to-find problems, since bpfattach[2] is usually not called by driver
> directly.

Separate from the above BPF on the output side may be optimized by passing
the mbuf not from drv*_start() but from drv*_txeof().  There may be a few
microseconds delay but a mbuf (-chain) copy is saved in the transmit path.
As an additional benefit only those packets that actually were transmitted
are persented to bpf.

-- 
Andre


From owner-freebsd-net@FreeBSD.ORG  Tue Dec  4 01:31:42 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id AE569D5C
 for <freebsd-net@freebsd.org>; Tue,  4 Dec 2012 01:31:42 +0000 (UTC)
 (envelope-from prvs=1685a61a7f=evendas@krazer.com.br)
Received: from krazer.com.br (usaimport.com.br [74.208.147.131])
 by mx1.freebsd.org (Postfix) with ESMTP id 26AC78FC26
 for <freebsd-net@freebsd.org>; Tue,  4 Dec 2012 01:31:35 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=simple; d=krazer.com.br;
 s=MDaemon; t=1354583459; x=1355188259; q=dns/txt; h=DomainKey-Signature:
 Received:From:To:Subject:Date:MIME-Version:Content-Type:
 Message-ID; bh=/AGF6JRzB/J7iTsWt8UkF/2jJ0/KNRujXWNnXTa6qHA=; b=G
 p90aWi6jWdzIKxXTaBrwTZEAnx2MHKnn269Yto+K0ksdyPmChUqq0JJ9hckrPHVE
 3fqbjEH6MmjEQPbuCNe8QxllJvD5mVUs0hEvhmtd/7Y7hO33GXfjhbQyj0cHqPwh
 +Upbz/ihPUjJf/e1kPi9YAasUKJcZfzS1gCjqMLP88=
DomainKey-Signature: a=rsa-sha1; s=MDaemon; d=krazer.com.br;
 c=simple; q=dns; h=from:message-id;
 b=GZXyIIwtB+n/RLrcRxT88fMg0TRyw12rdPESYhDUkCpDbl0dd6UKSQnxkp/N
 8CONNejRMW+xT6VzsrFkbNFhYjYVZNQDvLcby0Zjzp6jeM/GQyNO1V4Ev
 c8GDl2sfpdYjWnB36opLpzY0brBHOLnzHm5cz2e/Yiz6O0uv3lxzfA=;
X-MDAV-Processed: allearth.com.br, Mon, 03 Dec 2012 23:10:59 -0200
Received: from krazer by allearth.com.br (MDaemon PRO v11.0.0)
 with ESMTP id md50003170654.msg
 for <freebsd-net@freebsd.org>; Mon, 03 Dec 2012 23:10:58 -0200
X-Spam-Processed: allearth.com.br, Mon, 03 Dec 2012 23:10:58 -0200
 (not processed: message from trusted or authenticated source)
X-Authenticated-Sender: evendas@krazer.com.br
X-MDRemoteIP: 74.208.167.75
X-Return-Path: prvs=1685a61a7f=evendas@krazer.com.br
X-Envelope-From: evendas@krazer.com.br
X-MDaemon-Deliver-To: freebsd-net@freebsd.org
From: "Vendas Krazer Technologies" <evendas@krazer.com.br>
To: <freebsd-net@freebsd.org>
Subject: =?utf-8?B?Tm92YSBDUEUgS3JhemVyIFNreSBTdGF0aW9uIDVHSHo=?=
 =?utf-8?B?IE4gLSBDUEUgQW50ZW5hIEludGVncmFkYSBkZSAxOGRCaQ==?=
 =?utf-8?B?IGUgQ29tIFNhw61kYSBwYXJhIEFudGVuYSBFeHRlcm5h?=
Date: Mon, 03 Dec 2012 22:08:16 -0200
MIME-Version: 1.0
Content-Type: multipart/related;
 boundary="----=45652905_3502_4801_0078_850943129657"
Message-ID: <B2152610.1CDD1A2@krazer>
X-Mailer: Clientes Krazer
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 04 Dec 2012 01:31:42 -0000

This is a multi-part message in MIME format.

------=45652905_3502_4801_0078_850943129657
Content-Type: text/plain; 	charset=utf-8
Content-Transfer-Encoding: quoted-printable

Lan=C3=A7amento

CPE Krazer Sky Station 5GHz N

Voc=C3=AA cliente pediu que a Krazer fizesse uma nova CPE num formato mais =
estiloso, pequena, de menor tamanho e que tivesse novas funcionabilidade, m=
ais especificamente acesso f=C3=A1cil ao bot=C3=A3o de reset, prote=C3=A7=
=C3=A3o contra queima e a t=C3=A3o desejada SA=C3=8DDA PARA ANTENA EXTERNA!=
!!

R$ 179.90


Antena Integrada de 18dBi 60=C2=BA

Duas Portas de Rede Lan e Wan

PA Real de 630mW e LNA Ultra Ganho
PoE Passivo com Prote=C3=A7=C3=A3o Dupla de 12 a 24V
Fonte Chaveada 12V Full Range 110 a 220V
Exclusiva Sa=C3=ADda para Antena Externa

Homologa=C3=A7=C3=A3o Anatel 0269-11-5280

Instala=C3=A7=C3=A3o R=C3=A1pida e Simples.
Software Amigavel e em Portugu=C3=AAs!
Suporte a PPPoE Wisp Cliente!
Controle de Banda!


Excelente sinal de recep=C3=A7=C3=A3o!  Longa Dist=C3=A2ncia!


Fa=C3=A7a um teste em sua rede e compare com os concorrentes, muito mais si=
nal que UBNT, muito mais dados, transmiss=C3=A3o de quase 90Mbps TCP/IP con=
tinuamente! Lat=C3=AAncia de rede de 1 a 5 ms com carga completa!

Contate-nos
Val Campos // Carla Maria // Eder Roberto

Email / MSN:
vendas@allearth.com.br


Vendas / SAC
(19) 3256-5557
(19) 3245-0708
www.krazer.com.br


Envio de Email n=C3=A3o autorizado =C3=A9 crime, n=C3=A3o seja o vil=C3=A3o=
 da hist=C3=B3ria! Email =C3=A9 protegido sobre sigilo fiscal e federal. Le=
i Federal Brasil.

------=45652905_3502_4801_0078_850943129657--


From owner-freebsd-net@FreeBSD.ORG  Tue Dec  4 18:43:03 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 87770496
 for <freebsd-net@freebsd.org>; Tue,  4 Dec 2012 18:43:03 +0000 (UTC)
 (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net
 [IPv6:2001:470:1f10:75::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 474488FC1D
 for <freebsd-net@freebsd.org>; Tue,  4 Dec 2012 18:43:03 +0000 (UTC)
Received: from pakbsde14.localnet (unknown [38.105.238.108])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id 77BE5B9BC;
 Tue,  4 Dec 2012 13:43:02 -0500 (EST)
From: John Baldwin <jhb@freebsd.org>
To: Barney Cordoba <barney_cordoba@yahoo.com>
Subject: Re: Latency issues with buf_ring
Date: Tue, 4 Dec 2012 11:08:17 -0500
User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p22; KDE/4.5.5; amd64; ; )
References: <1353259441.19423.YahooMailClassic@web121605.mail.ne1.yahoo.com>
In-Reply-To: <1353259441.19423.YahooMailClassic@web121605.mail.ne1.yahoo.com>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201212041108.17645.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Tue, 04 Dec 2012 13:43:02 -0500 (EST)
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 04 Dec 2012 18:43:03 -0000

On Sunday, November 18, 2012 12:24:01 pm Barney Cordoba wrote:
> 
> --- On Thu, 1/19/12, John Baldwin <jhb@freebsd.org> wrote:
> 
> > From: John Baldwin <jhb@freebsd.org>
> > Subject: Latency issues with buf_ring
> > To: net@freebsd.org
> > Cc: "Ed Maste" <emaste@freebsd.org>, "Navdeep Parhar" <np@freebsd.org>
> > Date: Thursday, January 19, 2012, 11:41 AM
> > The current buf_ring usage in various
> > NIC drivers has a race that can
> > result in really high packet latencies in some cases. 
> > Specifically,
> > the common pattern in an if_transmit routine is to use a
> > try-lock on
> > the queue and if that fails enqueue the packet in the
> > buf_ring and
> > return.  The race, of course, is that the thread
> > holding the lock
> > might have just finished checking the buf_ring and found it
> > empty and
> > be in the process of releasing the lock when the original
> > thread fails
> > the try lock.  If this happens, then the packet queued
> > by the first
> > thread will be stalled until another thread tries to
> > transmit packets
> > for that queue.  Some drivers attempt to handle this
> > race (igb(4)
> > schedules a task to kick the transmit queue if the try lock
> > fails) and
> > others don't (cxgb(4) doesn't handle it at all).  At
> > work this race
> > was triggered very often after upgrading from 7 to 8 with
> > bursty
> > traffic and caused numerous problems, so it is not a rare
> > occurrence
> > and needs to be addressed.
> > 
> > (Note, all patches mentioned are against 8)
> > 
> > The first hack I tried to use was to simply always lock the
> > queue after
> > the drbr enqueue if the try lock failed and then drain the
> > queue if
> > needed (www.freebsd.org/~jhb/patches/try_fail.patch). 
> > While this fixed
> > my latency problems, it would seem that this breaks other
> > workloads
> > that the drbr design is trying to optimize.
> > 
> > After further hacking what I came up with was a variant of
> > drbr_enqueue()
> > that would atomically set a 'pending' flag.  During the
> > enqueue operation.
> > The first thread to fail the try lock sets this flag (it is
> > told that it
> > set the flag by a new return value (EINPROGRESS) from the
> > enqueue call).
> > The pending thread then explicitly clears the flag once it
> > acquires the
> > queue lock.  This should prevent multiple threads from
> > stacking up on the
> > queue lock so that if multiple threads are dumping packets
> > into the ring
> > concurrently all but two (the one draining the queue
> > currently and the
> > one waiting for the lock) can continue to drain the
> > queue.  One downside
> > of this approach though is that each driver has to be
> > changed to make
> > an explicit call to clear the pending flag after grabbing
> > the queue lock
> > if the try lock fails.  This is what I am currently
> > running in production
> > (www.freebsd.org/~jhb/patches/try_fail3.patch).
> > 
> > However, this still results in a lot of duplicated code in
> > each driver
> > that wants to support multiq.  Several folks have
> > expressed a desire
> > to move in a direction where the stack has explicit
> > knowledge of
> > transmit queues allowing us to hoist some of this duplicated
> > code out
> > of the drivers and up into the calling layer.  After
> > discussing this a
> > bit with Navdeep (np@), the approach I am looking at is to
> > alter the
> > buf_ring code flow a bit to more closely model the older
> > code-flow
> > with IFQ and if_start methods.  That is, have the
> > if_transmit methods
> > always enqueue each packet that arrives to the buf_ring and
> > then to
> > call an if_start-like method that drains a specific transmit
> > queue.
> > This approach simplifies a fair bit of driver code and means
> > we can
> > potentially move the enqueue, etc. bits up into the calling
> > layer and
> > instead have drivers provide the per-transmit queue start
> > routine as
> > the direct function pointer to the upper layers ala
> > if_start.
> > 
> > However, we would still need a way to close the latency
> > race.  I've
> > attempted to do that by inverting my previous 'thread
> > pending' flag.
> > Instead, I make the buf_ring store a 'busy' flag.  This
> > flag is
> > managed by the single-consumer buf_ring dequeue method
> > (that
> > drbr_dequeue() uses).  It is set to true when a packet
> > is removed from
> > the queue while there are more packets pending. 
> > Conversely, if there
> > are no other pending packets then it is set to false. 
> > The assumption
> > is that once a thread starts draining the queue, it will not
> > stop
> > until the queue is empty (or if it has to stop for some
> > other reason
> > such as the transmit ring being full, the driver will
> > restart draining
> > of the queue until it is empty, e.g. after it receives a
> > transmit
> > completion interrupt).  Now when the if_transmit
> > routine enqueues the
> > packet, it will get either a real error, 0 if the packet was
> > enqueued
> > and the queue was not idle, or EINPROGRESS if the packet was
> > enqueued
> > and the queue was busy.  For the EINPROGRESS case the
> > if_transmit
> > routine just returns success.  For the 0 case it does a
> > blocking lock
> > on the queue lock and calls the queue's start routine (note
> > that this
> > means that the busy flag is similar to the old OACTIVE
> > interface
> > flag).  This does mean that in some cases you may have
> > one thread that
> > is sending what was the last packet in the buf_ring holding
> > the lock
> > when another thread blocks, and that the first thread will
> > see the new
> > packet when it loops back around so that the second thread
> > is wasting
> > it's time spinning, but in the common case I believe it will
> > give the
> > same parallelism as the current code.  OTOH, there is
> > nothing to
> > prevent multiple threads from "stacking up" in the new
> > approach.  At
> > least the try_fail3 patch ensured only one thread at a time
> > would ever
> > potentially block on the queue lock.
> > 
> > Another approach might be to replace the 'busy' flag with
> > the 'thread
> > pending' flag from try_fail3.patch, but to clear the 'thread
> > pending'
> > flag anytime the dequeue method is called rather than using
> > an
> > explicit 'clear pending' method.  (Hadn't thought of
> > that until
> > writing this e-mail.)  That would prevent multiple
> > threads from
> > waiting on the queue lock perhaps.
> > 
> > Note that the 'busy' approach (or the modification I
> > mentioned above)
> > does rely on the assumption I stated above, i.e. once a
> > driver starts
> > draining a queue, it will drain it until empty unless it
> > hits an
> > "error" condition (link went down, transmit ring full,
> > etc.).  If it
> > hits an "error" condition, the driver is responsible for
> > restarting
> > transmit when the condition clears.  I believe our
> > drivers already
> > work this way now.
> > 
> > The 'busy' patch is at http://www.freebsd.org/~jhb/patches/drbr.patch
> > 
> > -- 
> > John Baldwin
> 
> Q1: Has this been corrected?

No.  I've yet to been able to raise a meaningful discussion about possible 
solutions to this.

> Q2: Are there any case studies or benchmarks for buf_ring, or it is just
> blindly being used because someone claimed it was better and offered it
> for free? One of the points of locking is to avoid race conditions, so the 
fact that you have races in a supposed lock-less scheme seems more than just 
ironic.

The buf_ring author claims it has benefits in high pps workloads.  I am not 
aware of any benchmarks, etc.

-- 
John Baldwin

From owner-freebsd-net@FreeBSD.ORG  Tue Dec  4 19:34:37 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 7CCC42E4;
 Tue,  4 Dec 2012 19:34:37 +0000 (UTC)
 (envelope-from adrian.chadd@gmail.com)
Received: from mail-wi0-f180.google.com (mail-wi0-f180.google.com
 [209.85.212.180])
 by mx1.freebsd.org (Postfix) with ESMTP id D9EA98FC0C;
 Tue,  4 Dec 2012 19:34:36 +0000 (UTC)
Received: by mail-wi0-f180.google.com with SMTP id hj13so854359wib.13
 for <multiple recipients>; Tue, 04 Dec 2012 11:34:35 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=nOwx4cI4NlCjeGnwjUHbdLlEYhtNXtuMfHL3cRMCfUM=;
 b=I8j4+NOl/OdkArwgmSZP1baSk+2pfcWYBRVLGMBLTHNpdQbG5wP3RZX70fG4O9fwHd
 kbM7pbs1hqtaoEyzjgnwOCe79BdBA/oEHXyriR0lpEhv/+5ADx5UZ18cVFRwnr2FIhze
 qXC0pNL+t+k+P6istLvFVCzGEIAVTRCgOnkqGl5un5rwnB65GKiGH57G/VMKmYCfzwZ7
 96T2cAyYvB3GGUisc+t2oU5DWEx/yMaN0MCBzcQuSkzRrOuRozIg38u2Pv5YdDbHtgmc
 uCl5Cqh9rfM75OLwKd8Wuf4ZSXAW4qkcVtyDZjVt6BVe+nZ+tveLPJj85FQlLjgRKkdo
 I3mQ==
MIME-Version: 1.0
Received: by 10.216.85.211 with SMTP id u61mr5672480wee.212.1354649675658;
 Tue, 04 Dec 2012 11:34:35 -0800 (PST)
Sender: adrian.chadd@gmail.com
Received: by 10.217.57.9 with HTTP; Tue, 4 Dec 2012 11:34:35 -0800 (PST)
In-Reply-To: <201212041108.17645.jhb@freebsd.org>
References: <1353259441.19423.YahooMailClassic@web121605.mail.ne1.yahoo.com>
 <201212041108.17645.jhb@freebsd.org>
Date: Tue, 4 Dec 2012 11:34:35 -0800
X-Google-Sender-Auth: ouURajnVNcCsSzqsreuwjeeKdNU
Message-ID: <CAJ-Vmo=tFFkeK2uADMPuBrgX6wN_9TSjAgs0WKPCrEfyhkG6Pw@mail.gmail.com>
Subject: Re: Latency issues with buf_ring
From: Adrian Chadd <adrian@freebsd.org>
To: John Baldwin <jhb@freebsd.org>
Content-Type: text/plain; charset=ISO-8859-1
Cc: Barney Cordoba <barney_cordoba@yahoo.com>, freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 04 Dec 2012 19:34:37 -0000

.. and it's important to note that buf_ring itself doesn't have the
race condition; it's the general driver implementation that's racy.

I have the same races in ath(4) with the watchdog programming. Exactly
the same issue.


Adrian

From owner-freebsd-net@FreeBSD.ORG  Tue Dec  4 20:02:32 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id A515EDF0
 for <freebsd-net@freebsd.org>; Tue,  4 Dec 2012 20:02:32 +0000 (UTC)
 (envelope-from oppermann@networx.ch)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
 by mx1.freebsd.org (Postfix) with ESMTP id 0D2578FC08
 for <freebsd-net@freebsd.org>; Tue,  4 Dec 2012 20:02:31 +0000 (UTC)
Received: (qmail 5917 invoked from network); 4 Dec 2012 21:32:44 -0000
Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2])
 (envelope-sender <oppermann@networx.ch>)
 by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
 for <adrian@freebsd.org>; 4 Dec 2012 21:32:44 -0000
Message-ID: <50BE56C8.1030804@networx.ch>
Date: Tue, 04 Dec 2012 21:02:16 +0100
From: Andre Oppermann <oppermann@networx.ch>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:16.0) Gecko/20121026 Thunderbird/16.0.2
MIME-Version: 1.0
To: Adrian Chadd <adrian@freebsd.org>
Subject: Re: Latency issues with buf_ring
References: <1353259441.19423.YahooMailClassic@web121605.mail.ne1.yahoo.com>
 <201212041108.17645.jhb@freebsd.org>
 <CAJ-Vmo=tFFkeK2uADMPuBrgX6wN_9TSjAgs0WKPCrEfyhkG6Pw@mail.gmail.com>
In-Reply-To: <CAJ-Vmo=tFFkeK2uADMPuBrgX6wN_9TSjAgs0WKPCrEfyhkG6Pw@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Barney Cordoba <barney_cordoba@yahoo.com>, John Baldwin <jhb@freebsd.org>,
 freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 04 Dec 2012 20:02:32 -0000

On 04.12.2012 20:34, Adrian Chadd wrote:
> .. and it's important to note that buf_ring itself doesn't have the
> race condition; it's the general driver implementation that's racy.
>
> I have the same races in ath(4) with the watchdog programming. Exactly
> the same issue.

Our IF_* stack/driver boundary handoff isn't up to the task anymore.

Also the interactions are either poorly defined or understood in many
places.  I've had a few chats with yongari@ and am experimenting with
a modernized interface in my branch.

The reason I stumbled across it was because I'm extending the hardware
offload feature set and found out that the stack and the drivers (and
the drivers among themself) are not really in sync with regards to behavior.

For most if not all ethernet drivers from 100Mbit/s the TX DMA rings
are so large that buffering at the IFQ level doesn't make sense anymore
and only adds latency.  So it could simply directly put everything into
the TX DMA and not even try to soft-queue.  If the TX DMA ring is full
ENOBUFS is returned instead of filling yet another queue.  However there
are ALTQ interactions and other mechanisms which have to be considered
too making it a bit more involved.

I'm coming up with a draft and some benchmark results for an updated
stack/driver boundary in the next weeks before xmas.

-- 
Andre


From owner-freebsd-net@FreeBSD.ORG  Tue Dec  4 21:31:26 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id F26E2819;
 Tue,  4 Dec 2012 21:31:25 +0000 (UTC)
 (envelope-from adrian.chadd@gmail.com)
Received: from mail-we0-f182.google.com (mail-we0-f182.google.com
 [74.125.82.182])
 by mx1.freebsd.org (Postfix) with ESMTP id 4FAC78FC19;
 Tue,  4 Dec 2012 21:31:24 +0000 (UTC)
Received: by mail-we0-f182.google.com with SMTP id u54so2209471wey.13
 for <multiple recipients>; Tue, 04 Dec 2012 13:31:23 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=A+iA2cUXAhMGGI46BKigGPjZEnEKIEBPalVUClTApi4=;
 b=Uezhxn3hNNm7kZtALNJj2VS45tb18yKP4llDpfMt3OcwZCyQmmb9i+98lxs1Mk7aU8
 /DrJmTi41DQFeA3dLhCwG7W+JrIT1/RjSmDbu2O+ewj5asBgjd02hWKrM9yXBD0UzfQ+
 9aYxr34ul5F9pjPN3v8/Tq7sccBwfRnKuIptOP6dBHnXwOX6xqcNz1to9TXMqbWDNc/S
 GxT+rdm6O0QZnZ0fCsEluXq02HjHoZ8wb0Qca5V2rf+c/XcNhZGABdsfU7zpCiuQV1OQ
 lmqKR4palTm2FupfA0yPjgm5KkzK21ib09nG7q2P2zFcSEAPVBTbCzcBulvLsqwjPgab
 G6MA==
MIME-Version: 1.0
Received: by 10.216.139.140 with SMTP id c12mr5872057wej.46.1354656683288;
 Tue, 04 Dec 2012 13:31:23 -0800 (PST)
Sender: adrian.chadd@gmail.com
Received: by 10.217.57.9 with HTTP; Tue, 4 Dec 2012 13:31:23 -0800 (PST)
In-Reply-To: <50BE56C8.1030804@networx.ch>
References: <1353259441.19423.YahooMailClassic@web121605.mail.ne1.yahoo.com>
 <201212041108.17645.jhb@freebsd.org>
 <CAJ-Vmo=tFFkeK2uADMPuBrgX6wN_9TSjAgs0WKPCrEfyhkG6Pw@mail.gmail.com>
 <50BE56C8.1030804@networx.ch>
Date: Tue, 4 Dec 2012 13:31:23 -0800
X-Google-Sender-Auth: CK7HGl4-msBdEVpC_rzmRv2l7J8
Message-ID: <CAJ-Vmok+W_LgSCnETLOAogucqUSy+yBixsdNj-2Aepy+1Lo7gw@mail.gmail.com>
Subject: Re: Latency issues with buf_ring
From: Adrian Chadd <adrian@freebsd.org>
To: Andre Oppermann <oppermann@networx.ch>
Content-Type: text/plain; charset=ISO-8859-1
Cc: Barney Cordoba <barney_cordoba@yahoo.com>, John Baldwin <jhb@freebsd.org>,
 freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 04 Dec 2012 21:31:26 -0000

On 4 December 2012 12:02, Andre Oppermann <oppermann@networx.ch> wrote:

> Our IF_* stack/driver boundary handoff isn't up to the task anymore.

Right. well, the current hand off is really "here's a packet, go do
stuff!" and the legacy if_start() method is just plain broken for SMP,
preemption and direct dispatch.

Things are also very special in the net80211 world, with the stack
layer having to get its grubby fingers into things.

I'm sure that the other examples of layered protocols (eg doing MPLS,
or even just straight PPPoE style tunneling) has the same issues.
Anything with sequence numbers and encryption being done by some other
layer is going to have the same issue, unless it's all enforced via
some other queue and a single thread handling the network stack
"stuff".

I bet direct-dispatch netgraph will have similar issues too, if it
ever comes into existence. :-)

> Also the interactions are either poorly defined or understood in many
> places.  I've had a few chats with yongari@ and am experimenting with
> a modernized interface in my branch.
>
> The reason I stumbled across it was because I'm extending the hardware
> offload feature set and found out that the stack and the drivers (and
> the drivers among themself) are not really in sync with regards to behavior.
>
> For most if not all ethernet drivers from 100Mbit/s the TX DMA rings
> are so large that buffering at the IFQ level doesn't make sense anymore
> and only adds latency.  So it could simply directly put everything into
> the TX DMA and not even try to soft-queue.  If the TX DMA ring is full
> ENOBUFS is returned instead of filling yet another queue.  However there
> are ALTQ interactions and other mechanisms which have to be considered
> too making it a bit more involved.

net80211 has slightly different problems. We have requirements for
per-node, per-TID/per-AC state (not just for QOS, but separate
sequence numbers, different state machine handling for things like
aggregation and (later) U-APSD handling, etc) so we do need to direct
frames into different queues and then correctly serialise that mess.

> I'm coming up with a draft and some benchmark results for an updated
> stack/driver boundary in the next weeks before xmas.

Ok. Please don't rush into it though; I'd like time to think about it
after NY (as I may actually _have_ a holiday this xmas!) and I'd like
to try and rope in people from non-ethernet-packet-pushing backgrounds
to comment.
They may have much stricter and/or stranger requirements when it comes
to how the network layer passes, serialises and pushes packets to
other layers.

Thanks,


Adrian

From owner-freebsd-net@FreeBSD.ORG  Wed Dec  5 03:31:39 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id F3616D4F;
 Wed,  5 Dec 2012 03:31:38 +0000 (UTC)
 (envelope-from brde@optusnet.com.au)
Received: from fallbackmx06.syd.optusnet.com.au
 (fallbackmx06.syd.optusnet.com.au [211.29.132.8])
 by mx1.freebsd.org (Postfix) with ESMTP id 2824A8FC08;
 Wed,  5 Dec 2012 03:31:37 +0000 (UTC)
Received: from mail05.syd.optusnet.com.au (mail05.syd.optusnet.com.au
 [211.29.132.186])
 by fallbackmx06.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
 qB53VUG7026821; Wed, 5 Dec 2012 14:31:31 +1100
Received: from c122-106-175-26.carlnfd1.nsw.optusnet.com.au
 (c122-106-175-26.carlnfd1.nsw.optusnet.com.au [122.106.175.26])
 by mail05.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id qB53VHcG016927
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
 Wed, 5 Dec 2012 14:31:21 +1100
Date: Wed, 5 Dec 2012 14:31:17 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Andre Oppermann <oppermann@networx.ch>
Subject: Re: Latency issues with buf_ring
In-Reply-To: <50BE56C8.1030804@networx.ch>
Message-ID: <20121205112511.Q932@besplex.bde.org>
References: <1353259441.19423.YahooMailClassic@web121605.mail.ne1.yahoo.com>
 <201212041108.17645.jhb@freebsd.org>
 <CAJ-Vmo=tFFkeK2uADMPuBrgX6wN_9TSjAgs0WKPCrEfyhkG6Pw@mail.gmail.com>
 <50BE56C8.1030804@networx.ch>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Optus-CM-Score: 0
X-Optus-CM-Analysis: v=2.0 cv=c8fz2mBl c=1 sm=1 a=ie5KVN3-GTQA:10
 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=SOXvLa97LiYA:10
 a=gr-qOqZ8CvggYKoQBowA:9 a=CjuIK1q_8ugA:10 a=bxQHXO5Py4tHmhUgaywp5w==:117
Cc: Barney Cordoba <barney_cordoba@yahoo.com>,
 Adrian Chadd <adrian@FreeBSD.org>, John Baldwin <jhb@FreeBSD.org>,
 freebsd-net@FreeBSD.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 05 Dec 2012 03:31:39 -0000

On Tue, 4 Dec 2012, Andre Oppermann wrote:

> For most if not all ethernet drivers from 100Mbit/s the TX DMA rings
> are so large that buffering at the IFQ level doesn't make sense anymore
> and only adds latency.

I found sort of the opposite for bge at 1Gbps.  Most or all bge NICs
have a tx ring size of 512.  The ifq length is the tx ring size minus
1 (511).  I needed to expand this to imax(2 * tick / 4, 10000) to
maximize pps.  This does bad things to latency and worse things to
caching (512 buffers might fit in the L2 cache, but 10000 buffers
bust any reasonably cache as they are cycled through), but I only
tried to optimize tx pps.

> So it could simply directly put everything into
> the TX DMA and not even try to soft-queue.  If the TX DMA ring is full
> ENOBUFS is returned instead of filling yet another queue.

That could work, but upper layers currently don't understand ENOBUFS
at all, so it would work poorly now.  Also, 512 entries is not many,
so even if upper layers understood ENOBUFS it is not easy for them to
_always_ respond fast enough to keep the tx active, unless there are
upstream buffers with many more than 512 entries.  There needs to be
enough buffering somewhere so that the tx ring can be replenished
almost instantly from the buffer, to handle the worst-case latency
for the threads generatng new (unbuffered) packets.  At the line rate
of ~1.5 Mpps for 1 Gbps, the maximum latency that can be covered by
512 entries is only 340 usec.

> However there
> are ALTQ interactions and other mechanisms which have to be considered
> too making it a bit more involved.

I didn't try to handle ALTQ or even optimize for TCP.

More details: to maximize pps, the main detail is to ensure that the tx
ring never becomes empty.  The tx then transmits as fast as possible.
This requires some watermark processing, but FreeBSD has almost none
for tx rings.  The following normally happens for packet generators
like ttcp and netsend:

- loop calling send() or sendto() until the tx ring (and also any
   upstream buffers) fill up.  Then ENOBUFS is returned.

- watermark processing is broken in the user API at this point.  There
   is no way for the application to wait for the ENOBUFS condition to
   go away (select() and poll() don't work).  Applications use poor
   workarounds:

- old (~1989) ttcp sleeps for 18 msec when send() returns ENOBUFS.  This
   was barely good enough for 1 Mbps ethernet (line rate ~1500 pps is 27
   per 18 msec, so IFQ_MAXLEN = 50 combined with just a 1-entry tx ring
   provides a safety factor of about 2).  Expansion of the tx ring size to
   512 makes this work with 10 Mbps ethernet too.  Expansion of the ifq
   to 511 gives another factor of 2.  After losing the safety factor of 2,
   we can now handle 40 Mbps ethernet, and are only a factor of 25 short
   for 1 Gbps.  My hardware can't do line rate for small packets -- it
   can only do 640 kpps.  Thus ttcp is only a factor of 11 short of
   supporting the hardware at 1 Gbps.

   This assumes that sleeps of 18 msec are actually possible, which
   they aren't with HZ = 100 giving a granularity of 10 msec so that
   sleep(18 msec) actually sleeps for an average of 23 msec.  -current
   uses the bad default of HZ = 1000.  With that sleep(18 msec) would
   average 18.5 msec.  Of course, ttcp should sleep for more like 1
   msec if that is possible.  Then the average sleep is 1.5 msec.  ttcp
   can keep up with the hardware with that, and is only slightly behind
   the hardware with the worst-case sleep of 2 msec (512+511 packets
   generated every 2 msec is 511.5 kpps).

   I normally use old ttcp, except I modify it to sleep for 1 msec instead
   of 18 in one version, and in another version I remove the sleep so that
   it busy-waits in a loop that calls send() which almost always returns
   ENOBUFS.  The latter wastes a lot of CPU, but is almost good enough
   for throughput testing.

- newer ttcp tries to program the sleep time in microseconds.  This doesn't
   really work, since the sleep granularity is normally at least a millisecond,
   and even if it could be the 340 microseconds needed by bge with no ifq
   (see above, and better divide the 340 by 2), then this is quite short
   and would take almost as much CPU as busy-waiting.  I consider HZ = 1000
   to be another form of polling/busy-waiting and don't use it except for
   testing.

- netrate/netsend also uses a programmed sleep time.  This doesn't really
   work, as above.  netsend also tries to limit its rate based on sleeping.
   This is further from working, since even finer-grained sleeps are needed
   to limit the rate accurately than to keep up with the maxium rate.

Watermark processing at the kernel level is not quite as broken.  It
is mostly non-existend, but partly works, sort of accidentally.  The
difference is now that there is a tx "eof" or "completion" interrupt
which indicates the condition corresponding to the ENOBUFS condition
going away, so that the kernel doesn't have to poll for this.  This
is not really an "eof" interrupt (unless bge is programmed insanely,
to interrupt only after the tx ring is completely empty).  It acts as
primitive watermarking.  bge can be programmed to interrupt after
having sent every N packets (strictly, after every N buffer descriptors,
but for small packets these are the same).  When there are more than
N packets to start, say M, this acts as a watermark at M-N packets.
bge is normally misprogrammed with N = 10.  At the line rate of 1.5 Mpps,
this asks for an interrupt rate of 150 kHz, which is far too high and
is usually unreachable, so reaching the line rate is impossible due to
the CPU load from the interrupts.  I use N = 384 or 256 so that the
interrupt rate is not the dominant limit.  However, N = 10 is better
for latency and works under light loads.  It also reduces the amount
of buffering needed.

The ifq works more as part of accidentally watermarking than as a buffer.
It is the same size as the tx right (actually 1 smaller for bogus reasons),
so it is not really useful as a buffer.  However, with no explicit
watermarking, any separate buffer like the ifq provides a sort of
watermark at the boundary between the buffers.  The usefulness of this
would most obvious if the tx "eof" interrupt were actually for eof
(perhaps that is what it was originally).  Then on the eof interrupt,
there is no time at all to generate new packets, and the time when the
tx is idle can be minimized by keeping pre-generated packets handy where
the can be copied to the tx ring at tx "eof" interrupt time.  A buffer
of about the same size as the tx ring (or maybe 1/4) the size, is enough
for this.

OTOH, with bge misprogrammed to interrupt after every 10 tx packets, the
ifq is useless for its watermark purposes.  The watermark is effectively
in the tx ring, and very strangely placed there at 10 below the top
(ring full).  Normally tx watermarks are placed near the bottom (ring
empty).  They must not be placed too near the bottom, else there would
not be enough time to replenish the ring between the time when the "eof"
(really, the "watermark") interrupt is received and when the tx runs
dry.  They should not be placed too near the top like they are in -current's
bge, else the point of having a large tx ring is defeated and there are
too many interrupts.  However, when they are placed near the top, latencency
requirements are reduced.

I recently worked on buffering for sio and noticed similar related
problems for tx watermarks.  Don't laugh -- serial i/o 1 character at
a time at 3.686400 Mbps has much the same timing requirements as
ethernet i/o 1 packet at a time at 1 Gbps.  Each serial character
takes ~2.7 usec and each minimal ethernet packet takes ~0.67 usec.
With tx "ring" sizes of 128 and 512 respectively, the ring times for
full to empty are 347 usec for serial i/o and 341 usec for ethernet i/o.
Strangely, tx is harder than rx because:
- perfection is possible and easier to measure for tx.  It consists of
   just keeping at least 1 entry in the tx ring at all times.  Latency
   must be kept below ~340 usec to have any chance of this.  This is not
   so easy to achieve under _all_ loads.
- for rx, you have an external source generating the packets, so you
   don't have to worry about latency affecting the generators.
- the need for watermark processing is better known for rx, since it
   obviously doesn't work to generate the rx "eof" interrupt near the
   top.
The serial timing was actually harder to satisfy, because I worked on
it on a 366 MHz CPU while I worked on bge on a 2 GHz CPU, and even the
2GHz CPU couldn't keep up with line rate (so from full to empty takes
800 usec).

It turned out that the best position for the tx low watermark is about
1/4 or 1/2 from the bottom for both sio and bge.  It must be fairly
high, else the latency requirements are not met.  In the middle is a
good general position.  Although it apparently "wastes" half of the ring
to make the latency requirements easier to meet (without very
system-dependent tuning), the efficiency lost from this is reasonably
small.

Bruce

From owner-freebsd-net@FreeBSD.ORG  Wed Dec  5 03:57:55 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 66EF01F9;
 Wed,  5 Dec 2012 03:57:55 +0000 (UTC)
 (envelope-from fodillemlinkarim@gmail.com)
Received: from mail-ie0-f179.google.com (mail-ie0-f179.google.com
 [209.85.223.179])
 by mx1.freebsd.org (Postfix) with ESMTP id EE27F8FC0C;
 Wed,  5 Dec 2012 03:57:54 +0000 (UTC)
Received: by mail-ie0-f179.google.com with SMTP id k14so7086016iea.10
 for <multiple recipients>; Tue, 04 Dec 2012 19:57:54 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=message-id:date:from:user-agent:mime-version:to:cc:subject
 :references:in-reply-to:content-type:content-transfer-encoding;
 bh=hkHoL6ot2rl5hN8jxnUoDl46qp9wOoFGrA83diK/XQ4=;
 b=A3jGiym1yazWI2tkDSYmEAzW9U5uMCESQYRDuqWe2iZI9KgjdFAKVcuiGubHm53uy/
 dIi8oIhl/9nA72e/h0EDPu0zpeKsbNltev0wkrGIzW5Z2WT2j3YueK+GWuCLNWNVH+JB
 LdYGuJhYk6VuZnqMaPwBbWEG4yh9P+K5UaFyKj54YyCXgC0f3Jkdo4GyRcb9PJO1lzpF
 ge+WLXKHnoYabs7H/HmchahvziTSCabuCXVFMOcgkA+sdPybjtudn0nSrkNmmptyqY8J
 d1gYnava0T9+WN2ZTJ2bmbu8JjQSkdyuoW3j8pG9tP0baXtuHMG8eH6rc2W8rPaNCn8t
 +F0w==
Received: by 10.50.33.173 with SMTP id s13mr582385igi.23.1354679874026;
 Tue, 04 Dec 2012 19:57:54 -0800 (PST)
Received: from [10.0.0.130] ([24.225.136.71])
 by mx.google.com with ESMTPS id uj11sm11568434igb.15.2012.12.04.19.57.51
 (version=SSLv3 cipher=OTHER); Tue, 04 Dec 2012 19:57:52 -0800 (PST)
Message-ID: <50BEC63B.6020801@gmail.com>
Date: Tue, 04 Dec 2012 22:57:47 -0500
From: Karim Fodil-Lemelin <fodillemlinkarim@gmail.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/17.0 Thunderbird/17.0
MIME-Version: 1.0
To: Andre Oppermann <oppermann@networx.ch>
Subject: Re: Latency issues with buf_ring
References: <1353259441.19423.YahooMailClassic@web121605.mail.ne1.yahoo.com>
 <201212041108.17645.jhb@freebsd.org>
 <CAJ-Vmo=tFFkeK2uADMPuBrgX6wN_9TSjAgs0WKPCrEfyhkG6Pw@mail.gmail.com>
 <50BE56C8.1030804@networx.ch>
In-Reply-To: <50BE56C8.1030804@networx.ch>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Barney Cordoba <barney_cordoba@yahoo.com>,
 Adrian Chadd <adrian@freebsd.org>, John Baldwin <jhb@freebsd.org>,
 freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 05 Dec 2012 03:57:55 -0000

Hi,

On 04/12/2012 3:02 PM, Andre Oppermann wrote:
> On 04.12.2012 20:34, Adrian Chadd wrote:
>> .. and it's important to note that buf_ring itself doesn't have the
>> race condition; it's the general driver implementation that's racy.
>>
>> I have the same races in ath(4) with the watchdog programming. Exactly
>> the same issue.
>
> Our IF_* stack/driver boundary handoff isn't up to the task anymore.
>
> Also the interactions are either poorly defined or understood in many
> places.  I've had a few chats with yongari@ and am experimenting with
> a modernized interface in my branch.
>
> The reason I stumbled across it was because I'm extending the hardware
> offload feature set and found out that the stack and the drivers (and
> the drivers among themself) are not really in sync with regards to 
> behavior.
>
> For most if not all ethernet drivers from 100Mbit/s the TX DMA rings
> are so large that buffering at the IFQ level doesn't make sense anymore
> and only adds latency.  So it could simply directly put everything into
> the TX DMA and not even try to soft-queue.  If the TX DMA ring is full
> ENOBUFS is returned instead of filling yet another queue.  However there
> are ALTQ interactions and other mechanisms which have to be considered
> too making it a bit more involved.
I've also bumped into this 'internalization' of drbr for quite some time 
now.

I have been toying with some ideas around a multi-queue capable ALTQ. 
Not unlike IFQ_* the whole class_queue_t code in ALTQ could use some 
freshening up. One avenue I am looking into is drbr queues (and its 
associated TX lock) as the back end queue implementation for ALTQ. 
ALTQ(9) has a concept of driver managed queues and the approach tries to 
keep the same paradigm but adapt it for buf_ring. In that context, It 
doesn't feel natural for me that drbr logic is handled so low inside the 
device drivers and makes system level modifications to ALTQ 
unnecessarily driver dependent.

ALTQ is also using very coarse grained locking (using the IFQ_LOCK for 
everything) which doesn't make much sense in a SMP/multiqueue system but 
that's another story.
>
> I'm coming up with a draft and some benchmark results for an updated
> stack/driver boundary in the next weeks before xmas.
>
Sounds great, can't wait to read it while drinking that eggnog :)

From owner-freebsd-net@FreeBSD.ORG  Wed Dec  5 03:58:09 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 0DF2F286;
 Wed,  5 Dec 2012 03:58:09 +0000 (UTC)
 (envelope-from fodillemlinkarim@gmail.com)
Received: from mail-ie0-f182.google.com (mail-ie0-f182.google.com
 [209.85.223.182])
 by mx1.freebsd.org (Postfix) with ESMTP id 96C398FC12;
 Wed,  5 Dec 2012 03:58:08 +0000 (UTC)
Received: by mail-ie0-f182.google.com with SMTP id s9so8977056iec.13
 for <multiple recipients>; Tue, 04 Dec 2012 19:58:07 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=message-id:date:from:user-agent:mime-version:to:cc:subject
 :references:in-reply-to:content-type:content-transfer-encoding;
 bh=Az1vPZAQ9TMin+ClxwSn7WMwLgYwBzDn6RN1spKRA9M=;
 b=hzvzRdBngMd2IpmZtYfCAeFmeharw/RZn/LQ6D9s6l2VQgodn9PyEApeC4Vvw/DqqM
 AbWYR3XgpuTVRa/78Kf8Sn6WGNn9ylb2Xxgb8LmILvJj72yde2d9oD1BB8ZFW3FnlsVi
 ho6xzMQp7Qe/SWN6EJtIFREd4WwL07uGUAqi0GdQYRFQNuh6SkvdhBqHGWie47nwUYM+
 d+uKkGVmDfcUyonDDSg5hyA7Kwr8kcE7YyTBjV+riJExY97vB+sa710VqfNxM4MKWhGJ
 F7rR4RuVSo+tglENGZ4yWRrdQ9qSDGTBU0J/vlC5ibq3UQsgQ1z2NJFHR+Xp3rUgJ5HQ
 nNCQ==
Received: by 10.50.150.144 with SMTP id ui16mr503107igb.68.1354679887885;
 Tue, 04 Dec 2012 19:58:07 -0800 (PST)
Received: from [10.0.0.130] ([24.225.136.71])
 by mx.google.com with ESMTPS id uj11sm11568274igb.15.2012.12.04.19.58.06
 (version=SSLv3 cipher=OTHER); Tue, 04 Dec 2012 19:58:07 -0800 (PST)
Message-ID: <50BEC64B.7010906@gmail.com>
Date: Tue, 04 Dec 2012 22:58:03 -0500
From: Karim Fodil-Lemelin <fodillemlinkarim@gmail.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:17.0) Gecko/17.0 Thunderbird/17.0
MIME-Version: 1.0
To: Andre Oppermann <oppermann@networx.ch>
Subject: Re: Latency issues with buf_ring
References: <1353259441.19423.YahooMailClassic@web121605.mail.ne1.yahoo.com>
 <201212041108.17645.jhb@freebsd.org>
 <CAJ-Vmo=tFFkeK2uADMPuBrgX6wN_9TSjAgs0WKPCrEfyhkG6Pw@mail.gmail.com>
 <50BE56C8.1030804@networx.ch>
In-Reply-To: <50BE56C8.1030804@networx.ch>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Barney Cordoba <barney_cordoba@yahoo.com>,
 Adrian Chadd <adrian@freebsd.org>, John Baldwin <jhb@freebsd.org>,
 freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 05 Dec 2012 03:58:09 -0000

Hi,

On 04/12/2012 3:02 PM, Andre Oppermann wrote:
> On 04.12.2012 20:34, Adrian Chadd wrote:
>> .. and it's important to note that buf_ring itself doesn't have the
>> race condition; it's the general driver implementation that's racy.
>>
>> I have the same races in ath(4) with the watchdog programming. Exactly
>> the same issue.
>
> Our IF_* stack/driver boundary handoff isn't up to the task anymore.
>
> Also the interactions are either poorly defined or understood in many
> places.  I've had a few chats with yongari@ and am experimenting with
> a modernized interface in my branch.
>
> The reason I stumbled across it was because I'm extending the hardware
> offload feature set and found out that the stack and the drivers (and
> the drivers among themself) are not really in sync with regards to 
> behavior.
>
> For most if not all ethernet drivers from 100Mbit/s the TX DMA rings
> are so large that buffering at the IFQ level doesn't make sense anymore
> and only adds latency.  So it could simply directly put everything into
> the TX DMA and not even try to soft-queue.  If the TX DMA ring is full
> ENOBUFS is returned instead of filling yet another queue.  However there
> are ALTQ interactions and other mechanisms which have to be considered
> too making it a bit more involved.
I've also bumped into this 'internalization' of drbr for quite some time 
now.

I have been toying with some ideas around a multi-queue capable ALTQ. 
Not unlike IFQ_* the whole class_queue_t code in ALTQ could use some 
freshening up. One avenue I am looking into is drbr queues (and its 
associated TX lock) as the back end queue implementation for ALTQ. 
ALTQ(9) has a concept of driver managed queues and the approach tries to 
keep the same paradigm but adapt it for buf_ring. In that context, It 
doesn't feel natural for me that drbr logic is handled so low inside the 
device drivers and makes system level modifications to ALTQ 
unnecessarily driver dependent.

ALTQ is also using very coarse grained locking (using the IFQ_LOCK for 
everything) which doesn't make much sense in a SMP/multiqueue system but 
that's another story.
>
> I'm coming up with a draft and some benchmark results for an updated
> stack/driver boundary in the next weeks before xmas.
>
Sounds great, can't wait to read it while drinking eggnog :)

From owner-freebsd-net@FreeBSD.ORG  Wed Dec  5 12:32:59 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 291CA85A
 for <freebsd-net@freebsd.org>; Wed,  5 Dec 2012 12:32:59 +0000 (UTC)
 (envelope-from barney_cordoba@yahoo.com)
Received: from nm47-vm1.bullet.mail.ne1.yahoo.com
 (nm47-vm1.bullet.mail.ne1.yahoo.com [98.138.121.97])
 by mx1.freebsd.org (Postfix) with ESMTP id BF8FF8FC14
 for <freebsd-net@freebsd.org>; Wed,  5 Dec 2012 12:32:58 +0000 (UTC)
Received: from [98.138.90.49] by nm47.bullet.mail.ne1.yahoo.com with NNFMP;
 05 Dec 2012 12:32:58 -0000
Received: from [98.138.89.168] by tm2.bullet.mail.ne1.yahoo.com with NNFMP;
 05 Dec 2012 12:32:58 -0000
Received: from [127.0.0.1] by omp1024.mail.ne1.yahoo.com with NNFMP;
 05 Dec 2012 12:32:58 -0000
X-Yahoo-Newman-Property: ymail-3
X-Yahoo-Newman-Id: 65999.77670.bm@omp1024.mail.ne1.yahoo.com
Received: (qmail 33185 invoked by uid 60001); 5 Dec 2012 12:32:57 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
 t=1354710777; bh=mXB4TdkKnov/lNyNyy9JU4dr7hc96bRxKTGK/B/wMD4=;
 h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding;
 b=LuAjGzb6PF/nJ4jD1u1aAw++FWUG3JnT8r9tyM6wyCfjroaEjwF8mIF+RCYztZXmm/zUJTTYVKa0YBJ1pcMOvruqAODNZaRwNixkfDUEQXktLEW0Hn3SX9CQHFBTfb0leESkuZq61p68Z+x9SiZnA3Ojo/TfEEUQnufuCkYPG3Y=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
 h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding;
 b=NLvs1yrSntjcOnc7fMr6oLYb+TJuKlMRvVckTEVPqKvuKLm+Xa4qMqUFLq7ois+Q/MXGTciOfWhrJT8lFIH7Ft0MS/5YY2t+Gk2TCySomm+nN9WKKd/k/juQSrnIMvTDVL2cMEmLIeEjFgyNC3/nA8TIHOqH1/kgqx9q9Ql1PBw=;
X-YMail-OSG: Y58jqwoVM1n2ffkYFcd15Ta_8CZLzIirawoghO3vYwamjAG
 _FJDX.IY36WbxCSgBVnE9YhYEgYPiN35dXIkczK06bdiHPm7dXoq0XJKL.7M
 3n91A_kR0UWB01D.Ty9vJEkLWskDS.fC84UU9wJyeq805xBSY7AA.PxqL772
 nzXVeedQhTGmJvdJ.ZzDbkZxRRN7ivebLt50yga08AHIQB6uDVvGJk..Wz6E
 YyuFjY0OzBC2qrAJahwS9wINb5_fAkftKwb4GiOXZOW4la7T7kmOUSlxfU4g
 QylibeFFzQITUpqsQmmsA1jP0.WAKLoQjae2yOYOgAdErPj5hRJycFiA3FrY
 jYiTGKvI7JUIMQ6lQPNRBwJnY4x.v2Gar59iDP8CWDy0jj0pTN39tqndHBoo
 xh7w1zM9Qd5_T2CgR8P.R0Gdf2NvFPowomNoviu_J5BubQSEbHw17idnFYc4
 tXMT9uba76d3FrU3polVY7bh6CZeQY5B0BDug084qB5GIfEkKiYjfFTJlg8N
 8_bVpxE6v7snIniJ2cnPHduqlqq.AcQ--
Received: from [174.48.128.27] by web121602.mail.ne1.yahoo.com via HTTP;
 Wed, 05 Dec 2012 04:32:57 PST
X-Rocket-MIMEInfo: 001.001,
 CgotLS0gT24gVHVlLCAxMi80LzEyLCBCcnVjZSBFdmFucyA8YnJkZUBvcHR1c25ldC5jb20uYXU.IHdyb3RlOgoKPiBGcm9tOiBCcnVjZSBFdmFucyA8YnJkZUBvcHR1c25ldC5jb20uYXU.Cj4gU3ViamVjdDogUmU6IExhdGVuY3kgaXNzdWVzIHdpdGggYnVmX3JpbmcKPiBUbzogIkFuZHJlIE9wcGVybWFubiIgPG9wcGVybWFubkBuZXR3b3J4LmNoPgo.IENjOiAiQWRyaWFuIENoYWRkIiA8YWRyaWFuQEZyZWVCU0Qub3JnPiwgIkJhcm5leSBDb3Jkb2JhIiA8YmFybmV5X2NvcmRvYmFAeWFob28uY29tPiwgIkoBMAEBAQE-
X-Mailer: YahooMailClassic/15.1.1 YahooMailWebService/0.8.128.478
Message-ID: <1354710777.97879.YahooMailClassic@web121602.mail.ne1.yahoo.com>
Date: Wed, 5 Dec 2012 04:32:57 -0800 (PST)
From: Barney Cordoba <barney_cordoba@yahoo.com>
Subject: Re: Latency issues with buf_ring
To: Andre Oppermann <oppermann@networx.ch>, Bruce Evans <brde@optusnet.com.au>
In-Reply-To: <20121205112511.Q932@besplex.bde.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-net@FreeBSD.org, Adrian Chadd <adrian@FreeBSD.org>,
 John Baldwin <jhb@FreeBSD.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 05 Dec 2012 12:32:59 -0000

=0A=0A--- On Tue, 12/4/12, Bruce Evans <brde@optusnet.com.au> wrote:=0A=0A>=
 From: Bruce Evans <brde@optusnet.com.au>=0A> Subject: Re: Latency issues w=
ith buf_ring=0A> To: "Andre Oppermann" <oppermann@networx.ch>=0A> Cc: "Adri=
an Chadd" <adrian@FreeBSD.org>, "Barney Cordoba" <barney_cordoba@yahoo.com>=
, "John Baldwin" <jhb@FreeBSD.org>, freebsd-net@FreeBSD.org=0A> Date: Tuesd=
ay, December 4, 2012, 10:31 PM=0A> On Tue, 4 Dec 2012, Andre Oppermann=0A> =
wrote:=0A> =0A> > For most if not all ethernet drivers from 100Mbit/s the=
=0A> TX DMA rings=0A> > are so large that buffering at the IFQ level doesn'=
t=0A> make sense anymore=0A> > and only adds latency.=0A> =0A> I found sort=
 of the opposite for bge at 1Gbps.=A0 Most or=0A> all bge NICs=0A> have a t=
x ring size of 512.=A0 The ifq length is the tx=0A> ring size minus=0A> 1 (=
511).=A0 I needed to expand this to imax(2 * tick / 4,=0A> 10000) to=0A> ma=
ximize pps.=A0 This does bad things to latency and=0A> worse things to=0A> =
caching (512 buffers might fit in the L2 cache, but 10000=0A> buffers=0A> b=
ust any reasonably cache as they are cycled through), but I=0A> only=0A> tr=
ied to optimize tx pps.=0A> =0A> > So it could simply directly put everythi=
ng into=0A> > the TX DMA and not even try to soft-queue.=A0 If the=0A> TX D=
MA ring is full=0A> > ENOBUFS is returned instead of filling yet another=0A=
> queue.=0A> =0A> That could work, but upper layers currently don't underst=
and=0A> ENOBUFS=0A> at all, so it would work poorly now.=A0 Also, 512 entri=
es=0A> is not many,=0A> so even if upper layers understood ENOBUFS it is no=
t easy=0A> for them to=0A> _always_ respond fast enough to keep the tx acti=
ve, unless=0A> there are=0A> upstream buffers with many more than 512 entri=
es.=A0=0A> There needs to be=0A> enough buffering somewhere so that the tx =
ring can be=0A> replenished=0A> almost instantly from the buffer, to handle=
 the worst-case=0A> latency=0A> for the threads generatng new (unbuffered) =
packets.=A0 At=0A> the line rate=0A> of ~1.5 Mpps for 1 Gbps, the maximum l=
atency that can be=0A> covered by=0A> 512 entries is only 340 usec.=0A> =0A=
> > However there=0A> > are ALTQ interactions and other mechanisms which ha=
ve=0A> to be considered=0A> > too making it a bit more involved.=0A> =0A> I=
 didn't try to handle ALTQ or even optimize for TCP.=0A> =0A> More details:=
 to maximize pps, the main detail is to ensure=0A> that the tx=0A> ring nev=
er becomes empty.=A0 The tx then transmits as=0A> fast as possible.=0A> Thi=
s requires some watermark processing, but FreeBSD has=0A> almost none=0A> f=
or tx rings.=A0 The following normally happens for=0A> packet generators=0A=
> like ttcp and netsend:=0A> =0A> - loop calling send() or sendto() until t=
he tx ring (and=0A> also any=0A> =A0 upstream buffers) fill up.=A0 Then ENO=
BUFS is=0A> returned.=0A> =0A> - watermark processing is broken in the user=
 API at this=0A> point.=A0 There=0A> =A0 is no way for the application to w=
ait for the ENOBUFS=0A> condition to=0A> =A0 go away (select() and poll() d=
on't work).=A0=0A> Applications use poor=0A> =A0 workarounds:=0A> =0A> - ol=
d (~1989) ttcp sleeps for 18 msec when send() returns=0A> ENOBUFS.=A0 This=
=0A> =A0 was barely good enough for 1 Mbps ethernet (line rate=0A> ~1500 pp=
s is 27=0A> =A0 per 18 msec, so IFQ_MAXLEN =3D 50 combined with just a=0A> =
1-entry tx ring=0A> =A0 provides a safety factor of about 2).=A0 Expansion=
=0A> of the tx ring size to=0A> =A0 512 makes this work with 10 Mbps ethern=
et too.=A0=0A> Expansion of the ifq=0A> =A0 to 511 gives another factor of =
2.=A0 After losing=0A> the safety factor of 2,=0A> =A0 we can now handle 40=
 Mbps ethernet, and are only a=0A> factor of 25 short=0A> =A0 for 1 Gbps.=
=A0 My hardware can't do line rate for=0A> small packets -- it=0A> =A0 can =
only do 640 kpps.=A0 Thus ttcp is only a=0A> factor of 11 short of=0A> =A0 =
supporting the hardware at 1 Gbps.=0A> =0A> =A0 This assumes that sleeps of=
 18 msec are actually=0A> possible, which=0A> =A0 they aren't with HZ =3D 1=
00 giving a granularity of 10=0A> msec so that=0A> =A0 sleep(18 msec) actua=
lly sleeps for an average of 23=0A> msec.=A0 -current=0A> =A0 uses the bad =
default of HZ =3D 1000.=A0 With that=0A> sleep(18 msec) would=0A> =A0 avera=
ge 18.5 msec.=A0 Of course, ttcp should sleep=0A> for more like 1=0A> =A0 m=
sec if that is possible.=A0 Then the average=0A> sleep is 1.5 msec.=A0 ttcp=
=0A> =A0 can keep up with the hardware with that, and is only=0A> slightly =
behind=0A> =A0 the hardware with the worst-case sleep of 2 msec=0A> (512+51=
1 packets=0A> =A0 generated every 2 msec is 511.5 kpps).=0A> =0A> =A0 I nor=
mally use old ttcp, except I modify it to sleep=0A> for 1 msec instead=0A> =
=A0 of 18 in one version, and in another version I remove=0A> the sleep so =
that=0A> =A0 it busy-waits in a loop that calls send() which=0A> almost alw=
ays returns=0A> =A0 ENOBUFS.=A0 The latter wastes a lot of CPU, but is=0A> =
almost good enough=0A> =A0 for throughput testing.=0A> =0A> - newer ttcp tr=
ies to program the sleep time in=0A> microseconds.=A0 This doesn't=0A> =A0 =
really work, since the sleep granularity is normally=0A> at least a millise=
cond,=0A> =A0 and even if it could be the 340 microseconds needed=0A> by bg=
e with no ifq=0A> =A0 (see above, and better divide the 340 by 2), then=0A>=
 this is quite short=0A> =A0 and would take almost as much CPU as=0A> busy-=
waiting.=A0 I consider HZ =3D 1000=0A> =A0 to be another form of polling/bu=
sy-waiting and don't=0A> use it except for=0A> =A0 testing.=0A> =0A> - netr=
ate/netsend also uses a programmed sleep time.=A0=0A> This doesn't really=
=0A> =A0 work, as above.=A0 netsend also tries to limit its=0A> rate based =
on sleeping.=0A> =A0 This is further from working, since even=0A> finer-gra=
ined sleeps are needed=0A> =A0 to limit the rate accurately than to keep up=
 with the=0A> maxium rate.=0A> =0A> Watermark processing at the kernel leve=
l is not quite as=0A> broken.=A0 It=0A> is mostly non-existend, but partly =
works, sort of=0A> accidentally.=A0 The=0A> difference is now that there is=
 a tx "eof" or "completion"=0A> interrupt=0A> which indicates the condition=
 corresponding to the ENOBUFS=0A> condition=0A> going away, so that the ker=
nel doesn't have to poll for=0A> this.=A0 This=0A> is not really an "eof" i=
nterrupt (unless bge is programmed=0A> insanely,=0A> to interrupt only afte=
r the tx ring is completely=0A> empty).=A0 It acts as=0A> primitive waterma=
rking.=A0 bge can be programmed to=0A> interrupt after=0A> having sent ever=
y N packets (strictly, after every N buffer=0A> descriptors,=0A> but for sm=
all packets these are the same).=A0 When there=0A> are more than=0A> N pack=
ets to start, say M, this acts as a watermark at M-N=0A> packets.=0A> bge i=
s normally misprogrammed with N =3D 10.=A0 At the line=0A> rate of 1.5 Mpps=
,=0A> this asks for an interrupt rate of 150 kHz, which is far too=0A> high=
 and=0A> is usually unreachable, so reaching the line rate is=0A> impossibl=
e due to=0A> the CPU load from the interrupts.=A0 I use N =3D 384 or 256=0A=
> so that the=0A> interrupt rate is not the dominant limit.=A0 However, N =
=3D=0A> 10 is better=0A> for latency and works under light loads.=A0 It als=
o=0A> reduces the amount=0A> of buffering needed.=0A> =0A> The ifq works mo=
re as part of accidentally watermarking than=0A> as a buffer.=0A> It is the=
 same size as the tx right (actually 1 smaller for=0A> bogus reasons),=0A> =
so it is not really useful as a buffer.=A0 However, with=0A> no explicit=0A=
> watermarking, any separate buffer like the ifq provides a=0A> sort of=0A>=
 watermark at the boundary between the buffers.=A0 The=0A> usefulness of th=
is=0A> would most obvious if the tx "eof" interrupt were actually=0A> for e=
of=0A> (perhaps that is what it was originally).=A0 Then on the=0A> eof int=
errupt,=0A> there is no time at all to generate new packets, and the=0A> ti=
me when the=0A> tx is idle can be minimized by keeping pre-generated packet=
s=0A> handy where=0A> the can be copied to the tx ring at tx "eof" interrup=
t=0A> time.=A0 A buffer=0A> of about the same size as the tx ring (or maybe=
 1/4) the=0A> size, is enough=0A> for this.=0A> =0A> OTOH, with bge misprog=
rammed to interrupt after every 10 tx=0A> packets, the=0A> ifq is useless f=
or its watermark purposes.=A0 The=0A> watermark is effectively=0A> in the t=
x ring, and very strangely placed there at 10 below=0A> the top=0A> (ring f=
ull).=A0 Normally tx watermarks are placed near=0A> the bottom (ring=0A> em=
pty).=A0 They must not be placed too near the bottom,=0A> else there would=
=0A> not be enough time to replenish the ring between the time=0A> when the=
 "eof"=0A> (really, the "watermark") interrupt is received and when the=0A>=
 tx runs=0A> dry.=A0 They should not be placed too near the top like=0A> th=
ey are in -current's=0A> bge, else the point of having a large tx ring is d=
efeated=0A> and there are=0A> too many interrupts.=A0 However, when they ar=
e placed=0A> near the top, latencency=0A> requirements are reduced.=0A> =0A=
> I recently worked on buffering for sio and noticed similar=0A> related=0A=
> problems for tx watermarks.=A0 Don't laugh -- serial i/o=0A> 1 character =
at=0A> a time at 3.686400 Mbps has much the same timing=0A> requirements as=
=0A> ethernet i/o 1 packet at a time at 1 Gbps.=A0 Each serial=0A> characte=
r=0A> takes ~2.7 usec and each minimal ethernet packet takes ~0.67=0A> usec=
.=0A> With tx "ring" sizes of 128 and 512 respectively, the ring=0A> times =
for=0A> full to empty are 347 usec for serial i/o and 341 usec for=0A> ethe=
rnet i/o.=0A> Strangely, tx is harder than rx because:=0A> - perfection is =
possible and easier to measure for tx.=A0=0A> It consists of=0A> =A0 just k=
eeping at least 1 entry in the tx ring at all=0A> times.=A0 Latency=0A> =A0=
 must be kept below ~340 usec to have any chance of=0A> this.=A0 This is no=
t=0A> =A0 so easy to achieve under _all_ loads.=0A> - for rx, you have an e=
xternal source generating the=0A> packets, so you=0A> =A0 don't have to wor=
ry about latency affecting the=0A> generators.=0A> - the need for watermark=
 processing is better known for rx,=0A> since it=0A> =A0 obviously doesn't =
work to generate the rx "eof"=0A> interrupt near the=0A> =A0 top.=0A> The s=
erial timing was actually harder to satisfy, because I=0A> worked on=0A> it=
 on a 366 MHz CPU while I worked on bge on a 2 GHz CPU,=0A> and even the=0A=
> 2GHz CPU couldn't keep up with line rate (so from full to=0A> empty takes=
=0A> 800 usec).=0A> =0A> It turned out that the best position for the tx lo=
w=0A> watermark is about=0A> 1/4 or 1/2 from the bottom for both sio and bg=
e.=A0 It=0A> must be fairly=0A> high, else the latency requirements are not=
 met.=A0 In=0A> the middle is a=0A> good general position.=A0 Although it a=
pparently "wastes"=0A> half of the ring=0A> to make the latency requirement=
s easier to meet (without=0A> very=0A> system-dependent tuning), the effici=
ency lost from this is=0A> reasonably=0A> small.=0A> =0A> Bruce=0A> =0A=0AI=
'm sure that Bill Paul is a nice man, but referencing drivers that were=0Aw=
ritten from a template and never properly load tested doesn't really=0Aillu=
strate anything. All of his drivers are functional but optimized for=0Anoth=
ing.=0A=0ABC

From owner-freebsd-net@FreeBSD.ORG  Wed Dec  5 13:01:13 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 32AB4955
 for <freebsd-net@freebsd.org>; Wed,  5 Dec 2012 13:01:13 +0000 (UTC)
 (envelope-from barney_cordoba@yahoo.com)
Received: from nm59-vm3.bullet.mail.ne1.yahoo.com
 (nm59-vm3.bullet.mail.ne1.yahoo.com [98.138.121.127])
 by mx1.freebsd.org (Postfix) with ESMTP id 56EF58FC17
 for <freebsd-net@freebsd.org>; Wed,  5 Dec 2012 13:01:12 +0000 (UTC)
Received: from [98.138.226.176] by nm59.bullet.mail.ne1.yahoo.com with NNFMP;
 05 Dec 2012 12:58:17 -0000
Received: from [98.138.87.6] by tm11.bullet.mail.ne1.yahoo.com with NNFMP;
 05 Dec 2012 12:58:17 -0000
Received: from [127.0.0.1] by omp1006.mail.ne1.yahoo.com with NNFMP;
 05 Dec 2012 12:58:17 -0000
X-Yahoo-Newman-Property: ymail-3
X-Yahoo-Newman-Id: 728604.25428.bm@omp1006.mail.ne1.yahoo.com
Received: (qmail 69087 invoked by uid 60001); 5 Dec 2012 12:58:17 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
 t=1354712297; bh=B3h8bir2SgWE4ETUypD5mlhkUVnmkWxWg71yBpDHixE=;
 h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding;
 b=U5kzQd02Dqacf6gEID3KvufkzcxMnHkZCjTYTXWd3yGWcfayqU+nfudArnrFQa8uuAp44ig7J4CQsagW9zO+QYitfX6JK9ggXxljf2BIQUNMeeoEMbmju37T8dH6JPxDgkcZDWFefk3KM6W3cn1uXZ3mMg4GarLsDF0iZr8vFZY=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
 h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding;
 b=vRGINeoDBlhKH3F8ZSQFabVaUhVB4qqJSZgjhw7RcThgmj6ATebFOv1Anl8plOe5sbURVXvZZkVEopFsGFUZVIFl9nshYDmSBVfeCqROp69VSH9JuAMBz0xxiC4iN5vzM+QQtVGtl2CP/kfQXI/DsMlNqkNfAulsxJDKYHrgxVw=;
X-YMail-OSG: gYLHBpsVM1kSTeI7EZJA1XwTzbgNTHB43r8PYFfF_2hPTHB
 2uUJ2cRGRrIOfxGJO8rdKYlYMvdFqj15NbYWwRkKwbTLVxldISZZ2CL1.EJQ
 A3o8_XMeGeVmBKFJSUYQal7ECMWXeqT_fGsJczXkaSg_o70qp6WXFuKTJqNV
 ncSTNL48JptnHk5FGm2LUK1R7FL0ASNM3tvCtWV4nhWPd4YjMsZnCN6aRuj7
 rQyCdc3g1SAshcEnwVarNanEqGD5cxCiyPLGgqgYsSECMP5p.U3ew04Vb.re
 VVSnjtdS5XeBfQLFZ0mMVxoCyXKppjqnD4OpkbBxSEeXMARX32wyCcEHkaoW
 0V1eodsaWy_Hn.y9GceWJ4U2lRXO6ZiPFk7bx3xFXhxekJKh_2scsYXDQe4n
 ELNZ02zbii.k61Dbag.1cfmR1Ei3rFCrNfy1r3TXADqO16s2TWAi4POVSXKS
 uLLLsEjxAcZ9QBLk5OV.0bifG_kgDtQOfS0xgdluPw5O16qpcTkEkGjc8sBu
 3Y7E2KTPG_s0IE24DtXz7huu9sJ49gA--
Received: from [174.48.128.27] by web121606.mail.ne1.yahoo.com via HTTP;
 Wed, 05 Dec 2012 04:58:17 PST
X-Rocket-MIMEInfo: 001.001,
 CgotLS0gT24gVHVlLCAxMi80LzEyLCBBZHJpYW4gQ2hhZGQgPGFkcmlhbkBmcmVlYnNkLm9yZz4gd3JvdGU6Cgo.IEZyb206IEFkcmlhbiBDaGFkZCA8YWRyaWFuQGZyZWVic2Qub3JnPgo.IFN1YmplY3Q6IFJlOiBMYXRlbmN5IGlzc3VlcyB3aXRoIGJ1Zl9yaW5nCj4gVG86ICJBbmRyZSBPcHBlcm1hbm4iIDxvcHBlcm1hbm5AbmV0d29yeC5jaD4KPiBDYzogIkJhcm5leSBDb3Jkb2JhIiA8YmFybmV5X2NvcmRvYmFAeWFob28uY29tPiwgIkpvaG4gQmFsZHdpbiIgPGpoYkBmcmVlYnNkLm9yZz4sIGZyZWVic2QBMAEBAQE-
X-Mailer: YahooMailClassic/15.1.1 YahooMailWebService/0.8.128.478
Message-ID: <1354712297.65896.YahooMailClassic@web121606.mail.ne1.yahoo.com>
Date: Wed, 5 Dec 2012 04:58:17 -0800 (PST)
From: Barney Cordoba <barney_cordoba@yahoo.com>
Subject: Re: Latency issues with buf_ring
To: Andre Oppermann <oppermann@networx.ch>, Adrian Chadd <adrian@freebsd.org>
In-Reply-To: <CAJ-Vmok+W_LgSCnETLOAogucqUSy+yBixsdNj-2Aepy+1Lo7gw@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-net@freebsd.org, John Baldwin <jhb@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 05 Dec 2012 13:01:13 -0000

=0A=0A--- On Tue, 12/4/12, Adrian Chadd <adrian@freebsd.org> wrote:=0A=0A> =
From: Adrian Chadd <adrian@freebsd.org>=0A> Subject: Re: Latency issues wit=
h buf_ring=0A> To: "Andre Oppermann" <oppermann@networx.ch>=0A> Cc: "Barney=
 Cordoba" <barney_cordoba@yahoo.com>, "John Baldwin" <jhb@freebsd.org>, fre=
ebsd-net@freebsd.org=0A> Date: Tuesday, December 4, 2012, 4:31 PM=0A> On 4 =
December 2012 12:02, Andre=0A> Oppermann <oppermann@networx.ch>=0A> wrote:=
=0A> =0A> > Our IF_* stack/driver boundary handoff isn't up to the=0A> task=
 anymore.=0A> =0A> Right. well, the current hand off is really "here's a=0A=
> packet, go do=0A> stuff!" and the legacy if_start() method is just plain=
=0A> broken for SMP,=0A> preemption and direct dispatch.=0A> =0A> Things ar=
e also very special in the net80211 world, with the=0A> stack=0A> layer hav=
ing to get its grubby fingers into things.=0A> =0A> I'm sure that the other=
 examples of layered protocols (eg=0A> doing MPLS,=0A> or even just straigh=
t PPPoE style tunneling) has the same=0A> issues.=0A> Anything with sequenc=
e numbers and encryption being done by=0A> some other=0A> layer is going to=
 have the same issue, unless it's all=0A> enforced via=0A> some other queue=
 and a single thread handling the network=0A> stack=0A> "stuff".=0A> =0A> I=
 bet direct-dispatch netgraph will have similar issues too,=0A> if it=0A> e=
ver comes into existence. :-)=0A> =0A> > Also the interactions are either p=
oorly defined or=0A> understood in many=0A> > places.=A0 I've had a few cha=
ts with yongari@ and am=0A> experimenting with=0A> > a modernized interface=
 in my branch.=0A> >=0A> > The reason I stumbled across it was because I'm=
=0A> extending the hardware=0A> > offload feature set and found out that th=
e stack and=0A> the drivers (and=0A> > the drivers among themself) are not =
really in sync with=0A> regards to behavior.=0A> >=0A> > For most if not al=
l ethernet drivers from 100Mbit/s the=0A> TX DMA rings=0A> > are so large t=
hat buffering at the IFQ level doesn't=0A> make sense anymore=0A> > and onl=
y adds latency.=A0 So it could simply=0A> directly put everything into=0A> =
> the TX DMA and not even try to soft-queue.=A0 If the=0A> TX DMA ring is f=
ull=0A> > ENOBUFS is returned instead of filling yet another=0A> queue.=A0 =
However there=0A> > are ALTQ interactions and other mechanisms which have=
=0A> to be considered=0A> > too making it a bit more involved.=0A> =0A> net=
80211 has slightly different problems. We have=0A> requirements for=0A> per=
-node, per-TID/per-AC state (not just for QOS, but=0A> separate=0A> sequenc=
e numbers, different state machine handling for=0A> things like=0A> aggrega=
tion and (later) U-APSD handling, etc) so we do need=0A> to direct=0A> fram=
es into different queues and then correctly serialise=0A> that mess.=0A> =
=0A> > I'm coming up with a draft and some benchmark results=0A> for an upd=
ated=0A> > stack/driver boundary in the next weeks before xmas.=0A> =0A> Ok=
. Please don't rush into it though; I'd like time to think=0A> about it=0A>=
 after NY (as I may actually _have_ a holiday this xmas!) and=0A> I'd like=
=0A> to try and rope in people from non-ethernet-packet-pushing=0A> backgro=
unds=0A> to comment.=0A> They may have much stricter and/or stranger requir=
ements=0A> when it comes=0A> to how the network layer passes, serialises an=
d pushes=0A> packets to=0A> other layers.=0A> =0A> Thanks,=0A> =0A> =0A> Ad=
rian=0A=0ASomething I'd like to see is a general modularization of function=
,=0Awhich will make all of the other stuff much easier. A big issue with=0A=
multipurpose OSes is that they tend to be bloated with stuff that almost=0A=
nobody uses. 99.9% of people are running either bridge/filters or straight=
=0ATCP/IP, and there is a different design goal for a single nic web server=
=0Aand a router or firewall. =0A=0ABy modularization, I mean making the "pi=
eces" threadable. The requirements=0Afor threading vary by application, but=
 the ability to control it can=0Amake a world of difference in performance.=
 Having a dedicate transmit=0Athread may make no sense on a web server, on =
a dual core system or=0Awith a single queue adapter, but other times it mig=
ht. Instead of having=0Aone big honking routine that does everything, modul=
arizing it not only=0Acleans up the code, but also makes the system more fl=
exible without =0Amaking it a mess.=0A=0AThe design for the 99% should not =
be hindered by the need to support =0Astuff like ALTQ. The hooks for ALTQ s=
hould be possible, but the locking=0Aand queuing only required for such out=
liers should be separable. =0A=0AI'd also like to see a unification of all =
of the projects. Is it really=0Anecessary to have 34 checks for different "=
ideas" in if_ethersubr.c? =0A=0AAs a developer I know that you always want =
to work on the next new thing,=0Abut sometimes you need to stop, think, and=
 clean up your code. The cleaner=0Acode opens up new possibilities, and res=
ults in a better overall product.=0A=0ABC

From owner-freebsd-net@FreeBSD.ORG  Wed Dec  5 14:00:18 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 9D651D8
 for <freebsd-net@freebsd.org>; Wed,  5 Dec 2012 14:00:18 +0000 (UTC)
 (envelope-from andre@freebsd.org)
Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2])
 by mx1.freebsd.org (Postfix) with ESMTP id 0F30C8FC14
 for <freebsd-net@freebsd.org>; Wed,  5 Dec 2012 14:00:17 +0000 (UTC)
Received: (qmail 10187 invoked from network); 5 Dec 2012 15:30:27 -0000
Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2])
 (envelope-sender <andre@freebsd.org>)
 by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP
 for <barney_cordoba@yahoo.com>; 5 Dec 2012 15:30:27 -0000
Message-ID: <50BF536C.3060909@freebsd.org>
Date: Wed, 05 Dec 2012 15:00:12 +0100
From: Andre Oppermann <andre@freebsd.org>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:16.0) Gecko/20121026 Thunderbird/16.0.2
MIME-Version: 1.0
To: Barney Cordoba <barney_cordoba@yahoo.com>
Subject: Re: Latency issues with buf_ring
References: <1354712297.65896.YahooMailClassic@web121606.mail.ne1.yahoo.com>
In-Reply-To: <1354712297.65896.YahooMailClassic@web121606.mail.ne1.yahoo.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@freebsd.org, Adrian Chadd <adrian@freebsd.org>,
 John Baldwin <jhb@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 05 Dec 2012 14:00:18 -0000

On 05.12.2012 13:58, Barney Cordoba wrote:
>
>
> --- On Tue, 12/4/12, Adrian Chadd <adrian@freebsd.org> wrote:
>
>> From: Adrian Chadd <adrian@freebsd.org>
>> Subject: Re: Latency issues with buf_ring
>> To: "Andre Oppermann" <oppermann@networx.ch>
>> Cc: "Barney Cordoba" <barney_cordoba@yahoo.com>, "John Baldwin" <jhb@freebsd.org>, freebsd-net@freebsd.org
>> Date: Tuesday, December 4, 2012, 4:31 PM
>> On 4 December 2012 12:02, Andre
>> Oppermann <oppermann@networx.ch>
>> wrote:
>>
>>> Our IF_* stack/driver boundary handoff isn't up to the
>> task anymore.
>>
>> Right. well, the current hand off is really "here's a
>> packet, go do
>> stuff!" and the legacy if_start() method is just plain
>> broken for SMP,
>> preemption and direct dispatch.
>>
>> Things are also very special in the net80211 world, with the
>> stack
>> layer having to get its grubby fingers into things.
>>
>> I'm sure that the other examples of layered protocols (eg
>> doing MPLS,
>> or even just straight PPPoE style tunneling) has the same
>> issues.
>> Anything with sequence numbers and encryption being done by
>> some other
>> layer is going to have the same issue, unless it's all
>> enforced via
>> some other queue and a single thread handling the network
>> stack
>> "stuff".
>>
>> I bet direct-dispatch netgraph will have similar issues too,
>> if it
>> ever comes into existence. :-)
>>
>>> Also the interactions are either poorly defined or
>> understood in many
>>> places.  I've had a few chats with yongari@ and am
>> experimenting with
>>> a modernized interface in my branch.
>>>
>>> The reason I stumbled across it was because I'm
>> extending the hardware
>>> offload feature set and found out that the stack and
>> the drivers (and
>>> the drivers among themself) are not really in sync with
>> regards to behavior.
>>>
>>> For most if not all ethernet drivers from 100Mbit/s the
>> TX DMA rings
>>> are so large that buffering at the IFQ level doesn't
>> make sense anymore
>>> and only adds latency.  So it could simply
>> directly put everything into
>>> the TX DMA and not even try to soft-queue.  If the
>> TX DMA ring is full
>>> ENOBUFS is returned instead of filling yet another
>> queue.  However there
>>> are ALTQ interactions and other mechanisms which have
>> to be considered
>>> too making it a bit more involved.
>>
>> net80211 has slightly different problems. We have
>> requirements for
>> per-node, per-TID/per-AC state (not just for QOS, but
>> separate
>> sequence numbers, different state machine handling for
>> things like
>> aggregation and (later) U-APSD handling, etc) so we do need
>> to direct
>> frames into different queues and then correctly serialise
>> that mess.
>>
>>> I'm coming up with a draft and some benchmark results
>> for an updated
>>> stack/driver boundary in the next weeks before xmas.
>>
>> Ok. Please don't rush into it though; I'd like time to think
>> about it
>> after NY (as I may actually _have_ a holiday this xmas!) and
>> I'd like
>> to try and rope in people from non-ethernet-packet-pushing
>> backgrounds
>> to comment.
>> They may have much stricter and/or stranger requirements
>> when it comes
>> to how the network layer passes, serialises and pushes
>> packets to
>> other layers.
>>
>> Thanks,
>>
>>
>> Adrian
>
> Something I'd like to see is a general modularization of function,
> which will make all of the other stuff much easier. A big issue with
> multipurpose OSes is that they tend to be bloated with stuff that almost
> nobody uses. 99.9% of people are running either bridge/filters or straight
> TCP/IP, and there is a different design goal for a single nic web server
> and a router or firewall.
>
> By modularization, I mean making the "pieces" threadable. The requirements
> for threading vary by application, but the ability to control it can
> make a world of difference in performance. Having a dedicate transmit
> thread may make no sense on a web server, on a dual core system or
> with a single queue adapter, but other times it might. Instead of having
> one big honking routine that does everything, modularizing it not only
> cleans up the code, but also makes the system more flexible without
> making it a mess.
>
> The design for the 99% should not be hindered by the need to support
> stuff like ALTQ. The hooks for ALTQ should be possible, but the locking
> and queuing only required for such outliers should be separable.
>
> I'd also like to see a unification of all of the projects. Is it really
> necessary to have 34 checks for different "ideas" in if_ethersubr.c?
>
> As a developer I know that you always want to work on the next new thing,
> but sometimes you need to stop, think, and clean up your code. The cleaner
> code opens up new possibilities, and results in a better overall product.

I hear you.

-- 
Andre


From owner-freebsd-net@FreeBSD.ORG  Thu Dec  6 06:39:13 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 8493512B
 for <freebsd-net@freebsd.org>; Thu,  6 Dec 2012 06:39:13 +0000 (UTC)
 (envelope-from kevlo@kevlo.org)
Received: from ns.kevlo.org (kevlo.org [220.128.136.52])
 by mx1.freebsd.org (Postfix) with ESMTP id 058B78FC08
 for <freebsd-net@freebsd.org>; Thu,  6 Dec 2012 06:39:12 +0000 (UTC)
Received: from srg.kevlo.org (git.kevlo.org [220.128.136.52])
 by ns.kevlo.org (8.14.5/8.14.5) with ESMTP id qB66d15M051618
 for <freebsd-net@freebsd.org>; Thu, 6 Dec 2012 14:39:01 +0800 (CST)
 (envelope-from kevlo@kevlo.org)
Message-ID: <50C03D8F.3090106@kevlo.org>
Date: Thu, 06 Dec 2012 14:39:11 +0800
From: Kevin Lo <kevlo@kevlo.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/17.0 Thunderbird/17.0
MIME-Version: 1.0
To: freebsd-net@freebsd.org
Subject: Review request: fix return value of socket(2) on no family found
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 06 Dec 2012 06:39:13 -0000

Hi,

Here's the patch mostly from NetBSD to make socket(2) return EAFNOSUPPORT
rather than EPROTONOSUPPORT if the family cannot be found.

http://people.freebsd.org/~kevlo/patch-socket

The man page documents the behavior specified in POSIX.1-2008:

http://pubs.opengroup.org/onlinepubs/9699919799/functions/socket.html

For reference, Linux, NetBSD, and OS X return EAFNOSUPPORT for this.

     Kevin


From owner-freebsd-net@FreeBSD.ORG  Thu Dec  6 09:13:46 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 8F3E8145;
 Thu,  6 Dec 2012 09:13:46 +0000 (UTC)
 (envelope-from ermal.luci@gmail.com)
Received: from mail-qa0-f47.google.com (mail-qa0-f47.google.com
 [209.85.216.47])
 by mx1.freebsd.org (Postfix) with ESMTP id 2E2538FC15;
 Thu,  6 Dec 2012 09:13:45 +0000 (UTC)
Received: by mail-qa0-f47.google.com with SMTP id a19so546446qad.13
 for <multiple recipients>; Thu, 06 Dec 2012 01:13:45 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:date:x-google-sender-auth:message-id:subject
 :from:to:content-type;
 bh=E8vbsDDCOqpkv5xjsDU6z4mwj49/e+EQ5bbcO7jESvg=;
 b=d92GhF2cCOr1PWrHTKxKXYespxbmt7O72XdJQgG4TyCKc7FDXyWCIDrAOx55glhg4D
 VZamewYt/ryvA/UZ+jggdbQFBDAnj5eJNV2lbr2Xsn9HP2lE85X23xZJ6BADjQTvNnss
 Ai8Ti3fm52SrSa1xT1qmw8bI1+c7VhD8l8B1vy5AfboA3FAf8VLuvHcq0rEzkTXx1RtC
 SHo+EwgHAUnqdAeg8ScWa3VqFrCAgOy74gGbKJx+Rr6EJzyYEt/+vxMQSZ6Y54Sn3RdH
 iT2udGTPnjXnTs7hjTCO8HA31dl2gDTaJk3223BoWPJikWkEx+jHlmujkC++w3lazf4p
 2eVw==
MIME-Version: 1.0
Received: by 10.229.201.160 with SMTP id fa32mr356975qcb.16.1354785225309;
 Thu, 06 Dec 2012 01:13:45 -0800 (PST)
Sender: ermal.luci@gmail.com
Received: by 10.49.121.163 with HTTP; Thu, 6 Dec 2012 01:13:45 -0800 (PST)
Date: Thu, 6 Dec 2012 10:13:45 +0100
X-Google-Sender-Auth: 3kSIhFh3XTCevcjAOClVIc3Gmco
Message-ID: <CAPBZQG3T=Mvp80-mhFJJ5QRcv5+3SLwV__2bVgvP1YO=UFVOUA@mail.gmail.com>
Subject: ipfw(4) dynamic states/rules and its callout
From: =?ISO-8859-1?Q?Ermal_Lu=E7i?= <eri@freebsd.org>
To: freebsd-net <freebsd-net@freebsd.org>, freebsd-ipfw@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 06 Dec 2012 09:13:46 -0000

Hello,

i was looking at ipfw dynamic code for dynamic states/rules and see that it
unconditionally schedules a callout even if there is not work to do.

Wouldn't it be best to reschedule it when there is something to do to avoid
having a useless
callout/event run every time on the system?

Is there any complication i am missing on it!

Regards,
Ermal

From owner-freebsd-net@FreeBSD.ORG  Thu Dec  6 09:35:17 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 477CEB59;
 Thu,  6 Dec 2012 09:35:17 +0000 (UTC)
 (envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
 by mx1.freebsd.org (Postfix) with ESMTP id 167C38FC0C;
 Thu,  6 Dec 2012 09:35:17 +0000 (UTC)
Received: from fledge.watson.org (fledge.watson.org [65.122.17.41])
 by cyrus.watson.org (Postfix) with ESMTPS id A2A6F46B20;
 Thu,  6 Dec 2012 04:35:16 -0500 (EST)
Date: Thu, 6 Dec 2012 09:35:16 +0000 (GMT)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: John Baldwin <jhb@freebsd.org>
Subject: Re: Latency issues with buf_ring
In-Reply-To: <201212041108.17645.jhb@freebsd.org>
Message-ID: <alpine.BSF.2.00.1212060929430.78351@fledge.watson.org>
References: <1353259441.19423.YahooMailClassic@web121605.mail.ne1.yahoo.com>
 <201212041108.17645.jhb@freebsd.org>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: Barney Cordoba <barney_cordoba@yahoo.com>, freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 06 Dec 2012 09:35:17 -0000


On Tue, 4 Dec 2012, John Baldwin wrote:

>> Q2: Are there any case studies or benchmarks for buf_ring, or it is just 
>> blindly being used because someone claimed it was better and offered it for 
>> free? One of the points of locking is to avoid race conditions, so the
>
> fact that you have races in a supposed lock-less scheme seems more than just 
> ironic.
>
> The buf_ring author claims it has benefits in high pps workloads.  I am not 
> aware of any benchmarks, etc.

... joining this conversation a bit late -- still about two years behind on 
net@ :-) ...

There are several places where having a good buf_ring primitive should offer 
significant benefits over blocking locks around queues:

- ifnet transmit enqueue path, whether owned by the general stack (ifqueue) or
   the driver (as is often the case with if_transmit).

- netisr queues used in deferred input dispatch, including loopback.

- A future lockless hand-off of inbound TCP segments from the ithread/netisr
   to an already running user thread a la Van Jacobson's proposal to the Linux
   community (now implemented), which would significantly reduce contention on
   inpcb locks in many workloads.

I've measured significant lock contention in all those places in the past, and 
I believe buf_ring was intended to address at least the first case.  This 
isn't the same as having benchmarks showing that the current code is "better", 
but the right primitive used in the right way should almost certainly help all 
of those cases substantially.  I know that when Philip Paeps was working with 
the Solarflare driver, switching to lockless dispatch in the outbound path 
made a significant difference.  One thing we do need to make sure is handled 
well is bounds on queue length, since we don't want infinitely long queues 
when a backlog begins to form -- there's no reason this can't be done, 
although the specifics depend on what one wants to accomplish and how.

I would like to see us making use of lockless queue primitives in these kinds 
of scenarios, motivated by benchmarking, and ideally addressing architectures 
with weaker memory consistency properly.  We should definitely minimise the 
number of different implementations of those primitives as much as possible, 
since (as with locks themselves) they are very hard to get right, and 
debugging problems with them can be quite problematic.

Robert

From owner-freebsd-net@FreeBSD.ORG  Thu Dec  6 09:39:48 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 5FFE5C4C;
 Thu,  6 Dec 2012 09:39:48 +0000 (UTC)
 (envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42])
 by mx1.freebsd.org (Postfix) with ESMTP id 2DDE98FC0C;
 Thu,  6 Dec 2012 09:39:48 +0000 (UTC)
Received: from fledge.watson.org (fledge.watson.org [65.122.17.41])
 by cyrus.watson.org (Postfix) with ESMTPS id CCA9346B1A;
 Thu,  6 Dec 2012 04:39:47 -0500 (EST)
Date: Thu, 6 Dec 2012 09:39:47 +0000 (GMT)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Andre Oppermann <oppermann@networx.ch>
Subject: Re: Latency issues with buf_ring
In-Reply-To: <50BE56C8.1030804@networx.ch>
Message-ID: <alpine.BSF.2.00.1212060936010.78351@fledge.watson.org>
References: <1353259441.19423.YahooMailClassic@web121605.mail.ne1.yahoo.com>
 <201212041108.17645.jhb@freebsd.org>
 <CAJ-Vmo=tFFkeK2uADMPuBrgX6wN_9TSjAgs0WKPCrEfyhkG6Pw@mail.gmail.com>
 <50BE56C8.1030804@networx.ch>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: Barney Cordoba <barney_cordoba@yahoo.com>,
 Adrian Chadd <adrian@freebsd.org>, John Baldwin <jhb@freebsd.org>,
 freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 06 Dec 2012 09:39:48 -0000

On Tue, 4 Dec 2012, Andre Oppermann wrote:

> For most if not all ethernet drivers from 100Mbit/s the TX DMA rings are so 
> large that buffering at the IFQ level doesn't make sense anymore and only 
> adds latency.  So it could simply directly put everything into the TX DMA 
> and not even try to soft-queue.  If the TX DMA ring is full ENOBUFS is 
> returned instead of filling yet another queue.  However there are ALTQ 
> interactions and other mechanisms which have to be considered too making it 
> a bit more involved.

I asserted for many years that software-side queueing would be subsumed by 
increasingly large DMA descriptor rings for the majority of devices and 
configurations.  However, this turns out not to have happened in a number of 
scenarios, and so I've revised my conclusions there.  I think we will continue 
to need to support transmit-side buffering, ideally in the form of a set of 
"libraries" that device drivers can use to avoid code replication and 
integrate queue management features fairly transparently.

I'm a bit worried by the level of copy-and-paste between 10gbps device drivers 
right now -- for 10/100/1000 drivers, the network stack contains the majority 
of the code, and the responsibility of the device driver is to advertise 
hardware features and manage interactions with rings, interrupts, etc.  On the 
10gbps side, we see lots of code replication, especially in queue management, 
and it suggests to me (as discussed for several years in a row at BSDCan and 
elsehwere) that it's time to do a bit of revisiting of ifnet, pull more code 
back into the central stack and out of device drivers, etc.  That doesn't 
necessarily mean changing notions of ownership of event models, rather, 
centralising code in libraries rather than all over the place.  This is 
something to do with some care, of course.

Robert

From owner-freebsd-net@FreeBSD.ORG  Thu Dec  6 11:56:09 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id E6104FBF;
 Thu,  6 Dec 2012 11:56:09 +0000 (UTC)
 (envelope-from melifaro@FreeBSD.org)
Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 785468FC13;
 Thu,  6 Dec 2012 11:56:09 +0000 (UTC)
Received: from v6.mpls.in ([2a02:978:2::5] helo=ws.su29.net)
 by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256)
 (Exim 4.76 (FreeBSD)) (envelope-from <melifaro@FreeBSD.org>)
 id 1Tga77-0003RI-Th; Thu, 06 Dec 2012 15:59:37 +0400
Message-ID: <50C087D2.6020607@FreeBSD.org>
Date: Thu, 06 Dec 2012 15:56:02 +0400
From: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:9.0) Gecko/20120121 Thunderbird/9.0
MIME-Version: 1.0
To: =?ISO-8859-1?Q?Ermal_Lu=E7i?= <eri@freebsd.org>
Subject: Re: ipfw(4) dynamic states/rules and its callout
References: <CAPBZQG3T=Mvp80-mhFJJ5QRcv5+3SLwV__2bVgvP1YO=UFVOUA@mail.gmail.com>
In-Reply-To: <CAPBZQG3T=Mvp80-mhFJJ5QRcv5+3SLwV__2bVgvP1YO=UFVOUA@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
Cc: freebsd-net <freebsd-net@freebsd.org>, freebsd-ipfw@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 06 Dec 2012 11:56:10 -0000

On 06.12.2012 13:13, Ermal Lu�i wrote:
> Hello,
>
> i was looking at ipfw dynamic code for dynamic states/rules and see that it
> unconditionally schedules a callout even if there is not work to do.
>
> Wouldn't it be best to reschedule it when there is something to do to avoid
> having a useless
> callout/event run every time on the system?
>
> Is there any complication i am missing on it!
I thought about the same (and possibly not allocating dynamic hash at 
all if we have no dynamic rules) while rewriting dynamic code.
The main "problem" is to reliably determine if we have dynamic rules in 
our ruleset.

Rule checking probably can be done via adding additional argument to 
check_ipfw_struct(), however the rest can be a bit more complicated 
since we can delete more that one rule (or set with bunch of rules) at once.

>
> Regards,
> Ermal
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>


From owner-freebsd-net@FreeBSD.ORG  Thu Dec  6 16:48:54 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id F003027B
 for <freebsd-net@FreeBSD.org>; Thu,  6 Dec 2012 16:48:54 +0000 (UTC)
 (envelope-from glebius@FreeBSD.org)
Received: from cell.glebius.int.ru (glebius.int.ru [81.19.69.10])
 by mx1.freebsd.org (Postfix) with ESMTP id 6F16F8FC12
 for <freebsd-net@FreeBSD.org>; Thu,  6 Dec 2012 16:48:53 +0000 (UTC)
Received: from cell.glebius.int.ru (localhost [127.0.0.1])
 by cell.glebius.int.ru (8.14.5/8.14.5) with ESMTP id qB6Gmp0J061126;
 Thu, 6 Dec 2012 20:48:51 +0400 (MSK)
 (envelope-from glebius@FreeBSD.org)
Received: (from glebius@localhost)
 by cell.glebius.int.ru (8.14.5/8.14.5/Submit) id qB6Gmn4g061125;
 Thu, 6 Dec 2012 20:48:49 +0400 (MSK)
 (envelope-from glebius@FreeBSD.org)
X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to
 glebius@FreeBSD.org using -f
Date: Thu, 6 Dec 2012 20:48:49 +0400
From: Gleb Smirnoff <glebius@FreeBSD.org>
To: Kevin Lo <kevlo@kevlo.org>
Subject: Re: Review request: fix return value of socket(2) on no family found
Message-ID: <20121206164849.GE48639@FreeBSD.org>
References: <50C03D8F.3090106@kevlo.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=koi8-r
Content-Disposition: inline
In-Reply-To: <50C03D8F.3090106@kevlo.org>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-net@FreeBSD.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 06 Dec 2012 16:48:55 -0000

  Kevin,

On Thu, Dec 06, 2012 at 02:39:11PM +0800, Kevin Lo wrote:
K> Here's the patch mostly from NetBSD to make socket(2) return EAFNOSUPPORT
K> rather than EPROTONOSUPPORT if the family cannot be found.
K> 
K> http://people.freebsd.org/~kevlo/patch-socket
K> 
K> The man page documents the behavior specified in POSIX.1-2008:
K> 
K> http://pubs.opengroup.org/onlinepubs/9699919799/functions/socket.html
K> 
K> For reference, Linux, NetBSD, and OS X return EAFNOSUPPORT for this.

IMO, the proposed change is correct.

I'd suggest only couple of things:

- Please commit the addition of the pffinddomain() function and its
  documentation separately from socket() return value change.

- May be it is worth to have a comment with reference to POSIX in
  the code in uipc_socket.c, that selects approptiate error value.

-- 
Totus tuus, Glebius.

From owner-freebsd-net@FreeBSD.ORG  Thu Dec  6 18:02:10 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id BBAD56E3
 for <freebsd-net@freebsd.org>; Thu,  6 Dec 2012 18:02:10 +0000 (UTC)
 (envelope-from barney_cordoba@yahoo.com)
Received: from nm25-vm3.bullet.mail.ne1.yahoo.com
 (nm25-vm3.bullet.mail.ne1.yahoo.com [98.138.91.155])
 by mx1.freebsd.org (Postfix) with ESMTP id 66C758FC17
 for <freebsd-net@freebsd.org>; Thu,  6 Dec 2012 18:02:10 +0000 (UTC)
Received: from [98.138.90.49] by nm25.bullet.mail.ne1.yahoo.com with NNFMP;
 06 Dec 2012 18:02:04 -0000
Received: from [98.138.87.6] by tm2.bullet.mail.ne1.yahoo.com with NNFMP;
 06 Dec 2012 18:02:04 -0000
Received: from [127.0.0.1] by omp1006.mail.ne1.yahoo.com with NNFMP;
 06 Dec 2012 18:02:04 -0000
X-Yahoo-Newman-Property: ymail-3
X-Yahoo-Newman-Id: 132418.83920.bm@omp1006.mail.ne1.yahoo.com
Received: (qmail 81634 invoked by uid 60001); 6 Dec 2012 18:02:04 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
 t=1354816924; bh=ywQcO9yhsuj8zQKuy3VE9yte4qK0pkGhZZG+Djp7cpE=;
 h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding;
 b=m4Dv9cw3Xh7wxbtplKyVZHVCA57FGPKRvUNvwTYzNSK/7PMXEVE94+2GBn/Nfr8dtRPwnmRkUciwFwMCoSADnpUowmnAouB6QR4jBCx5ckJGHzpAMK7gBPd2pB2bRArGfrirYBDuP41h4zDgN27oedq2pOMXFEQUTRTf9byur+w=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
 h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding;
 b=1vTQm3zPWTNy1/8+ssk2tuvSYprD346HtV2vljx1hSNwC4PWxxYwcURFiPgVQKPPf7T/wmeJPrAYPgNPZsETFdWxnHKTpiQSnOiWLI409HSQx9oR7S+ymzFq8UmReeGjbefsgrCLT5Cw3r9IKM0JNPg6Agoz92cSLu44YSCkDjg=;
X-YMail-OSG: A5Ce9xsVM1nP3XFfhl7gCs4S0Frcz0WQFNjabKs7z4xs49N
 yFJxpw94T_iQz9auJNa5MXMOYJa0RPNNyBlOZ3wpKS4X991Nm2nQLXSJqDJg
 pkvSLhfxcHpzol5lmCi23EFm73_YpjsVZwy.Si9iDUG4Fd1hvQxIifijlnnX
 4QcW7rUvKVe8AIyjQFhpRux5WAChrMvnobZd.oU5bk40zJXgMfd2XtvBnH8k
 OrHaUpACeHrFM1Ih9bTyVOadJSbaRHUfHzyDee2BkBxzmPqvTdJwHtFaNpTB
 v6s7a0NJtZwt8nPovx.9I7buoo.60mmXjSY6ie9gUHtabqSlGO0_SRGZ0oje
 b9t5JcznGKBiHYILrEMhe2uZN_fk07UJah0ZrAikyOv5WpJ1NoMwU2Xtt2tC
 jZJppiyuLVIUrDaDG0tvDvn12W2iuOE1f7qbXT6WwtQMYgf0nF2hHhJByxAG
 x7JIdUTC_5X98MlEi3VPyfFuSnFalNJnlGpBSJJ_kN1LPyS8Td66OnhoVfnx
 vndlJOYbDtcc_f1kOVFMLSJpfMQ7Y_Q--
Received: from [174.48.128.27] by web121601.mail.ne1.yahoo.com via HTTP;
 Thu, 06 Dec 2012 10:02:03 PST
X-Rocket-MIMEInfo: 001.001,
 CgotLS0gT24gVGh1LCAxMi82LzEyLCBSb2JlcnQgV2F0c29uIDxyd2F0c29uQEZyZWVCU0Qub3JnPiB3cm90ZToKCj4gRnJvbTogUm9iZXJ0IFdhdHNvbiA8cndhdHNvbkBGcmVlQlNELm9yZz4KPiBTdWJqZWN0OiBSZTogTGF0ZW5jeSBpc3N1ZXMgd2l0aCBidWZfcmluZwo.IFRvOiAiQW5kcmUgT3BwZXJtYW5uIiA8b3BwZXJtYW5uQG5ldHdvcnguY2g.Cj4gQ2M6ICJCYXJuZXkgQ29yZG9iYSIgPGJhcm5leV9jb3Jkb2JhQHlhaG9vLmNvbT4sICJBZHJpYW4gQ2hhZGQiIDxhZHJpYW5AZnJlZWJzZC5vcmc.LCABMAEBAQE-
X-Mailer: YahooMailClassic/15.1.1 YahooMailWebService/0.8.128.478
Message-ID: <1354816923.71234.YahooMailClassic@web121601.mail.ne1.yahoo.com>
Date: Thu, 6 Dec 2012 10:02:03 -0800 (PST)
From: Barney Cordoba <barney_cordoba@yahoo.com>
Subject: Re: Latency issues with buf_ring
To: Robert Watson <rwatson@FreeBSD.org>
In-Reply-To: <alpine.BSF.2.00.1212060936010.78351@fledge.watson.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-net@freebsd.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 06 Dec 2012 18:02:10 -0000

=0A=0A--- On Thu, 12/6/12, Robert Watson <rwatson@FreeBSD.org> wrote:=0A=0A=
> From: Robert Watson <rwatson@FreeBSD.org>=0A> Subject: Re: Latency issues=
 with buf_ring=0A> To: "Andre Oppermann" <oppermann@networx.ch>=0A> Cc: "Ba=
rney Cordoba" <barney_cordoba@yahoo.com>, "Adrian Chadd" <adrian@freebsd.or=
g>, "John Baldwin" <jhb@freebsd.org>, freebsd-net@freebsd.org=0A> Date: Thu=
rsday, December 6, 2012, 4:39 AM=0A> On Tue, 4 Dec 2012, Andre Oppermann=0A=
> wrote:=0A> =0A> > For most if not all ethernet drivers from 100Mbit/s the=
=0A> TX DMA rings are so large that buffering at the IFQ level=0A> doesn't =
make sense anymore and only adds latency.=A0 So=0A> it could simply directl=
y put everything into the TX DMA and=0A> not even try to soft-queue.=A0 If =
the TX DMA ring is full=0A> ENOBUFS is returned instead of filling yet anot=
her=0A> queue.=A0 However there are ALTQ interactions and other=0A> mechani=
sms which have to be considered too making it a bit=0A> more involved.=0A> =
=0A> I asserted for many years that software-side queueing would=0A> be sub=
sumed by increasingly large DMA descriptor rings for=0A> the majority of de=
vices and configurations.=A0 However,=0A> this turns out not to have happen=
ed in a number of=0A> scenarios, and so I've revised my conclusions there.=
=A0 I=0A> think we will continue to need to support transmit-side=0A> buffe=
ring, ideally in the form of a set of "libraries" that=0A> device drivers c=
an use to avoid code replication and=0A> integrate queue management feature=
s fairly transparently.=0A> =0A> I'm a bit worried by the level of copy-and=
-paste between=0A> 10gbps device drivers right now -- for 10/100/1000 drive=
rs,=0A> the network stack contains the majority of the code, and the=0A> re=
sponsibility of the device driver is to advertise hardware=0A> features and=
 manage interactions with rings, interrupts,=0A> etc.=A0 On the 10gbps side=
, we see lots of code=0A> replication, especially in queue management, and =
it suggests=0A> to me (as discussed for several years in a row at BSDCan an=
d=0A> elsehwere) that it's time to do a bit of revisiting of=0A> ifnet, pul=
l more code back into the central stack and out of=0A> device drivers, etc.=
=A0 That doesn't necessarily mean=0A> changing notions of ownership of even=
t models, rather,=0A> centralising code in libraries rather than all over t=
he=0A> place.=A0 This is something to do with some care, of=0A> course.=0A>=
 =0A> Robert=0A=0A=0AMore troubling than that is the notion that the same c=
ode that's suitable=0Afor 10/100Gb/s should be used in a 10Gb/s environment=
. 10Gb/s requires a=0Acompletely different way of thinking.=0A=0ABC

From owner-freebsd-net@FreeBSD.ORG  Thu Dec  6 18:31:30 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id E1F78335;
 Thu,  6 Dec 2012 18:31:30 +0000 (UTC)
 (envelope-from adrian.chadd@gmail.com)
Received: from mail-we0-f182.google.com (mail-we0-f182.google.com
 [74.125.82.182])
 by mx1.freebsd.org (Postfix) with ESMTP id 430358FC12;
 Thu,  6 Dec 2012 18:31:29 +0000 (UTC)
Received: by mail-we0-f182.google.com with SMTP id u54so3355301wey.13
 for <multiple recipients>; Thu, 06 Dec 2012 10:31:29 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=9N8tU9+r7+p/H1S/e9w8DvFsTOOWRV6viq8jbxsjCDE=;
 b=0l196Iqtljhs/jWu4mVSe6F/BfZnbP54cNR5WqCTHWcZlqISmSJuu3QJldnxpz938L
 V9Ip+eWmj+1e3yDifidFkNnu6voSswpn2tp4hg5LKsR+NkLsRRYOKoc7aLy+YwQm8VyU
 dtu6rGEkeUd0uLDI8g4Kdi22JkvzKIfg+KzEeaKCdtnhfKFVLc+y1wJDSS3N6ZqsxG77
 hIormynxB9DJhNUdtboy0S7p055mBvUd65NoV5gOsf/u3VwviLHLzniPB29DP/VAS0Sy
 KM77tvRvnFr6zncNI6LgDYz2FuizNUBMrhWfwhXMTWd5D72pmjQvSSJNvG4yK2U48Scc
 729w==
MIME-Version: 1.0
Received: by 10.180.104.69 with SMTP id gc5mr10477681wib.13.1354818689241;
 Thu, 06 Dec 2012 10:31:29 -0800 (PST)
Sender: adrian.chadd@gmail.com
Received: by 10.217.57.9 with HTTP; Thu, 6 Dec 2012 10:31:29 -0800 (PST)
In-Reply-To: <1354816923.71234.YahooMailClassic@web121601.mail.ne1.yahoo.com>
References: <alpine.BSF.2.00.1212060936010.78351@fledge.watson.org>
 <1354816923.71234.YahooMailClassic@web121601.mail.ne1.yahoo.com>
Date: Thu, 6 Dec 2012 10:31:29 -0800
X-Google-Sender-Auth: A_YlupdxiL_661PLRwV17rqfvQY
Message-ID: <CAJ-Vmo=FPQLNwkh4-d_BrB7Q1TrFG6nZAi9Ou-E1YZqLFsj68w@mail.gmail.com>
Subject: Re: Latency issues with buf_ring
From: Adrian Chadd <adrian@freebsd.org>
To: Barney Cordoba <barney_cordoba@yahoo.com>
Content-Type: text/plain; charset=ISO-8859-1
Cc: freebsd-net@freebsd.org, Robert Watson <rwatson@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 06 Dec 2012 18:31:31 -0000

There've been plenty of discussions about "better" ways of doing this
networking stuff.

Barney, are you able to make it to any of the developer summits?


adrian

From owner-freebsd-net@FreeBSD.ORG  Fri Dec  7 02:32:11 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 088B82B6;
 Fri,  7 Dec 2012 02:32:11 +0000 (UTC) (envelope-from mike@karels.net)
Received: from mail.karels.net (unknown [IPv6:2001:470:c004:1::5])
 by mx1.freebsd.org (Postfix) with ESMTP id BE7338FC0C;
 Fri,  7 Dec 2012 02:32:10 +0000 (UTC)
Received: from mail.karels.net (localhost [127.0.0.1])
 by mail.karels.net (8.14.5/8.14.5) with ESMTP id qB72W5ji039704;
 Thu, 6 Dec 2012 20:32:07 -0600 (CST) (envelope-from mike@karels.net)
Message-Id: <201212070232.qB72W5ji039704@mail.karels.net>
To: Gleb Smirnoff <glebius@freebsd.org>
From: Mike Karels <mike@karels.net>
Subject: Re: Review request: fix return value of socket(2) on no family found
In-reply-to: Your message of Thu, 06 Dec 2012 20:48:49 +0400.
 <20121206164849.GE48639@FreeBSD.org>
Date: Thu, 06 Dec 2012 20:32:05 -0600
Cc: freebsd-net@freebsd.org, Kevin Lo <kevlo@kevlo.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: mike@karels.net
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 07 Dec 2012 02:32:11 -0000

> On Thu, Dec 06, 2012 at 02:39:11PM +0800, Kevin Lo wrote:
> K> Here's the patch mostly from NetBSD to make socket(2) return EAFNOSUPPORT
> K> rather than EPROTONOSUPPORT if the family cannot be found.
> K> 
> K> http://people.freebsd.org/~kevlo/patch-socket
> K> 
> K> The man page documents the behavior specified in POSIX.1-2008:
> K> 
> K> http://pubs.opengroup.org/onlinepubs/9699919799/functions/socket.html
> K> 
> K> For reference, Linux, NetBSD, and OS X return EAFNOSUPPORT for this.

> IMO, the proposed change is correct.

I'd have to disagree.  EAFNOSUPPORT means "Address family not supported by
protocol family".  However, the socket syscall does not take an address
family parameter.  It takes a protocol family, a socket type, and an
optional protocol.  EPFNOSUPPORT would be the correct error if the protocol
family is not supported.  I don't remember if I missed this when POSIX
was being balloted, or if my objection was unsuccessful.

That said, I will say that consistency across systems and with the standard
is a useful thing, so I'll reluctantly agree with the change to the errno.

However, the proposed text for socket(2) doesn't make sense:

+The address family (domain) is not supported or the
+specified domain is not supported by this protocol family.

The domain is the protocol family.  This could reasonably say just
"The protocol family (domain) is not supported."  It might further
say "This specific error value may not be accurate, but is specified
by POSIX.1-2008."

		Mike

From owner-freebsd-net@FreeBSD.ORG  Fri Dec  7 12:27:50 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx2.freebsd.org (mx2.freebsd.org [69.147.83.53])
 by hub.freebsd.org (Postfix) with ESMTP id B9FAB329;
 Fri,  7 Dec 2012 12:27:50 +0000 (UTC) (envelope-from ae@FreeBSD.org)
Received: from butcher-nb.yandex.net (hub.freebsd.org
 [IPv6:2001:1900:2254:206c::16:88])
 by mx2.freebsd.org (Postfix) with ESMTP id 7CCE33B4C04;
 Fri,  7 Dec 2012 12:27:39 +0000 (UTC)
Message-ID: <50C1E09A.5050301@FreeBSD.org>
Date: Fri, 07 Dec 2012 16:27:06 +0400
From: "Andrey V. Elsukov" <ae@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/17.0 Thunderbird/17.0
MIME-Version: 1.0
To: freebsd-net@freebsd.org, freebsd-ipfw <freebsd-ipfw@freebsd.org>
Subject: [RFC] IPv6 ifaddr hash
X-Enigmail-Version: 1.4.6
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: melifaro@FreeBSD.org
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 07 Dec 2012 12:27:50 -0000

Hi All,

We have discovered that ipfw(4) shows very low performance results with
our rules. One of the biggest problems is rules with O_IP6_XXX_ME
opcode. They checks match or not match packet's addresses with locally
configured IPv6 addresses.

For IPv4 we have an in_ifaddr hash for the quick search an address, but
not for the IPv6. So, I have implemented the first patch based on the
code for the IPv4, but there are several questions I want to discuss.

The patch is here:
	http://people.freebsd.org/~ae/in6_ifaddrhash.diff

1. The hash size. I made it the same what IPv4 has. But I think 512
buckets is too many.

2. What hash function is better to use?

3. Using the whole 128 bit of address to hash seems like overkill.

-- 
WBR, Andrey V. Elsukov

From owner-freebsd-net@FreeBSD.ORG  Fri Dec  7 20:49:20 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 5A47744E
 for <freebsd-net@freebsd.org>; Fri,  7 Dec 2012 20:49:20 +0000 (UTC)
 (envelope-from yanegomi@gmail.com)
Received: from mail-lb0-f182.google.com (mail-lb0-f182.google.com
 [209.85.217.182])
 by mx1.freebsd.org (Postfix) with ESMTP id CAD558FC13
 for <freebsd-net@freebsd.org>; Fri,  7 Dec 2012 20:49:19 +0000 (UTC)
Received: by mail-lb0-f182.google.com with SMTP id go10so929794lbb.13
 for <freebsd-net@freebsd.org>; Fri, 07 Dec 2012 12:49:18 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:date:message-id:subject:from:to:content-type;
 bh=VM5IVFLdPcmF4olnribjjlibJA6ulb6YPihOOnrDBhI=;
 b=S1a2j1L9vP+kK8zCy66pm5ANMIm620YgStnN/DjpHDVSX2lXMvPXTF1XEjRMv3EkZo
 x6U5MSilE02OKfwX/s4aFSVo0qxZIhUx/V7cLSsZ44i+FLYP8NHEgTwsmnGb5OmUaL5v
 JR09coX4Jlo6Y59ifUc2azlyLNbsx+xonyx4fZhUTLTDpKxG/Uk4UYDGRXNb86gu97D9
 WX+hwU2DnJhCiDWml/u/oFn4HBsaazciYlMv9EuFXSha3KYPO4ISB5YpA7W29k2hxZup
 2MEJUzwcb2yjTforPlpH3ksUxgzhZa4cZXNvPAeqV7dgfsPha8X06F2y0qwk6jYiE3aI
 S9ZQ==
MIME-Version: 1.0
Received: by 10.152.45.229 with SMTP id q5mr6566784lam.34.1354913358674; Fri,
 07 Dec 2012 12:49:18 -0800 (PST)
Received: by 10.112.99.70 with HTTP; Fri, 7 Dec 2012 12:49:18 -0800 (PST)
Date: Fri, 7 Dec 2012 12:49:18 -0800
Message-ID: <CAGH67wRCa62R9cOT3CtW=WV2kYMT03u1QnB-hq7o62xzgbp42g@mail.gmail.com>
Subject: Can't create lagg interfaces on recent HEAD (2012.12.05 based sources)
From: Garrett Cooper <yanegomi@gmail.com>
To: freebsd-net@freebsd.org
Content-Type: text/plain; charset=ISO-8859-1
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 07 Dec 2012 20:49:20 -0000

    I can't seem to create a lagg'ed interface on HEAD with ixgbe
(it's failing when creating a cloned interface), whereas creating it
on 9.1-STABLE built from a couple weeks ago just worked. Ideas?
Thanks,
-Garrett

# cat /root/ISI-GENERIC
include         GENERIC
ident           ISI-GENERIC

makeoptions     MODULES_OVERRIDE="bxe cxgb cxgbe em igb ixgbe qlxgb"
nodevice        bxe
nodevice        cxgb
nodevice        cxgbe
nodevice        em
nodevice        igb
nodevice        ixgbe
nodevice        qlxgb

options         OFED
options         SDP
options         IPOIB_CM

device          ipoib
device          mlx4ib
device          mlxen
device          mthca
# uname -a
FreeBSD wf158.west.isilon.com 10.0-CURRENT FreeBSD 10.0-CURRENT #0:
Thu Dec  6 23:41:57 PST 2012
root@wf158.west.isilon.com:/usr/obj/usr/src/sys/ISI-GENERIC  amd64
# service netif restart
Stopping Network: lo0 ix0 ix1.
lo0: flags=8048<LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
ix0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 9000
        options=407bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWTSO>
        ether 00:1b:21:88:51:c4
        inet6 fe80::21b:21ff:fe88:51c4%ix0 prefixlen 64 scopeid 0x2
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (10Gbase-SR <full-duplex>)
        status: active
ix1: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 9000
        options=407bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWTSO>
        ether 00:1b:21:88:51:c5
        inet6 fe80::21b:21ff:fe88:51c5%ix1 prefixlen 64 scopeid 0x3
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (10Gbase-SR <full-duplex>)
        status: active
ifconfig: SIOCIFCREATE2: Invalid argument
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Starting Network: lo0 ix0 ix1.
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
        inet 127.0.0.1 netmask 0xff000000
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
ix0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
        options=407bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWTSO>
        ether 00:1b:21:88:51:c4
        inet6 fe80::21b:21ff:fe88:51c4%ix0 prefixlen 64 scopeid 0x2
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (10Gbase-SR <full-duplex>)
        status: active
ix1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
        options=407bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWTSO>
        ether 00:1b:21:88:51:c5
        inet6 fe80::21b:21ff:fe88:51c5%ix1 prefixlen 64 scopeid 0x3
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet autoselect (10Gbase-SR <full-duplex>)
        status: active
# cat /etc/rc.conf
hostname="wf158.west.isilon.com"
ifconfig_em0="DHCP"
sshd_enable="YES"
ntpd_enable="YES"
# Set dumpdev to "AUTO" to enable crash dumps, "NO" to disable
dumpdev="NO"

nfs_client_enable="YES"
rpc_lockd_enable="YES"
rpc_statd_enable="YES"
rpcbind_enable="YES"

tahi_interface="em1"
#
#ipv6_default_interface="$tahi_interface"
#ipv6_network_interfaces="$tahi_interface"
#
eval "ifconfig_${tahi_interface}_ipv6='inet6 up -accept_rtadv
-auto_linklocal -nud'"

devfs_system_ruleset="tahi_bpf"

kld_list="ixgbe"
ifconfig_ix0="up mtu 9000"
ifconfig_ix1="up mtu 9000"
cloned_interfaces="lagg0"
ifconfig_lagg0="laggproto lacp laggport ix0 laggport ix1 lagghash l3
7.7.6.42 netmask 255.255.255.0"

From owner-freebsd-net@FreeBSD.ORG  Sat Dec  8 09:49:37 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id AD12B96B;
 Sat,  8 Dec 2012 09:49:37 +0000 (UTC)
 (envelope-from melifaro@FreeBSD.org)
Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2])
 by mx1.freebsd.org (Postfix) with ESMTP id 3CC928FC15;
 Sat,  8 Dec 2012 09:49:37 +0000 (UTC)
Received: from v6.mpls.in ([2a02:978:2::5] helo=ws.su29.net)
 by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256)
 (Exim 4.76 (FreeBSD)) (envelope-from <melifaro@FreeBSD.org>)
 id 1ThH5l-0009WS-SS; Sat, 08 Dec 2012 13:53:05 +0400
Message-ID: <50C30D21.6070804@FreeBSD.org>
Date: Sat, 08 Dec 2012 13:49:21 +0400
From: "Alexander V. Chernikov" <melifaro@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:9.0) Gecko/20120121 Thunderbird/9.0
MIME-Version: 1.0
To: "Andrey V. Elsukov" <ae@FreeBSD.org>
Subject: Re: [RFC] IPv6 ifaddr hash
References: <50C1E09A.5050301@FreeBSD.org>
In-Reply-To: <50C1E09A.5050301@FreeBSD.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-net@freebsd.org, freebsd-ipfw <freebsd-ipfw@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 08 Dec 2012 09:49:37 -0000

On 07.12.2012 16:27, Andrey V. Elsukov wrote:
> Hi All,
>
> We have discovered that ipfw(4) shows very low performance results with
> our rules. One of the biggest problems is rules with O_IP6_XXX_ME
> opcode. They checks match or not match packet's addresses with locally
> configured IPv6 addresses.
>
> For IPv4 we have an in_ifaddr hash for the quick search an address, but
> not for the IPv6. So, I have implemented the first patch based on the
> code for the IPv4, but there are several questions I want to discuss.
>
> The patch is here:
> 	http://people.freebsd.org/~ae/in6_ifaddrhash.diff
>
> 1. The hash size. I made it the same what IPv4 has. But I think 512
> buckets is too many.
While the same IPv6 configuration can have up to x2 addresses as in IPv4 
(link-local addresses), 512 is really too much, maybe 64, or 128
be better for common-use case?
>
> 2. What hash function is better to use?
We've got at least 3 (known to me) hashes in our kernel:
ng_netflow one, flowtable and in ipfw.

Can you provide some benchmarks and hashing effectiveness for some 
real-world data for those?
>
> 3. Using the whole 128 bit of address to hash seems like overkill.

There are people using IPv6 address space just as plain IPv4, e.g:

XX:YY:ZZ::1, XX:YY:ZZ::2, ... ::n, or even XX:YY:ZZ::A.B.C.D, so hashing 
upper 64 bits can lead to collisions.

Hashing lower 64 is more promising, but there can be other use cases, too.

Imho we can just test test performance of hashing functions and see how 
much is the different and is it worth talking.


There is another problem: link-local addresses. They are all the same, 
(or there are some small number of different groups) so one (or more) 
bucket will always be filled by them.

This can result in
* some searches for global addresses being much slower
* IPv6 code accepting packet to link-local address of the other 
interface ( RFC 4291 sec 2.5.6 )

We can workaround first problem by adding global unicast to list head, 
and link-local - to list tail, but this leaves us with the second one.

One of possible solutions is to add interface index as another parameter 
to hash function, and use it IFF address is site-local.

>


From owner-freebsd-net@FreeBSD.ORG  Sat Dec  8 14:18:35 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id F2F5613B
 for <freebsd-net@freebsd.org>; Sat,  8 Dec 2012 14:18:34 +0000 (UTC)
 (envelope-from barney_cordoba@yahoo.com)
Received: from nm36-vm3.bullet.mail.ne1.yahoo.com
 (nm36-vm3.bullet.mail.ne1.yahoo.com [98.138.229.115])
 by mx1.freebsd.org (Postfix) with ESMTP id 745AA8FC08
 for <freebsd-net@freebsd.org>; Sat,  8 Dec 2012 14:18:34 +0000 (UTC)
Received: from [98.138.90.52] by nm36.bullet.mail.ne1.yahoo.com with NNFMP;
 08 Dec 2012 14:16:21 -0000
Received: from [98.138.226.167] by tm5.bullet.mail.ne1.yahoo.com with NNFMP;
 08 Dec 2012 14:16:21 -0000
Received: from [127.0.0.1] by omp1068.mail.ne1.yahoo.com with NNFMP;
 08 Dec 2012 14:16:21 -0000
X-Yahoo-Newman-Property: ymail-3
X-Yahoo-Newman-Id: 902537.3254.bm@omp1068.mail.ne1.yahoo.com
Received: (qmail 39576 invoked by uid 60001); 8 Dec 2012 14:16:21 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024;
 t=1354976181; bh=cCcKVvzQe2jx/nQq3/qUi34qiBYrnA9hs/cr8i7KDU0=;
 h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
 b=H/BJKoZMNMlgTlzyj9Xb0TzIrrAbsnvRbd62ugBbhJNw6zizr/Te1wt4mMQ2FWU4MKp7kzV417AjvzJfaVE/VRyHeNoPEejcttNorT7T9CT7vXOFCOEDHQz3iqr6wlzDwSnN1mK5OSHxCKpCUdYg/h+Av1QvZBwM1jhgglcZ6KU=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com;
 h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type;
 b=Mkov8DmvpED72KKv7CqWmLtSmq7nZwjxPVvMqyUwXJmwTylD4lCDB6SlKtuMs4X1o3jkU1PaXvNxqIA3wfqkpLhYmEqan9J/eH6ohQsKPLmjNcQKUP+zDU3ztRPTuZLrv4awYm5tZobRXeeSQmSO1nLlPx8vKRA7YmfZyv6Vmow=;
X-YMail-OSG: 6gSNFVwVM1nudH.8ncrzN5BQuH4BCsZ2O9T1RcHiadbpkBl
 W4IvR51hfzgwIYgtxadGLQ1p6cv2BYfMP.DY.VItBjyi48bmrTt4jXfNpZY6
 i7qzlV0r3YwAOI0XvffkLPuliEMV.BeWurqqu1sBm0df3P4opGSyo26JPr2y
 guLLUCfeL9Pv_hmPT1JHfazP59JKa7IPZ8T5GE.AZTz.4XYgp9A5Jm55E0Ok
 8JIQggz7T.JWavkv9muNP3p6YUDRxUcTt9KYfX.Pu2hkrpI0LS7Zf_TvW_5g
 Nd.IEHFXfwuJAsZamrNpa0KbdxPV460WqZ0NtA1EvOh8nXSMw1rkJ4KskCfU
 nv_cPcfbDzTrUM1CyDzJR.z8lGwFXlliAEqEJDB3wBhnTmiAEGrmwvC9uNxf
 tN_KtVY_F7p_gpX1kPm.uzjXd9KAXwnVTiACZZ2QZAkAYT1BhXoNleg3Tifk
 iwFzaU6d8Om2blx9LfuMkAQdkuqFPPn8wXkZbsvKBHGOXvfvyYOkTgkhYGq_
 p60wWQ07pxdi9SK2nZW1Os..6YFm.Ng--
Received: from [174.48.128.27] by web121603.mail.ne1.yahoo.com via HTTP;
 Sat, 08 Dec 2012 06:16:21 PST
X-Rocket-MIMEInfo: 001.001,
 CgotLS0gT24gVGh1LCAxMi82LzEyLCBBZHJpYW4gQ2hhZGQgPGFkcmlhbkBmcmVlYnNkLm9yZz4gd3JvdGU6Cgo.IEZyb206IEFkcmlhbiBDaGFkZCA8YWRyaWFuQGZyZWVic2Qub3JnPgo.IFN1YmplY3Q6IFJlOiBMYXRlbmN5IGlzc3VlcyB3aXRoIGJ1Zl9yaW5nCj4gVG86ICJCYXJuZXkgQ29yZG9iYSIgPGJhcm5leV9jb3Jkb2JhQHlhaG9vLmNvbT4KPiBDYzogZnJlZWJzZC1uZXRAZnJlZWJzZC5vcmcsICJSb2JlcnQgV2F0c29uIiA8cndhdHNvbkBmcmVlYnNkLm9yZz4KPiBEYXRlOiBUaHVyc2RheSwgRGUBMAEBAQE-
X-Mailer: YahooMailClassic/15.1.1 YahooMailWebService/0.8.128.478
Message-ID: <1354976181.39549.YahooMailClassic@web121603.mail.ne1.yahoo.com>
Date: Sat, 8 Dec 2012 06:16:21 -0800 (PST)
From: Barney Cordoba <barney_cordoba@yahoo.com>
Subject: Re: Latency issues with buf_ring
To: Adrian Chadd <adrian@freebsd.org>
In-Reply-To: <CAJ-Vmo=FPQLNwkh4-d_BrB7Q1TrFG6nZAi9Ou-E1YZqLFsj68w@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: freebsd-net@freebsd.org, Robert Watson <rwatson@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 08 Dec 2012 14:18:35 -0000


--- On Thu, 12/6/12, Adrian Chadd <adrian@freebsd.org> wrote:

> From: Adrian Chadd <adrian@freebsd.org>
> Subject: Re: Latency issues with buf_ring
> To: "Barney Cordoba" <barney_cordoba@yahoo.com>
> Cc: freebsd-net@freebsd.org, "Robert Watson" <rwatson@freebsd.org>
> Date: Thursday, December 6, 2012, 1:31 PM
> There've been plenty of discussions
> about "better" ways of doing this
> networking stuff.
> 
> Barney, are you able to make it to any of the developer
> summits?
> 

Perhaps the "summits" are part of the problem? The goal should be to
get the best ideas; not the best ideas of those with the time and resource
and desire to attend a summit.

Lists are the best summit. You can get ideas from people who may not be
allowed by their contract obligations to attend such a summit.

BC

From owner-freebsd-net@FreeBSD.ORG  Sat Dec  8 16:43:25 2012
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
 by hub.freebsd.org (Postfix) with ESMTP id 9460AA45
 for <freebsd-net@freebsd.org>; Sat,  8 Dec 2012 16:43:25 +0000 (UTC)
 (envelope-from adrian.chadd@gmail.com)
Received: from mail-we0-f182.google.com (mail-we0-f182.google.com
 [74.125.82.182])
 by mx1.freebsd.org (Postfix) with ESMTP id 2083F8FC08
 for <freebsd-net@freebsd.org>; Sat,  8 Dec 2012 16:43:24 +0000 (UTC)
Received: by mail-we0-f182.google.com with SMTP id u54so783095wey.13
 for <freebsd-net@freebsd.org>; Sat, 08 Dec 2012 08:43:23 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date
 :x-google-sender-auth:message-id:subject:from:to:cc:content-type;
 bh=fyUosl2T8pcnHmgZG/J8MvjZ5RpISZZrKw7iS0o1WYc=;
 b=ahvxIf4SFtfs77ZMFCGVVoYCVVLHLtoLjQS1Am34VXTHv9aje/MlGwdQ8azeIWsoEl
 7ki6JsksMP0eWKOQGPonJEA9Z+fogwUdCfJJKdE/WcbXKe5JoL/LIyN1IyTymEU8s6Eh
 LZ1IXY+1WJct2HuURYzMH4iDlCas/m1pM9IhNoTvUysXmX5GKwHBspHgZFXaTQDMkIVc
 QVONrNPcLSE0WoLUqNrC7nFdoef7hoARQAgU3DLkLzi6V2XQnVr8aaRv7Q0bUNr5fOgH
 YdY57HOAl1F6wpAUrdvg0F8XJgXnz5Ivt/9UydKNc0UlLQAV8VfMPmd/PcXhno2PtyVQ
 TyXA==
MIME-Version: 1.0
Received: by 10.216.85.211 with SMTP id u61mr3622786wee.212.1354985003781;
 Sat, 08 Dec 2012 08:43:23 -0800 (PST)
Sender: adrian.chadd@gmail.com
Received: by 10.217.57.9 with HTTP; Sat, 8 Dec 2012 08:43:23 -0800 (PST)
In-Reply-To: <1354976181.39549.YahooMailClassic@web121603.mail.ne1.yahoo.com>
References: <CAJ-Vmo=FPQLNwkh4-d_BrB7Q1TrFG6nZAi9Ou-E1YZqLFsj68w@mail.gmail.com>
 <1354976181.39549.YahooMailClassic@web121603.mail.ne1.yahoo.com>
Date: Sat, 8 Dec 2012 08:43:23 -0800
X-Google-Sender-Auth: bXq710l51IzsOcc5e4aDu_14LPs
Message-ID: <CAJ-Vmo=CccPiEMQ_6zfk=741p3sEknw0n=cN9fnD2pEddNHmwg@mail.gmail.com>
Subject: Re: Latency issues with buf_ring
From: Adrian Chadd <adrian@freebsd.org>
To: Barney Cordoba <barney_cordoba@yahoo.com>
Content-Type: text/plain; charset=ISO-8859-1
Cc: FreeBSD Net <freebsd-net@freebsd.org>
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 08 Dec 2012 16:43:25 -0000

The summits are just another tool for collaboration. Plenty of
discussion and coding is done via list interaction.

The problem isn't how the collaboration is done. The problem is having
people to design and code things up. :-)
Otherwise talk is just that - talk.

So, someone come up with a few examples of how to better implement the
network device producer/consumer model and get back to me/us about it.


Adrian