Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 8 Jul 2008 04:16:42 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Andre Oppermann <andre@freebsd.org>
Cc:        Ingo Flaschberger <if@xip.at>, FreeBSD Net <freebsd-net@freebsd.org>, Bart Van Kerckhove <bart@it-ss.be>, Paul <paul@gtcomm.net>
Subject:   Re: Freebsd IP Forwarding performance (question, and some info) [7-stable, current, em, smp]
Message-ID:  <20080708034304.R21502@delplex.bde.org>
In-Reply-To: <48724238.2020103@freebsd.org>
References:  <4867420D.7090406@gtcomm.net> <4869B025.9080006@gtcomm.net><486A7E45.3030902@gtcomm.net> <486A8F24.5010000@gtcomm.net><486A9A0E.6060308@elischer.org> <486B41D5.3060609@gtcomm.net><alpine.LFD.1.10.0807021052041.557@filebunker.xip.at><486B4F11.6040906@gtcomm.net><alpine.LFD.1.10.0807021155280.557@filebunker.xip.at><486BC7F5.5070604@gtcomm.net><20080703160540.W6369@delplex.bde.org><486C7F93.7010308@gtcomm.net><20080703195521.O6973@delplex.bde.org><486D35A0.4000302@gtcomm.net><alpine.LFD.1.10.0807041106591.19613@filebunker.xip.at><486DF1A3.9000409@gtcomm.net><alpine.LFD.1.10.0807041303490.20760@filebunker.xip.at><486E65E6.3060301@gtcomm.net> <alpine.LFD.1.10.0807052356130.2145@filebunker.xip.at> <2d3001c8def1$f4309b90$020b000a@bartwrkstxp> <486FFF70.3090402@gtcomm.net> <48701921.7090107@gtcomm.net> <4871E618.1080500@freebsd.org> <20080708002228.G680@besplex.bde.org> <48724238.2020103@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 7 Jul 2008, Andre Oppermann wrote:

> Bruce Evans wrote:
>> So it seems that the major overheads are not near the driver (as I already
>> knew), and upper layers are responsible for most of the cache misses.
>> The packet header is accessed even in monitor mode, so I think most of
>> the cache misses in upper layers are not related to the packet header.
>> Maybe they are due mainly to perfect non-locality for mbufs.
>
> Monitor mode doesn't access the payload packet header.  It only looks
> at the mbuf (which has a structure called mbuf packet header).  The mbuf
> header it hot in the cache because the driver just touched it and filled
> in the information.  The packet content (the payload) is cold and just
> arrived via DMA in DRAM.

Why does it use ntohs() then? :-).  From if_ethersubr.c:

% static void
% ether_input(struct ifnet *ifp, struct mbuf *m)
% {
% 	struct ether_header *eh;
% 	u_short etype;
% 
% 	if ((ifp->if_flags & IFF_UP) == 0) {
% 		m_freem(m);
% 		return;
% 	}
% #ifdef DIAGNOSTIC
% 	if ((ifp->if_drv_flags & IFF_DRV_RUNNING) == 0) {
% 		if_printf(ifp, "discard frame at !IFF_DRV_RUNNING\n");
% 		m_freem(m);
% 		return;
% 	}
% #endif
% 	/*
% 	 * Do consistency checks to verify assumptions
% 	 * made by code past this point.
% 	 */
% 	if ((m->m_flags & M_PKTHDR) == 0) {
% 		if_printf(ifp, "discard frame w/o packet header\n");
% 		ifp->if_ierrors++;
% 		m_freem(m);
% 		return;
% 	}
% 	if (m->m_len < ETHER_HDR_LEN) {
% 		/* XXX maybe should pullup? */
% 		if_printf(ifp, "discard frame w/o leading ethernet "
% 				"header (len %u pkt len %u)\n",
% 				m->m_len, m->m_pkthdr.len);
% 		ifp->if_ierrors++;
% 		m_freem(m);
% 		return;
% 	}
% 	eh = mtod(m, struct ether_header *);

Point outside of mbuf header.

% 	etype = ntohs(eh->ether_type);

First access outside of mbuf header.

But this seems to be bogus and might be fixed by compiler optimization, 
since etype is not used until after the monitor mode returns.  This may
have been broken by debugging cruft -- in 5.2, etype is used immediately
after here in a printf about discarding oversize frames.  The compiler
might also pessimize things by reordering code.

% 	if (m->m_pkthdr.rcvif == NULL) {
% 		if_printf(ifp, "discard frame w/o interface pointer\n");
% 		ifp->if_ierrors++;
% 		m_freem(m);
% 		return;
% 	}
% #ifdef DIAGNOSTIC
% 	if (m->m_pkthdr.rcvif != ifp) {
% 		if_printf(ifp, "Warning, frame marked as received on %s\n",
% 			m->m_pkthdr.rcvif->if_xname);
% 	}
% #endif
% 
% 	if (ETHER_IS_MULTICAST(eh->ether_dhost)) {
% 		if (ETHER_IS_BROADCAST(eh->ether_dhost))
% 			m->m_flags |= M_BCAST;
% 		else
% 			m->m_flags |= M_MCAST;
% 		ifp->if_imcasts++;
% 	}

Another dereference of eh (2 unless optimizable and optimized).  Here
the result is actually used early, but I think you don't care enough
about maintaing if_imcasts to do this.

% 
% #ifdef MAC
% 	/*
% 	 * Tag the mbuf with an appropriate MAC label before any other
% 	 * consumers can get to it.
% 	 */
% 	mac_ifnet_create_mbuf(ifp, m);
% #endif
% 
% 	/*
% 	 * Give bpf a chance at the packet.
% 	 */
% 	ETHER_BPF_MTAP(ifp, m);

I think this can access the whole packet, but usually doesn't.

% 
% 	/*
% 	 * If the CRC is still on the packet, trim it off. We do this once
% 	 * and once only in case we are re-entered. Nothing else on the
% 	 * Ethernet receive path expects to see the FCS.
% 	 */
% 	if (m->m_flags & M_HASFCS) {
% 		m_adj(m, -ETHER_CRC_LEN);
% 		m->m_flags &= ~M_HASFCS;
% 	}
% 
% 	ifp->if_ibytes += m->m_pkthdr.len;
% 
% 	/* Allow monitor mode to claim this frame, after stats are updated. */
% 	if (ifp->if_flags & IFF_MONITOR) {
% 		m_freem(m);
% 		return;
% 	}

Finally return in monitor mode.

I don't see any stats update before here except for the stray if_imcasts
one.

BTW, stats behave strangely in monitor mode:
- netstat -I <interface> 1 works except:
   - the byte counts are 0 every second second (the next second counts the
     previous 2), while the packet counts are update every second
   - one system started showing bge0 stats for all interfaces.  Perhaps
     unrelated.
- systat -ip shows all counts 0.
I think this is due to stats maintained by the driver working but other
stats not.  The mixture seems strange at user level.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080708034304.R21502>