From owner-freebsd-net@FreeBSD.ORG Tue Dec 21 13:27:45 2010 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E6A0A1065672 for ; Tue, 21 Dec 2010 13:27:44 +0000 (UTC) (envelope-from fabien.thomas@netasq.com) Received: from work.netasq.com (mars.netasq.com [91.212.116.3]) by mx1.freebsd.org (Postfix) with ESMTP id 48ABB8FC1C for ; Tue, 21 Dec 2010 13:27:43 +0000 (UTC) Received: from [10.20.1.1] (unknown [10.2.1.1]) by work.netasq.com (Postfix) with ESMTPSA id 948A774000C; Tue, 21 Dec 2010 14:09:12 +0100 (CET) Mime-Version: 1.0 (Apple Message framework v1082) Content-Type: text/plain; charset=us-ascii From: Fabien Thomas In-Reply-To: <4D1083D6.6010707@rdtc.ru> Date: Tue, 21 Dec 2010 14:11:19 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <84530C06-AC2E-4E2B-BFD4-693902BB0FA6@netasq.com> References: <4D0CFEFF.3000902@rdtc.ru> <1292844095.1917.136.camel@stormi> <4D1083D6.6010707@rdtc.ru> To: Eugene Grosbein X-Mailer: Apple Mail (2.1082) Cc: net@freebsd.org Subject: Re: lagg/lacp poor traffic distribution X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Dec 2010 13:27:45 -0000 >>> Hi! >>>=20 >>> I've loaded router using two lagg interfaces in LACP mode. >>> lagg0 has IP address and two ports (em0 and em1) and carry untagged = frames. >>> lagg1 has no IP address and has two ports (igb0 and igb1) and carry >>> about 1000 dot-q vlans with lots of hosts in each vlan. >>>=20 >>> For lagg1, lagg distributes outgoing traffic over two ports just = fine. >>> For lagg0 (untagged ethernet segment with only 2 MAC addresses) >>> less than 0.07% (54Mbit/s max) of traffic goes to em0 >>> and over 99.92% goes to em1, that's bad. >>>=20 >>> That's general traffic of several thousands of customers surfing the = web, >>> using torrents etc. I've glanced over lagg/lacp sources if = src/sys/net/ >>> and found nothing suspicious, it should extract and use srcIP/dstIP = for hash. >>>=20 >>> How do I debug this problem? >>>=20 >>> Eugene Grosbein >>> _______________________________________________ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to = "freebsd-net-unsubscribe@freebsd.org" >>=20 >> I had this problem with igb driver, and I found, that lagg selects >> outgoing interface based on packet header flowid field if M_FLOWID = field >> is set. And in the igb driver code flowid is set as=20 >>=20 >> #if __FreeBSD_version >=3D 800000 >> <------><------><------>rxr->fmp->m_pkthdr.flowid =3D que->msix; >> <------><------><------>rxr->fmp->m_flags |=3D M_FLOWID; >> #endif >>=20 >> The same thing in em driver with MULTIQUEUE=20 >>=20 >> That does not give enough number of flows to balance traffic well, so = I >> commented check in if_lagg.c >>=20 >> lagg_lb_start(struct lagg_softc *sc, struct mbuf *m) >> { >> <------>struct lagg_lb *lb =3D (struct lagg_lb *)sc->sc_psc; >> <------>struct lagg_port *lp =3D NULL; >> <------>uint32_t p =3D 0; >>=20 >> //<---->if (m->m_flags & M_FLOWID) >> //<----><------>p =3D m->m_pkthdr.flowid; >> //<---->else >>=20 >> and with this change I have much better load distribution across = interfaces. >>=20 >> Hope it helps. >=20 > You are perfectly right. By disabling flow usage I've obtained load = sharing > close to even (final patch follows). Two questions: >=20 > 1. Is it a bug or design problem? How many queues have you with igb? If it's one it will explain why the = flowid is bad for load balancing with lagg. The problem is that flowid is good if the number of queue is =3D or a = multiple of the number of lagg ports. > 2. Will I get problems like packet reordering by permanently disabling > usage of these flows in lagg(4)? >=20 > --- if_lagg.c.orig 2010-12-20 22:53:21.000000000 +0600 > +++ if_lagg.c 2010-12-21 13:37:20.000000000 +0600 > @@ -168,6 +168,11 @@ > &lagg_failover_rx_all, 0, > "Accept input from any interface in a failover lagg"); >=20 > +int lagg_use_flows =3D 1; > +SYSCTL_INT(_net_link_lagg, OID_AUTO, use_flows, CTLFLAG_RW, > + &lagg_use_flows, 1, > + "Use flows for load sharing"); > + > static int > lagg_modevent(module_t mod, int type, void *data) > { > @@ -1666,7 +1671,7 @@ > struct lagg_port *lp =3D NULL; > uint32_t p =3D 0; >=20 > - if (m->m_flags & M_FLOWID) > + if (lagg_use_flows && (m->m_flags & M_FLOWID)) > p =3D m->m_pkthdr.flowid; > else > p =3D lagg_hashmbuf(m, lb->lb_key); > --- if_lagg.h.orig 2010-12-21 16:34:35.000000000 +0600 > +++ if_lagg.h 2010-12-21 16:35:27.000000000 +0600 > @@ -242,6 +242,8 @@ > int lagg_enqueue(struct ifnet *, struct mbuf *); > uint32_t lagg_hashmbuf(struct mbuf *, uint32_t); >=20 > +extern int lagg_use_flows; > + > #endif /* _KERNEL */ >=20 > #endif /* _NET_LAGG_H */ > --- ieee8023ad_lacp.c.orig 2010-12-21 16:36:09.000000000 +0600 > +++ ieee8023ad_lacp.c 2010-12-21 16:35:58.000000000 +0600 > @@ -812,7 +812,7 @@ > return (NULL); > } >=20 > - if (m->m_flags & M_FLOWID) > + if (lagg_use_flows && (m->m_flags & M_FLOWID)) > hash =3D m->m_pkthdr.flowid; > else > hash =3D lagg_hashmbuf(m, lsc->lsc_hashkey); >=20 > Eugene Grosbein > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"