From owner-freebsd-bugs@FreeBSD.ORG Mon Jun 24 11:40:00 2013 Return-Path: Delivered-To: freebsd-bugs@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id C773326B for ; Mon, 24 Jun 2013 11:40:00 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id AD6281150 for ; Mon, 24 Jun 2013 11:40:00 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r5OBe0hI010522 for ; Mon, 24 Jun 2013 11:40:00 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.7/8.14.7/Submit) id r5OBe0s5010521; Mon, 24 Jun 2013 11:40:00 GMT (envelope-from gnats) Resent-Date: Mon, 24 Jun 2013 11:40:00 GMT Resent-Message-Id: <201306241140.r5OBe0s5010521@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Boris Astardzhiev Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 813B5123 for ; Mon, 24 Jun 2013 11:32:47 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from oldred.freebsd.org (oldred.freebsd.org [8.8.178.121]) by mx1.freebsd.org (Postfix) with ESMTP id 73EF510FA for ; Mon, 24 Jun 2013 11:32:47 +0000 (UTC) Received: from oldred.freebsd.org ([127.0.1.6]) by oldred.freebsd.org (8.14.5/8.14.7) with ESMTP id r5OBWlxH085749 for ; Mon, 24 Jun 2013 11:32:47 GMT (envelope-from nobody@oldred.freebsd.org) Received: (from nobody@localhost) by oldred.freebsd.org (8.14.5/8.14.5/Submit) id r5OBWkdo085748; Mon, 24 Jun 2013 11:32:46 GMT (envelope-from nobody) Message-Id: <201306241132.r5OBWkdo085748@oldred.freebsd.org> Date: Mon, 24 Jun 2013 11:32:46 GMT From: Boris Astardzhiev To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 Subject: kern/179926: LACP: active aggregator selection bug X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Jun 2013 11:40:00 -0000 >Number: 179926 >Category: kern >Synopsis: LACP: active aggregator selection bug >Confidential: no >Severity: non-critical >Priority: low >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Mon Jun 24 11:40:00 UTC 2013 >Closed-Date: >Last-Modified: >Originator: Boris Astardzhiev >Release: FreeBSD 9.1-RELEASE #0 r243826 >Organization: Smartcom Bulgaria AD >Environment: FreeBSD freebsd91 9.1-RELEASE FreeBSD 9.1-RELEASE #0 r243826: Tue Dec 4 06:55:39 UTC 2012 root@obrian.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC i386 >Description: Hi, I've been investigating the LACP implementation in FreeBSD and have encountered a bug. Here's the set: " --------- ---------- " " | lagg1 | | bond0 | " " --------- | xl0--------eth0 | --------- " " | hosts |----b1----1 FBSD rl0--------eth1 Linux|---| hosts | " " --------- | 9.1 rl1--------eth2 | --------- " " | | | | " " --------- ---------- " On a FreeBSD 9.1-RELEASE #0 r243826 system a lagg is created and three interfaces are added to it: - xl0 - rl0 - rl1 On a Linux system a bonding interface is added *ONLY ONE* interface: - eth0 Note: I think the Linux may be substituted with any other LACP implementation. The lagg protocol on both of the systems is LACP. LACPDUs transmission/reception takes place only between xl0 and eth0. Here's the result: root@freebsd91:/root # ifconfig lagg1 lagg1: flags=8843 metric 0 mtu 1500 options=2008 ether 00:10:b5:7f:97:fb inet6 fe80::210:b5ff:fe7f:97fb%lagg1 prefixlen 64 scopeid 0x9 nd6 options=21 media: Ethernet autoselect status: active laggproto lacp lagghash l2,l3,l4 laggport: xl0 flags=18 laggport: rl1 flags=1c laggport: rl0 flags=1c I consider that xl0 is the only available link therefor the aggregation must rely on it. However the lacp implementation has chosen the other two links that haven't received a single LACPDU. I think the problem is related to the selection of best active aggregator - in lacp_select_active_aggregator(). I've attached the debug output of sysctl net.lacp_debug. .. snippet ... Jun 24 10:41:43 freebsd91 kernel: xl0: new pstate 3f Jun 24 10:41:43 freebsd91 kernel: rl0: lacp_sm_mux: state 4 Jun 24 10:41:43 freebsd91 kernel: rl1: lacp_sm_mux: state 4 Jun 24 10:41:43 freebsd91 kernel: xl0: lacp_sm_mux: state 3 Jun 24 10:41:43 freebsd91 kernel: xl0: enable distributing on aggregator [(8000,00-10-B5-7F-97-FB,0126,0000,0000),(FFFF,E0-8F-EC-00-B5-2F,0009,0000,0000)], nports 0 -> 1 Jun 24 10:41:43 freebsd91 kernel: lacp_select_active_aggregator Jun 24 10:41:43 freebsd91 kernel: [(8000,00-10-B5-7F-97-FB,0126,0000,0000),(FFFF,00-00-00-00-00-00,0000,0000,0000)], speed=200000000, nports=2 Jun 24 10:41:43 freebsd91 kernel: [(8000,00-10-B5-7F-97-FB,0126,0000,0000),(FFFF,E0-8F-EC-00-B5-2F,0009,0000,0000)], speed=100000000, nports=1 Jun 24 10:41:43 freebsd91 kernel: active aggregator not changed Jun 24 10:41:43 freebsd91 kernel: new [(8000,00-10-B5-7F-97-FB,0126,0000,0000),(FFFF,00-00-00-00-00-00,0000,0000,0000)] Jun 24 10:41:43 freebsd91 kernel: xl0: mux_state 3 -> 4 Jun 24 10:41:43 freebsd91 kernel: xl0: lacpdu transmit .. snippet ... Though there is an aggregator with an active partner the implementation has chosen the other aggregator: Jun 24 10:41:43 freebsd91 kernel: new [(8000,00-10-B5-7F-97-FB,0126,0000,0000),(FFFF,00-00-00-00-00-00,0000,0000,0000)] Do you think that such aggregators must be skipped in favour of aggregators with active partners? I've applied a patch that fixes this issue and xl0 remains the only active link but I'm not sure it is correct and it has the correct approach. Any comments are appreciated. Greetings, Boris Astardzhiev, Smartcom Bulgaria AD >How-To-Repeat: Follow the described set and the bug is reproduced. >Fix: A patch is attached. Patch attached with submission follows: diff --git a/sys/net/ieee8023ad_lacp.c b/sys/net/ieee8023ad_lacp.c index 70de743..5060223 100644 --- a/sys/net/ieee8023ad_lacp.c +++ b/sys/net/ieee8023ad_lacp.c @@ -947,6 +947,7 @@ lacp_select_active_aggregator(struct lacp_softc *lsc) struct lacp_aggregator *best_la = NULL; uint64_t best_speed = 0; char buf[LACP_LAGIDSTR_MAX+1]; + u_char zero_mac[] = { 0, 0, 0, 0, 0, 0 }; LACP_TRACE(NULL); @@ -957,6 +958,13 @@ lacp_select_active_aggregator(struct lacp_softc *lsc) continue; } + /* + * Skip aggregators that has no partner. + */ + if (!memcmp(LACP_SYS_MAC(la->la_partner), + zero_mac, sizeof(zero_mac))) + continue; + speed = lacp_aggregator_bandwidth(la); LACP_DPRINTF((NULL, "%s, speed=%jd, nports=%d\n", lacp_format_lagid_aggregator(la, buf, sizeof(buf)), diff --git a/sys/net/ieee8023ad_lacp.h b/sys/net/ieee8023ad_lacp.h index 9481ce2..4076513 100644 --- a/sys/net/ieee8023ad_lacp.h +++ b/sys/net/ieee8023ad_lacp.h @@ -264,6 +264,7 @@ struct lacp_softc { ((((s1) ^ (s2)) & (mask)) == 0) #define LACP_SYS_PRI(peer) (peer).lip_systemid.lsi_prio +#define LACP_SYS_MAC(peer) (peer).lip_systemid.lsi_mac #define LACP_PORT(_lp) ((struct lacp_port *)(_lp)->lp_psc) #define LACP_SOFTC(_sc) ((struct lacp_softc *)(_sc)->sc_psc) >Release-Note: >Audit-Trail: >Unformatted: