From owner-freebsd-net@freebsd.org Tue Sep 6 08:42:59 2016 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C232BA9DEDF for ; Tue, 6 Sep 2016 08:42:59 +0000 (UTC) (envelope-from killing@multiplay.co.uk) Received: from mail-wm0-x236.google.com (mail-wm0-x236.google.com [IPv6:2a00:1450:400c:c09::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 52BD12A5 for ; Tue, 6 Sep 2016 08:42:59 +0000 (UTC) (envelope-from killing@multiplay.co.uk) Received: by mail-wm0-x236.google.com with SMTP id w12so78145509wmf.0 for ; Tue, 06 Sep 2016 01:42:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=multiplay-co-uk.20150623.gappssmtp.com; s=20150623; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to; bh=xnocjiTe+LftgyCEomqX5XjNpHdGxvG0VKjVnj5+5eM=; b=iyNX10B0reS2fv4rUWAJBvxdEZj98xqmG5m4OaAweSgyBSEaVOEiJ7S9Re5JoOe4pT d/F3tuxDXGj+cU+QrtnhTMSecbsIV/CnZjqaABGrBOjwe3TE1PKVSTGaV8CruQOdJhmf QaXZ/D4sFcOCetoenprncI04Y0p8Csys4HjCHgOjtcGNl+4AyvIh1MBnvV6crxYL5IvM W53WGGDe3a3UTTg4ImieDvLYUf67UR2eYcPHFG9aJBLMOzLRErzT3wWWPylSMuRrRFQz FpuUCmLXYWE9GwoAyUdDDZdTutOHm9Ntb3ASYctaCSSBwi4ErW0eef/8ekzayZg3KzWC v9pA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to; bh=xnocjiTe+LftgyCEomqX5XjNpHdGxvG0VKjVnj5+5eM=; b=ZSpgCRTykQWnpEnEUNN7VQjyhrdXsehIhvTMX7uL/R0gnRD/Jn0M3aPARYIFVHZQQN 2gjOZihKGh42YhlMXMGN7nEwA8YM0ZNk3Wf9zVfHiX0x3ubzRpDkVRh6IbJKLWjFKLGr 8z13GFC5As/N/IkieHL66SQqOGnEnJZ5NnxtNty/0CO3oHbzhu2TFVoftVgrk3b+dKT5 jTDF/YrY8Edr2QuFgoxDw8/Eu2c2ukKSh2pL0PEoCr1xOBtuV622ZevQ9pzzrytmvB2U /bti44cCNLlCadv6amV1aWB1ckVvUaVr1PvR5j8/O0ry4zOnbBugwp9NTVliCzDq203V ec4g== X-Gm-Message-State: AE9vXwNjzy5LwuKpWOMs5vuV4gvughQCxVTLRmOE8LXDpldzHQqK2opUBZoYGhNDtz80w2Ow X-Received: by 10.194.243.8 with SMTP id wu8mr34109221wjc.178.1473151377218; Tue, 06 Sep 2016 01:42:57 -0700 (PDT) Received: from [10.10.1.58] (liv3d.labs.multiplay.co.uk. [82.69.141.171]) by smtp.gmail.com with ESMTPSA id w203sm25151290wmw.7.2016.09.06.01.42.55 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 06 Sep 2016 01:42:56 -0700 (PDT) Subject: Re: lagg Interfaces - don't do Gratuitous ARP? To: freebsd-net@freebsd.org References: <0D84203FAAFD0A8E7BBB24A3@[10.12.30.106]> From: Steven Hartland Message-ID: Date: Tue, 6 Sep 2016 09:42:59 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <0D84203FAAFD0A8E7BBB24A3@[10.12.30.106]> Content-Type: multipart/mixed; boundary="------------8DA201AA1DCCAD657B20F1BD" X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Sep 2016 08:42:59 -0000 This is a multi-part message in MIME format. --------------8DA201AA1DCCAD657B20F1BD Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Yes known issue I'm afraid. I created a patch set to address this but there where objections so it was removed, see the attached which is based on 10.2-RELEASE. On 06/09/2016 09:13, Karl Pielorz wrote: > > Hi, > > We've just changed the network config on a box - going from a single > 'em1' adapter to a lagg failover of em0, em1. > > This works - but we noticed after the machine rebooted, we couldn't > ping it from other hosts. > > Checking on other machines on the LAN they still had an ARP entry for > the changed hosts old em1's MAC. > > On the lagg machine - the MAC used for the NIC's (and lagg) was now > the MAC for em0 (which I believe is correct behaviour). > > Should the act of lagg / IP's coming up not send a gratuitous ARP for > them or something to avoid this? > > As it was we had to log into a number of key boxes and 'arp -d' the > IP's - and take a ~800 second 'hit' on other boxes timing out the old > MAC. > > > -Karl > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" --------------8DA201AA1DCCAD657B20F1BD Content-Type: text/plain; charset=UTF-8; name="lagg-arp02-link.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="lagg-arp02-link.patch" Index: sys/net/if.c =================================================================== --- sys/net/if.c (revision 291071) +++ sys/net/if.c (working copy) @@ -126,7 +126,7 @@ SX_SYSINIT(ifdescr_sx, &ifdescr_sx, "ifnet descr") void (*bridge_linkstate_p)(struct ifnet *ifp); void (*ng_ether_link_state_p)(struct ifnet *ifp, int state); -void (*lagg_linkstate_p)(struct ifnet *ifp, int state); +void (*lagg_linkstate_p)(struct ifnet *ifp); /* These are external hooks for CARP. */ void (*carp_linkstate_p)(struct ifnet *ifp); void (*carp_demote_adj_p)(int, char *); @@ -1980,6 +1980,8 @@ if_unroute(struct ifnet *ifp, int flag, int fam) if (ifp->if_carp) (*carp_linkstate_p)(ifp); + if (ifp->if_lagg) + (*lagg_linkstate_p)(ifp); rt_ifmsg(ifp); } @@ -2001,6 +2003,8 @@ if_route(struct ifnet *ifp, int flag, int fam) pfctlinput(PRC_IFUP, ifa->ifa_addr); if (ifp->if_carp) (*carp_linkstate_p)(ifp); + if (ifp->if_lagg) + (*lagg_linkstate_p)(ifp); rt_ifmsg(ifp); #ifdef INET6 in6_if_up(ifp); @@ -2015,17 +2019,27 @@ int (*vlan_tag_p)(struct ifnet *, uint16_t *); int (*vlan_setcookie_p)(struct ifnet *, void *); void *(*vlan_cookie_p)(struct ifnet *); +void +if_link_state_change(struct ifnet *ifp, int link_state) +{ + + return if_link_state_change_cond(ifp, link_state, 0); +} + /* * Handle a change in the interface link state. To avoid LORs * between driver lock and upper layer locks, as well as possible * recursions, we post event to taskqueue, and all job * is done in static do_link_state_change(). + * + * If the current link state matches link_state and force isn't + * specified no action is taken. */ void -if_link_state_change(struct ifnet *ifp, int link_state) +if_link_state_change_cond(struct ifnet *ifp, int link_state, int force) { - /* Return if state hasn't changed. */ - if (ifp->if_link_state == link_state) + + if (ifp->if_link_state == link_state && !force) return; ifp->if_link_state = link_state; @@ -2053,7 +2067,7 @@ do_link_state_change(void *arg, int pending) if (ifp->if_bridge) (*bridge_linkstate_p)(ifp); if (ifp->if_lagg) - (*lagg_linkstate_p)(ifp, link_state); + (*lagg_linkstate_p)(ifp); if (IS_DEFAULT_VNET(curvnet)) devctl_notify("IFNET", ifp->if_xname, Index: sys/net/if_lagg.c =================================================================== --- sys/net/if_lagg.c (revision 291071) +++ sys/net/if_lagg.c (working copy) @@ -106,7 +106,7 @@ static int lagg_port_create(struct lagg_softc *, s static int lagg_port_destroy(struct lagg_port *, int); static struct mbuf *lagg_input(struct ifnet *, struct mbuf *); static void lagg_linkstate(struct lagg_softc *); -static void lagg_port_state(struct ifnet *, int); +static void lagg_port_state(struct ifnet *); static int lagg_port_ioctl(struct ifnet *, u_long, caddr_t); static int lagg_port_output(struct ifnet *, struct mbuf *, const struct sockaddr *, struct route *); @@ -1774,8 +1774,13 @@ lagg_linkstate(struct lagg_softc *sc) break; } } - if_link_state_change(sc->sc_ifp, new_link); + /* + * Force state change to ensure ifnet_link_event is generated allowing + * protocols to notify other nodes of potential address move. + */ + if_link_state_change_cond(sc->sc_ifp, new_link, 1); + /* Update if_baudrate to reflect the max possible speed */ switch (sc->sc_proto) { case LAGG_PROTO_FAILOVER: @@ -1797,7 +1802,7 @@ lagg_linkstate(struct lagg_softc *sc) } static void -lagg_port_state(struct ifnet *ifp, int state) +lagg_port_state(struct ifnet *ifp) { struct lagg_port *lp = (struct lagg_port *)ifp->if_lagg; struct lagg_softc *sc = NULL; @@ -1813,7 +1818,7 @@ static void LAGG_WUNLOCK(sc); } -struct lagg_port * +static struct lagg_port * lagg_link_active(struct lagg_softc *sc, struct lagg_port *lp) { struct lagg_port *lp_next, *rval = NULL; Index: sys/net/if_lagg.h =================================================================== --- sys/net/if_lagg.h (revision 291071) +++ sys/net/if_lagg.h (working copy) @@ -281,7 +281,7 @@ struct lagg_port { #define LAGG_UNLOCK_ASSERT(_sc) rm_assert(&(_sc)->sc_mtx, RA_UNLOCKED) extern struct mbuf *(*lagg_input_p)(struct ifnet *, struct mbuf *); -extern void (*lagg_linkstate_p)(struct ifnet *, int ); +extern void (*lagg_linkstate_p)(struct ifnet *); int lagg_enqueue(struct ifnet *, struct mbuf *); Index: sys/net/if_var.h =================================================================== --- sys/net/if_var.h (revision 291071) +++ sys/net/if_var.h (working copy) @@ -962,6 +962,7 @@ struct ifmultiaddr * void if_free(struct ifnet *); void if_initname(struct ifnet *, const char *, int); void if_link_state_change(struct ifnet *, int); +void if_link_state_change_cond(struct ifnet *, int, int); int if_printf(struct ifnet *, const char *, ...) __printflike(2, 3); void if_qflush(struct ifnet *); void if_ref(struct ifnet *); Index: sys/netinet/if_ether.c =================================================================== --- sys/netinet/if_ether.c (revision 291071) +++ sys/netinet/if_ether.c (working copy) @@ -97,12 +97,14 @@ VNET_PCPUSTAT_SYSUNINIT(arpstat); #endif /* VIMAGE */ static VNET_DEFINE(int, arp_maxhold) = 1; +static VNET_DEFINE(int, arp_on_link) = 1; #define V_arpt_keep VNET(arpt_keep) #define V_arpt_down VNET(arpt_down) #define V_arp_maxtries VNET(arp_maxtries) #define V_arp_proxyall VNET(arp_proxyall) #define V_arp_maxhold VNET(arp_maxhold) +#define V_arp_on_link VNET(arp_on_link) SYSCTL_VNET_INT(_net_link_ether_inet, OID_AUTO, max_age, CTLFLAG_RW, &VNET_NAME(arpt_keep), 0, @@ -135,6 +137,7 @@ static void in_arpinput(struct mbuf *); static void arp_iflladdr(void *arg __unused, struct ifnet *ifp); static eventhandler_tag iflladdr_tag; +static eventhandler_tag ifnet_link_event_tag; static const struct netisr_handler arp_nh = { .nh_name = "arp", @@ -559,6 +562,9 @@ SYSCTL_INT(_net_link_ether_inet, OID_AUT CTLFLAG_RW, &arp_maxpps, 0, "Maximum number of remotely triggered ARP messages that can be " "logged per second"); +SYSCTL_INT(_net_link_ether_inet, OID_AUTO, arp_on_link, CTLFLAG_VNET | CTLFLAG_RW, + &VNET_NAME(arp_on_link), 0, + "Send gratuitous ARP's on interface link up events"); #define ARP_LOG(pri, ...) do { \ if (ppsratecheck(&arp_lastlog, &arp_curpps, arp_maxpps)) \ @@ -972,7 +978,7 @@ arp_ifinit(struct ifnet *ifp, struct ifa if (ntohl(dst_in->sin_addr.s_addr) == INADDR_ANY) return; - arp_announce_ifaddr(ifp, dst_in->sin_addr, IF_LLADDR(ifp)); + arp_announce_addr(ifp, &dst_in->sin_addr, IF_LLADDR(ifp)); /* * interface address is considered static entry @@ -972,38 +978,91 @@ arp_ifinit(struct ifnet *ifp, struct ifa ifa->ifa_rtrequest = NULL; } -void -arp_announce_ifaddr(struct ifnet *ifp, struct in_addr addr, u_char *enaddr) +void __noinline +arp_announce_addr(struct ifnet *ifp, const struct in_addr *addr, u_char *enaddr) { - if (ntohl(addr.s_addr) != INADDR_ANY) - arprequest(ifp, &addr, &addr, enaddr); + if (ntohl(addr->s_addr) != INADDR_ANY) + arprequest(ifp, addr, addr, enaddr); } /* - * Sends gratuitous ARPs for each ifaddr to notify other - * nodes about the address change. + * Send gratuitous ARPs for all interfaces addresses to notify other nodes of + * changes. + * + * This is a noop if the interface isn't up or has been flagged for no ARP. */ -static __noinline void -arp_handle_ifllchange(struct ifnet *ifp) +void __noinline +arp_announce(struct ifnet *ifp) { + int i, cnt, entries; + u_char *lladdr; struct ifaddr *ifa; + struct in_addr *addr, *head; + if (!(ifp->if_flags & IFF_UP) || (ifp->if_flags & IFF_NOARP)) + return; + + entries = 8; + cnt = 0; + head = malloc(sizeof(*addr) * entries, M_TEMP, M_NOWAIT); + if (head == NULL) { + log(LOG_INFO, "arp_announce: malloc %d entries failed\n", + entries); + return; + } + + /* Take a copy then process to avoid locking issues. */ + IF_ADDR_RLOCK(ifp); TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) { - if (ifa->ifa_addr->sa_family == AF_INET) - arp_ifinit(ifp, ifa); + if (ifa->ifa_addr->sa_family != AF_INET) + continue; + + if (cnt == entries) { + addr = (struct in_addr *)realloc(head, sizeof(*addr) * + (entries + 8), M_TEMP, M_NOWAIT); + if (addr == NULL) { + log(LOG_INFO, "arp_announce: realloc to %d " + "entries failed\n", entries + 8); + /* Process what we have. */ + break; + } + entries += 8; + head = addr; + } + + addr = head + cnt; + bcopy(IFA_IN(ifa), addr, sizeof(*addr)); + cnt++; } + IF_ADDR_RUNLOCK(ifp); + + lladdr = IF_LLADDR(ifp); + for (i = 0; i < cnt; i++) { + arp_announce_addr(ifp, head + i, lladdr); + } + free(head, M_TEMP); +} + +/* + * A handler for interface linkstate change events. + */ +static void +arp_ifnet_link_event(void *arg __unused, struct ifnet *ifp, int linkstate) +{ + + if (linkstate == LINK_STATE_UP && V_arp_on_link) + arp_announce(ifp); } /* - * A handler for interface link layer address change event. + * A handler for interface link layer address change events. */ static __noinline void arp_iflladdr(void *arg __unused, struct ifnet *ifp) { - if ((ifp->if_flags & IFF_UP) != 0) - arp_handle_ifllchange(ifp); + arp_announce(ifp); } static void @@ -1016,8 +1075,12 @@ arp_init(void) { netisr_register(&arp_nh); - if (IS_DEFAULT_VNET(curvnet)) + + if (IS_DEFAULT_VNET(curvnet)) { iflladdr_tag = EVENTHANDLER_REGISTER(iflladdr_event, arp_iflladdr, NULL, EVENTHANDLER_PRI_ANY); + ifnet_link_event_tag = EVENTHANDLER_REGISTER(ifnet_link_event, + arp_ifnet_link_event, 0, EVENTHANDLER_PRI_ANY); + } } SYSINIT(arp, SI_SUB_PROTO_DOMAIN, SI_ORDER_ANY, arp_init, 0); Index: sys/netinet/if_ether.h =================================================================== --- sys/netinet/if_ether.h (revision 291071) +++ sys/netinet/if_ether.h (working copy) @@ -120,7 +120,8 @@ int arpresolve(struct ifnet *ifp, struct void arprequest(struct ifnet *, const struct in_addr *, const struct in_addr *, u_char *); void arp_ifinit(struct ifnet *, struct ifaddr *); -void arp_announce_ifaddr(struct ifnet *, struct in_addr addr, u_char *); +void arp_announce(struct ifnet *); +void arp_announce_addr(struct ifnet *, const struct in_addr *addr, u_char *); void arp_ifscrub(struct ifnet *, uint32_t); #endif Index: sys/netinet/in_var.h =================================================================== --- sys/netinet/in_var.h (revision 291071) +++ sys/netinet/in_var.h (working copy) @@ -126,6 +126,9 @@ extern struct rwlock in_ifaddr_lock; #define IN_IFADDR_WLOCK_ASSERT() rw_assert(&in_ifaddr_lock, RA_WLOCKED) #define IN_IFADDR_WUNLOCK() rw_wunlock(&in_ifaddr_lock) +#define IFA_IN(ifa) \ + (&((struct sockaddr_in *)ifa->ifa_addr)->sin_addr) + /* * Macro for finding the internet address structure (in_ifaddr) * corresponding to one of our IP addresses (in_addr). Index: sys/netinet/ip_carp.c =================================================================== --- sys/netinet/ip_carp.c (revision 291071) +++ sys/netinet/ip_carp.c (working copy) @@ -1009,13 +1009,12 @@ static void carp_send_arp(struct carp_softc *sc) { struct ifaddr *ifa; - struct in_addr addr; CARP_FOREACH_IFA(sc, ifa) { if (ifa->ifa_addr->sa_family != AF_INET) continue; - addr = ((struct sockaddr_in *)ifa->ifa_addr)->sin_addr; - arp_announce_ifaddr(sc->sc_carpdev, addr, LLADDR(&sc->sc_addr)); + arp_announce_addr(sc->sc_carpdev, IFA_IN(ifa), + LLADDR(&sc->sc_addr)); } } @@ -1037,18 +1036,16 @@ carp_iamatch(struct ifaddr *ifa, uint8_t **enaddr) static void carp_send_na(struct carp_softc *sc) { - static struct in6_addr mcast = IN6ADDR_LINKLOCAL_ALLNODES_INIT; struct ifaddr *ifa; - struct in6_addr *in6; CARP_FOREACH_IFA(sc, ifa) { - if (ifa->ifa_addr->sa_family != AF_INET6) + if (ifa->ifa_addr->sa_family != AF_INET6 || + IFA_ND6_NA_UNSOLICITED_SKIP(ifa)) continue; - in6 = IFA_IN6(ifa); - nd6_na_output(sc->sc_carpdev, &mcast, in6, - ND_NA_FLAG_OVERRIDE, 1, NULL); - DELAY(1000); /* XXX */ + nd6_na_output_unsolicited_addr(sc->sc_carpdev, IFA_IN6(ifa), + IFA_ND6_NA_BASE_FLAGS(sc->sc_carpdev, ifa)); + nd6_na_unsolicited_addr_delay(ifa); } } Index: sys/netinet6/in6.c =================================================================== --- sys/netinet6/in6.c (revision 291071) +++ sys/netinet6/in6.c (working copy) @@ -113,7 +113,7 @@ VNET_DECLARE(int, icmp6_nodeinfo_oldmcprefix); #define V_icmp6_nodeinfo_oldmcprefix VNET(icmp6_nodeinfo_oldmcprefix) /* - * Definitions of some costant IP6 addresses. + * Definitions of some constant IP6 addresses. */ const struct in6_addr in6addr_any = IN6ADDR_ANY_INIT; const struct in6_addr in6addr_loopback = IN6ADDR_LOOPBACK_INIT; Index: sys/netinet6/in6_var.h =================================================================== --- sys/netinet6/in6_var.h (revision 291071) +++ sys/netinet6/in6_var.h (working copy) @@ -399,6 +399,16 @@ struct in6_rrenumreq { #define IA6_SIN6(ia) (&((ia)->ia_addr)) #define IA6_DSTSIN6(ia) (&((ia)->ia_dstaddr)) #define IFA_IN6(x) (&((struct sockaddr_in6 *)((x)->ifa_addr))->sin6_addr) +#define IFA_IN6_FLAGS(ifa) ((struct in6_ifaddr *)ifa)->ia6_flags +#define IFA_ND6_NA_BASE_FLAGS(ifp, ifa) \ + (IFA_IN6_FLAGS(ifa) & IN6_IFF_ANYCAST ? 0 : ND_NA_FLAG_OVERRIDE) | \ + ((V_ip6_forwarding && !(ND_IFINFO(ifp)->flags & ND6_IFF_ACCEPT_RTADV && \ + V_ip6_norbit_raif)) ? ND_NA_FLAG_ROUTER : 0) +#define IFA_ND6_NA_UNSOLICITED_SKIP(ifa) \ + (IFA_IN6_FLAGS(ifa) & (IN6_IFF_DUPLICATED | IN6_IFF_DEPRECATED | \ + IN6_IFF_TENTATIVE)) != 0 +#define IN6_MAX_ANYCAST_DELAY_TIME_MS 1000000 +#define IN6_BROADCAST_DELAY_TIME_MS 1000 #define IFA_DSTIN6(x) (&((struct sockaddr_in6 *)((x)->ifa_dstaddr))->sin6_addr) #define IFPR_IN6(x) (&((struct sockaddr_in6 *)((x)->ifpr_prefix))->sin6_addr) Index: sys/netinet6/nd6.c =================================================================== --- sys/netinet6/nd6.c (revision 291071) +++ sys/netinet6/nd6.c (working copy) @@ -39,6 +39,7 @@ __FBSDID("$FreeBSD: releng/10.2/sys/neti #include #include #include +#include #include #include #include @@ -103,8 +104,12 @@ VNET_DEFINE(int, nd6_maxnudhint) = 0; /* * layer hints */ static VNET_DEFINE(int, nd6_maxqueuelen) = 1; /* max pkts cached in unresolved * ND entries */ + +static VNET_DEFINE(int, nd6_on_link) = 1; /* Send unsolicited ND's on link up */ + #define V_nd6_maxndopt VNET(nd6_maxndopt) #define V_nd6_maxqueuelen VNET(nd6_maxqueuelen) +#define V_nd6_on_link VNET(nd6_on_link) #ifdef ND6_DEBUG VNET_DEFINE(int, nd6_debug) = 1; @@ -112,6 +117,8 @@ VNET_DEFINE(int, nd6_debug) = 1; VNET_DEFINE(int, nd6_debug) = 0; #endif +static eventhandler_tag ifnet_link_event_eh; + /* for debugging? */ #if 0 static int nd6_inuse, nd6_allocated; @@ -143,6 +150,14 @@ static VNET_DEFINE(struct callout, nd6_s VNET_DEFINE(struct callout, nd6_timer_ch); +static void +nd6_ifnet_link_event(void *arg __unused, struct ifnet *ifp, int linkstate) +{ + + if (linkstate == LINK_STATE_UP && V_nd6_on_link) + nd6_na_output_unsolicited(ifp); +} + void nd6_init(void) { @@ -158,6 +173,11 @@ nd6_init(void) nd6_slowtimo, curvnet); nd6_dad_init(); + + if (IS_DEFAULT_VNET(curvnet)) { + ifnet_link_event_eh = EVENTHANDLER_REGISTER(ifnet_link_event, + nd6_ifnet_link_event, NULL, EVENTHANDLER_PRI_ANY); + } } #ifdef VIMAGE @@ -167,6 +187,9 @@ nd6_destroy() callout_drain(&V_nd6_slowtimo_ch); callout_drain(&V_nd6_timer_ch); + if (IS_DEFAULT_VNET(curvnet)) { + EVENTHANDLER_DEREGISTER(ifnet_link_event, ifnet_link_event_eh); + } } #endif @@ -2250,13 +2273,18 @@ static int nd6_sysctl_prlist(SYSCTL_HAND SYSCTL_DECL(_net_inet6_icmp6); #endif SYSCTL_NODE(_net_inet6_icmp6, ICMPV6CTL_ND6_DRLIST, nd6_drlist, - CTLFLAG_RD, nd6_sysctl_drlist, ""); + CTLFLAG_RD, nd6_sysctl_drlist, "List default routers"); SYSCTL_NODE(_net_inet6_icmp6, ICMPV6CTL_ND6_PRLIST, nd6_prlist, - CTLFLAG_RD, nd6_sysctl_prlist, ""); + CTLFLAG_RD, nd6_sysctl_prlist, "List prefixes"); SYSCTL_VNET_INT(_net_inet6_icmp6, ICMPV6CTL_ND6_MAXQLEN, nd6_maxqueuelen, - CTLFLAG_RW, &VNET_NAME(nd6_maxqueuelen), 1, ""); + CTLFLAG_RW, &VNET_NAME(nd6_maxqueuelen), 1, + "Max packets cached in unresolved ND entries"); SYSCTL_VNET_INT(_net_inet6_icmp6, OID_AUTO, nd6_gctimer, - CTLFLAG_RW, &VNET_NAME(nd6_gctimer), (60 * 60 * 24), ""); + CTLFLAG_RW, &VNET_NAME(nd6_gctimer), (60 * 60 * 24), + "Interface in seconds between garbage collection passes"); +SYSCTL_INT(_net_inet6_icmp6, OID_AUTO, nd6_on_link, CTLFLAG_VNET | CTLFLAG_RW, + &VNET_NAME(nd6_on_link), 0, + "Send unsolicited neighbor discovery on interface link up events"); static int nd6_sysctl_drlist(SYSCTL_HANDLER_ARGS) Index: sys/netinet6/nd6.h =================================================================== --- sys/netinet6/nd6.h (revision 291071) +++ sys/netinet6/nd6.h (working copy) @@ -398,6 +398,10 @@ void nd6_init(void); #ifdef VIMAGE void nd6_destroy(void); #endif +void nd6_na_output_unsolicited(struct ifnet *); +void nd6_na_output_unsolicited_addr(struct ifnet *, const struct in6_addr *, + u_long); +int nd6_na_unsolicited_addr_delay(struct ifaddr *); struct nd_ifinfo *nd6_ifattach(struct ifnet *); void nd6_ifdetach(struct nd_ifinfo *); int nd6_is_addr_neighbor(const struct sockaddr_in6 *, struct ifnet *); Index: sys/netinet6/nd6_nbr.c =================================================================== --- sys/netinet6/nd6_nbr.c (revision 291071) +++ sys/netinet6/nd6_nbr.c (working copy) @@ -124,20 +124,16 @@ nd6_ns_input(struct mbuf *m, int off, int icmp6len struct in6_addr saddr6 = ip6->ip6_src; struct in6_addr daddr6 = ip6->ip6_dst; struct in6_addr taddr6; - struct in6_addr myaddr6; char *lladdr = NULL; struct ifaddr *ifa = NULL; + u_long flags; int lladdrlen = 0; - int anycast = 0, proxy = 0, tentative = 0; + int proxy = 0; int tlladdr; - int rflag; union nd_opts ndopts; struct sockaddr_dl proxydl; char ip6bufs[INET6_ADDRSTRLEN], ip6bufd[INET6_ADDRSTRLEN]; - rflag = (V_ip6_forwarding) ? ND_NA_FLAG_ROUTER : 0; - if (ND_IFINFO(ifp)->flags & ND6_IFF_ACCEPT_RTADV && V_ip6_norbit_raif) - rflag = 0; #ifndef PULLDOWN_TEST IP6_EXTHDR_CHECK(m, off, icmp6len,); nd_ns = (struct nd_neighbor_solicit *)((caddr_t)ip6 + off); @@ -229,10 +225,7 @@ nd6_ns_input(struct mbuf *m, int off, int icmp6len * In implementation, we add target link-layer address by default. * We do not add one in MUST NOT cases. */ - if (!IN6_IS_ADDR_MULTICAST(&daddr6)) - tlladdr = 0; - else - tlladdr = 1; + tlladdr = !IN6_IS_ADDR_MULTICAST(&daddr6); /* * Target address (taddr6) must be either: @@ -289,9 +282,6 @@ nd6_ns_input(struct mbuf *m, int off, int icmp6len */ goto freeit; } - myaddr6 = *IFA_IN6(ifa); - anycast = ((struct in6_ifaddr *)ifa)->ia6_flags & IN6_IFF_ANYCAST; - tentative = ((struct in6_ifaddr *)ifa)->ia6_flags & IN6_IFF_TENTATIVE; if (((struct in6_ifaddr *)ifa)->ia6_flags & IN6_IFF_DUPLICATED) goto freeit; @@ -303,7 +293,7 @@ nd6_ns_input(struct mbuf *m, int off, int icmp6len goto bad; } - if (IN6_ARE_ADDR_EQUAL(&myaddr6, &saddr6)) { + if (IN6_ARE_ADDR_EQUAL(IFA_IN6(ifa), &saddr6)) { nd6log((LOG_INFO, "nd6_ns_input: duplicate IP6 address %s\n", ip6_sprintf(ip6bufs, &saddr6))); goto freeit; @@ -321,7 +311,7 @@ nd6_ns_input(struct mbuf *m, int off, int icmp6len * * The processing is defined in RFC 2462. */ - if (tentative) { + if (IFA_IN6_FLAGS(ifa) & IN6_IFF_TENTATIVE) { /* * If source address is unspecified address, it is for * duplicate address detection. @@ -335,6 +325,10 @@ nd6_ns_input(struct mbuf *m, int off, int icmp6len goto freeit; } + flags = IFA_ND6_NA_BASE_FLAGS(ifp, ifa); + if (proxy || !tlladdr) + flags &= ~ND_NA_FLAG_OVERRIDE; + /* * If the source address is unspecified address, entries must not * be created or updated. @@ -349,10 +343,8 @@ nd6_ns_input(struct mbuf *m, int off, int icmp6len in6_all = in6addr_linklocal_allnodes; if (in6_setscope(&in6_all, ifp, NULL) != 0) goto bad; - nd6_na_output_fib(ifp, &in6_all, &taddr6, - ((anycast || proxy || !tlladdr) ? 0 : ND_NA_FLAG_OVERRIDE) | - rflag, tlladdr, proxy ? (struct sockaddr *)&proxydl : NULL, - M_GETFIB(m)); + nd6_na_output_fib(ifp, &in6_all, &taddr6, flags, tlladdr, + proxy ? (struct sockaddr *)&proxydl : NULL, M_GETFIB(m)); goto freeit; } @@ -359,10 +351,8 @@ nd6_ns_input(struct mbuf *m, int off, int icmp6len nd6_cache_lladdr(ifp, &saddr6, lladdr, lladdrlen, ND_NEIGHBOR_SOLICIT, 0); - nd6_na_output_fib(ifp, &saddr6, &taddr6, - ((anycast || proxy || !tlladdr) ? 0 : ND_NA_FLAG_OVERRIDE) | - rflag | ND_NA_FLAG_SOLICITED, tlladdr, - proxy ? (struct sockaddr *)&proxydl : NULL, M_GETFIB(m)); + nd6_na_output_fib(ifp, &saddr6, &taddr6, flags | ND_NA_FLAG_SOLICITED, + tlladdr, proxy ? (struct sockaddr *)&proxydl : NULL, M_GETFIB(m)); freeit: if (ifa != NULL) ifa_free(ifa); @@ -1589,3 +1579,110 @@ nd6_dad_na_input(struct ifaddr *ifa) nd6_dad_rele(dp); } } + +/* + * Send unsolicited neighbor advertisements for all interface addresses to + * notify other nodes of changes. + * + * This is a noop if the interface isn't up. + */ +void __noinline +nd6_na_output_unsolicited(struct ifnet *ifp) +{ + int i, cnt, entries; + struct ifaddr *ifa; + struct ann { + struct in6_addr addr; + u_long flags; + int delay; + } *ann1, *head; + + if (!(ifp->if_flags & IFF_UP)) + return; + + entries = 8; + cnt = 0; + head = malloc(sizeof(struct ann) * entries, M_TEMP, M_WAITOK); + + /* Take a copy then process to avoid locking issues. */ + IF_ADDR_RLOCK(ifp); + TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) { + if (ifa->ifa_addr->sa_family != AF_INET6 || + IFA_ND6_NA_UNSOLICITED_SKIP(ifa)) + continue; + + if (cnt == entries) { + ann1 = (struct ann*)realloc(head, sizeof(struct ann) * + (entries + 8), M_TEMP, M_NOWAIT); + if (ann1 == NULL) { + log(LOG_INFO, "nd6_announce: realloc to %d " + "entries failed\n", entries + 8); + /* Process what we have. */ + break; + } + entries += 8; + head = ann1; + } + + ann1 = head + cnt; + bcopy(IFA_IN6(ifa), &ann1->addr, sizeof(ann1->addr)); + ann1->flags = IFA_ND6_NA_BASE_FLAGS(ifp, ifa); + ann1->delay = nd6_na_unsolicited_addr_delay(ifa); + cnt++; + } + IF_ADDR_RUNLOCK(ifp); + + for (i = 0; i < cnt;) { + ann1 = head + i; + nd6_na_output_unsolicited_addr(ifp, &ann1->addr, ann1->flags); + i++; + if (i == cnt) + break; + DELAY(ann1->delay); + } + free(head, M_TEMP); +} + +/* + * Return the delay required for announcements of the address as per RFC 4861. + */ +int +nd6_na_unsolicited_addr_delay(struct ifaddr *ifa) +{ + + if (IFA_IN6_FLAGS(ifa) & IN6_IFF_ANYCAST) { + /* + * Random value between 0 and MAX_ANYCAST_DELAY_TIME + * as per section 7.2.7. + */ + return (random() % IN6_MAX_ANYCAST_DELAY_TIME_MS); + } + + /* Small delay as per section 7.2.6. */ + return (IN6_BROADCAST_DELAY_TIME_MS); +} + +/* + * Send an unsolicited neighbor advertisement for an address to notify other + * nodes of changes. + */ +void __noinline +nd6_na_output_unsolicited_addr(struct ifnet *ifp, const struct in6_addr *addr, + u_long flags) +{ + int error; + struct in6_addr mcast; + + mcast = in6addr_linklocal_allnodes; + if ((error = in6_setscope(&mcast, ifp, NULL)) != 0) { + /* + * This shouldn't by possible as the only error is for loopback + * address which we're not using. + */ + log(LOG_INFO, "in6_setscope: on mcast failed: %d\n", error); + return; + } + nd6_na_output(ifp, &mcast, addr, flags, 1, NULL); +} + + --------------8DA201AA1DCCAD657B20F1BD--