Date: Mon, 20 Oct 2014 09:57:34 -0700 From: Adrian Chadd <adrian@freebsd.org> To: Bryan Venteicher <bryanv@freebsd.org> Cc: "svn-src-head@freebsd.org" <svn-src-head@freebsd.org>, "svn-src-all@freebsd.org" <svn-src-all@freebsd.org>, "src-committers@freebsd.org" <src-committers@freebsd.org> Subject: Re: svn commit: r273331 - in head: sbin/ifconfig share/man/man4 sys/conf sys/modules sys/modules/if_vxlan sys/net sys/sys Message-ID: <CAJ-VmonFwHqdw1SKYFxRkG0ML1cvAc6LNcEQ=_pRosdqZyBRYQ@mail.gmail.com> In-Reply-To: <201410201442.s9KEggqt096167@svn.freebsd.org> References: <201410201442.s9KEggqt096167@svn.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi, Can you please create a PR that says something like "review vxlan code for RSS after de-capsulation" and assign it to me? I'm going to have to insert a hash recalculation after decapsulation but I'm too busy at the moment to do it. Thanks, -a On 20 October 2014 07:42, Bryan Venteicher <bryanv@freebsd.org> wrote: > Author: bryanv > Date: Mon Oct 20 14:42:42 2014 > New Revision: 273331 > URL: https://svnweb.freebsd.org/changeset/base/273331 > > Log: > Add vxlan interface > > vxlan creates a virtual LAN by encapsulating the inner Ethernet frame in > a UDP packet. This implementation is based on RFC7348. > > Currently, the IPv6 support is not fully compliant with the specification: > we should be able to receive UPDv6 packets with a zero checksum, but we > need to support RFC6935 first. Patches for this should come soon. > > Encapsulation protocols such as vxlan emphasize the need for the FreeBSD > network stack to support batching, GRO, and GSO. Each frame has to make > two trips through the network stack, and each frame will be at most MTU > sized. Performance suffers accordingly. > > Some latest generation NICs have begun to support vxlan HW offloads that > we should also take advantage of. VIMAGE support should also be added soon. > > Differential Revision: https://reviews.freebsd.org/D384 > Reviewed by: gnn > Relnotes: yes > > Added: > head/sbin/ifconfig/ifvxlan.c (contents, props changed) > head/share/man/man4/vxlan.4 (contents, props changed) > head/sys/modules/if_vxlan/ > head/sys/modules/if_vxlan/Makefile (contents, props changed) > head/sys/net/if_vxlan.c (contents, props changed) > head/sys/net/if_vxlan.h (contents, props changed) > Modified: > head/sbin/ifconfig/Makefile > head/sbin/ifconfig/ifconfig.8 > head/share/man/man4/Makefile > head/sys/conf/NOTES > head/sys/conf/files > head/sys/modules/Makefile > head/sys/sys/priv.h > > Modified: head/sbin/ifconfig/Makefile > ============================================================================== > --- head/sbin/ifconfig/Makefile Mon Oct 20 14:25:23 2014 (r273330) > +++ head/sbin/ifconfig/Makefile Mon Oct 20 14:42:42 2014 (r273331) > @@ -30,6 +30,7 @@ SRCS+= ifmac.c # MAC support > SRCS+= ifmedia.c # SIOC[GS]IFMEDIA support > SRCS+= iffib.c # non-default FIB support > SRCS+= ifvlan.c # SIOC[GS]ETVLAN support > +SRCS+= ifvxlan.c # VXLAN support > SRCS+= ifgre.c # GRE keys etc > SRCS+= ifgif.c # GIF reversed header workaround > > > Modified: head/sbin/ifconfig/ifconfig.8 > ============================================================================== > --- head/sbin/ifconfig/ifconfig.8 Mon Oct 20 14:25:23 2014 (r273330) > +++ head/sbin/ifconfig/ifconfig.8 Mon Oct 20 14:42:42 2014 (r273331) > @@ -28,7 +28,7 @@ > .\" From: @(#)ifconfig.8 8.3 (Berkeley) 1/5/94 > .\" $FreeBSD$ > .\" > -.Dd October 1, 2014 > +.Dd October 20, 2014 > .Dt IFCONFIG 8 > .Os > .Sh NAME > @@ -2541,6 +2541,76 @@ argument is useless and hence deprecated > .El > .Pp > The following parameters are used to configure > +.Xr vxlan 4 > +interfaces. > +.Bl -tag -width indent > +.It Cm vni Ar identifier > +This value is a 24-bit VXLAN Network Identifier (VNI) that identifies the > +virtual network segment membership of the interface. > +.It Cm local Ar address > +The source address used in the encapsulating IPv4/IPv6 header. > +The address should already be assigned to an existing interface. > +When the interface is configured in unicast mode, the listening socket > +is bound to this address. > +.It Cm remote Ar address > +The interface can be configured in a unicast, or point-to-point, mode > +to create a tunnel between two hosts. > +This is the IP address of the remote end of the tunnel. > +.It Cm group Ar address > +The interface can be configured in a multicast mode > +to create a virtual network of hosts. > +This is the IP multicast group address the interface will join. > +.It Cm localport Ar port > +The port number the interface will listen on. > +The default port number is 4789. > +.It Cm remoteport Ar port > +The destination port number used in the encapsulating IPv4/IPv6 header. > +The remote host should be listening on this port. > +The default port number is 4789. > +Note some other implementations, such as Linux, > +do not default to the IANA assigned port, > +but instead listen on port 8472. > +.It Cm portrange Ar low high > +The range of source ports used in the encapsulating IPv4/IPv6 header. > +The port selected within the range is based on a hash of the inner frame. > +A range is useful to provide entropy within the outer IP header > +for more effective load balancing. > +The default range is between the > +.Xr sysctl 8 > +variables > +.Va net.inet.ip.portrange.first > +and > +.Va net.inet.ip.portrange.last > +.It Cm timeout Ar timeout > +The maximum time, in seconds, before an entry in the forwarding table > +is pruned. > +The default is 1200 seconds (20 minutes). > +.It Cm maxaddr Ar max > +The maximum number of entries in the forwarding table. > +The default is 2000. > +.It Cm vxlandev Ar dev > +When the interface is configured in multicast mode, the > +.Cm dev > +interface is used to transmit IP multicast packets. > +.It Cm ttl Ar ttl > +The TTL used in the encapsulating IPv4/IPv6 header. > +The default is 64. > +.It Cm learn > +The source IP address and inner source Ethernet MAC address of > +received packets are used to dynamically populate the forwarding table. > +When in multicast mode, an entry in the forwarding table allows the > +interface to send the frame directly to the remote host instead of > +broadcasting the frame to the multicast group. > +This is the default. > +.It Fl learn > +The forwarding table is not populated by recevied packets. > +.It Cm flush > +Delete all dynamically-learned addresses from the forwarding table. > +.It Cm flushall > +Delete all addresses, including static addresses, from the forwarding table. > +.El > +.Pp > +The following parameters are used to configure > .Xr carp 4 > protocol on an interface: > .Bl -tag -width indent > @@ -2745,6 +2815,7 @@ tried to alter an interface's configurat > .Xr pfsync 4 , > .Xr polling 4 , > .Xr vlan 4 , > +.Xr vxlan 4 , > .Xr devd.conf 5 , > .\" .Xr eon 5 , > .Xr devd 8 , > > Added: head/sbin/ifconfig/ifvxlan.c > ============================================================================== > --- /dev/null 00:00:00 1970 (empty, because file is newly added) > +++ head/sbin/ifconfig/ifvxlan.c Mon Oct 20 14:42:42 2014 (r273331) > @@ -0,0 +1,648 @@ > +/*- > + * Copyright (c) 2014, Bryan Venteicher <bryanv@FreeBSD.org> > + * All rights reserved. > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following conditions > + * are met: > + * 1. Redistributions of source code must retain the above copyright > + * notice unmodified, this list of conditions, and the following > + * disclaimer. > + * 2. Redistributions in binary form must reproduce the above copyright > + * notice, this list of conditions and the following disclaimer in the > + * documentation and/or other materials provided with the distribution. > + * > + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR > + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES > + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. > + * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, > + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT > + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF > + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. > + */ > + > +#include <sys/cdefs.h> > +__FBSDID("$FreeBSD$"); > + > +#include <sys/param.h> > +#include <sys/ioctl.h> > +#include <sys/socket.h> > +#include <sys/sockio.h> > + > +#include <stdlib.h> > +#include <stdint.h> > +#include <unistd.h> > +#include <netdb.h> > + > +#include <net/ethernet.h> > +#include <net/if.h> > +#include <net/if_var.h> > +#include <net/if_vxlan.h> > +#include <net/route.h> > +#include <netinet/in.h> > + > +#include <ctype.h> > +#include <stdio.h> > +#include <string.h> > +#include <stdlib.h> > +#include <unistd.h> > +#include <err.h> > +#include <errno.h> > + > +#include "ifconfig.h" > + > +static struct ifvxlanparam params = { > + .vxlp_vni = VXLAN_VNI_MAX, > +}; > + > +static int > +get_val(const char *cp, u_long *valp) > +{ > + char *endptr; > + u_long val; > + > + errno = 0; > + val = strtoul(cp, &endptr, 0); > + if (cp[0] == '\0' || endptr[0] != '\0' || errno == ERANGE) > + return (-1); > + > + *valp = val; > + return (0); > +} > + > +static int > +do_cmd(int sock, u_long op, void *arg, size_t argsize, int set) > +{ > + struct ifdrv ifd; > + > + bzero(&ifd, sizeof(ifd)); > + > + strlcpy(ifd.ifd_name, ifr.ifr_name, sizeof(ifd.ifd_name)); > + ifd.ifd_cmd = op; > + ifd.ifd_len = argsize; > + ifd.ifd_data = arg; > + > + return (ioctl(sock, set ? SIOCSDRVSPEC : SIOCGDRVSPEC, &ifd)); > +} > + > +static int > +vxlan_exists(int sock) > +{ > + struct ifvxlancfg cfg; > + > + bzero(&cfg, sizeof(cfg)); > + > + return (do_cmd(sock, VXLAN_CMD_GET_CONFIG, &cfg, sizeof(cfg), 0) != -1); > +} > + > +static void > +vxlan_status(int s) > +{ > + struct ifvxlancfg cfg; > + char src[NI_MAXHOST], dst[NI_MAXHOST]; > + char srcport[NI_MAXSERV], dstport[NI_MAXSERV]; > + struct sockaddr *lsa, *rsa; > + int vni, mc, ipv6; > + > + bzero(&cfg, sizeof(cfg)); > + > + if (do_cmd(s, VXLAN_CMD_GET_CONFIG, &cfg, sizeof(cfg), 0) < 0) > + return; > + > + vni = cfg.vxlc_vni; > + lsa = &cfg.vxlc_local_sa.sa; > + rsa = &cfg.vxlc_remote_sa.sa; > + ipv6 = rsa->sa_family == AF_INET6; > + > + /* Just report nothing if the network identity isn't set yet. */ > + if (vni >= VXLAN_VNI_MAX) > + return; > + > + if (getnameinfo(lsa, lsa->sa_len, src, sizeof(src), > + srcport, sizeof(srcport), NI_NUMERICHOST | NI_NUMERICSERV) != 0) > + src[0] = srcport[0] = '\0'; > + if (getnameinfo(rsa, rsa->sa_len, dst, sizeof(dst), > + dstport, sizeof(dstport), NI_NUMERICHOST | NI_NUMERICSERV) != 0) > + dst[0] = dstport[0] = '\0'; > + > + if (!ipv6) { > + struct sockaddr_in *sin = (struct sockaddr_in *)rsa; > + mc = IN_MULTICAST(ntohl(sin->sin_addr.s_addr)); > + } else { > + struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)rsa; > + mc = IN6_IS_ADDR_MULTICAST(&sin6->sin6_addr); > + } > + > + printf("\tvxlan vni %d", vni); > + printf(" local %s%s%s:%s", ipv6 ? "[" : "", src, ipv6 ? "]" : "", > + srcport); > + printf(" %s %s%s%s:%s", mc ? "group" : "remote", ipv6 ? "[" : "", > + dst, ipv6 ? "]" : "", dstport); > + > + if (verbose) { > + printf("\n\t\tconfig: "); > + printf("%slearning portrange %d-%d ttl %d", > + cfg.vxlc_learn ? "" : "no", cfg.vxlc_port_min, > + cfg.vxlc_port_max, cfg.vxlc_ttl); > + printf("\n\t\tftable: "); > + printf("cnt %d max %d timeout %d", > + cfg.vxlc_ftable_cnt, cfg.vxlc_ftable_max, > + cfg.vxlc_ftable_timeout); > + } > + > + putchar('\n'); > +} > + > +#define _LOCAL_ADDR46 \ > + (VXLAN_PARAM_WITH_LOCAL_ADDR4 | VXLAN_PARAM_WITH_LOCAL_ADDR6) > +#define _REMOTE_ADDR46 \ > + (VXLAN_PARAM_WITH_REMOTE_ADDR4 | VXLAN_PARAM_WITH_REMOTE_ADDR6) > + > +static void > +vxlan_check_params(void) > +{ > + > + if ((params.vxlp_with & _LOCAL_ADDR46) == _LOCAL_ADDR46) > + errx(1, "cannot specify both local IPv4 and IPv6 addresses"); > + if ((params.vxlp_with & _REMOTE_ADDR46) == _REMOTE_ADDR46) > + errx(1, "cannot specify both remote IPv4 and IPv6 addresses"); > + if ((params.vxlp_with & VXLAN_PARAM_WITH_LOCAL_ADDR4 && > + params.vxlp_with & VXLAN_PARAM_WITH_REMOTE_ADDR6) || > + (params.vxlp_with & VXLAN_PARAM_WITH_LOCAL_ADDR6 && > + params.vxlp_with & VXLAN_PARAM_WITH_REMOTE_ADDR4)) > + errx(1, "cannot mix IPv4 and IPv6 addresses"); > +} > + > +#undef _LOCAL_ADDR46 > +#undef _REMOTE_ADDR46 > + > +static void > +vxlan_cb(int s, void *arg) > +{ > + > +} > + > +static void > +vxlan_create(int s, struct ifreq *ifr) > +{ > + > + vxlan_check_params(); > + > + ifr->ifr_data = (caddr_t) ¶ms; > + if (ioctl(s, SIOCIFCREATE2, ifr) < 0) > + err(1, "SIOCIFCREATE2"); > +} > + > +static > +DECL_CMD_FUNC(setvxlan_vni, arg, d) > +{ > + struct ifvxlancmd cmd; > + u_long val; > + > + if (get_val(arg, &val) < 0 || val >= VXLAN_VNI_MAX) > + errx(1, "invalid network identifier: %s", arg); > + > + if (!vxlan_exists(s)) { > + params.vxlp_with |= VXLAN_PARAM_WITH_VNI; > + params.vxlp_vni = val; > + return; > + } > + > + bzero(&cmd, sizeof(cmd)); > + cmd.vxlcmd_vni = val; > + > + if (do_cmd(s, VXLAN_CMD_SET_VNI, &cmd, sizeof(cmd), 1) < 0) > + err(1, "VXLAN_CMD_SET_VNI"); > +} > + > +static > +DECL_CMD_FUNC(setvxlan_local, addr, d) > +{ > + struct ifvxlancmd cmd; > + struct addrinfo *ai; > + struct sockaddr *sa; > + int error; > + > + bzero(&cmd, sizeof(cmd)); > + > + if ((error = getaddrinfo(addr, NULL, NULL, &ai)) != 0) > + errx(1, "error in parsing local address string: %s", > + gai_strerror(error)); > + > + sa = ai->ai_addr; > + > + switch (ai->ai_family) { > +#ifdef INET > + case AF_INET: { > + struct in_addr addr = ((struct sockaddr_in *) sa)->sin_addr; > + > + if (IN_MULTICAST(ntohl(addr.s_addr))) > + errx(1, "local address cannot be multicast"); > + > + cmd.vxlcmd_sa.in4.sin_family = AF_INET; > + cmd.vxlcmd_sa.in4.sin_addr = addr; > + break; > + } > +#endif > +#ifdef INET6 > + case AF_INET6: { > + struct in6_addr *addr = &((struct sockaddr_in6 *)sa)->sin6_addr; > + > + if (IN6_IS_ADDR_MULTICAST(addr)) > + errx(1, "local address cannot be multicast"); > + > + cmd.vxlcmd_sa.in6.sin6_family = AF_INET6; > + cmd.vxlcmd_sa.in6.sin6_addr = *addr; > + break; > + } > +#endif > + default: > + errx(1, "local address %s not supported", addr); > + } > + > + freeaddrinfo(ai); > + > + if (!vxlan_exists(s)) { > + if (cmd.vxlcmd_sa.sa.sa_family == AF_INET) { > + params.vxlp_with |= VXLAN_PARAM_WITH_LOCAL_ADDR4; > + params.vxlp_local_in4 = cmd.vxlcmd_sa.in4.sin_addr; > + } else { > + params.vxlp_with |= VXLAN_PARAM_WITH_LOCAL_ADDR6; > + params.vxlp_local_in6 = cmd.vxlcmd_sa.in6.sin6_addr; > + } > + return; > + } > + > + if (do_cmd(s, VXLAN_CMD_SET_LOCAL_ADDR, &cmd, sizeof(cmd), 1) < 0) > + err(1, "VXLAN_CMD_SET_LOCAL_ADDR"); > +} > + > +static > +DECL_CMD_FUNC(setvxlan_remote, addr, d) > +{ > + struct ifvxlancmd cmd; > + struct addrinfo *ai; > + struct sockaddr *sa; > + int error; > + > + bzero(&cmd, sizeof(cmd)); > + > + if ((error = getaddrinfo(addr, NULL, NULL, &ai)) != 0) > + errx(1, "error in parsing remote address string: %s", > + gai_strerror(error)); > + > + sa = ai->ai_addr; > + > + switch (ai->ai_family) { > +#ifdef INET > + case AF_INET: { > + struct in_addr addr = ((struct sockaddr_in *)sa)->sin_addr; > + > + if (IN_MULTICAST(ntohl(addr.s_addr))) > + errx(1, "remote address cannot be multicast"); > + > + cmd.vxlcmd_sa.in4.sin_family = AF_INET; > + cmd.vxlcmd_sa.in4.sin_addr = addr; > + break; > + } > +#endif > +#ifdef INET6 > + case AF_INET6: { > + struct in6_addr *addr = &((struct sockaddr_in6 *)sa)->sin6_addr; > + > + if (IN6_IS_ADDR_MULTICAST(addr)) > + errx(1, "remote address cannot be multicast"); > + > + cmd.vxlcmd_sa.in6.sin6_family = AF_INET6; > + cmd.vxlcmd_sa.in6.sin6_addr = *addr; > + break; > + } > +#endif > + default: > + errx(1, "remote address %s not supported", addr); > + } > + > + freeaddrinfo(ai); > + > + if (!vxlan_exists(s)) { > + if (cmd.vxlcmd_sa.sa.sa_family == AF_INET) { > + params.vxlp_with |= VXLAN_PARAM_WITH_REMOTE_ADDR4; > + params.vxlp_remote_in4 = cmd.vxlcmd_sa.in4.sin_addr; > + } else { > + params.vxlp_with |= VXLAN_PARAM_WITH_REMOTE_ADDR6; > + params.vxlp_remote_in6 = cmd.vxlcmd_sa.in6.sin6_addr; > + } > + return; > + } > + > + if (do_cmd(s, VXLAN_CMD_SET_REMOTE_ADDR, &cmd, sizeof(cmd), 1) < 0) > + err(1, "VXLAN_CMD_SET_REMOTE_ADDR"); > +} > + > +static > +DECL_CMD_FUNC(setvxlan_group, addr, d) > +{ > + struct ifvxlancmd cmd; > + struct addrinfo *ai; > + struct sockaddr *sa; > + int error; > + > + bzero(&cmd, sizeof(cmd)); > + > + if ((error = getaddrinfo(addr, NULL, NULL, &ai)) != 0) > + errx(1, "error in parsing group address string: %s", > + gai_strerror(error)); > + > + sa = ai->ai_addr; > + > + switch (ai->ai_family) { > +#ifdef INET > + case AF_INET: { > + struct in_addr addr = ((struct sockaddr_in *)sa)->sin_addr; > + > + if (!IN_MULTICAST(ntohl(addr.s_addr))) > + errx(1, "group address must be multicast"); > + > + cmd.vxlcmd_sa.in4.sin_family = AF_INET; > + cmd.vxlcmd_sa.in4.sin_addr = addr; > + break; > + } > +#endif > +#ifdef INET6 > + case AF_INET6: { > + struct in6_addr *addr = &((struct sockaddr_in6 *)sa)->sin6_addr; > + > + if (!IN6_IS_ADDR_MULTICAST(addr)) > + errx(1, "group address must be multicast"); > + > + cmd.vxlcmd_sa.in6.sin6_family = AF_INET6; > + cmd.vxlcmd_sa.in6.sin6_addr = *addr; > + break; > + } > +#endif > + default: > + errx(1, "group address %s not supported", addr); > + } > + > + freeaddrinfo(ai); > + > + if (!vxlan_exists(s)) { > + if (cmd.vxlcmd_sa.sa.sa_family == AF_INET) { > + params.vxlp_with |= VXLAN_PARAM_WITH_REMOTE_ADDR4; > + params.vxlp_remote_in4 = cmd.vxlcmd_sa.in4.sin_addr; > + } else { > + params.vxlp_with |= VXLAN_PARAM_WITH_REMOTE_ADDR6; > + params.vxlp_remote_in6 = cmd.vxlcmd_sa.in6.sin6_addr; > + } > + return; > + } > + > + if (do_cmd(s, VXLAN_CMD_SET_REMOTE_ADDR, &cmd, sizeof(cmd), 1) < 0) > + err(1, "VXLAN_CMD_SET_REMOTE_ADDR"); > +} > + > +static > +DECL_CMD_FUNC(setvxlan_local_port, arg, d) > +{ > + struct ifvxlancmd cmd; > + u_long val; > + > + if (get_val(arg, &val) < 0 || val >= UINT16_MAX) > + errx(1, "invalid local port: %s", arg); > + > + if (!vxlan_exists(s)) { > + params.vxlp_with |= VXLAN_PARAM_WITH_LOCAL_PORT; > + params.vxlp_local_port = val; > + return; > + } > + > + bzero(&cmd, sizeof(cmd)); > + cmd.vxlcmd_port = val; > + > + if (do_cmd(s, VXLAN_CMD_SET_LOCAL_PORT, &cmd, sizeof(cmd), 1) < 0) > + err(1, "VXLAN_CMD_SET_LOCAL_PORT"); > +} > + > +static > +DECL_CMD_FUNC(setvxlan_remote_port, arg, d) > +{ > + struct ifvxlancmd cmd; > + u_long val; > + > + if (get_val(arg, &val) < 0 || val >= UINT16_MAX) > + errx(1, "invalid remote port: %s", arg); > + > + if (!vxlan_exists(s)) { > + params.vxlp_with |= VXLAN_PARAM_WITH_REMOTE_PORT; > + params.vxlp_remote_port = val; > + return; > + } > + > + bzero(&cmd, sizeof(cmd)); > + cmd.vxlcmd_port = val; > + > + if (do_cmd(s, VXLAN_CMD_SET_REMOTE_PORT, &cmd, sizeof(cmd), 1) < 0) > + err(1, "VXLAN_CMD_SET_REMOTE_PORT"); > +} > + > +static > +DECL_CMD_FUNC2(setvxlan_port_range, arg1, arg2) > +{ > + struct ifvxlancmd cmd; > + u_long min, max; > + > + if (get_val(arg1, &min) < 0 || min >= UINT16_MAX) > + errx(1, "invalid port range minimum: %s", arg1); > + if (get_val(arg2, &max) < 0 || max >= UINT16_MAX) > + errx(1, "invalid port range maximum: %s", arg2); > + if (max < min) > + errx(1, "invalid port range"); > + > + if (!vxlan_exists(s)) { > + params.vxlp_with |= VXLAN_PARAM_WITH_PORT_RANGE; > + params.vxlp_min_port = min; > + params.vxlp_max_port = max; > + return; > + } > + > + bzero(&cmd, sizeof(cmd)); > + cmd.vxlcmd_port_min = min; > + cmd.vxlcmd_port_max = max; > + > + if (do_cmd(s, VXLAN_CMD_SET_PORT_RANGE, &cmd, sizeof(cmd), 1) < 0) > + err(1, "VXLAN_CMD_SET_PORT_RANGE"); > +} > + > +static > +DECL_CMD_FUNC(setvxlan_timeout, arg, d) > +{ > + struct ifvxlancmd cmd; > + u_long val; > + > + if (get_val(arg, &val) < 0 || (val & ~0xFFFFFFFF) != 0) > + errx(1, "invalid timeout value: %s", arg); > + > + if (!vxlan_exists(s)) { > + params.vxlp_with |= VXLAN_PARAM_WITH_FTABLE_TIMEOUT; > + params.vxlp_ftable_timeout = val & 0xFFFFFFFF; > + return; > + } > + > + bzero(&cmd, sizeof(cmd)); > + cmd.vxlcmd_ftable_timeout = val & 0xFFFFFFFF; > + > + if (do_cmd(s, VXLAN_CMD_SET_FTABLE_TIMEOUT, &cmd, sizeof(cmd), 1) < 0) > + err(1, "VXLAN_CMD_SET_FTABLE_TIMEOUT"); > +} > + > +static > +DECL_CMD_FUNC(setvxlan_maxaddr, arg, d) > +{ > + struct ifvxlancmd cmd; > + u_long val; > + > + if (get_val(arg, &val) < 0 || (val & ~0xFFFFFFFF) != 0) > + errx(1, "invalid maxaddr value: %s", arg); > + > + if (!vxlan_exists(s)) { > + params.vxlp_with |= VXLAN_PARAM_WITH_FTABLE_MAX; > + params.vxlp_ftable_max = val & 0xFFFFFFFF; > + return; > + } > + > + bzero(&cmd, sizeof(cmd)); > + cmd.vxlcmd_ftable_max = val & 0xFFFFFFFF; > + > + if (do_cmd(s, VXLAN_CMD_SET_FTABLE_MAX, &cmd, sizeof(cmd), 1) < 0) > + err(1, "VXLAN_CMD_SET_FTABLE_MAX"); > +} > + > +static > +DECL_CMD_FUNC(setvxlan_dev, arg, d) > +{ > + struct ifvxlancmd cmd; > + > + if (!vxlan_exists(s)) { > + params.vxlp_with |= VXLAN_PARAM_WITH_MULTICAST_IF; > + strlcpy(params.vxlp_mc_ifname, arg, > + sizeof(params.vxlp_mc_ifname)); > + return; > + } > + > + bzero(&cmd, sizeof(cmd)); > + strlcpy(cmd.vxlcmd_ifname, arg, sizeof(cmd.vxlcmd_ifname)); > + > + if (do_cmd(s, VXLAN_CMD_SET_MULTICAST_IF, &cmd, sizeof(cmd), 1) < 0) > + err(1, "VXLAN_CMD_SET_MULTICAST_IF"); > +} > + > +static > +DECL_CMD_FUNC(setvxlan_ttl, arg, d) > +{ > + struct ifvxlancmd cmd; > + u_long val; > + > + if (get_val(arg, &val) < 0 || val > 256) > + errx(1, "invalid TTL value: %s", arg); > + > + if (!vxlan_exists(s)) { > + params.vxlp_with |= VXLAN_PARAM_WITH_TTL; > + params.vxlp_ttl = val; > + return; > + } > + > + bzero(&cmd, sizeof(cmd)); > + cmd.vxlcmd_ttl = val; > + > + if (do_cmd(s, VXLAN_CMD_SET_TTL, &cmd, sizeof(cmd), 1) < 0) > + err(1, "VXLAN_CMD_SET_TTL"); > +} > + > +static > +DECL_CMD_FUNC(setvxlan_learn, arg, d) > +{ > + struct ifvxlancmd cmd; > + > + if (!vxlan_exists(s)) { > + params.vxlp_with |= VXLAN_PARAM_WITH_LEARN; > + params.vxlp_learn = d; > + return; > + } > + > + bzero(&cmd, sizeof(cmd)); > + if (d != 0) > + cmd.vxlcmd_flags |= VXLAN_CMD_FLAG_LEARN; > + > + if (do_cmd(s, VXLAN_CMD_SET_LEARN, &cmd, sizeof(cmd), 1) < 0) > + err(1, "VXLAN_CMD_SET_LEARN"); > +} > + > +static void > +setvxlan_flush(const char *val, int d, int s, const struct afswtch *afp) > +{ > + struct ifvxlancmd cmd; > + > + bzero(&cmd, sizeof(cmd)); > + if (d != 0) > + cmd.vxlcmd_flags |= VXLAN_CMD_FLAG_FLUSH_ALL; > + > + if (do_cmd(s, VXLAN_CMD_FLUSH, &cmd, sizeof(cmd), 1) < 0) > + err(1, "VXLAN_CMD_FLUSH"); > +} > + > +static struct cmd vxlan_cmds[] = { > + > + DEF_CLONE_CMD_ARG("vni", setvxlan_vni), > + DEF_CLONE_CMD_ARG("local", setvxlan_local), > + DEF_CLONE_CMD_ARG("remote", setvxlan_remote), > + DEF_CLONE_CMD_ARG("group", setvxlan_group), > + DEF_CLONE_CMD_ARG("localport", setvxlan_local_port), > + DEF_CLONE_CMD_ARG("remoteport", setvxlan_remote_port), > + DEF_CLONE_CMD_ARG2("portrange", setvxlan_port_range), > + DEF_CLONE_CMD_ARG("timeout", setvxlan_timeout), > + DEF_CLONE_CMD_ARG("maxaddr", setvxlan_maxaddr), > + DEF_CLONE_CMD_ARG("vxlandev", setvxlan_dev), > + DEF_CLONE_CMD_ARG("ttl", setvxlan_ttl), > + DEF_CLONE_CMD("learn", 1, setvxlan_learn), > + DEF_CLONE_CMD("-learn", 0, setvxlan_learn), > + > + DEF_CMD_ARG("vni", setvxlan_vni), > + DEF_CMD_ARG("local", setvxlan_local), > + DEF_CMD_ARG("remote", setvxlan_remote), > + DEF_CMD_ARG("group", setvxlan_group), > + DEF_CMD_ARG("localport", setvxlan_local_port), > + DEF_CMD_ARG("remoteport", setvxlan_remote_port), > + DEF_CMD_ARG2("portrange", setvxlan_port_range), > + DEF_CMD_ARG("timeout", setvxlan_timeout), > + DEF_CMD_ARG("maxaddr", setvxlan_maxaddr), > + DEF_CMD_ARG("vxlandev", setvxlan_dev), > + DEF_CMD_ARG("ttl", setvxlan_ttl), > + DEF_CMD("learn", 1, setvxlan_learn), > + DEF_CMD("-learn", 0, setvxlan_learn), > + > + DEF_CMD("flush", 0, setvxlan_flush), > + DEF_CMD("flushall", 1, setvxlan_flush), > +}; > + > +static struct afswtch af_vxlan = { > + .af_name = "af_vxlan", > + .af_af = AF_UNSPEC, > + .af_other_status = vxlan_status, > +}; > + > +static __constructor void > +vxlan_ctor(void) > +{ > +#define N(a) (sizeof(a) / sizeof(a[0])) > + size_t i; > + > + for (i = 0; i < N(vxlan_cmds); i++) > + cmd_register(&vxlan_cmds[i]); > + af_register(&af_vxlan); > + callback_register(vxlan_cb, NULL); > + clone_setdefcallback("vxlan", vxlan_create); > +#undef N > +} > > Modified: head/share/man/man4/Makefile > ============================================================================== > --- head/share/man/man4/Makefile Mon Oct 20 14:25:23 2014 (r273330) > +++ head/share/man/man4/Makefile Mon Oct 20 14:42:42 2014 (r273331) > @@ -567,6 +567,7 @@ MAN= aac.4 \ > ${_virtio_scsi.4} \ > vkbd.4 \ > vlan.4 \ > + vxlan.4 \ > ${_vmx.4} \ > vpo.4 \ > vr.4 \ > @@ -743,6 +744,7 @@ MLINKS+=urndis.4 if_urndis.4 > MLINKS+=${_urtw.4} ${_if_urtw.4} > MLINKS+=vge.4 if_vge.4 > MLINKS+=vlan.4 if_vlan.4 > +MLINKS+=vxlan.4 if_vxlan.4 > MLINKS+=${_vmx.4} ${_if_vmx.4} > MLINKS+=vpo.4 imm.4 > MLINKS+=vr.4 if_vr.4 > > Added: head/share/man/man4/vxlan.4 > ============================================================================== > --- /dev/null 00:00:00 1970 (empty, because file is newly added) > +++ head/share/man/man4/vxlan.4 Mon Oct 20 14:42:42 2014 (r273331) > @@ -0,0 +1,235 @@ > +.\" Copyright (c) 2014 Bryan Venteicher > +.\" All rights reserved. > +.\" > +.\" Redistribution and use in source and binary forms, with or without > +.\" modification, are permitted provided that the following conditions > +.\" are met: > +.\" 1. Redistributions of source code must retain the above copyright > +.\" notice, this list of conditions and the following disclaimer. > +.\" 2. Redistributions in binary form must reproduce the above copyright > +.\" notice, this list of conditions and the following disclaimer in the > +.\" documentation and/or other materials provided with the distribution. > +.\" > +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND > +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE > +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE > +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE > +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL > +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS > +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) > +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT > +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY > +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF > +.\" SUCH DAMAGE. > +.\" > +.\" $FreeBSD$ > +.\" > +.Dd October 20, 2014 > +.Dt VXLAN 4 > +.Os > +.Sh NAME > +.Nm vxlan > +.Nd "Virtual eXtensible LAN interface" > +.Sh SYNOPSIS > +To compile this driver into the kernel, > +place the following line in your > +kernel configuration file: > +.Bd -ragged -offset indent > +.Cd "device vxlan" > +.Ed > +.Pp > +Alternatively, to load the driver as a > +module at boot time, place the following line in > +.Xr loader.conf 5 : > +.Bd -literal -offset indent > +if_vxlan_load="YES" > +.Ed > +.Sh DESCRIPTION > +The > +.Nm > +driver creates a virtual tunnel endpoint in a > +.Nm > +segment. > +A > +.Nm > +segment is a virtual Layer 2 (Ethernet) network that is overlaid > +in a Layer 3 (IP/UDP) network. > +.Nm > +is analogous to > +.Xr vlan 4 > +but is designed to be better suited for large, multiple tenant > +data center environments. > +.Pp > +Each > +.Nm > +interface is created at runtime using interface cloning. > +This is most easily done with the > +.Xr ifconfig 8 > +.Cm create > +command or using the > +.Va cloned_interfaces > +variable in > +.Xr rc.conf 5 . > +The interface may be removed with the > +.Xr ifconfig 8 > +.Cm destroy > +command. > +.Pp > +The > +.Nm > +driver creates a pseudo Ethernet network interface > +that supports the usual network > +.Xr ioctl 2 Ns s > +and is thus can be used with > +.Xr ifconfig 8 > +like any other Ethernet interface. > +The > +.Nm > +interface encapsulates the Ethernet frame > +by prepending IP/UDP and > +.Nm > +headers. > +Thus, the encapsulated (inner) frame is able to transmitted > +over a routed, Layer 3 network to the remote host. > +.Pp > +The > +.Nm > +interface may be configured in either unicast or multicast mode. > +When in unicast mode, > +the interface creates a tunnel to a single remote host, > +and all traffic is transmitted to that host. > +When in multicast mode, > +the interface joins an IP multicast group, > +and receives packets sent to the group address, > +and transmits packets to either the multicast group address, > +or directly the remote host if there is an appropriate > +forwarding table entry. > +.Pp > +When the > +.Nm > +interface is brought up, a > +.Xr UDP 4 > +.Xr socket 9 > +is created based on the configuration, > +such as the local address for unicast mode or > +the group address for multicast mode, > +and the listening (local) port number. > +Since multiple > +.Nm > +interfaces may be created that either > +use the same local address > +or join the same group address, > +and use the same port, > +the driver may share a socket among multiple interfaces. > +However, each interface within a socket must belong to > +a unique > +.Nm > +segment. > +The analogous > +.Xr vlan 4 > +configuration would be a physical interface configured as > +the parent device for multiple VLAN interfaces, each with > +a unique VLAN tag. > +Each > +.Nm > +segment is identified by a 24-bit value in the > +.Nm > +header called the > +.Dq VXLAN Network Identifier , > +or VNI. > +.Pp > +When configured with the > +.Xr ifconfig 8 > +.Cm learn > +parameter, the interface dynamically creates forwarding table entries > +from received packets. > +An entry in the forwarding table maps the inner source MAC address > +to the outer remote IP address. > +During transmit, the interface attempts to lookup an entry for > +the encapsulated destination MAC address. > +If an entry is found, the IP address in the entry is used to directly > +transmit the encapsulated frame to the destination. > +Otherwise, when configured in multicast mode, > +the interface must flood the frame to all hosts in the group. > +The maximum number of entries in the table is configurable with the > +.Xr ifconfig 8 > +.Cm maxaddr > +command. > +Stale entries in the table periodically pruned. > +The timeout is configurable with the > +.Xr ifconfig 8 > +.Cm timeout > +command. > +The table may be viewed with the > +.Xr sysctl 8 > +.Cm net.link.vlxan.N.ftable.dump > +command. > +.Sh MTU > +Since the > +.Nm > +interface encapsulates the Ethernet frame with an IP, UDP, and > +.Nm > +header, the resulting frame may be larger than the MTU of the > +physical network. > +The > +.Nm > +specification recommends the physical network MTU be configured > +to use jumbo frames to accommodate the encapsulated frame size. > +Alternatively, the > +.Xr ifconfig 8 > +.Cm mtu > +command may be used to reduce the MTU size on the > +.Nm > +interface to allow the encapsulated frame to fit in the > +current MTU of the physical network. > +.Sh EXAMPLES > +Create a > +.Nm > +interface in unicast mode > +with the > +.Cm local > +tunnel address of 192.168.100.1, > +and the > +.Cm remote > +tunnel address of 192.168.100.2. > +.Bd -literal -offset indent > +ifconfig vxlan create vni 108 local 192.168.100.1 remote 192.168.100.2 > +.Ed > +.Pp > +Create a > +.Nm > +interface in multicast mode, > +with the > +.Cm local > +address of 192.168.10.95, > +and the > +.Cm group > +address of 224.0.2.6. > +The em0 interface will be used to transmit multicast packets. > +.Bd -literal -offset indent > +ifconfig vxlan create vni 42 local 192.168.10.95 group 224.0.2.6 vxlandev em0 > +.Ed > +.Pp > +Once created, the > +.Nm > +interface can be configured with > +.Xr ifconfig 8 . > +.Sh SEE ALSO > +.Xr ifconfig 8 , > +.Xr inet 4 , > +.Xr inet 6 , > +.Xr sysctl 8 , > +.Xr vlan 8 > > *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-VmonFwHqdw1SKYFxRkG0ML1cvAc6LNcEQ=_pRosdqZyBRYQ>