From owner-freebsd-current@FreeBSD.ORG Sun Apr 25 09:49:41 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5E71216A4CE; Sun, 25 Apr 2004 09:49:41 -0700 (PDT) Received: from xorpc.icir.org (xorpc.icir.org [192.150.187.68]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3892643D49; Sun, 25 Apr 2004 09:49:41 -0700 (PDT) (envelope-from rizzo@icir.org) Received: from xorpc.icir.org (localhost [127.0.0.1]) by xorpc.icir.org (8.12.9p1/8.12.8) with ESMTP id i3PGnfgd051047; Sun, 25 Apr 2004 09:49:41 -0700 (PDT) (envelope-from rizzo@xorpc.icir.org) Received: (from rizzo@localhost) by xorpc.icir.org (8.12.9p1/8.12.3/Submit) id i3PGnf5J051046; Sun, 25 Apr 2004 09:49:41 -0700 (PDT) (envelope-from rizzo) Date: Sun, 25 Apr 2004 09:49:40 -0700 From: Luigi Rizzo To: current@freebsd.org Message-ID: <20040425094940.A50968@xorpc.icir.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i cc: net@freebsd.org Subject: new arp code snapshot for review... X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Apr 2004 16:49:41 -0000 Here is a snapshot of the new arp code that i have been working on lately, based a on Andre's ideas. (I say 'ARP' for brevity, what i mean is the layer3-to-layer2 address translation code -- arp, aarp, nd6 all fit in the category). The basic idea is to have per-ifp, per-af tables linked to the ifnet itself. Each table is address-family specific, and as such is managed by the protocol itself. It can be structured as a list, an array with direct access, or a hash table depending on the requirements. The search key is always the layer3 address. The advantage is a reduction in size of the routing table, because it does not have to store ARP entries anymore, and a likely speedup of the arp lookups because now the table lends itself nicely to quick lookup and easy management. Also, when the approach is used for INET6 as well (which is the only other AF using the routing table to store arp entries) rtentry's will not need to support cloning anymore, nor store 'rt_gwroute', 'rt_llinfo', 'rt_genmask' and 'rt_parent' fields, which means another large chunk of code simply goes away. Entries in the table are tagged with some flags so the code knows which ones refer to dynamic entries, local interface addresses, or statically configured entries. Compatibility with userland tools is preserved usign some stub routines which trap requests on the routing sockets and manipulate the arp tables accordingly. I have tried to keep the changes to a minimum (see below) Basically all the existing functionality should be preserved, with a few minor differences: + routing entries associated to interfaces are now non-clonable + the 'useloopback' flag is not yet implemented, because i have some doubts on its semantic. At the moment, and despite what you might think, 'useloopback' means "when you create (by cloning) a routing entry to the local host, use the loopback interface if useloopback is set at the time of cloning". Because there is no cloning anymore, the above semantics (which is not a design decision, just an accident) has to change slightly, to one of these two forms: - use the loopback interface for any local traffic if useloopback=1 - create a routing entry that uses the loopback interface if useloopback is set when you assign an address to an interface The former is a lot simpler, so i would vote for that. I also have patches for nd6, but these are a bit more extensive and i am trying to see if i can write them in a way to minimize differences with the existing code. In any case, ipv6 should work unmodified. --- Code changes: --- src/usr.sbin/arp/arp.c one small change to make 'arp' requests clearly identifiable; src/sys/net/route.c rtinit calls the new arp code, arp_ifscrub(), to remove an interface address when the address goes away. It also creates a route-to-interface, non clonable, entry when a new interface address is configured. src/sys/net/rtsock.c route_output() calls the new arp code, arp_rt_output(), to implement routing socket requests that relate to the arp table. Another method, sysctl_dumparp(), is used for the sysctl interface to the arp table. In both cases, the input and output data format is the same as before src/sys/netinet/if_ether.c this is the core of the new arp code for ipv4. At the moment this file also contains a number of generic routines which are not specific for ipv4 and so could be well moved to a different file. Note that arpresolve now completely ignores the 'rtentry' parameter passed by the upper layer. src/sys/netinet/if_ether.h contains the definition of the 'struct lltable' and various flags that control the behaviour of the each entry. All this should probably go elsewhere as it is not INET specific. ---------------------- comments welcome. The questions i have is mainly: Have i forgotten anything ? (the routing API is quite hard to follow...) Please keep in mind that some things such as malloc vs uma, field and variable names, location of code are going to change, so if you have preferences please state them. Also, as you see, there is no locking in place yet, i am leaving that task to the locking gurus cheers luigi ========================================= Index: src/usr.sbin/arp/arp.c =================================================================== RCS file: /home/ncvs/src/usr.sbin/arp/arp.c,v retrieving revision 1.50 diff -u -p -r1.50 arp.c --- src/usr.sbin/arp/arp.c 13 Apr 2004 14:16:37 -0000 1.50 +++ src/usr.sbin/arp/arp.c 25 Apr 2004 15:42:21 -0000 @@ -439,6 +439,17 @@ delete(char *host, int do_proxy) !(rtm->rtm_flags & RTF_GATEWAY) && valid_type(sdl->sdl_type) ) break; /* found it */ + /* check the new arp interface */ + if (sdl->sdl_family == AF_LINK && + !(rtm->rtm_flags & RTF_GATEWAY) && + valid_type(sdl->sdl_type) ) { + /* + * found it. But overwrite the address to make + * sure that we really get it. + */ + addr->sin_addr.s_addr = dst->sin_addr.s_addr; + break; + } if (dst->sin_other & SIN_PROXY) { fprintf(stderr, "delete: cannot locate %s\n",host); return (1); Index: src/sys/net/route.c =================================================================== RCS file: /home/ncvs/src/sys/net/route.c,v retrieving revision 1.104 diff -u -p -r1.104 route.c --- src/sys/net/route.c 25 Apr 2004 01:39:00 -0000 1.104 +++ src/sys/net/route.c 25 Apr 2004 16:13:39 -0000 @@ -42,6 +42,7 @@ #include #include +#include /* for sockaddr_dl */ #include #include @@ -1105,9 +1106,13 @@ rt_maskedcopy(struct sockaddr *src, stru bzero((caddr_t)cp2, (unsigned)(cplim2 - cp2)); } +void arp_ifscrub(struct ifnet *ifp, uint32_t addr); + /* * Set up a routing table entry, normally * for an interface. + * Instead of the destination address, use a sockaddr_dl for the + * gateway, using the index and type of the interface. */ int rtinit(struct ifaddr *ifa, int cmd, int flags) @@ -1118,6 +1123,7 @@ rtinit(struct ifaddr *ifa, int cmd, int struct rtentry *rt = NULL; struct rt_addrinfo info; int error; + static struct sockaddr_dl null_sdl = {sizeof(null_sdl), AF_LINK}; if (flags & RTF_HOST) { dst = ifa->ifa_dstaddr; @@ -1126,6 +1132,13 @@ rtinit(struct ifaddr *ifa, int cmd, int dst = ifa->ifa_addr; netmask = ifa->ifa_netmask; } + printf("rtinit cmd %d flags 0x%x, ifa_ifp %p dst %d:0x%x gw %d:0x%x\n", + cmd, flags, ifa->ifa_ifp, + dst->sa_family, + ntohl(((struct sockaddr_in *)dst)->sin_addr.s_addr), + ifa->ifa_addr->sa_family, + ntohl(((struct sockaddr_in *)ifa->ifa_addr)->sin_addr.s_addr)); + /* * If it's a delete, check that if it exists, it's on the correct * interface or we might scrub a route to another ifa which would @@ -1136,6 +1149,9 @@ rtinit(struct ifaddr *ifa, int cmd, int struct radix_node_head *rnh; struct radix_node *rn; + if (dst->sa_family == AF_INET) + arp_ifscrub(ifa->ifa_ifp, + ((struct sockaddr_in *)dst)->sin_addr.s_addr); /* * It's a delete, so it should already exist.. * If it's a net, mask off the host bits @@ -1175,10 +1191,14 @@ bad: info.rti_ifa = ifa; info.rti_flags = flags | ifa->ifa_flags; info.rti_info[RTAX_DST] = dst; - info.rti_info[RTAX_GATEWAY] = ifa->ifa_addr; + info.rti_info[RTAX_GATEWAY] = (struct sockaddr *)&null_sdl; info.rti_info[RTAX_NETMASK] = netmask; error = rtrequest1(cmd, &info, &rt); if (error == 0 && rt != NULL) { + ((struct sockaddr_dl *)rt->rt_gateway)->sdl_type = + rt->rt_ifp->if_type; + ((struct sockaddr_dl *)rt->rt_gateway)->sdl_index = + rt->rt_ifp->if_index; /* * notify any listening routing agents of the change */ Index: src/sys/net/rtsock.c =================================================================== RCS file: /home/ncvs/src/sys/net/rtsock.c,v retrieving revision 1.107 diff -u -p -r1.107 rtsock.c --- src/sys/net/rtsock.c 19 Apr 2004 07:20:32 -0000 1.107 +++ src/sys/net/rtsock.c 25 Apr 2004 15:39:49 -0000 @@ -91,6 +91,10 @@ static void rt_getmetrics(const struct r struct rt_metrics *out); static void rt_dispatch(struct mbuf *, const struct sockaddr *); +/* support new arp code */ +int arp_rt_output(struct rt_msghdr *rtm, struct rt_addrinfo *info); +int sysctl_dumparp(int af, struct sysctl_req *wr); + /* * It really doesn't make any sense at all for this code to share much * with raw_usrreq.c, since its functionality is so restricted. XXX @@ -275,6 +279,8 @@ static struct pr_usrreqs route_usrreqs = sosend, soreceive, sopoll, pru_sosetlabel_null }; + + /*ARGSUSED*/ static int route_output(struct mbuf *m, struct socket *so) @@ -350,6 +356,11 @@ route_output(struct mbuf *m, struct sock if (info.rti_info[RTAX_GATEWAY] == NULL) senderr(EINVAL); saved_nrt = NULL; + if (info.rti_info[RTAX_GATEWAY]->sa_family == AF_LINK) { + /* support for new ARP code */ + arp_rt_output(rtm, &info); + break; + } error = rtrequest1(RTM_ADD, &info, &saved_nrt); if (error == 0 && saved_nrt) { RT_LOCK(saved_nrt); @@ -363,6 +374,11 @@ route_output(struct mbuf *m, struct sock case RTM_DELETE: saved_nrt = NULL; + if (info.rti_info[RTAX_GATEWAY]->sa_family == AF_LINK) { + /* support for new ARP code */ + arp_rt_output(rtm, &info); + break; + } error = rtrequest1(RTM_DELETE, &info, &saved_nrt); if (error == 0) { RT_LOCK(saved_nrt); @@ -1069,6 +1085,7 @@ sysctl_rtsock(SYSCTL_HANDLER_ARGS) int i, lim, s, error = EINVAL; u_char af; struct walkarg w; + int found = 0; name ++; namelen--; @@ -1100,8 +1117,17 @@ sysctl_rtsock(SYSCTL_HANDLER_ARGS) error = rnh->rnh_walktree(rnh, sysctl_dumpentry, &w);/* could sleep XXX */ /* RADIX_NODE_HEAD_UNLOCK(rnh); */ - } else if (af != 0) - error = EAFNOSUPPORT; + if (error) + break; + found = 1; + } + /* + * take care of llinfo entries. XXX check AF_INET ? + */ + if (w.w_op == NET_RT_FLAGS && (RTF_LLINFO & w.w_arg)) + error = sysctl_dumparp(af, w.w_req); + else if (af != 0 && found == 0) + error = EAFNOSUPPORT; break; case NET_RT_IFLIST: Index: src/sys/netinet/if_ether.c =================================================================== RCS file: /home/ncvs/src/sys/netinet/if_ether.c,v retrieving revision 1.127 diff -u -p -r1.127 if_ether.c --- src/sys/netinet/if_ether.c 25 Apr 2004 15:00:17 -0000 1.127 +++ src/sys/netinet/if_ether.c 25 Apr 2004 16:14:03 -0000 @@ -27,7 +27,7 @@ * SUCH DAMAGE. * * @(#)if_ether.c 8.1 (Berkeley) 6/10/93 - * $FreeBSD: src/sys/netinet/if_ether.c,v 1.127 2004/04/25 15:00:17 luigi Exp $ + * $FreeBSD$ */ /* @@ -101,7 +101,6 @@ struct llinfo_arp { static LIST_HEAD(, llinfo_arp) llinfo_arp; static struct ifqueue arpintrq; -static int arp_allocated; static int arp_maxtries = 5; static int useloopback = 1; /* use loopback interface for local traffic */ @@ -116,18 +115,303 @@ SYSCTL_INT(_net_link_ether_inet, OID_AUT &arp_proxyall, 0, ""); static void arp_init(void); -static void arp_rtrequest(int, struct rtentry *, struct rt_addrinfo *); static void arprequest(struct ifnet *, struct in_addr *, struct in_addr *, u_char *); static void arpintr(struct mbuf *); static void arptfree(struct llinfo_arp *); static void arptimer(void *); -static struct llinfo_arp - *arplookup(u_long, int, int); +struct llentry *arplookup(struct ifnet *ifp, uint32_t addr, uint32_t flags); #ifdef INET static void in_arpinput(struct mbuf *); #endif +/*** + *** + *** Start of new arp support routines which should go to a separate file. + *** + ***/ +#define DEB(x) +#define DDB(x) x + +struct llentry { + struct llentry *lle_next; + struct mbuf *la_hold; + uint16_t flags; /* see values in if_ether.h */ + uint8_t la_preempt; + uint8_t la_asked; + time_t expire; + struct in_addr l3_addr; + union { + uint64_t mac_aligned; + uint16_t mac16[3]; + } ll_addr; +}; + +MALLOC_DEFINE(M_ARP, "arp", "arp entries"); /* XXX will move to UMA */ + +int arp_rt_output(struct rt_msghdr *rtm, struct rt_addrinfo *info); +int sysctl_dumparp(int af, struct sysctl_req *wr); +void arp_ifscrub(struct ifnet *ifp, uint32_t addr); + +/* + * called by in_ifscrub to remove entry from the table when + * the interface goes away + */ +void +arp_ifscrub(struct ifnet *ifp, uint32_t addr) +{ + arplookup(ifp, addr, LLE_DELETE | LLE_IFADDR); +} + +/* + * Find an interface address matching the ifp-addr pair. + * This may replicate some of the functions of ifa_ifwithnet() + */ +static struct ifaddr * +find_ifa(struct ifnet *ifp, uint32_t addr) +{ + struct ifaddr *ifa; + + if (ifp == NULL) + return NULL; + TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) { + if (ifa->ifa_addr->sa_family != AF_INET) + continue; + if (ifp->if_flags & IFF_POINTOPOINT) + break; + if (((addr ^ SIN(ifa->ifa_addr)->sin_addr.s_addr) & + SIN(ifa->ifa_netmask)->sin_addr.s_addr ) == 0) + break; /* found! */ + } + return ifa; +} + +static void +llentry_free(struct llentry **e) +{ + struct llentry *x; + + if (e == 0) + panic("llentry_free: null ptr"); + x = *e; + *e = x->lle_next; + if (x->la_hold) + m_freem(x->la_hold); + free(x, M_ARP); +} + +/* + * Add a new table at the head of the list for interface ifp + */ +struct lltable * +lltable_new(struct ifnet *ifp, int af) +{ + struct lltable *t; + + t = malloc(sizeof (struct lltable), M_ARP, M_DONTWAIT | M_ZERO); + if (t != NULL) { + t->llt_next = ifp->lltables; + t->llt_af = af; + ifp->lltables = t; + } + return t; +} + +struct lltable ** +lltable_free(struct lltable **t) +{ + struct lltable *x; + + if (t == NULL) + panic("lltable_free: null ptr"); + x = *t; + *t = x->llt_next; + free(x, M_ARP); + return t; +} + +static void +newarptimer(__unused void *ignored_arg) +{ + struct lltable *t; + struct llentry **e; + struct ifnet *ifp; + + IFNET_RLOCK(); + printf("arptimer!\n"); + TAILQ_FOREACH(ifp, &ifnet, if_link) { + for (t = ifp->lltables; t ; t = t->llt_next) { + if (t->llt_af != AF_INET) + continue; + for (e = (struct llentry **)&t->lle_head; *e; ) { + int kill; + + if ((*e)->flags & LLE_DELETED) + kill = 1; + else if ((*e)->flags & LLE_STATIC) + kill = 0; + else + kill = time_second >= (*e)->expire; + if (kill) + llentry_free(e); + else + e = &((*e)->lle_next); + } + } + } + IFNET_RUNLOCK(); + callout_reset(&arp_callout, arpt_prune * hz, newarptimer, NULL); +} + +static int +inet_dumparp(struct ifnet *ifp, void *head, struct sysctl_req *wr) +{ + struct llentry *e; + int error = 0; + + for (e = head; e; e = e->lle_next) { + struct { + struct rt_msghdr rtm; + struct sockaddr_inarp sin2; + struct sockaddr_dl sdl; + //struct sockaddr_inarp addr2; + } d; + + DEB(printf("ifp %p index %d flags 0x%x ip %x %s\n", + ifp, ifp->if_index, + e->flags, + ntohl(e->l3_addr.s_addr), + (e->flags & LLA_VALID) ? "valid" : "incomplete");) + if (e->flags & LLE_DELETED) /* skip deleted entries */ + continue; + /* + * produce a msg made of: + * struct rt_msghdr; + * struct sockaddr_inarp; + * struct sockaddr_dl; + */ + bzero(&d, sizeof (d)); + d.rtm.rtm_msglen = sizeof(d); + d.sin2.sin_family = AF_INET; + d.sin2.sin_len = sizeof(d.sin2); + d.sin2.sin_addr.s_addr = e->l3_addr.s_addr; + + if (e->flags & LLA_VALID) { /* valid MAC */ + d.sdl.sdl_family = AF_LINK; + d.sdl.sdl_len = sizeof(d.sdl); + d.sdl.sdl_alen = ifp->if_addrlen; + d.sdl.sdl_index = ifp->if_index; + d.sdl.sdl_type = ifp->if_type; + bcopy(&e->ll_addr, LLADDR(&d.sdl), ifp->if_addrlen); + } + d.rtm.rtm_rmx.rmx_expire = + e->flags & LLE_STATIC ? 0 : e->expire; + d.rtm.rtm_flags = RTF_LLINFO; + if (e->flags & LLE_STATIC) + d.rtm.rtm_flags |= RTF_STATIC; + d.rtm.rtm_index = ifp->if_index; + error = SYSCTL_OUT(wr, &d, sizeof(d)); + if (error) + break; + } + return error; +} + +/* + * glue to dump arp tables + */ +int +sysctl_dumparp(int af, struct sysctl_req *wr) +{ + struct lltable *t; + struct ifnet *ifp; + int error = 0; + + IFNET_RLOCK(); + TAILQ_FOREACH(ifp, &ifnet, if_link) { + for (t = ifp->lltables; t ; t = t->llt_next) { + if (af != 0 && t->llt_af != af) + continue; + switch (af) { + case AF_INET: + error = inet_dumparp(ifp, t->lle_head, wr); + break; + /* other handlers, if any */ + } + if (error) + goto done; + } + } +done: + IFNET_RUNLOCK(); + return (error); +} + +/* + * Called in route_output when adding/deleting a route to an interface. + */ +int +arp_rt_output(struct rt_msghdr *rtm, struct rt_addrinfo *info) +{ + struct sockaddr_dl *dl = + (struct sockaddr_dl *)info->rti_info[RTAX_GATEWAY]; + struct sockaddr_in *dst = + (struct sockaddr_in *)info->rti_info[RTAX_DST]; + struct ifnet *ifp; + struct llentry *la; + u_int flags; + + printf("arp_rt_output type %d af: gw %d dst %d:%x if_index %d\n", + rtm->rtm_type, + dl ? dl->sdl_family : 0, + dst ? dst->sin_family : 0, + dst && dst->sin_family == AF_INET ? + ntohl(dst->sin_addr.s_addr) : 0, + dl ? dl->sdl_index : 0); + if (dl == NULL || dl->sdl_family != AF_LINK) { + /* XXX should also check (dl->sdl_index < if_indexlim) */ + printf("invalid gateway/index\n"); + return EINVAL; + } + ifp = ifnet_byindex(dl->sdl_index); + if (ifp == NULL) { + printf("invalid ifp\n"); + return EINVAL; + } + + switch (rtm->rtm_type) { + case RTM_ADD: + flags = LLE_CREATE; + break; + + case RTM_CHANGE: + default: + return EINVAL; /* XXX not implemented yet */ + + case RTM_DELETE: + flags = LLE_DELETE; + break; + } + la = arplookup(ifp, dst->sin_addr.s_addr, flags); + if (la == NULL) { + bcopy(LLADDR(dl), &la->ll_addr, ifp->if_addrlen); + la->flags |= LLA_VALID; + if (rtm->rtm_flags & RTF_STATIC) + la->flags |= LLE_STATIC; + else + la->expire = time_second + arpt_keep; + } + return 0; +} + + + +/*** + *** + *** End of new arp support routines which should go to a separate file. + *** + ***/ + /* * Timeout routine. Age arp_tab entries periodically. */ @@ -152,6 +436,9 @@ arptimer(ignored_arg) callout_reset(&arp_callout, arpt_prune * hz, arptimer, NULL); } +#if 0 /* this is unused */ +static int arp_allocated; + /* * Parallel to llc_rtrequest. */ @@ -284,6 +571,7 @@ arp_rtrequest(req, rt, info) Free((caddr_t)la); } } +#endif /* arp_rtrequest unused */ /* * Broadcast an ARP request. Caller specifies: @@ -301,6 +589,28 @@ arprequest(ifp, sip, tip, enaddr) struct arphdr *ah; struct sockaddr sa; + if (sip == NULL) { + /* + * The caller did not supply a source address, try to find + * a compatible one among those assigned to this interface. + */ + struct ifaddr *ifa; + + TAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) { + if (!ifa->ifa_addr || + ifa->ifa_addr->sa_family != AF_INET) + continue; + sip = &SIN(ifa->ifa_addr)->sin_addr; + if (0 == ((sip->s_addr ^ tip->s_addr) & + SIN(ifa->ifa_netmask)->sin_addr.s_addr) ) + break; /* found it. */ + } + } + if (sip == NULL) { + printf(" cannot find matching address, no arprequest\n"); + return; + } + if ((m = m_gethdr(M_DONTWAIT, MT_DATA)) == NULL) return; m->m_len = sizeof(*ah) + 2*sizeof(struct in_addr) + @@ -344,16 +654,11 @@ int arpresolve(struct ifnet *ifp, struct rtentry *rt0, struct mbuf *m, struct sockaddr *dst, u_char *desten) { - struct llinfo_arp *la = 0; + struct llentry *la = 0; struct sockaddr_dl *sdl; - int error; struct rtentry *rt; - - error = rt_check(&rt, &rt0, dst); - if (error) { - m_freem(m); - return error; - } + u_int flags = (ifp->if_flags & (IFF_NOARP | IFF_STATICARP)) ? + 0 : LLE_CREATE; if (m->m_flags & M_BCAST) { /* broadcast */ (void)memcpy(desten, ifp->if_broadcastaddr, ifp->if_addrlen); @@ -363,51 +668,39 @@ arpresolve(struct ifnet *ifp, struct rte ETHER_MAP_IP_MULTICAST(&SIN(dst)->sin_addr, desten); return (0); } - if (rt) - la = (struct llinfo_arp *)rt->rt_llinfo; - if (la == 0) { - la = arplookup(SIN(dst)->sin_addr.s_addr, 1, 0); - if (la) - rt = la->la_rt; - } - if (la == 0 || rt == 0) { - log(LOG_DEBUG, "arpresolve: can't allocate llinfo for %s%s%s\n", - inet_ntoa(SIN(dst)->sin_addr), la ? "la" : "", - rt ? "rt" : ""); + la = arplookup(ifp, SIN(dst)->sin_addr.s_addr, flags); + if (la == NULL) { + if (flags & LLE_CREATE) + log(LOG_DEBUG, + "arpresolve: can't allocate llinfo for %s\n", + inet_ntoa(SIN(dst)->sin_addr)); m_freem(m); return (EINVAL); /* XXX */ } sdl = SDL(rt->rt_gateway); /* - * Check the address family and length is valid, the address - * is resolved; otherwise, try to resolve. + * If the entry is valid and not expired, use it. */ - if ((rt->rt_expire == 0 || rt->rt_expire > time_second) && - sdl->sdl_family == AF_LINK && sdl->sdl_alen != 0) { + if (la->flags & LLA_VALID && + (la->flags & LLE_STATIC || la->expire > time_second)) { + bcopy(&la->ll_addr, desten, ifp->if_addrlen); /* * If entry has an expiry time and it is approaching, * see if we need to send an ARP request within this * arpt_down interval. */ - if ((rt->rt_expire != 0) && - (time_second + la->la_preempt > rt->rt_expire)) { - arprequest(ifp, - &SIN(rt->rt_ifa->ifa_addr)->sin_addr, - &SIN(dst)->sin_addr, - IF_LLADDR(ifp)); + if (!(la->flags & LLE_STATIC) && + time_second + la->la_preempt > la->expire) { + arprequest(ifp, NULL, + &SIN(dst)->sin_addr, IF_LLADDR(ifp)); + la->la_preempt--; } - - bcopy(LLADDR(sdl), desten, sdl->sdl_alen); return (0); } - /* - * If ARP is disabled or static on this interface, stop. - * XXX - * Probably should not allocate empty llinfo struct if we are - * not going to be sending out an arp request. - */ - if (ifp->if_flags & (IFF_NOARP | IFF_STATICARP)) { + if (la->flags & LLE_STATIC) { /* should not happen! */ + log(LOG_DEBUG, "arpresolve: ouch, empty static llinfo for %s\n", + inet_ntoa(SIN(dst)->sin_addr)); m_freem(m); return (EINVAL); } @@ -419,26 +712,26 @@ arpresolve(struct ifnet *ifp, struct rte if (la->la_hold) m_freem(la->la_hold); la->la_hold = m; - if (rt->rt_expire) { - RT_LOCK(rt); - rt->rt_flags &= ~RTF_REJECT; - if (la->la_asked == 0 || rt->rt_expire != time_second) { - rt->rt_expire = time_second; - if (la->la_asked++ < arp_maxtries) { - arprequest(ifp, - &SIN(rt->rt_ifa->ifa_addr)->sin_addr, - &SIN(dst)->sin_addr, - IF_LLADDR(ifp)); - } else { - rt->rt_flags |= RTF_REJECT; - rt->rt_expire += arpt_down; - la->la_asked = 0; - la->la_preempt = arp_maxtries; - } - + /* + * Now implement the logic to issue requests -- we can send up + * to arp_maxtries with a 1-sec spacing, followed by a pause + * of arpt_down seconds if no replies are coming back. + * Take the chance to enforce limits on arp_maxtries and arpt_down + */ + if (la->expire <= time_second) { /* ok, expired */ + if (arp_maxtries > 100) /* enforce a sane limit */ + arp_maxtries = 100; + else if (arp_maxtries < 3) + arp_maxtries = 3; + if (la->la_asked++ < arp_maxtries) + la->expire = time_second + 1; + else { + la->la_asked = 0; + la->expire = time_second + arpt_down; + la->la_preempt = arp_maxtries; } - RT_UNLOCK(rt); - } + arprequest(ifp, NULL, &SIN(dst)->sin_addr, IF_LLADDR(ifp)); + } return (EWOULDBLOCK); } @@ -518,16 +811,12 @@ in_arpinput(m) { struct arphdr *ah; struct ifnet *ifp = m->m_pkthdr.rcvif; - struct iso88025_header *th = (struct iso88025_header *)0; - struct iso88025_sockaddr_dl_data *trld; - struct llinfo_arp *la = 0; - struct rtentry *rt; + struct llentry *la = 0; struct ifaddr *ifa; struct in_ifaddr *ia; - struct sockaddr_dl *sdl; struct sockaddr sa; struct in_addr isaddr, itaddr, myaddr; - int op, rif_len; + int op; int req_len; req_len = arphdr_len2(ifp->if_addrlen, sizeof(struct in_addr)); @@ -540,6 +829,19 @@ in_arpinput(m) op = ntohs(ah->ar_op); (void)memcpy(&isaddr, ar_spa(ah), sizeof (isaddr)); (void)memcpy(&itaddr, ar_tpa(ah), sizeof (itaddr)); + /* + * sanity check for the address length. + * XXX this does not work for protocols with variable address + * length. -is + */ + if (ifp->if_addrlen != ah->ar_hln) { + log(LOG_WARNING, + "arp from %*D: addr len: new %d, i/f %d (ignored)", + ifp->if_addrlen, (u_char *) ar_sha(ah), ":", + ah->ar_hln, ifp->if_addrlen); + goto drop; + } + #ifdef BRIDGE #define BRIDGE_TEST (do_bridge) #else @@ -592,62 +894,41 @@ match: } if (ifp->if_flags & IFF_STATICARP) goto reply; - la = arplookup(isaddr.s_addr, itaddr.s_addr == myaddr.s_addr, 0); - if (la && (rt = la->la_rt) && (sdl = SDL(rt->rt_gateway))) { - /* the following is not an error when doing bridging */ - if (!BRIDGE_TEST && rt->rt_ifp != ifp) { - if (log_arp_wrong_iface) - log(LOG_ERR, "arp: %s is on %s but got reply from %*D on %s\n", - inet_ntoa(isaddr), - rt->rt_ifp->if_xname, - ifp->if_addrlen, (u_char *)ar_sha(ah), ":", - ifp->if_xname); - goto reply; - } - if (sdl->sdl_alen && - bcmp(ar_sha(ah), LLADDR(sdl), sdl->sdl_alen)) { - if (rt->rt_expire) { - if (log_arp_movements) - log(LOG_INFO, "arp: %s moved from %*D to %*D on %s\n", - inet_ntoa(isaddr), - ifp->if_addrlen, (u_char *)LLADDR(sdl), ":", - ifp->if_addrlen, (u_char *)ar_sha(ah), ":", - ifp->if_xname); - } else { + /* Look up the source. If I am the target, create an entry for it. */ + la = arplookup(ifp, isaddr.s_addr, + (itaddr.s_addr == myaddr.s_addr) ? LLE_CREATE : 0); + if (la != NULL) { + /* We have a valid entry. Check and store the MAC. */ + if (la->flags & LLA_VALID && + bcmp(ar_sha(ah), &la->ll_addr, ifp->if_addrlen)) { + if (la->flags & LLE_STATIC) { log(LOG_ERR, "arp: %*D attempts to modify permanent entry for %s on %s\n", ifp->if_addrlen, (u_char *)ar_sha(ah), ":", inet_ntoa(isaddr), ifp->if_xname); goto reply; } + if (log_arp_movements) + log(LOG_INFO, "arp: %s moved from %*D to %*D on %s\n", + inet_ntoa(isaddr), + ifp->if_addrlen, (u_char *)&la->ll_addr, ":", + ifp->if_addrlen, (u_char *)ar_sha(ah), ":", + ifp->if_xname); } - /* - * sanity check for the address length. - * XXX this does not work for protocols with variable address - * length. -is - */ - if (sdl->sdl_alen && - sdl->sdl_alen != ah->ar_hln) { - log(LOG_WARNING, - "arp from %*D: new addr len %d, was %d", - ifp->if_addrlen, (u_char *) ar_sha(ah), ":", - ah->ar_hln, sdl->sdl_alen); - } - if (ifp->if_addrlen != ah->ar_hln) { - log(LOG_WARNING, - "arp from %*D: addr len: new %d, i/f %d (ignored)", - ifp->if_addrlen, (u_char *) ar_sha(ah), ":", - ah->ar_hln, ifp->if_addrlen); - goto reply; - } - (void)memcpy(LLADDR(sdl), ar_sha(ah), - sdl->sdl_alen = ah->ar_hln); + bcopy(ar_sha(ah), &la->ll_addr, ifp->if_addrlen); + la->flags |= LLA_VALID; +#if 0 /* XXX this needs to be fixed */ /* * If we receive an arp from a token-ring station over * a token-ring nic then try to save the source * routing info. */ if (ifp->if_type == IFT_ISO88025) { + struct iso88025_header *th; + struct iso88025_sockaddr_dl_data *trld; + struct sockaddr_dl *sdl; + int rif_len; + th = (struct iso88025_header *)m->m_pkthdr.header; trld = SDL_ISO88025(sdl); rif_len = TR_RCF_RIFLEN(th->rcf); @@ -673,15 +954,20 @@ match: m->m_pkthdr.len += 8; th->rcf = trld->trld_rcf; } - RT_LOCK(rt); - if (rt->rt_expire) - rt->rt_expire = time_second + arpt_keep; - rt->rt_flags &= ~RTF_REJECT; - RT_UNLOCK(rt); +#endif + if (!(la->flags & LLE_STATIC)) + la->expire = time_second + arpt_keep; la->la_asked = 0; la->la_preempt = arp_maxtries; if (la->la_hold) { - (*ifp->if_output)(ifp, la->la_hold, rt_key(rt), rt); + struct sockaddr_in sin; + + bzero(&sin, sizeof(sin)); + sin.sin_len = sizeof(struct sockaddr_in); + sin.sin_family = AF_INET; + sin.sin_addr.s_addr = la->l3_addr.s_addr; + ifp->if_output(ifp, la->la_hold, + (struct sockaddr *)&sin, NULL); la->la_hold = 0; } } @@ -693,9 +979,10 @@ reply: (void)memcpy(ar_tha(ah), ar_sha(ah), ah->ar_hln); (void)memcpy(ar_sha(ah), IF_LLADDR(ifp), ah->ar_hln); } else { - la = arplookup(itaddr.s_addr, 0, SIN_PROXY); + la = arplookup(ifp, itaddr.s_addr, LLE_PROXY); if (la == NULL) { struct sockaddr_in sin; + struct rtentry *rt; if (!arp_proxyall) goto drop; @@ -747,10 +1034,8 @@ reply: inet_ntoa(itaddr)); #endif } else { - rt = la->la_rt; (void)memcpy(ar_tha(ah), ar_sha(ah), ah->ar_hln); - sdl = SDL(rt->rt_gateway); - (void)memcpy(ar_sha(ah), LLADDR(sdl), ah->ar_hln); + (void)memcpy(ar_sha(ah), &la->ll_addr, ah->ar_hln); } } @@ -798,66 +1083,77 @@ arptfree(la) /* * Lookup or enter a new address in arptab. */ -static struct llinfo_arp * -arplookup(addr, create, proxy) - u_long addr; - int create, proxy; +struct llentry * +arplookup(struct ifnet *ifp, uint32_t l3addr, u_int flags) { - struct rtentry *rt; - struct sockaddr_inarp sin; - const char *why = 0; - - bzero(&sin, sizeof(sin)); - sin.sin_len = sizeof(sin); - sin.sin_family = AF_INET; - sin.sin_addr.s_addr = addr; - if (proxy) - sin.sin_other = SIN_PROXY; - rt = rtalloc1((struct sockaddr *)&sin, create, 0UL); - if (rt == 0) - return (0); - - if (rt->rt_flags & RTF_GATEWAY) - why = "host is not on local network"; - else if ((rt->rt_flags & RTF_LLINFO) == 0) - why = "could not allocate llinfo"; - else if (rt->rt_gateway->sa_family != AF_LINK) - why = "gateway route is not ours"; - - if (why) { -#define ISDYNCLONE(_rt) \ - (((_rt)->rt_flags & (RTF_STATIC | RTF_WASCLONED)) == RTF_WASCLONED) - if (create) - log(LOG_DEBUG, "arplookup %s failed: %s\n", - inet_ntoa(sin.sin_addr), why); - /* - * If there are no references to this Layer 2 route, - * and it is a cloned route, and not static, and - * arplookup() is creating the route, then purge - * it from the routing table as it is probably bogus. - */ - if (rt->rt_refcnt == 1 && ISDYNCLONE(rt)) - rtexpunge(rt); - RTFREE_LOCKED(rt); - return (0); -#undef ISDYNCLONE - } else { - RT_REMREF(rt); - RT_UNLOCK(rt); - return ((struct llinfo_arp *)rt->rt_llinfo); - } + struct llentry *e; + struct lltable *t; + // uint proxy = flags & LLE_PROXY; + + if (ifp == NULL) + return NULL; + /* LOCK_IFNET */ + for (t = ifp->lltables; t && t->llt_af != AF_INET; t = t->llt_next) + ; + if (t == NULL && flags & LLE_CREATE) + t = lltable_new(ifp, AF_INET); + if (t == NULL) { + /* UNLOCK_ALL_TABLES */ + return NULL; /* failed! */ + } + /* LOCK_TABLE(t) */ + /* UNLOCK_ALL_TABLES */ + for (e = (struct llentry *)t->lle_head; e ; e = e->lle_next) { + if (e->flags & LLE_DELETED) + continue; + if (l3addr == e->l3_addr.s_addr) + break; + } + if (e == NULL) { /* entry not found */ + if (!(flags & LLE_CREATE)) + goto done; + if (find_ifa(ifp, l3addr) == NULL) { + printf("host is not on local network\n"); + goto done; + } + e = malloc(sizeof (struct llentry), M_ARP, M_DONTWAIT | M_ZERO); + if (e == NULL) { + printf("arp malloc failed\n"); + goto done; + } + e->expire = time_second; /* mark expired */ + e->l3_addr.s_addr = l3addr; + e->lle_next = t->lle_head; + t->lle_head = e; + } + if (flags & LLE_DELETE && + (e->flags & LLE_IFADDR) == (flags & LLE_IFADDR)) + e->flags = LLE_DELETED; +done: + /* UNLOCK(t) */ + return e; } + void arp_ifinit(ifp, ifa) struct ifnet *ifp; struct ifaddr *ifa; { + struct llentry *la; + + printf("arp_ifinit ifp %p addr 0x%x\n", + ifp, ntohl(IA_SIN(ifa)->sin_addr.s_addr)); + if (ntohl(IA_SIN(ifa)->sin_addr.s_addr) != INADDR_ANY) arprequest(ifp, &IA_SIN(ifa)->sin_addr, &IA_SIN(ifa)->sin_addr, IF_LLADDR(ifp)); - ifa->ifa_rtrequest = arp_rtrequest; - ifa->ifa_flags |= RTF_CLONING; + la = arplookup(ifp, IA_SIN(ifa)->sin_addr.s_addr, LLE_CREATE); + if (la) { /* store our address */ + bcopy(IF_LLADDR(ifp), &la->ll_addr, ifp->if_addrlen); + la->flags |= LLA_VALID | LLE_STATIC | LLE_IFADDR; + } + ifa->ifa_rtrequest = NULL; } static void @@ -866,9 +1162,8 @@ arp_init(void) arpintrq.ifq_maxlen = 50; mtx_init(&arpintrq.ifq_mtx, "arp_inq", NULL, MTX_DEF); - LIST_INIT(&llinfo_arp); callout_init(&arp_callout, CALLOUT_MPSAFE); netisr_register(NETISR_ARP, arpintr, &arpintrq, NETISR_MPSAFE); - callout_reset(&arp_callout, hz, arptimer, NULL); + callout_reset(&arp_callout, hz, newarptimer, NULL); } SYSINIT(arp, SI_SUB_PROTO_DOMAIN, SI_ORDER_ANY, arp_init, 0); Index: src/sys/netinet/if_ether.h =================================================================== RCS file: /home/ncvs/src/sys/netinet/if_ether.h,v retrieving revision 1.30 diff -u -p -r1.30 if_ether.h --- src/sys/netinet/if_ether.h 7 Apr 2004 20:46:13 -0000 1.30 +++ src/sys/netinet/if_ether.h 25 Apr 2004 15:09:46 -0000 @@ -112,6 +112,33 @@ extern u_char ether_ipmulticast_max[ETHE int arpresolve(struct ifnet *ifp, struct rtentry *rt, struct mbuf *m, struct sockaddr *dst, u_char *desten); void arp_ifinit(struct ifnet *, struct ifaddr *); + +/* + * Support routines for the new arp table + */ +struct lltable *lltable_new(struct ifnet *ifp, int af); +struct lltable **lltable_free(struct lltable **t); #endif +struct lltable { + struct lltable *llt_next; + void *lle_head; /* pointer to the list of address entries */ + int llt_af; /* address family */ +}; + +/* + * flags to be passed to arplookup. + */ +#define LLE_DELETED 0x0001 /* entry must be deleted */ +#define LLE_STATIC 0x0002 /* entry is static */ +#define LLE_IFADDR 0x0004 /* entry is interface addr */ +#define LLA_VALID 0x0008 /* ll_addr is valid */ +#define LLE_PROXY 0x0010 /* proxy entry ??? */ +#define LLE_PUB 0x0020 /* publish entry ??? */ +#define LLE_CREATE 0x8000 /* create on a lookup miss */ +#define LLE_DELETE 0x4000 /* delete on a lookup - match LLE_IFADDR */ + +/* + * End of support code for the new arp table + */ #endif