Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 12 May 2012 10:21:05 +0300
From:      Nikolay Denev <ndenev@gmail.com>
To:        freebsd-net@freebsd.org
Subject:   Re: setfib/arpresolve behaviour bug?
Message-ID:  <59FAFD0B-A107-4173-9FA9-BA3349D499E2@gmail.com>
In-Reply-To: <4B587DD0.8020700@icritical.com>
References:  <4B587DD0.8020700@icritical.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On Jan 21, 2010, at 6:16 PM, Matt Burke wrote:

> Box is running 8.0-RELEASE-p2 cvsupped two days ago.
> 
> NICs are em bonded with lagg failover and running a few vlan interfaces.
> 
> net.my_fibnum: 0
> net.add_addr_allfibs: 1
> net.fibs: 4
> 
> This is reproducible, but with the lack of (accessible?) documentation on
> multiple routing tables, I don't know if this is intended behaviour or a bug.
> 
> It seems processes using a non-default fib cannot perform arp lookups
> unless the fib 0 has a routing table entry for the attached network:
> 
> [root@host ~]# ifconfig vlan11 a.a.a.92/27
> [root@host ~]# route delete -net a.a.a.64/27
> delete net a.a.a.64
> [root@host ~]# setfib 1 ping a.a.a.65
> PING a.a.a.65 (a.a.a.65): 56 data bytes
> ping: sendto: Invalid argument
> ^C
> --- a.a.a.65 ping statistics ---
> 1 packets transmitted, 0 packets received, 100.0% packet loss
> [root@host ~]# dmesg |tail -1
> arpresolve: can't allocate llinfo for a.a.a.65
> 
> 
> Putting the entry into the arp cache before removing the route results in
> success:
> 
> [root@host ~]# ifconfig vlan11 a.a.a.92/27
> [root@host ~]# setfib 1 ping a.a.a.65
> PING a.a.a.65 (a.a.a.65): 56 data bytes
> 64 bytes from a.a.a.65: icmp_seq=0 ttl=255 time=1.437 ms
> ^C
> --- a.a.a.65 ping statistics ---
> 1 packets transmitted, 1 packets received, 0.0% packet loss
> round-trip min/avg/max/stddev = 1.437/1.437/1.437/0.000 ms
> [root@host ~]# route delete -net a.a.a.64/27
> delete net a.a.a.64
> [root@host ~]# setfib 1 ping a.a.a.65
> PING a.a.a.65 (a.a.a.65): 56 data bytes
> 64 bytes from a.a.a.65: icmp_seq=0 ttl=255 time=0.762 ms
> ^C
> --- a.a.a.65 ping statistics ---
> 1 packets transmitted, 1 packets received, 0.0% packet loss
> round-trip min/avg/max/stddev = 0.762/0.762/0.762/0.000 ms
> 
> 
> and deleting it again results in failure:
> 
> [root@host ~]# arp -an
> ? (a.a.a.92) at 00:11:27:00:d7:c4 on vlan11 permanent [vlan]
> ? (a.a.a.65) at 00:1a:e4:00:60:bf on vlan11 [vlan]
> ...
> [root@host ~]# arp -d a.a.a.65
> delete: cannot locate a.a.a.65
> [root@host ~]# setfib 1 arp -d a.a.a.65
> a.a.a.65 (a.a.a.65) deleted
> [root@host ~]# setfib 1 ping -c1 a.a.a.65
> PING a.a.a.65 (a.a.a.65): 56 data bytes
> ping: sendto: Invalid argument
> ^C
> --- a.a.a.65 ping statistics ---
> 1 packets transmitted, 0 packets received, 100.0% packet loss
> 
> 
> This behaviour seems a little inconsistent, with fib 1 requesting arp
> lookups, fib 0 performing and displaying them, but fib 1 needing to delete
> them...
> 
> 
> 
> -- 
> 
> The information contained in this message is confidential and is intended for the addressee only. If you have received this message in error or there are any problems please notify the originator immediately. The unauthorised use, disclosure, copying or alteration of this message is strictly forbidden. 
> 
> Critical Software Ltd. reserves the right to monitor and record e-mail messages sent to and from this address for the purposes of investigating or detecting any unauthorised use of its system and ensuring its effective operation.
> 
> Critical Software Ltd. registered in England, 04909220. Registered Office: IC2, Keele Science Park, Keele, Staffordshire, ST5 5NH.
> 
> ------------------------------------------------------------
> This message has been scanned for security threats by iCritical.
>    For further information, please visit www.icritical.com
> ------------------------------------------------------------

I've encountered exactly the same problem today.

I have a machine with public addresses, and also a interface for out of band management with private address, and I wanted to use
a separate FIB for the private interface and it's routes.

When I've deleted the routes for the private interface form the main FIB, arpresolve stopped working.

The I've patched sys/netinet/in.c with the following patch :

--- sys/netinet/in.c.orig	2012-05-12 08:57:17.000000000 +0200
+++ sys/netinet/in.c	2012-05-12 08:56:43.000000000 +0200
@@ -1418,21 +1418,21 @@
 
 static int
 in_lltable_rtcheck(struct ifnet *ifp, u_int flags, const struct sockaddr *l3addr)
 {
 	struct rtentry *rt;
 
 	KASSERT(l3addr->sa_family == AF_INET,
 	    ("sin_family %d", l3addr->sa_family));
 
 	/* XXX rtalloc1 should take a const param */
-	rt = rtalloc1(__DECONST(struct sockaddr *, l3addr), 0, 0);
+	rt = rtalloc1_fib(__DECONST(struct sockaddr *, l3addr), 0, 0, ifp->if_fib);
 
 	if (rt == NULL)
 		return (EINVAL);
 
 	/*
 	 * If the gateway for an existing host route matches the target L3
 	 * address, which is a special route inserted by some implementation
 	 * such as MANET, and the interface is of the correct type, then
 	 * allow for ARP to proceed.
 	 */


And this seems to fix the issue.

Now that the multi FIB code is in GENERIC probably this (or similar fix) should be comitted.

P.S.: I also wonder why the loopback route for an interface address is also installed explicitly in the default FIB?






Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?59FAFD0B-A107-4173-9FA9-BA3349D499E2>