Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 9 Jun 2010 14:17:43 -0400
From:      Nick Rogers <ncrogers@gmail.com>
To:        freebsd-hackers@freebsd.org
Subject:   arp(8) performance w/ many aliases assigned to an interface
Message-ID:  <AANLkTikLZCREKNUdon_kRHtzvPkk-XbbXF9ghUuBjoGw@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
I have an 8.0-RELEASE system with 4000 "permanent" ARP entries due to having
a network interface (em(4)) configured with 4000 aliases. The "arp -na"
command takes what I consider to be an extremely long time to finish (up to
30s on an otherwise unloaded system). I am able to replicate this in a test
environment by installing 8.0-RELEASE-amd64 on a VMWare VM w/ 1GB of RAM and
a 2GHz CPU. The 4000 aliases/entries is arbitrary, but nicely illustrates
the performance problem.

The performance is much worse on a real/loaded system. I realize the 4k
aliases on an interface is unusual but I have been effectively using this
configuration in my network to try and keep my end-users's each on his/her
own broadcast domain. The box is a router and I allocate addresses to each
user and put each on his/her own subnet with a netmask of /30. If you would
like more info on this I can provide it, but it has worked effectively in
FreeBSD 6.0-7.2. The slow performance of "arp -na" is an issue for me
because I have a web/CGI tool that runs various reports, many of them
relying on acquiring the current "ARP table", and the performance of arp(8)
makes the web interface extremely slow.

I believe the problem was introduced between 7.2 and 8.0, when, as far as I
understand, parts of the ARP subsystem were improved. In 7.2, the aliases
configured on an interface were not considered ARP entries (at least
according to arp(8)), but as of 8.0 they are marked as "permanent" ARP
entries and displayed by arp(8), which seems to attribute to the performance
problem.

I ran the following perl script to setup my test system. This script was run
after installing 8.0-RELEASE and adding the bash, perl, and p5-NetAddr-IP
packages via pkg_add -r.

#!/usr/bin/perl

use strict;
use diagnostics;

use NetAddr::IP;

my $interface = 'em1';
my $cidr        = '10.0.0.1/18';

# configure the interface with 4000 or so aliases
foreach my $na (@{NetAddr::IP->new($cidr)->splitref(30)}) {
    my $ip        = $na->addr();
    my $mask  = $na->mask();
    my $bcast  = $na->broadcast()->addr();

    my $cmd = "ifconfig $interface inet alias $ip netmask $mask broadcast
$bcast";
    print STDERR "$cmd\n";
    system($cmd);
}


The results are as follows:

[root@ ~]# uname -a
FreeBSD .localdomain 8.0-RELEASE FreeBSD 8.0-RELEASE #0: Sat Nov 21 15:02:08
UTC 2009     root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  amd64
[root@ ~]# ifconfig -a | wc -l
    4113
[root@ ~]# ifconfig -a | head -15
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
ether 00:0c:29:65:4d:3e
inet 172.16.16.244 netmask 0xffffff00 broadcast 172.16.16.255
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
ether 00:0c:29:65:4d:48
inet 10.0.0.0 netmask 0xfffffffc broadcast 10.0.0.3
inet 10.0.0.4 netmask 0xfffffffc broadcast 10.0.0.7
inet 10.0.0.8 netmask 0xfffffffc broadcast 10.0.0.11
inet 10.0.0.12 netmask 0xfffffffc broadcast 10.0.0.15
inet 10.0.0.16 netmask 0xfffffffc broadcast 10.0.0.19
inet 10.0.0.20 netmask 0xfffffffc broadcast 10.0.0.23
[root@ ~]# time ifconfig -a > /dev/null

real 0m0.032s
user 0m0.023s
sys 0m0.008s


[root@ ~]# arp -na | wc -l
    4100
[root@ ~]# arp -na | tail -15
? (10.0.5.80) at 00:0c:29:65:4d:48 on em1 permanent [ethernet]
? (10.0.5.48) at 00:0c:29:65:4d:48 on em1 permanent [ethernet]
? (10.0.5.16) at 00:0c:29:65:4d:48 on em1 permanent [ethernet]
? (10.0.1.244) at 00:0c:29:65:4d:48 on em1 permanent [ethernet]
? (10.0.1.212) at 00:0c:29:65:4d:48 on em1 permanent [ethernet]
? (10.0.1.180) at 00:0c:29:65:4d:48 on em1 permanent [ethernet]
? (10.0.1.148) at 00:0c:29:65:4d:48 on em1 permanent [ethernet]
? (10.0.1.116) at 00:0c:29:65:4d:48 on em1 permanent [ethernet]
? (10.0.1.84) at 00:0c:29:65:4d:48 on em1 permanent [ethernet]
? (10.0.1.52) at 00:0c:29:65:4d:48 on em1 permanent [ethernet]
? (10.0.1.20) at 00:0c:29:65:4d:48 on em1 permanent [ethernet]
? (172.16.16.1) at 00:50:56:c0:00:08 on em0 [ethernet]
? (172.16.16.2) at 00:50:56:ea:ea:1a on em0 [ethernet]
? (172.16.16.254) at 00:50:56:f2:75:00 on em0 [ethernet]
? (172.16.16.244) at 00:0c:29:65:4d:3e on em0 permanent [ethernet]
[root@ ~]# uptime
 7:28PM  up 42 mins, 2 users, load averages: 0.00, 0.00, 0.00
[root@ ~]# time arp -na > /dev/null

real 0m12.761s
user 0m2.959s
sys 0m9.753s
[root@ ~]#


Notice that "arp -na" takes about 13s to execute even though there is no
other load. This can get a lot worse by a few orders of magnitude on a
loaded machine in a production environment, and seems to scale up linearly
when more aliases are added to the interface (permanent ARP entries
created).


I tried the same scenario on 8.1-BETA1 and it still takes a very long time
for arp(8) to complete. I was able to isolate the performance bottleneck to
a small piece of the arp(8) code. It seems that looking up the interface for
an ARP entry is a very heavy operation when that entry corresponds to an
alias assigned to the interface. Permanent ARP entries that do not
correspond with an interface alias do not seem to cause arp(8) to puke on
the interface lookup.

The following commands and code diff illustrates how arp(8) can be modified
to run a lot faster in this scenario, but obviously the associated interface
is no longer printed for each entry.

[root@ /usr/src/usr.sbin/arp]# uname -a
FreeBSD .localdomain 8.1-BETA1 FreeBSD 8.1-BETA1 #0: Thu May 27 15:03:30 UTC
2010     root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  amd64
[root@ /usr/src/usr.sbin/arp]# time /usr/sbin/arp -na | wc -l
    4100

real 0m14.903s
user 0m3.133s
sys 0m11.519s
[root@ /usr/src/usr.sbin/arp]# pwd
/usr/src/usr.sbin/arp
[root@ /usr/src/usr.sbin/arp]# !diff
diff -ruN arp.c.orig arp.c
--- arp.c.orig 2010-06-05 18:25:24.000000000 +0000
+++ arp.c 2010-06-05 18:28:19.000000000 +0000
@@ -562,7 +562,7 @@
  const char *host;
  struct hostent *hp;
  struct iso88025_sockaddr_dl_data *trld;
- char ifname[IF_NAMESIZE];
+ //char ifname[IF_NAMESIZE];
  int seg;

  if (nflag == 0)
@@ -591,8 +591,8 @@
  }
  } else
  printf("(incomplete)");
- if (if_indextoname(sdl->sdl_index, ifname) != NULL)
- printf(" on %s", ifname);
+ //if (if_indextoname(sdl->sdl_index, ifname) != NULL)
+ //printf(" on %s", ifname);
  if (rtm->rtm_rmx.rmx_expire == 0)
  printf(" permanent");
  else {
[root@ /usr/src/usr.sbin/arp]# make clean && make
rm -f arp arp.o arp.4.gz arp.8.gz arp.4.cat.gz arp.8.cat.gz
Warning: Object directory not changed from original /usr/src/usr.sbin/arp
cc -O2 -pipe  -std=gnu99 -fstack-protector -Wsystem-headers -Werror -Wall
-Wno-format-y2k -W -Wno-unused-parameter -Wstrict-prototypes
-Wmissing-prototypes -Wpointer-arith -Wno-uninitialized -Wno-pointer-sign -c
arp.c
cc -O2 -pipe  -std=gnu99 -fstack-protector -Wsystem-headers -Werror -Wall
-Wno-format-y2k -W -Wno-unused-parameter -Wstrict-prototypes
-Wmissing-prototypes -Wpointer-arith -Wno-uninitialized -Wno-pointer-sign
 -o arp arp.o
gzip -cn arp.4 > arp.4.gz
gzip -cn arp.8 > arp.8.gz
[root@ /usr/src/usr.sbin/arp]# time ./arp -na | wc -l
    4099

real 0m0.036s
user 0m0.015s
sys 0m0.021s
[root@ /usr/src/usr.sbin/arp]#

Notice that 0.036s without the interface lookup is a heck of a lot faster
than 14.903s when doing the interface lookup.

Is there something that can be done to speedup the call to if_indextoname(),
or would it be worthwhile for me to submit a patch that adds the ability to
skip the interface lookup as an arp(8) option?



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTikLZCREKNUdon_kRHtzvPkk-XbbXF9ghUuBjoGw>