From owner-freebsd-stable@FreeBSD.ORG Wed Apr 9 21:35:49 2014 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1F73B994 for ; Wed, 9 Apr 2014 21:35:49 +0000 (UTC) Received: from frv190.fwdcdn.com (frv190.fwdcdn.com [212.42.77.190]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id CE25E13D5 for ; Wed, 9 Apr 2014 21:35:48 +0000 (UTC) Received: from [10.10.1.23] (helo=frv199.fwdcdn.com) by frv190.fwdcdn.com with esmtp ID 1WXzp8-000JJd-Tv for stable@freebsd.org; Thu, 10 Apr 2014 00:14:22 +0300 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=ukr.net; s=ffe; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-Id:Cc:To:Subject:From:Date; bh=jupy+tg9D/Jim0E+jNBnR/vdasU7Y2pUFqWwqwTMwIA=; b=tVtoGNSYa2camIT1t5S9ojhVodMOmxpl+NILOYNUcu3ShlBhRlmirZEXH9MaEO2xilwnrS2Xhw5KlWuXtOyd0EIItIeBMnAsxBK3SSkGotgxD4FPYxX51BRyTqQ0sTglu4UwYedApa68N+y2SZE89xmWdRdPGZbbAUhQGURvoUQ=; Received: from [10.10.10.35] (helo=frv35.fwdcdn.com) by frv199.fwdcdn.com with smtp ID 1WXzow-000JFX-Cq for stable@freebsd.org; Thu, 10 Apr 2014 00:14:10 +0300 Date: Thu, 10 Apr 2014 00:14:10 +0300 From: Vladislav Prodan Subject: Some gruesome moments with performance of FreeBSD at over 20K interfaces To: stable@freebsd.org X-Mailer: mail.ukr.net 5.0 Message-Id: <1397077963.756961709.gspkmzvd@frv35.fwdcdn.com> MIME-Version: 1.0 Received: from universite@ukr.net by frv35.fwdcdn.com; Thu, 10 Apr 2014 00:14:10 +0300 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: binary Content-Disposition: inline Cc: hackers@freebsd.org, net@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Apr 2014 21:35:49 -0000 Dear Colleagues! I had a task, using FreeBSD 10.0-STABLE: 1) Receive 20-30 Q-in-Q VLAN (IEEE 802.1ad ), inside of which 2k-4k vlan (IEEE 802.1Q). Total ~60K vlan 2) To every vlan interface assign ipv4 and ipv6 addresses, define routes to ipv4 and ipv6 addresses on another side of vlan (ip unnumbered), and also prescribe ipv6 network /64 by size through ipv6 address on another side of vlan. 3) Perform routing from the world to all of these ipv4/ipv6 addresses и ipv6 networks inside ~60K vlan To accomplish the 1st task I have no alternatives to using Netgraph. I noticed incorrect behavior of ngctl(8) after addition of 560th vlan (bin/187835) Than speed of addition 4k, 8k, 12k vlans was damnably slow: 10 minutes for first 4k vlans 18 minutes for first 5k vlans 28 minutes for first 6k vlans 52 minutes for first 8k vlans Than I added more 4к vlans 20 minutes - 9500 vlans 33 minutes - 10500 vlans 58 minutes - 12к vlans In total speed of addition of 4k, 8k, 12k vlans was subsequently 10m/52m/110m It’s hard to imagine, how many time is needed to add ~60K vlan :( Process was accelerated a little by shooting off devd, bsnmpd, ntpd services, but it found another problems and limitations. For example, a) Service ntpd refuse to start at 12K interfaces: ntpd[2195]: Too many sockets in use, FD_SETSIZE 16384 exceeded I remind, that in files /usr/src/sys/sys/select.h and /usr/include/sys/select.h FD_SETSIZE value is only 1024U b) Service bsnmpd started at 12K interfaces, but immediately loaded CPU at 80-100% last pid: 64011; load averages: 1.00, 0.97, 0.90 up 0+05:25:39 21:26:36 58 processes: 3 running, 54 sleeping, 1 waiting CPU: 68.2% user, 0.0% nice, 30.6% system, 1.2% interrupt, 0.0% idle Mem: 125M Active, 66M Inact, 435M Wired, 200K Cache, 525M Free ARC: 66M Total, 28M MFU, 36M MRU, 16K Anon, 614K Header, 2035K Other Swap: 1024M Total, 1024M Free PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND 63863 root 1 96 0 136M 119M RUN 35:31 79.98% bsnmpd ... c) Size of fields during output of command netstat(1) - netstat -inW is unsufficient (bin/188153) d) If indicate in command netstat of interface it’s impossible to understand, which ipv4/ipv6 neworks are indicated here. # netstat -I ngeth123.223 -nW Name Mtu Network Address Ipkts Ierrs Idrop Opkts Oerrs Coll ngeth12 1500 08:00:27:cd:9b:8e 0 0 0 1 5 0 ngeth12 - 172.18.206.13 172.18.206.139 0 - - 0 - - ngeth12 - fe80::a00:27f fe80::a00:27ff:fe 0 - - 1 - - ngeth12 - 2001:570:28:1 2001:570:28:140:: 0 - - 0 - - e) Very low output of command arp: # ngctl list | grep ngeth | wc -l 12003 # ifconfig -a | egrep -e 'inet ' | wc -l 12007 # time /usr/sbin/arp -na > /dev/null 150.661u 551.002s 11:53.71 98.3% 20+172k 1+0io 0pf+0w More info at http://freebsd.1045724.n5.nabble.com/arp-8-performance-use-if-nameindex-instead-of-if-indextoname-td5898205.html After using of patch, speed became acceptable: # time /usr/sbin/arp -na > /dev/null 0.114u 0.090s 0:00.14 142.8% 20+170k 0+0io 0pf+0w I suspect, that output of standard network stack will be too low to accomplish a 3rd task, routing of ~60K vlan I have no idea, how to use netmap(4) in this situation :( Please, help me in fulfillment of assigned task. P.S. Colleague-Linuxoid is adjusting the same task and bragging: At Debian, in test (kernel 3.13), 80K vlans arose in 20 minutes. It takes 3 GB RAM. And deleting of these vlans also took 20 minutes. -- Vladislav V. Prodan System & Network Administrator http://support.od.ua +380 67 4584408, +380 99 4060508 VVP88-RIPE