From owner-freebsd-net@freebsd.org Tue Aug 27 20:59:33 2019 Return-Path: Delivered-To: freebsd-net@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id D2464DF7CF for ; Tue, 27 Aug 2019 20:59:33 +0000 (UTC) (envelope-from vit@otcnet.ru) Received: from mail.otcnet.ru (mail.otcnet.ru [194.190.78.3]) by mx1.freebsd.org (Postfix) with ESMTP id 46J1TJ5SZJz4L9c for ; Tue, 27 Aug 2019 20:59:32 +0000 (UTC) (envelope-from vit@otcnet.ru) Received: from Victors-MacBook-Air-2.local (unknown [195.91.148.145]) by mail.otcnet.ru (Postfix) with ESMTPSA id 8CA8689DF2; Tue, 27 Aug 2019 23:59:31 +0300 (MSK) Subject: Re: finding optimal ipfw strategy To: Eugene Grosbein , "Andrey V. Elsukov" , freebsd-net@freebsd.org References: <4ff39c8f-341c-5d72-1b26-6558c57bff8d@grosbein.net> From: Victor Gamov Organization: OTCnet Message-ID: Date: Tue, 27 Aug 2019 23:59:29 +0300 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 46J1TJ5SZJz4L9c X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of vit@otcnet.ru designates 194.190.78.3 as permitted sender) smtp.mailfrom=vit@otcnet.ru X-Spamd-Result: default: False [-6.51 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_SPF_ALLOW(-0.20)[+a:mail.otcnet.ru:c]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[otcnet.ru]; TO_DN_SOME(0.00)[]; HAS_ORG_HEADER(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_HAM_SHORT(-0.98)[-0.982,0]; IP_SCORE(-3.33)[ip: (-8.76), ipnet: 194.190.78.0/24(-4.38), asn: 50822(-3.50), country: RU(0.01)]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:50822, ipnet:194.190.78.0/24, country:RU]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Aug 2019 20:59:33 -0000 On 27/08/2019 23:30, Eugene Grosbein wrote: > 28.08.2019 2:20, Victor Gamov wrote: > >> sysctl.conf >> ===== >> net.link.ether.ipfw=1 >> net.link.bridge.ipfw=1 >> net.link.bridge.ipfw_arp=1 >> net.link.bridge.pfil_member=1 >> >> net.inet.ip.fw.verbose_limit=100 >> net.inet.ip.fw.verbose=1 >> ===== > > You should avoid passing same packet multiple times through the ruleset. > Less checks, better performance. Yes, I feel it :-) > Do you really use ipfw filtering based on layer2 parameters like MAC addresses? > If not, you should disable net.link.ether.ipfw. If yes, you should use "layer2" keyword > explicily in rules filtering by ethernet headers and place these rules above others > and use "allow ip from any to any layer2" after L2 filtering is done, > so L2 packets do not go through other rules extra time. > > Do you really need to filter each bridged L3 packet twice? Once as "out xmit $bridge" > and once as "out xmit $brige_member"? If not, you should disable > net.link.bridge.ipfw and keep net.link.bridge.pfil_member=1 only. Packets must be filtered on input VLANs (bridge members) and on output VLANs. So net.link.bridge.pfil_member=1 > Perhaps, you are ruining the performance with such settings making same work 3 times without real need. > > Do you really need filtering ARP? Disable net.link.bridge.ipfw_arp if not. I need to drop ARP moving via bridge. As I use many VLANs all VLAN must be isolated and only multicast must be bridged from one VLAN to others. To block ARP following rule used: deny ip from any to any mac-type 0x0806 via bridge1202 As I understand correctly I need net.link.bridge.ipfw_arp and net.link.bridge.ipfw to do it. I'm not sure about net.link.ether.ipfw >> `sysctl net.isr` >> ===== >> sysctl net.isr >> net.isr.numthreads: 1 >> net.isr.maxprot: 16 >> net.isr.defaultqlimit: 256 >> net.isr.maxqlimit: 10240 >> net.isr.bindthreads: 0 >> net.isr.maxthreads: 1 >> net.isr.dispatch: direct >> ===== >> >> I don't know about internals but I think high interrupt load is bad and it because NIC does >> not support per CPU-queue for example. > > All decent igb(4) NICs support at least 8 hardware input queues unless disabled by driver/kernel. > However, net.isr settings are not about such queues. > > High interrupt number is definitely better than dropping input frames by NIC chip > due to overflow of its internal buffers just because CPU was not notified it's time to get traffic > out of these buffers. The driver tries not to overload CPU with interrupts and that's fine > but default 8000 limit is not adequate to modern CPU and was not adequate for many years. > Raise limit to 32000. I see. Thanks! I'll tune net.isr ASAP. >>> If not, you should try something like this. For loader.conf: >> >> Sorry, it's a production system and I can reboot it at the middle of October only. >> >>> #substitute total number of CPU cores in the system here >>> net.isr.maxthreads=4 >>> # EOF >> >> Is it ok for multicast? It's UDP traffic which must be ordered. I read 'maxthreads=1' used to keep TCP traffic ordered. > > It's a job for uplink to feed your bridge with ordered UDP flows. If you use igb(4) driver, > FreeBSD kernel will keep flows ordered automatically. There is no place in the code > where they could be reordered if you do not use lagg(4) without LACP. Thanks again. I'll set maxthreads=4 at next reboot. >>> And if you haven't already seen it, you may find useful my blog post >>> (in Russian) https://dadv.livejournal.com/139170.html >>> It's a bit old but still can give you some light. >> Yes, I read it already :-) Also some Calomel articles. I'll try to tune system at next reboot. >> The main question for myself now is "how is this architecture correct" and "how many traffic is possible to process". > > You have read numbers from my posts. ipfw+dummynet+PPPoE+routing+LACP+vlan tagging takes much more CPU power > than just ipfw+bridging and my system still processed much more traffic. > > Make sure you don't pass same packets multiple times through ipfw rules. > ipfw has its counters for rules and you should use them to summarize octets carefully > and compare with numbers shown by netstat or systat (they both have same in-kernel source) > to verify whether packets go through ipfw extra times or not. It's not too easy but I'll try to build test system and check on it. If 'bridge + drop on outgoing' is not a bottleneck I'll tune system and use this approach while it's possible. -- CU, Victor Gamov