From owner-freebsd-net@FreeBSD.ORG Mon Jul 1 08:29:34 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 4F1D6687; Mon, 1 Jul 2013 08:29:34 +0000 (UTC) (envelope-from melifaro@FreeBSD.org) Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2]) by mx1.freebsd.org (Postfix) with ESMTP id F00C91716; Mon, 1 Jul 2013 08:29:33 +0000 (UTC) Received: from [2a02:6b8:0:401:222:4dff:fe50:cd2f] (helo=dhcp170-36-red.yandex.net) by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.76 (FreeBSD)) (envelope-from ) id 1UtZXW-000APL-0F; Mon, 01 Jul 2013 12:32:50 +0400 Message-ID: <51D13D6D.7030603@FreeBSD.org> Date: Mon, 01 Jul 2013 12:27:25 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130418 Thunderbird/17.0.5 MIME-Version: 1.0 To: Navdeep Parhar Subject: Re: cxgbetool & hw filtering issues References: <51D03FCE.1060102@FreeBSD.org> <51D089D9.6080901@FreeBSD.org> In-Reply-To: <51D089D9.6080901@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Jul 2013 08:29:34 -0000 On 30.06.2013 23:41, Navdeep Parhar wrote: > On 06/30/13 07:25, Alexander V. Chernikov wrote: >> Hello list! >> >> While experimenting with Chelsio T440-CR (cxgbe) internal firewall, I'm >> getting some kind of unexpected results: > One bit of general advice to begin with: add "hitcnts 1" to all your > filter rules and then you can see how many incoming packets hit that > filter in the output of "cxgbetool t4nex0 filter list". I really should > make hitcnts=1 the default in the driver. Thanks for the hint. > >> filtering 'type ipv4 action drop' permits IPv4 TCP traffic with bad >> checksum. > It may be that a bad checksum makes it an invalid IPv4 packet to the > chip and so it doesn't hit the "type ipv4" rule. There is an entirely > separate knob available to have the chip drop bad packets if you don't > want to see them. The default is to let them through so that users can > examine them with tcpdump etc. That's OK, Intel also has such tunable (IXGBE_FCTRL_SBP flag). How can I tune this? > >> filtering 'type IPv6 action drop' permits IPv6 traffic to multicast >> addresses (MLDv2, etc..) > The DMAC is an L2 multicast address? Try "proto 58 hitcnts 1 action > drop" to get these ICMP6 packets. > >> filtering 'ethtype 34525 action drop' (drop all IPv6) results in >> 'CHELSIO_T4_SET_FILTER: Argument list too long' despite to what is said >> in budget table from cxgbetool.8 > This _would_ have gotten everything with ethertype ipv6 but the default > filter mode doesn't have ethtype enabled, which is why it's complaining: > # cxgbetool t4nex0 filter mode > ipv4 ipv6 sip dip sport dport matchtype proto ivlan iport fcoe Well, ./cxgbetool t4nex0 filter mode ipv4 ipv6 sip dip sport dport matchtype proto vlan iport cxgbetool: CHELSIO_T4_SET_FILTER_MODE: Operation not supported (Probably because -t4_set_filter_mode() is still under "#ifdef notyet" in t4_main.c) :) > >> filtering 'matchtype 4 action drop' or similar (4,5,4:0,4:4, 5:0, 5:5) >> does not match anything despite some traffic definitely falls into that >> conditions. >> filtering 'action drop' and 'iport X action drop' filters IPv4 traffic >> only. > Strange. I use "iport X action drop hitcnts 1" as a packet black hole > all the time. Were these the only filters when you tried them? Are you > sure your packets didn't hit some other rule and were delivered as a > result of that? Check the order in "cxgbetool t4nex0 filter list" TESTING COUNTER: # ipfw show 200 00200 432677 57910898 deny ip from any to any via cxgbe3 # while true; do sleep 1; ipfw show 200 ; ipfw -q zero 200 ;done [## EMPTY # ./cxgbetool t4nex0 filter list ##] 00200 281878 80450397 deny ip from any to any via cxgbe3 00200 281451 80296577 deny ip from any to any via cxgbe3 00200 299594 85351560 deny ip from any to any via cxgbe3 [## # ./cxgbetool t4nex0 filter 0 iport 3 hitcnts 1 action drop # ./cxgbetool t4nex0 filter list Idx Hits FCoE Port vld:VLAN Prot MPS Frag DIP SIP DPORT SPORT Action 0 1841792 0/0 3/7 0:0000/0:0000 00/00 0/0 0/0 00000000/00000000 00000000/00000000 0000/0000 0000/0000 Drop ##] 00200 115487 15451587 deny ip from any to any via cxgbe3 00200 115148 15414229 deny ip from any to any via cxgbe3 00200 116008 15526682 deny ip from any to any via cxgbe3 [ ## the same, IPv4 TCP with bad csum packets, and IPv6 traffic with L2 multicast macs: # tcpdump -i cxgbe3 -lns0 -c1 ip tcpdump: WARNING: cxgbe3: no IPv4 address assigned tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on cxgbe3, link-type EN10MB (Ethernet), capture size 65535 bytes 12:09:42.249299 IP 95.108.170.36.39215 > 93.158.158.93.80: Flags [P.], seq 2064108148:2064108546, ack 4252238260, win 1040, options [nop,nop,TS val 538195909 ecr 1194268184], length 398 12:12 [0] test25# tcpdump -i cxgbe3 -lnes0 -c10 ip6 tcpdump: WARNING: cxgbe3: no IPv4 address assigned tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on cxgbe3, link-type EN10MB (Ethernet), capture size 65535 bytes 12:12:16.728912 80:49:71:11:8d:a2 > 33:33:00:00:00:fb, ethertype IPv6 (0x86dd), length 324: fe80::8249:71ff:fe11:8da2.5353 > ff02::fb.5353: 0*- [0q] 2/0/7 PTR zivot-osx._smb._tcp.local., TXT "model=MacBookAir4,2" (262) 12:12:16.728923 00:25:90:0e:00:b8 > 33:33:00:00:00:01, ethertype IPv6 (0x86dd), length 134: fe80::225:90ff:fe0e:b8 > ff02::1: ICMP6, router advertisement, length 80 12:12:16.728942 5c:26:0a:6e:b4:76 > 33:33:00:00:00:16, ethertype IPv6 (0x86dd), length 130: fe80::884:a1e8:86ae:57f7 > ff02::16: HBH ICMP6, multicast listener report v2, 3 group record(s), length 68 12:12:16.728968 5c:26:0a:6e:b4:76 > 33:33:ff:0e:00:b8, ethertype IPv6 (0x86dd), length 86: fe80::884:a1e8:86ae:57f7 > ff02::1:ff0e:b8: ICMP6, neighbor solicitation, who has fe80::225:90ff:fe0e:b8, length 32 12:12:16.728971 5c:26:0a:6e:b4:76 > 33:33:ff:0e:00:b8, ethertype IPv6 (0x86dd), length 86: fe80::884:a1e8:86ae:57f7 > ff02::1:ff0e:b8: ICMP6, neighbor solicitation, who has fe80::225:90ff:fe0e:b8, length 32 12:12:16.729011 5c:26:0a:6e:b4:76 > 33:33:ff:0e:00:b8, ethertype IPv6 (0x86dd), length 86: fe80::884:a1e8:86ae:57f7 > ff02::1:ff0e:b8: ICMP6, neighbor solicitation, who has fe80::225:90ff:fe0e:b8, length 32 12:12:16.729011 20:c9:d0:2b:b7:28 > 33:33:00:00:00:fb, ethertype IPv6 (0x86dd), length 321: fe80::22c9:d0ff:fe2b:b728.5353 > ff02::fb.5353: 0*- [0q] 2/0/7 PTR octo-osx._smb._tcp.local., TXT "model=MacBookAir5,2" (259) 12:12:16.729012 20:c9:d0:7c:cb:1d > 33:33:00:00:00:fb, ethertype IPv6 (0x86dd), length 95: fe80::22c9:d0ff:fe7c:cb1d.5353 > ff02::fb.5353: 0 PTR (QM)? _smb._tcp.local. (33) 12:12:16.729021 5c:26:0a:6e:b4:76 > 33:33:ff:4f:dc:69, ethertype IPv6 (0x86dd), length 78: :: > ff02::1:ff4f:dc69: ICMP6, neighbor solicitation, who has 2a02:6b8:0:401:c599:50e2:184f:dc69, length 24 12:12:16.729022 5c:26:0a:6e:b4:76 > 33:33:00:00:00:02, ethertype IPv6 (0x86dd), length 70: fe80::884:a1e8:86ae:57f7 > ff02::2: ICMP6, router solicitation, length 16 ## ] > > Also, are you going by the ifnet rx stats as displayed by netstat etc.? > Right now the driver fills the ifnet stats directly from hardware > registers rather than counting the packets that it actually received > from the chip. The hardware registers include packets that would have > been delivered to the driver if no filters were present but are dropped > due to a filter. I'm counting packets by "deny ip from any to any via cxgbe3" ipfw counter, as I specified in the setup scenario :) > >> filter 'type ipv6 ...' can be set on (0,4,8,12,...) filter numbers >> yelling 'CHELSIO_T4_SET_FILTER: Invalid argument' on other numbers. > Yes, IPv6 filters take 4 tid's (non-IPv6 take 1) and these tid's have to > start at a naturally aligned boundary. No way around this. No problem :) > >> What can I do to debug further/fix this behavior? >> >> Some more questions: >> Does anybody known how I can get/set total number of HW firewall >> records? There is such tunable in Linux version. > I will add a simple sysctl for this. For now you can indirectly figure > this out from the output of "sysctl -n dev.t4nex.0.misc.tids" -- the > FTIDs are the filter tids. For example I see 1456 filters on this card: > trantor:~# sysctl -n dev.t4nex.0.misc.tids > ATID range: 0-8191, in use: 0 > TID range: 2048-18431, in use: 0 > STID range: 0-511, in use: 0 > FTID range: 512-1967 > HW TID usage: 0 IP users, 0 IPv6 users > trantor:~# echo $((1967 - 512 + 1)) > 1456 Thanks! > >> Is there any way to retrieve _host_ interface statistic (e.g. how much >> traffic in packets/bytes are thrown to NIC driver)? > cxgbe(4) doesn't count this stuff itself. Currently it just reads the Understood. I'll use hitcnts counters, then. > hardware registers once per second and it's done. Software stats would > have to be per queue (and then aggregated from time to time). I'll wait counter(9) framework handles this automatically for sysctls > to see where the PCPU counter work in the kernel goes before reworking > this part of the driver. Well, it's actually working, and working great :) We're using PCPU counters for if_vlan, ipfw and IP stack statistics > > Regards, > Navdeep >