From owner-freebsd-isp@FreeBSD.ORG  Mon Jun 12 12:50:24 2006
Return-Path: <owner-freebsd-isp@FreeBSD.ORG>
X-Original-To: freebsd-isp@freebsd.org
Delivered-To: freebsd-isp@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id E61E616A4CA;
	Mon, 12 Jun 2006 12:50:24 +0000 (UTC)
	(envelope-from vadim_nuclight@mail.ru)
Received: from mx27.mail.ru (mx27.mail.ru [194.67.23.63])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 996C443D80;
	Mon, 12 Jun 2006 12:50:14 +0000 (GMT)
	(envelope-from vadim_nuclight@mail.ru)
Received: from [82.211.136.13] (port=4686 helo=nuclight.avtf.net)
	by mx27.mail.ru with esmtp 
	id 1Fplri-0004V8-00; Mon, 12 Jun 2006 16:50:04 +0400
Date: Mon, 12 Jun 2006 19:48:50 +0700
To: "Eduardo Meyer" <dudu.meyer@gmail.com>, freebsd-current@freebsd.org
References: <optax2g7jq4fjv08@nuclight.avtf.net>
	<70e8236f0606110836j38f7ca33wa3058eaecf386fb5@mail.gmail.com>
	<optazz26kn17d6mn@nuclight.avtf.net>
	<d3ea75b30606111534q337aa27aj87baa1f20550ac1c@mail.gmail.com>
From: "Vadim Goncharov" <vadim_nuclight@mail.ru>
Organization: AVTF TPU Hostel
Content-Type: text/plain; format=flowed; delsp=yes; charset=koi8-r
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Message-ID: <opta09vodb17d6mn@nuclight.avtf.net>
In-Reply-To: <d3ea75b30606111534q337aa27aj87baa1f20550ac1c@mail.gmail.com>
User-Agent: Opera M2/7.54 (Win32, build 3865)
Cc: freebsd-isp@freebsd.org, freebsd-net@freebsd.org
Subject: Re: [PATCH] ng_tag - new netgraph node,
	please test (L7 filtering possibility)
X-BeenThere: freebsd-isp@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Internet Services Providers <freebsd-isp.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-isp>,
	<mailto:freebsd-isp-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-isp>
List-Post: <mailto:freebsd-isp@freebsd.org>
List-Help: <mailto:freebsd-isp-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-isp>,
	<mailto:freebsd-isp-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 12 Jun 2006 12:50:25 -0000

12.06.06 @ 05:34 Eduardo Meyer wrote:

> I read the messages and man page but did not understand. Maybe it is
> my lack of knowledge regarding netgraph? Well, in man page it seems
> that you looked at ipfw source code (.h in fact) to find out the tag
> number. Can you explain this?

Yes, netgraph always was a semi-programmer system, less or more,  
especially true with ng_tag, as it tries to be generalized mbuf_tags(9)  
manipulating interface, and this is more kernel internals. For simple  
using, however, you don't need to bother all that details - just remember  
magic number and where to place it, and it is now simple for use with ipfw  
tags.

> A practical example, how could I, for example, block Kazaa or
> bittorrent based on L7 with ng_tag? Can you please explain the steps
> on how to do this?

The truth is that, in fact, ng_tag doesn't do any traffic analysis. It  
merely provides an easy way to distinguish different packets after  
returning to ipfw. Currently the only analyzing node in FreeBSD src tree  
is ng_bpf(4), but it merely splits incoming packets in two streams,  
matched and not. There are reasons to this, as netgraph needs to be  
modular, and each node does a small thing, but does it well. For long time  
ng_bpf was used for another purposes in the kernel, and now, as new ipfw  
features appeared, ng_tag came up for easy integration.

So, that's merely a framework allowing you to create custom filters, and  
if you need to match some kind of traffic, you should sit, understand what  
patterns that traffic has and then program ng_bpf(4) with appropriate  
filter. In fact, it allows to create it from tcpdump(1) expressions, so  
you don't need to be a C programmer, and that's good, isn't it? :)

> I don't run -CURRENT but I need this kind of feature very much, I am
> downloading a 7.0 snapshot just to test this with ipfw tag.

You'll be able to do this with RELENG_6 about two weeks later. I simply  
couldn't wait a month for MFC and wrote it earlier :)

> How this addresses the problem on system level L7 filtering? I always
> though that someone would show up with a userland application that
> tags packets and returns the tag to ipfw filtering, but you came up
> with a kernel approach. How better and why it is when compared to evil
> regexp evaluation on kernel or how efficient is this when compared to
> Linux L7 which is know to fail a lot (let a number of packets pass)?

Yes, in general case you do - correct way is to have a userland  
application which will do analysis, this easier, simpler and safer  
(imagine a security flaw inside kernel matcher?). Like snort.  But the  
main disadvantage - it is SLOW. And for many kinds of traffic you do not  
need to perform complete flow analysis, as that is simple enough to do  
per-packet matching, then to say "Huh.. I found such packet, so entire  
connection must be of that type". Actually, I've found Linux iptables P2P  
matching module named ipp2p at http://www.ipp2p.org/ which was told to  
work reasonable well, looked at the code and found that one-packet match  
is enough for this work. So, per-packet matching can be implemented in  
kernel.

After that I've discovered that FreeBSD already have in-kernel packet  
matcher for a long time, since 4.0. Briefly inspecting ipp2p code shown  
that most recognized P2P types can be matched by tcpdump and thus are  
programmable on ng_bpf(4). For some patterns, still, that's not enough, as  
bpf can't search for a substring on a variable, not fixed, offset. Then we  
can imagine another netgraph node which will do substring search (like  
iptables --string), so with both bpf and string-matching all P2P traffic  
can be caught.

Anyway, that work yet to be done. The main benefit of ng_tag at the moment  
is that everybody wishing this have no longer principial barriers to do,  
like needing skills to write kernel module or even userland matching  
daemon.

> Sorry for all those questions, but I am an end user in the average,
> so, I can not understand it myself only reading the code.
>
> Thank you for your work and help. It seems that I will have a 7.0
> snapshot doing this job to me untill the ipfw tag MFC happens, if I
> can understand this approach.

I hope that my explanation was helpful enough to understand :) Also, if  
you will be using 7.0, include BPF_JITTER in your kernel config as this  
will enable native code-compiling for bpf and ng_bpf - this will speed  
things up.

==========================================================================

P.S. Here is quick-and-dirty primer how to convert ipp2p functions to  
ng_bpf(4) input expression for tcpdump(1). Go to http://www.ipp2p.org/ and  
download source, unpack and open file pt_ipp2p.c and find function for  
your P2P type, let it be BitTorrent for our example. So look (I've  
formatted that bad Linux code a little to be a more style(9)'ish):

int
search_bittorrent (const unsigned char *payload, const u16 plen)
{
     if (plen > 20) {
	/* test for match 0x13+"BitTorrent protocol" */
	if (payload[0] == 0x13)
		if (memcmp(payload+1, "BitTorrent protocol", 19) == 0)
			return (IPP2P_BIT * 100);

	/* get tracker commandos, all starts with GET /
	 * then it can follow: scrape| announce
	 * and then ?hash_info=
	 */
	if (memcmp(payload,"GET /",5) == 0) {
		/* message scrape */
		if (memcmp(payload+5, "scrape?info_hash=", 17)==0)
			return (IPP2P_BIT * 100 + 1);
		/* message announce */
		if (memcmp(payload+5, "announce?info_hash=", 19)==0)
			return (IPP2P_BIT * 100 + 2);
	}
     } else {
     	/*
	 * bitcomet encryptes the first packet, so we have to detect another
     	 * one later in the flow
	 */
     	 /* first try failed, too many missdetections */
     	//if (size == 5 && get_u32(t,0) == __constant_htonl(1) && t[4] < 3)
	//	return (IPP2P_BIT * 100 + 3);
     	
     	/* second try: block request packets */
     	if ((plen == 17) &&
	    (get_u32(payload,0) == __constant_htonl(0x0d)) &&
	    (payload[4] == 0x06) &&
	    (get_u32(payload,13) == __constant_htonl(0x4000)))
		return (IPP2P_BIT * 100 + 3);
     }
     return 0;
}

So, what do we see? BitTorrent packet can start with one of three fixed  
strings (we see memcmp() checks for them). Author of ipp2p employs one  
more check, but as we can see from comments, he's not sure.

Let's find out what are the byte sequences for these strings:

$ echo -n "BitTorrent protocol" | hd
00000000  42 69 74 54 6f 72 72 65  6e 74 20 70 72 6f 74 6f  |BitTorrent  
proto|
00000010  63 6f 6c                                          |col|
00000013
$ echo -n "GET /scrape?info_hash=" | hd
00000000  47 45 54 20 2f 73 63 72  61 70 65 3f 69 6e 66 6f  |GET  
/scrape?info|
00000010  5f 68 61 73 68 3d                                 |_hash=|
00000016
$ echo -n "GET /announce?info_hash=" | hd
00000000  47 45 54 20 2f 61 6e 6e  6f 75 6e 63 65 3f 69 6e  |GET  
/announce?in|
00000010  66 6f 5f 68 61 73 68 3d                           |fo_hash=|
00000018

We can give 1, 2 or 4 bytes to tcpdump for comarison at one time. The  
"payload" variable in the source points to beginning of data in TCP  
packet. Remember from man ng_tag that tcpdump assumes packets to have  
14-byte Ethernet header for it's arrays like "tcp[]", but packets come  
 from ipfw to ng_bpf without this header, and that affects our offset  
calculations. So we must give offsets from very beginning of packets,  
which is done through "ether[]" tcpdump's prime, and parse headers  
manually. Let's assume (for simplicity and speed), however, that IP and  
TCP headers have no any options and thus always have length 20 bytes each,  
then ipp2p's "payload[0]" will be tcpdump's "ether[40]". Also, let's  
assume that ipfw checked packet len for us so we don't do that in netgraph  
too.

Then, we simply take hex bytes in order hd(1) told us, as this is network  
byte order also, and write them as tcpdump expressions (remember that  
first string ("...protocol") actually have 0x13 prepended to it). So, we  
write follow in ng_bpf(4) script:

PATTERN="(ether[40:4]=0x13426974 &&
           ether[44:4]=0x546f7272 &&
           ether[48:4]=0x656e7420 &&
           ether[52:4]=0x70726f74 &&
           ether[56:4]=0x6f636f6c
          ) ||
          (ether[40:4]=0x47455420 &&
           (ether[44:4]=0x2f736372 &&
            ether[48:4]=0x6170653f &&
            ether[52:4]=0x696e666f &&
            ether[56:4]=0x5f686173 &&
            ether[60:2]=0x683d
           ) ||
           (ether[44:4]=0x2f616e6e &&
            ether[48:4]=0x6f756e63 &&
            ether[52:4]=0x653f696e &&
            ether[56:4]=0x666f5f68 &&
            ether[60:4]=0x6173683d)
          ) ||
          (ether[2:2]=57 &&
           ether[40:4]=0x0000000d &&
           ether[44]=0x06 &&
           ether[53:4]=0x00004000)"

Note the last OR block in expression - this is translation of that "not  
sure" checking request packets. I've explicitly written packet length -  
plen=17 + 20 byte IP header len + 20 byte TCP header len, check at offset  
2 in IP header, according to RFC 791. Construction "get_u32 ==  
__constant_htonl()" means comparing 4-byte values at given offset.

P.P.S. I have not tested that pattern on real packets, as I have no  
BitTorrent today, but it should work.

-- 
WBR, Vadim Goncharov