Date: Sun, 31 Jul 2016 17:17:44 -0300 From: "Dr. Rolf Jansen" <rj@obsigna.com> To: freebsd-ipfw@freebsd.org Cc: Ian Smith <smithi@nimnet.asn.au> Subject: Re: ipfw divert filter for IPv4 geo-blocking Message-ID: <166B5A24-8E5A-444D-BBF5-2B5883DE76A7@obsigna.com> In-Reply-To: <20160801030317.P29054@sola.nimnet.asn.au> References: <61DFB3E2-6E34-4EEA-8AC6-70094CEACA72@cyclaero.com> <CAHu1Y739PvFqqEKE74BjzgLa7NNG6Kh55NPnU5MaA-8HsrjkFw@mail.gmail.com> <4D047727-F7D0-4BEE-BD42-2501F44C9550@obsigna.com> <c2cd797d-66db-8673-af4e-552dfa916a76@freebsd.org> <9641D08A-0501-4AA2-9DF6-D5AFE6CB2975@obsigna.com> <4d76a492-17ae-cbff-f92f-5bbbb1339aad@freebsd.org> <C0CC7001-16FE-40BF-A96A-1FA51A0AFBA7@obsigna.com> <677900fb-c717-743f-fcfe-86b603466e33@freebsd.org> <0D3C9016-7A4A-46BA-B35F-3844D07562A8@obsigna.com> <CAFPNf59w6BHgDjLNHW=rQckZAFG4gqPHL49vLXiDmMAxVPOcKg@mail.gmail.com> <1E1DB7E0-D354-4D7A-B657-0ECF94C12CE0@obsigna.com> <50d405a4-3f8f-a706-9cac-d1162925e56a@freebsd.org> <c62fa048-63c8-aef6-5bad-b0a6719f6acb@freebsd.org> <9222BB10-C700-4DE7-83A3-BE7A38A11713@obsigna.com> <1B36CAD7-A139-436B-B7EC-0FFF232F9C6A@obsigna.com> <20160801030317.P29054@sola.nimnet.asn.au>
next in thread | previous in thread | raw e-mail | index | archive | help
> Am 31.07.2016 um 15:38 schrieb Ian Smith <smithi@nimnet.asn.au>: > On Sat, 30 Jul 2016 11:17:13 -0300, Dr. Rolf Jansen wrote: >> I finished the work on CIDR conformity of the IP ranges tables=20 >> generated by the tool geoip. The main constraint is that the start=20 >> and end address of an IP block given by the delegation files MUST BE=20= >> PRESERVED during the transformation to a set of CIDR records. This=20 >> target is achieved by: >>=20 >> 1. Finding the largest common netmask boundary of the start address = utilizing >> int(log2(addr_count)); then iteration like Euclid's algorithm in = computing >> a GCD. >>=20 >> 2. Output the CIDR with the given start address and the masklen = belonging >> to the found netmask. >>=20 >> 3. If the CIDR does not match the whole original IP range then set = the start >> address of the next CIDR block to the next boundary of the common = netmask, >> and loop over starting at 1. until the original range has been = satisfied. >>=20 >> I carefully tested the algorithm and a table that I pipe by the new=20= >> geoip tool into ipfw is 100 % identical to the output of the ipfw=20 >> command 'table N list'. >=20 > Great. I suppose that caters for some of the odd delegations one = sees,=20 > such as perhaps a /16 then a /15 (ie 3/4 of a /14) followed by maybe a=20= > /12, maybw with another /15 tacked on the end .. but I'm unsure if = that=20 > applies to country allocations as much as it does within countries. >=20 >> It is worth to note, that already the original RIR delegation files=20= >> contain 457 non CIDR conforming IPv4 ranges in a total of 165815=20 >> original records. I guess that this number will increase in the=20 >> future because the RIR's ran empty on new IPv4 ranges and are urged=20= >> to subdivide returned old ranges for new delegations. The above=20 >> algorithm is ready for this. >=20 > Yes, and just as well. I'm surprised it's as few as 457 .. I looked=20= > into it a bit back when 115.70/17 was first allocated to my ISP, after=20= > previously having been, as I recall, in China .. so of course we fell=20= > foul of a number of (probably perennially) out-of-date geoip blockers,=20= > for months in some cases .. malevolent beasts if not kept well fed :) >=20 >> Generally, CIDR conforming tables are more than twice as large as=20 >> optimized (joined adjacencies) IP range tables. All said changes have=20= >> been pushed to GitHup already. >=20 > So how many table entries does 'the world' come to, around 400,000? No, it is not that bad. The total number of original entries in the = delegation statistics files of all 5 RIR's is about 166000. The ipdb = tool which compiles these ranges into a consolidated sorted binary = table, that is suitable for loading it directly into a binary search = tree, reduces the number of entries to a bit more than one half, namely = ca. 83500. Consolidation primarily means, resolving of overlaps, because these = could not be handled in a meaningful way by a binary search tree. Only = as an additional benefit in the same go, that routine combines = adjacencies with the same country code, although, skipping the = combination is technically not a show stopper for the BST, this is only = to increase the performance. The geoip tool which generates the tables of CIDR ranges per country = code out of the consolidated tables would output a count of 167500 = entries for all countries. That is a little bit more than the original = count, however this table is still optimized, because original ranges = that when combined form a new valid CIDR are not broken down again, but = the combined CIDR is passed. >> I am still a little bit amazed how ipfw come to accept incorrect CIDR=20= >> ranges and arbitrarily moves the start/end addresses in order to=20 >> achieve CIDR conformity, and that without any further notice, and=20 >> that given that ipfw can be considered as being quite relevant to=20 >> system security. Or, may I assume that ipfw knows always better than=20= >> the user what should be allowed or denied. Otherwise, perhaps I am=20 >> the only one ever who input incorrect CIDR ranges for processing by=20= >> ipfw. >=20 > You've lost me here, Rolf. Do you mean that ipfw adds incorrect table=20= > entries for a given IPv4 address and mask length? Or that it c/should=20= > itself accept IP ranges and generate the needed CIDR entries to match? Perhaps an example may explain it better. Remember that the first = incarnation of geoip passed the incorrect range 201.222.20.0/20 to ipfw. = This is an incorrect CIDR because the start address does not match a = mask boundary defined by the given masklen. The point now is that this = error is caused by EITHER the masklen is incorrect OR the start address = is incorrect. ipfw can determine only that the CIDR is incorrect, and = does rectify it for further processing: # ipfw table 1 add 201.222.20.0/20 # ipfw table 1 list --> 201.222.16.0/20 0 So actually ipfw happily takes an incorrect CIDR and transforms this = into a correct one under the arbitrary assumption that the masklen is = the correct part and the start address is sort of variable. Technically = the addition by ipfw is a correct CIDR, but this is not necessarily the = range the user wanted to add. > If the former, how to reproduce for a bug report? If the latter, = might > you contemplate adding that functionality to ipfw Well, in the meantime, I saw that this kind of automatic rectification = of incorrect CIDR entries is within ipfw throughout. # ipfw add 50000 allow ip from 201.222.20.0/20 to any --> 50000 allow ip from 201.222.16.0/20 to any At least in this case the user is informed directly about the CIDR that = has been actually utilized, anyway, it is not exactly the range the user = has asked for, and the tiny difference can easily be overseen. Given that ipfw can't know what the user actually intended - did he = mistype the start address or obtained somehow a wrong masklen -, I = advocate that ipfw should accept the wrong CIDR and transform it to a = correct one exactly as above, but should output a warning, either to the = shell or to syslog in case no shell is connected. Ideally this warning = would give a useful explanation why the input CIDR is wrong and how it = could be made correct by either adjusting the masklen or the start = address. > - or is ipfw better=20 > off being driven to generate tables from the output of such as geoip? Of course, tools like geoip HAVE TO produce by 100 % valid CIDR, = otherwise the tool is buggy and the bug must be fixed as in the cse of = geoip. Of course, ipfw must not receive any so called "bug fix" that = immediately may break lots of firewall installations around the world, = given that it is so easy to inform invalid CIDR, and that ipfw happily = accepted these up to now. I don't want to look more catholic than the pope. My main concern is not = that ipfw does a choice for correcting invalid CIDR, my concern is that = the choice is sort of arbitrary. I advocate for logging a suitable = warning, but keep on processing invalid CIDR's by transforming it to = valid ones exactly as before. Best regards Rolf
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?166B5A24-8E5A-444D-BBF5-2B5883DE76A7>