Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 31 Jul 2016 17:17:44 -0300
From:      "Dr. Rolf Jansen" <rj@obsigna.com>
To:        freebsd-ipfw@freebsd.org
Cc:        Ian Smith <smithi@nimnet.asn.au>
Subject:   Re: ipfw divert filter for IPv4 geo-blocking
Message-ID:  <166B5A24-8E5A-444D-BBF5-2B5883DE76A7@obsigna.com>
In-Reply-To: <20160801030317.P29054@sola.nimnet.asn.au>
References:  <61DFB3E2-6E34-4EEA-8AC6-70094CEACA72@cyclaero.com> <CAHu1Y739PvFqqEKE74BjzgLa7NNG6Kh55NPnU5MaA-8HsrjkFw@mail.gmail.com> <4D047727-F7D0-4BEE-BD42-2501F44C9550@obsigna.com> <c2cd797d-66db-8673-af4e-552dfa916a76@freebsd.org> <9641D08A-0501-4AA2-9DF6-D5AFE6CB2975@obsigna.com> <4d76a492-17ae-cbff-f92f-5bbbb1339aad@freebsd.org> <C0CC7001-16FE-40BF-A96A-1FA51A0AFBA7@obsigna.com> <677900fb-c717-743f-fcfe-86b603466e33@freebsd.org> <0D3C9016-7A4A-46BA-B35F-3844D07562A8@obsigna.com> <CAFPNf59w6BHgDjLNHW=rQckZAFG4gqPHL49vLXiDmMAxVPOcKg@mail.gmail.com> <1E1DB7E0-D354-4D7A-B657-0ECF94C12CE0@obsigna.com> <50d405a4-3f8f-a706-9cac-d1162925e56a@freebsd.org> <c62fa048-63c8-aef6-5bad-b0a6719f6acb@freebsd.org> <9222BB10-C700-4DE7-83A3-BE7A38A11713@obsigna.com> <1B36CAD7-A139-436B-B7EC-0FFF232F9C6A@obsigna.com> <20160801030317.P29054@sola.nimnet.asn.au>

next in thread | previous in thread | raw e-mail | index | archive | help
> Am 31.07.2016 um 15:38 schrieb Ian Smith <smithi@nimnet.asn.au>:
> On Sat, 30 Jul 2016 11:17:13 -0300, Dr. Rolf Jansen wrote:
>> I finished the work on CIDR conformity of the IP ranges tables=20
>> generated by the tool geoip. The main constraint is that the start=20
>> and end address of an IP block given by the delegation files MUST BE=20=

>> PRESERVED during the transformation to a set of CIDR records. This=20
>> target is achieved by:
>>=20
>> 1. Finding the largest common netmask boundary of the start address =
utilizing
>>    int(log2(addr_count)); then iteration like Euclid's algorithm in =
computing
>>    a GCD.
>>=20
>> 2. Output the CIDR with the given start address and the masklen =
belonging
>>    to the found netmask.
>>=20
>> 3. If the CIDR does not match the whole original IP range then set =
the start
>>    address of the next CIDR block to the next boundary of the common =
netmask,
>>    and loop over starting at 1. until the original range has been =
satisfied.
>>=20
>> I carefully tested the algorithm and a table that I pipe by the new=20=

>> geoip tool into ipfw is 100 % identical to the output of the ipfw=20
>> command 'table N list'.
>=20
> Great.  I suppose that caters for some of the odd delegations one =
sees,=20
> such as perhaps a /16 then a /15 (ie 3/4 of a /14) followed by maybe a=20=

> /12, maybw with another /15 tacked on the end .. but I'm unsure if =
that=20
> applies to country allocations as much as it does within countries.
>=20
>> It is worth to note, that already the original RIR delegation files=20=

>> contain 457 non CIDR conforming IPv4 ranges in a total of 165815=20
>> original records. I guess that this number will increase in the=20
>> future because the RIR's ran empty on new IPv4 ranges and are urged=20=

>> to subdivide returned old ranges for new delegations. The above=20
>> algorithm is ready for this.
>=20
> Yes, and just as well.  I'm surprised it's as few as 457 .. I looked=20=

> into it a bit back when 115.70/17 was first allocated to my ISP, after=20=

> previously having been, as I recall, in China .. so of course we fell=20=

> foul of a number of (probably perennially) out-of-date geoip blockers,=20=

> for months in some cases .. malevolent beasts if not kept well fed :)
>=20
>> Generally, CIDR conforming tables are more than twice as large as=20
>> optimized (joined adjacencies) IP range tables. All said changes have=20=

>> been pushed to GitHup already.
>=20
> So how many table entries does 'the world' come to, around 400,000?

No, it is not that bad. The total number of original entries in the =
delegation statistics files of all 5 RIR's is about 166000. The ipdb =
tool which compiles these ranges into a consolidated sorted binary =
table, that is suitable for loading it directly into a binary search =
tree, reduces the number of entries to a bit more than one half, namely =
ca. 83500.

Consolidation primarily means, resolving of overlaps, because these =
could not be handled in a meaningful way by a binary search tree. Only =
as an additional benefit in the same go, that routine combines =
adjacencies with the same country code, although, skipping the =
combination is technically not a show stopper for the BST, this is only =
to increase the performance.

The geoip tool which generates the tables of CIDR ranges per country =
code out of the consolidated tables would output a count of 167500 =
entries for all countries. That is a little bit more than the original =
count, however this table is still optimized, because original ranges =
that when combined form a new valid CIDR are not broken down again, but =
the combined CIDR is passed.

>> I am still a little bit amazed how ipfw come to accept incorrect CIDR=20=

>> ranges and arbitrarily moves the start/end addresses in order to=20
>> achieve CIDR conformity, and that without any further notice, and=20
>> that given that ipfw can be considered as being quite relevant to=20
>> system security. Or, may I assume that ipfw knows always better than=20=

>> the user what should be allowed or denied. Otherwise, perhaps I am=20
>> the only one ever who input incorrect CIDR ranges for processing by=20=

>> ipfw.
>=20
> You've lost me here, Rolf.  Do you mean that ipfw adds incorrect table=20=

> entries for a given IPv4 address and mask length?  Or that it c/should=20=

> itself accept IP ranges and generate the needed CIDR entries to match?

Perhaps an example may explain it better. Remember that the first =
incarnation of geoip passed the incorrect range 201.222.20.0/20 to ipfw. =
This is an incorrect CIDR because the start address does not match a =
mask boundary defined by the given masklen. The point now is that this =
error is caused by EITHER the masklen is incorrect OR the start address =
is incorrect. ipfw can determine only that the CIDR is incorrect, and =
does rectify it for further processing:

  # ipfw table 1 add 201.222.20.0/20
  # ipfw table 1 list
  -->  201.222.16.0/20 0

So actually ipfw happily takes an incorrect CIDR and transforms this =
into a correct one under the arbitrary assumption that the masklen is =
the correct part and the start address is sort of variable. Technically =
the addition by ipfw is a correct CIDR, but this is not necessarily the =
range the user wanted to add.

> If the former, how to reproduce for a bug report?  If the latter, =
might
> you contemplate adding that functionality to ipfw

Well, in the meantime, I saw that this kind of automatic rectification =
of incorrect CIDR entries is within ipfw throughout.

  # ipfw add 50000 allow ip from 201.222.20.0/20 to any
  -->  50000 allow ip from 201.222.16.0/20 to any

At least in this case the user is informed directly about the CIDR that =
has been actually utilized, anyway, it is not exactly the range the user =
has asked for, and the tiny difference can easily be overseen.

Given that ipfw can't know what the user actually intended - did he =
mistype the start address or obtained somehow a wrong masklen -, I =
advocate that ipfw should accept the wrong CIDR and transform it to a =
correct one exactly as above, but should output a warning, either to the =
shell or to syslog in case no shell is connected. Ideally this warning =
would give a useful explanation why the input CIDR is wrong and how it =
could be made correct by either adjusting the masklen or the start =
address.

> - or is ipfw better=20
> off being driven to generate tables from the output of such as geoip?

Of course, tools like geoip HAVE TO produce by 100 % valid CIDR, =
otherwise the tool is buggy and the bug must be fixed as in the cse of =
geoip.

Of course, ipfw must not receive any so called "bug fix" that =
immediately may break lots of firewall installations around the world, =
given that it is so easy to inform invalid CIDR, and that ipfw happily =
accepted these up to now.

I don't want to look more catholic than the pope. My main concern is not =
that ipfw does a choice for correcting invalid CIDR, my concern is that =
the choice is sort of arbitrary. I advocate for logging a suitable =
warning, but keep on processing invalid CIDR's by transforming it to =
valid ones exactly as before.

Best regards

Rolf




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?166B5A24-8E5A-444D-BBF5-2B5883DE76A7>