Date: Wed, 07 Jan 2004 23:35:04 -0000 From: Andre Oppermann <andre@freebsd.org> To: rw@codeburst.co.uk, freebsd-net@freebsd.org Subject: Re: kern/60889 - zero IP id change issues in 5.2RC2 Message-ID: <3FFC97A1.4F397CC3@freebsd.org> References: <200401072023.UAA18922@starburst.demon.co.uk> <3FFC8CBE.13B527A3@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Andre Oppermann wrote: > > Richard, > > I've attached a patch that fixes the problem with FIN/ACK and one more > case which got it wrong. I also fixed the host byte order problem in > ip_output.c for the normal ip_id incrementor. > > Some comments and questions: > > 1. Do you think it is neccessary to do a htons() on the randomized > ip_id too? I'd say yes if there is a case where it has to > monotonically increase afterwards. Does it? > > 2. I have a Win2k machine but have check out how I can get tcp header > compression to work with my Cisco AS5300 (if it doesn't do that by > default). Will I see the problem when I do a download from a FreeBSD > 5.2RC2 machine or do I have to use the Windoze as router sending > packets upwards?` > > 3. There are indeed devices clearing the DF bit. For example Cisco > is recommending this in it's trouble shooting section for broadband > access via DSL/L2TP where the MTU is lower than 1500 because of the > tunnelling overhead. Make a route-map to unset the DF bit and apply > it to the incoming interface. 4. After reading the pf.conf man page from OpenBSD (where it talks about scrubbing IP packets) I don't think it's a good idea to set the ip_id to zero in the DF case. It seems to cause various kinds of problems when for some reason DF is removed along the path (which it shouldn't but whatever). I think it is clearly better to put a ip_id into every packet no matter what. Although the ip_randomid() function doesn't look really cheap to run... but then security comes at a cost. 5. Random ip_id is already a kernel option but it is *not* enabled by default in 5.2RC2 GENERIC or -current. On the other hand the code *does* zero the ip_id when DF is set in any case which is troublesome as we just found out. Updated fix attached. Same tab-brokenness due to c&p. -- Andre Index: ip_output.c =================================================================== RCS file: /home/ncvs/src/sys/netinet/ip_output.c,v retrieving revision 1.203 diff -u -p -r1.203 ip_output.c --- ip_output.c 20 Nov 2003 20:07:38 -0000 1.203 +++ ip_output.c 7 Jan 2004 23:31:56 -0000 @@ -229,10 +229,10 @@ ip_output(struct mbuf *m0, struct mbuf * /* * Fill in IP header. If we are not allowing fragmentation, - * then the ip_id field is meaningless, so send it as zero - * to reduce information leakage. Otherwise, if we are not - * randomizing ip_id, then don't bother to convert it to network - * byte order -- it's just a nonce. Note that a 16-bit counter + * then the ip_id field is meaningless, but we don't set it + * to zero. Doing so causes various problems when devices along + * the path (routers, load balancers, firewalls, etc.) illegally + * disable DF on our packet. Note that a 16-bit counter * will wrap around in less than 10 seconds at 100 Mbit/s on a * medium with MTU 1500. See Steven M. Bellovin, "A Technique * for Counting NATted Hosts", Proc. IMW'02, available at @@ -241,17 +241,11 @@ ip_output(struct mbuf *m0, struct mbuf * if ((flags & (IP_FORWARDING|IP_RAWOUTPUT)) == 0) { ip->ip_v = IPVERSION; ip->ip_hl = hlen >> 2; - if ((ip->ip_off & IP_DF) == 0) { - ip->ip_off = 0; #ifdef RANDOM_IP_ID - ip->ip_id = ip_randomid(); + ip->ip_id = ip_randomid(); #else - ip->ip_id = ip_id++; + ip->ip_id = htons(ip_id++); #endif - } else { - ip->ip_off = IP_DF; - ip->ip_id = 0; - } ipstat.ips_localout++; } else { hlen = ip->ip_hl << 2; Index: tcp_subr.c =================================================================== RCS file: /home/ncvs/src/sys/netinet/tcp_subr.c,v retrieving revision 1.172 diff -u -p -r1.172 tcp_subr.c --- tcp_subr.c 6 Jan 2004 23:29:46 -0000 1.172 +++ tcp_subr.c 7 Jan 2004 23:31:57 -0000 @@ -459,6 +459,8 @@ tcp_respond(tp, ipgen, th, m, ack, seq, tlen += sizeof (struct tcpiphdr); ip->ip_len = tlen; ip->ip_ttl = ip_defttl; + if (path_mtu_discovery) + ip->ip_off |= IP_DF; } m->m_len = tlen; m->m_pkthdr.len = tlen; @@ -1733,6 +1735,8 @@ tcp_twrespond(struct tcptw *tw, struct s m->m_pkthdr.csum_flags = CSUM_TCP; m->m_pkthdr.csum_data = offsetof(struct tcphdr, th_sum); ip->ip_len = m->m_pkthdr.len; + if (path_mtu_discovery) + ip->ip_off |= IP_DF; error = ip_output(m, inp->inp_options, NULL, (tw->tw_so_options & SO_DONTROUTE), NULL, inp); } > If these questions are answered I can prepare the final patch and > commit it pending review and re@ approval. > > -- > Andre > > (Note: tabs converted to whitespace because of c&p) > > Index: ip_output.c > =================================================================== > RCS file: /home/ncvs/src/sys/netinet/ip_output.c,v > retrieving revision 1.203 > diff -u -p -r1.203 ip_output.c > --- ip_output.c 20 Nov 2003 20:07:38 -0000 1.203 > +++ ip_output.c 7 Jan 2004 22:43:54 -0000 > @@ -246,7 +246,7 @@ ip_output(struct mbuf *m0, struct mbuf * > #ifdef RANDOM_IP_ID > ip->ip_id = ip_randomid(); > #else > - ip->ip_id = ip_id++; > + ip->ip_id = htons(ip_id++); > #endif > } else { > ip->ip_off = IP_DF; > Index: tcp_subr.c > =================================================================== > RCS file: /home/ncvs/src/sys/netinet/tcp_subr.c,v > retrieving revision 1.172 > diff -u -p -r1.172 tcp_subr.c > --- tcp_subr.c 6 Jan 2004 23:29:46 -0000 1.172 > +++ tcp_subr.c 7 Jan 2004 22:43:55 -0000 > @@ -459,6 +459,8 @@ tcp_respond(tp, ipgen, th, m, ack, seq, > tlen += sizeof (struct tcpiphdr); > ip->ip_len = tlen; > ip->ip_ttl = ip_defttl; > + if (path_mtu_discovery) > + ip->ip_off |= IP_DF; > } > m->m_len = tlen; > m->m_pkthdr.len = tlen; > @@ -1733,6 +1735,8 @@ tcp_twrespond(struct tcptw *tw, struct s > m->m_pkthdr.csum_flags = CSUM_TCP; > m->m_pkthdr.csum_data = offsetof(struct tcphdr, th_sum); > ip->ip_len = m->m_pkthdr.len; > + if (path_mtu_discovery) > + ip->ip_off |= IP_DF; > error = ip_output(m, inp->inp_options, NULL, > (tw->tw_so_options & SO_DONTROUTE), NULL, inp); > } > > Richard Wendland wrote: > > > > I've been asked (for freebsd-bugs) to open a discussion about PR > > kern/60889 on freebsd-net, to decide if a recent change to IP should be > > reversed before 5.2-RELEASE (scheduled this month) to give more time > > for some serious issues and risks my PR has raised to be considered > > and tested. My proposal is to revert to 5.1 behaviour for now. > > > > The recent change is to emit a zero fragmentation id when DF is set, with > > the objective of improving privacy from external id sequence observation, > > an issue raised by Steve Bellovin's paper in IMW'02. > > > > I've identified 4 problems with this change: > > > > 1) Even with path_mtu_discovery set, for some reason TCP emits FIN-ACK > > without DF. This causes two problems for this change: > > > > a) Currently the change doesn't really meet its objective for TCP, > > it just means FIN-ACK must be observed. > > > > b) Because now just one id is consumed per TCP connection on the > > FIN-ACK, for most/many systems id becomes a close approximate > > count of the number of TCP connections it has made, which external > > observers can see, not possible before this change. To me this > > seems more of a privacy issue for all FreeBSD users than the NAT > > issue in Steve Bellovin's paper this change seeks to solve. > > > > 2) Linux made exactly this change some time ago, around Linux 2.4.4, > > but was forced to back-out the change because (I think) of practical > > connectivity issues related to the comment in include/net/ip.h > > ip_select_ident() where it now implements an incrementing ip_id for DF: > > > > /* This is only to work around buggy Windows95/2000 > > * VJ compression implementations. If the ID field > > * does not change, they drop every other packet in > > * a TCP stream using header compression. > > */ > > > > I'm not aware that anyone checked that this buggy Windows95/2000 VJ > > compression problem is no longer an issue in practice. I doubt that > > the large web hosters who use FreeBSD would be best-pleased if they > > ran into this for even just a few users - especially as there is no > > config option to disable this change. > > > > 3) This change causes ip_id for non-DF to be output in native > > byte order in ip_output.c. Unfortunately ip_id is still output in > > Network Byte Order in ip_mroute.c and raw_ip.c, so this change risks > > little-endian machines emitting the same fragmentation id at about > > the same time from these different modules, rather than the usual > > 64k cycle; creating a small but real risk of re-assembly errors. > > [This isn't in my PR, I've only just noticed it.] > > > > I also have a suspicion that some middle-boxes (like HTTP load-balancers) > > may clear DF without setting a fresh IP id - clearing DF would save them > > the bother of routing ICMP fragmentation needed back to the source server. > > If this is so this is another problem this change could show up which > > may cause re-assembly errors. I know the bug is elsewhere, but it would > > still become a practical problem for some FreeBSD users. > > > > So before going with this change I think four things need to be done: > > > > 1) TCP changed so FIN-ACK goes out with DF if path_mtu_discovery set. > > > > 2) Tests with Windows95/2000 TCP VJ compression (RFC1144) run. > > > > 3) ip_id should be emitted in the same byte order everywhere. > > > > 4) The change made a config option, so sites can disable it should > > they run into problems, just as RANDOM_IP_ID is an option. > > > > This all seems too much to do for 5.2-RELEASE, and as I think the problems > > and risks sufficiently serious, I propose reversing the change until this > > can be done. We need a quick decision on this to get it into 5.2-RELEASE. > > > > The PR has some more detail and tcpdump output demonstrating the > > issue: > > > > http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/60889 > > > > The change to be reversed can be seen at: > > > > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/ip_output.c#rev1.189 > > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/netinet/ip_output.c.diff?r1=1.188&r2=1.189 > > > > NB If anyone can easily run tests with Windows95/2000 TCP VJ compression that > > would be great (using tcpdump to see if there is abnormal retransmission). > > > > NB2 If anyone knows why FIN-ACK goes out without DF that would be helpful. > > I've quickly looked at the source, and I can't see why. It's emitted > > as TCP moves from FIN_WAIT_n state to TIME_WAIT probably at line 3091 of > > tcp_input.c with a call to tcp_output(). tcp_output() always appears to > > set IP_DF at line 998 of tcp_output.c, if path_mtu_discovery is enabled. > > A puzzle. > > > > Thanks to Boris Staeblow <bs@dva.in-berlin.de> and Tim Rylance > > <tkr@puffball.demon.co.uk> for highlighting this change and helping me > > diagnose the issues. > > > > Richard > > -- > > Richard Wendland richard@wendland.org.uk > > _______________________________________________ > > freebsd-net@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-net > > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3FFC97A1.4F397CC3>