Date: Mon, 21 Aug 2006 18:09:45 +0300 From: Rostislav Krasny <rosti.bsd@gmail.com> To: Daniel Hartmeier <daniel@benzedrine.cx> Cc: freebsd-net@freebsd.org Subject: Re: PF or "traceroute -e -P TCP" bug? Message-ID: <20060821180945.6a75bc44.rosti.bsd@gmail.com> In-Reply-To: <20060821092350.GL20788@insomnia.benzedrine.cx> References: <20060818235756.25f72db4.rosti.bsd@gmail.com> <20060821092350.GL20788@insomnia.benzedrine.cx>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 21 Aug 2006 11:23:50 +0200 Daniel Hartmeier <daniel@benzedrine.cx> wrote: > [ I'm CC'ing Crist, maybe he can explain why -e behaves like it does ] > > On Fri, Aug 18, 2006 at 11:57:56PM +0300, Rostislav Krasny wrote: > > > I've tried the new "-e" traceroute option on today's RELENG_6 and > > found following problem: > > > > > traceroute -nq 1 -e -P TCP -p 80 216.136.204.117 > > As I understand the -e option, that should send a sequence of TCP SYNs > with > > - constant source port (randomly picked per invokation) > - constant destination port 80 > - increasing TTL per probe > > Assuming you pass the packets with pf, it matters whether you create > state or not. Filtering statelessly (without 'keep state'), there should > be no problem at all. I assume you're filtering statefully. I don't use 'keep state' in any pf rule. But I use a nat rule like this: nat on $ext_if from $internal_net to any -> ($ext_if) and according to 'pfctl -s state' any NAT-ed TCP connection creates a state. For example, during the above traceroute: self tcp 192.168.1.2:34345 -> xxx.xxx.xxx.xxx:50646 -> 216.136.204.117:80 SYN_SENT:CLOSED > With constant source and destination ports, the first probe should > create a state entry and all further probes (of the same traceroute > invokation) should match that state entry. > > What you changed in your patch is switching to a sequential (instead of > constant) source port. This forces creation of one state per probe, > treating each probe as a separate connection. Correct. > I don't think that's in > the spirit of the -e option. There's really no need for that, once the > underlying problem is fixed. > > So, why doesn't -e without your patch produce probes that all match a > single state entry? By the way, I asked a friend from IRC to try "traceroute -e -P TCP" through his router which does NATing by natd and it worked there. > Look at how the TCP sequence numbers are generated across the probes: > > tcp->th_seq = (tcp->th_sport << 16) | (tcp->th_dport + > (fixedPort ? outdata->seq : 0)); > > This is the problem. traceroute increments the sequence number with each > probe. I don't know why that is done. Why not use the same th_seq for > all probes, like an ISN (initial sequence number) would be re-used in > retransmissions in a real TCP handshake? > > If you create state on the first TCP SYN pf sees, pf will note the ISN > from the traceroute side. When pf sees further SYNs from that side, it > will deal with them like with any client retransmitting the SYN of the > handshake (before the peer replies with a SYN+ACK, giving its side's > ISN). Subsequent TCP SYNs with different ISN matching the address/port > pairs will be blocked by pf. > > If this happens on the IP forwarding path (i.e. pf blocks the packet > outgoing), the stack produces the ICMP host unreachable error that shows > up as "!H" in traceroute. I assume you have a "pass out on $ext_if keep > state" rule, and don't filter on the internal interface. If you add > stateful filtering on the internal interface, I think you'll find that > subsequent TCP SYNs are blocked without eliciting the ICMP error. > > I suggest traceroute with -e uses fixed th_seq, as in > > - tcp->th_seq = (tcp->th_sport << 16) | (tcp->th_dport + > - (fixedPort ? outdata->seq : 0)); > + tcp->th_seq = (tcp->th_sport << 16) tcp->th_dport; Even if I add accidentally deleted '|' it doesn't fix the problem: > traceroute -nq 1 -e -P TCP -p 80 www.freebsd.org traceroute to www.freebsd.org (216.136.204.117), 64 hops max, 52 byte packets 1 192.168.1.1 0.525 ms 2 10.0.0.138 2.122 ms 3 * 4 * 5 * 6 * 7 * 8 * 9 * 10 152.63.3.122 191.562 ms 11 * 12 * ^C I can decrease number of the "*" hops by -w option: > traceroute -nq 1 -e -w 10 -P TCP -p 80 www.freebsd.org traceroute to www.freebsd.org (216.136.204.117), 64 hops max, 52 byte packets 1 192.168.1.1 0.506 ms 2 10.0.0.138 1.886 ms 3 * 4 * 5 * 6 * 7 212.143.12.45 151.282 ms 8 * ^C According to repeatedly ran 'pfctl -s state | grep 216.136.204.117' it really has some relation to TCP states in the pf. Before the 212.143.12.45 hop the state closed and after that hop a new state created. And by the way, I think a tcp_check() function checks tcp->th_seq incorrectly: tcp->th_seq == (ident << 16) | (port + seq) In original version or after my patch it should be changed to this: tcp->th_seq == (htons(ident) << 16) | (port + (fixedPort ? seq : 0)) and after your patch to this: tcp->th_seq == (htons(ident) << 16) | port It looks like the return value of the tcp_check() isn't used anywhere anyway.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060821180945.6a75bc44.rosti.bsd>