Date: Wed, 22 Apr 2015 12:30:55 +1000 From: Lawrence Stewart <lstewart@freebsd.org> To: "freebsd-net@freebsd.org" <freebsd-net@FreeBSD.org> Subject: FYI: TSO results in significant IP ID reuse on short time scales Message-ID: <553707DF.20305@freebsd.org>
next in thread | raw e-mail | index | archive | help
Hi all, An interesting observation I made last night digging into some pcap files captured on a server doing TSO/LRO and client side bridge seeing all packets as they were on the wire... When doing TSO, the NIC takes the stack supplied template IP/TCP header and increments the IP ID field for each segment put on the wire, resulting in IP IDs of n, n+1, n+2, etc. For the next TSO chunk, the stack thinks it's sending the next logical segment, and if no or few packets have been sent in the meantime, sets the IP ID to n+1, and the NIC then diligently sends out segments with n+1, n+2, n+3 etc. This behaviour causes a lot of segments within the same window/RTT to have duplicate IP IDs. For obvious reasons, this is rather unfortunate if packets are not marked DF. This also potentially might upset some middle boxes doing 5-tuple based flow normalisation, although IP ID randomisation has probably weeded out most boxes that would have had problems with repeated IP IDs. At any rate, this is a cautionary note to file away in the back of your minds, especially when DF is not set for your packets (probably when PMTUD is disabled or not working). There's also an open question as to whether there might be some value in having the IP code increment the IP ID counter based on the number of MSS segments the packet nominally represents, so that the stack generated IP ID will not overlap with TSO segments when sequential generation is used. Thoughts welcome. Below is a concrete example from the pcap files I was poking at last night, with IP addresses sanitised. Note the client was behind a NAT doing port translation, but it was passing IP IDs through untouched as far as I can tell. server side: > 17:05:12.882836 IP (tos 0x0, ttl 64, id 60276, offset 0, flags [DF], proto TCP (6), length 14532, bad cksum 0 (->c7b7)!) > server.80 > client-nat.24122: Flags [.], seq 68344824:68359304, ack 2096907437, win 2050, options [nop,nop,TS val 1580984794 ecr 87033596], length 14480 > 17:05:12.885424 IP (tos 0x0, ttl 64, id 60277, offset 0, flags [DF], proto TCP (6), length 17428, bad cksum 0 (->bc66)!) > server.80 > client-nat.24122: Flags [.], seq 68359304:68376680, ack 2096907437, win 2050, options [nop,nop,TS val 1580984796 ecr 87033598], length 17376 is seen as this on the client side: > 7:05:12.882597 IP (tos 0x0, ttl 57, id 60276, offset 0, flags [DF], proto TCP (6), length 1500) > server.80 > client-priv.57140: Flags [.], cksum 0xb45b (correct), seq 68344824:68346272, ack 2096907437, win 2050, options [nop,nop,TS val 1580984794 ecr 87033596], length 1448 > 17:05:12.882673 IP (tos 0x0, ttl 57, id 60277, offset 0, flags [DF], proto TCP (6), length 1500) > server.80 > client-priv.57140: Flags [.], cksum 0xb1f4 (correct), seq 68346272:68347720, ack 2096907437, win 2050, options [nop,nop,TS val 1580984794 ecr 87033596], length 1448 > 17:05:12.882676 IP (tos 0x0, ttl 57, id 60278, offset 0, flags [DF], proto TCP (6), length 1500) > server.80 > client-priv.57140: Flags [.], cksum 0x0821 (correct), seq 68347720:68349168, ack 2096907437, win 2050, options [nop,nop,TS val 1580984794 ecr 87033596], length 1448 > 17:05:12.882678 IP (tos 0x0, ttl 57, id 60279, offset 0, flags [DF], proto TCP (6), length 1500) > server.80 > client-priv.57140: Flags [.], cksum 0xa768 (correct), seq 68349168:68350616, ack 2096907437, win 2050, options [nop,nop,TS val 1580984794 ecr 87033596], length 1448 > 17:05:12.882681 IP (tos 0x0, ttl 57, id 60280, offset 0, flags [DF], proto TCP (6), length 1500) > server.80 > client-priv.57140: Flags [.], cksum 0xc26c (correct), seq 68350616:68352064, ack 2096907437, win 2050, options [nop,nop,TS val 1580984794 ecr 87033596], length 1448 > 17:05:12.882683 IP (tos 0x0, ttl 57, id 60281, offset 0, flags [DF], proto TCP (6), length 1500) > server.80 > client-priv.57140: Flags [.], cksum 0xf750 (correct), seq 68352064:68353512, ack 2096907437, win 2050, options [nop,nop,TS val 1580984794 ecr 87033596], length 1448 > 17:05:12.882686 IP (tos 0x0, ttl 57, id 60282, offset 0, flags [DF], proto TCP (6), length 1500) > server.80 > client-priv.57140: Flags [.], cksum 0x95e9 (correct), seq 68353512:68354960, ack 2096907437, win 2050, options [nop,nop,TS val 1580984794 ecr 87033596], length 1448 > 17:05:12.882688 IP (tos 0x0, ttl 57, id 60283, offset 0, flags [DF], proto TCP (6), length 1500) > server.80 > client-priv.57140: Flags [.], cksum 0xb0d6 (correct), seq 68354960:68356408, ack 2096907437, win 2050, options [nop,nop,TS val 1580984794 ecr 87033596], length 1448 > 17:05:12.882698 IP (tos 0x0, ttl 57, id 60284, offset 0, flags [DF], proto TCP (6), length 1500) > server.80 > client-priv.57140: Flags [.], cksum 0xe7ed (correct), seq 68356408:68357856, ack 2096907437, win 2050, options [nop,nop,TS val 1580984794 ecr 87033596], length 1448 > 17:05:12.882772 IP (tos 0x0, ttl 57, id 60285, offset 0, flags [DF], proto TCP (6), length 1500) > server.80 > client-priv.57140: Flags [.], cksum 0xec4d (correct), seq 68357856:68359304, ack 2096907437, win 2050, options [nop,nop,TS val 1580984794 ecr 87033596], length 1448 Note the reuse begins here for the transmission of the second TSO chunk of 17376 bytes. > 17:05:12.884413 IP (tos 0x0, ttl 57, id 60277, offset 0, flags [DF], proto TCP (6), length 1500) > server.80 > client-priv.57140: Flags [.], cksum 0xd935 (correct), seq 68359304:68360752, ack 2096907437, win 2050, options [nop,nop,TS val 1580984796 ecr 87033598], length 1448 > 17:05:12.884416 IP (tos 0x0, ttl 57, id 60278, offset 0, flags [DF], proto TCP (6), length 1500) > server.80 > client-priv.57140: Flags [.], cksum 0xce96 (correct), seq 68360752:68362200, ack 2096907437, win 2050, options [nop,nop,TS val 1580984796 ecr 87033598], length 1448 > 17:05:12.884418 IP (tos 0x0, ttl 57, id 60279, offset 0, flags [DF], proto TCP (6), length 1500) > server.80 > client-priv.57140: Flags [.], cksum 0x273d (correct), seq 68362200:68363648, ack 2096907437, win 2050, options [nop,nop,TS val 1580984796 ecr 87033598], length 1448 > 17:05:12.884420 IP (tos 0x0, ttl 57, id 60280, offset 0, flags [DF], proto TCP (6), length 1500) > server.80 > client-priv.57140: Flags [.], cksum 0xc86b (correct), seq 68363648:68365096, ack 2096907437, win 2050, options [nop,nop,TS val 1580984796 ecr 87033598], length 1448 > 17:05:12.884422 IP (tos 0x0, ttl 57, id 60281, offset 0, flags [DF], proto TCP (6), length 1500) > server.80 > client-priv.57140: Flags [.], cksum 0xbde3 (correct), seq 68365096:68366544, ack 2096907437, win 2050, options [nop,nop,TS val 1580984796 ecr 87033598], length 1448 > 17:05:12.884423 IP (tos 0x0, ttl 57, id 60282, offset 0, flags [DF], proto TCP (6), length 1500) > server.80 > client-priv.57140: Flags [.], cksum 0x144a (correct), seq 68366544:68367992, ack 2096907437, win 2050, options [nop,nop,TS val 1580984796 ecr 87033598], length 1448 > 17:05:12.884425 IP (tos 0x0, ttl 57, id 60283, offset 0, flags [DF], proto TCP (6), length 1500) > server.80 > client-priv.57140: Flags [.], cksum 0xb999 (correct), seq 68367992:68369440, ack 2096907437, win 2050, options [nop,nop,TS val 1580984796 ecr 87033598], length 1448 > 17:05:12.884427 IP (tos 0x0, ttl 57, id 60284, offset 0, flags [DF], proto TCP (6), length 1500) > server.80 > client-priv.57140: Flags [.], cksum 0x982d (correct), seq 68369440:68370888, ack 2096907437, win 2050, options [nop,nop,TS val 1580984796 ecr 87033598], length 1448 > 17:05:12.884498 IP (tos 0x0, ttl 57, id 60285, offset 0, flags [DF], proto TCP (6), length 1500) > server.80 > client-priv.57140: Flags [.], cksum 0xa79a (correct), seq 68370888:68372336, ack 2096907437, win 2050, options [nop,nop,TS val 1580984796 ecr 87033598], length 1448 > 17:05:12.884500 IP (tos 0x0, ttl 57, id 60286, offset 0, flags [DF], proto TCP (6), length 1500) > server.80 > client-priv.57140: Flags [.], cksum 0xbc50 (correct), seq 68372336:68373784, ack 2096907437, win 2050, options [nop,nop,TS val 1580984796 ecr 87033598], length 1448 > 17:05:12.884501 IP (tos 0x0, ttl 57, id 60287, offset 0, flags [DF], proto TCP (6), length 1500) > server.80 > client-priv.57140: Flags [.], cksum 0xc84c (correct), seq 68373784:68375232, ack 2096907437, win 2050, options [nop,nop,TS val 1580984796 ecr 87033598], length 1448 > 17:05:12.884503 IP (tos 0x0, ttl 57, id 60288, offset 0, flags [DF], proto TCP (6), length 1500) > server.80 > client-priv.57140: Flags [.], cksum 0x7468 (correct), seq 68375232:68376680, ack 2096907437, win 2050, options [nop,nop,TS val 1580984796 ecr 87033598], length 1448 Cheers, Lawrence
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?553707DF.20305>