From owner-freebsd-net Sun Jun 11 11:24:25 2000 Delivered-To: freebsd-net@freebsd.org Received: from alive.znep.com (alive.znep.com [207.167.15.58]) by hub.freebsd.org (Postfix) with ESMTP id C0ACD37B8FD for ; Sun, 11 Jun 2000 11:24:18 -0700 (PDT) (envelope-from marcs@znep.com) Received: from localhost (marcs@localhost) by alive.znep.com (8.9.3/8.9.1) with ESMTP id MAA65898; Sun, 11 Jun 2000 12:24:05 -0600 (MDT) (envelope-from marcs@znep.com) Date: Sun, 11 Jun 2000 12:24:05 -0600 (MDT) From: Marc Slemko To: Dann Lunsford Cc: net@FreeBSD.ORG Subject: Re: Strange MTU related (?) problem In-Reply-To: <20000611105000.A7581@greycat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-net@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Sun, 11 Jun 2000, Dann Lunsford wrote: > I started playing with traceroute, nmap, etc. Nothing obvious; the last > few hops to slashdot.org were blotto, but with so many idiots blocking > ICMP indiscriminately, that wasn't surprising. Then I realized that path > MTU discovery depended on ICMP, so I started playing with the MTU on > this box. BINGO! Any MTU larger than 1024 (extremely suspicious number, > I think!), slashdot not accessible; 1024 or smaller, everything OK. > Turning net.inet.tcp.path_mtu_discovery on or off had no effect, Yup, because it is the remote end that is the problem. (oh, and before I get my rant going I should mention http://users.worldgate.com/~marcs/mtu/ which contains an overview of the general PMTU-D horkage issue for people not familiar with it) > Now this is a solution to the immediate problem, But it irritates me greatly > because, among other things, it affects my LAN performance, So, questions: > Anybody else see this sort of behavior, and, is there any way to pinpoint > what machine is doing this so polite letters can be written to the > responsible parties (Packages with high explosives are not an option; much > too messy :-) ) ? Well, there are two possible ways to avoid it. First, somewhere between you and them there is a link with a MTU smaller than the MTU of your local interface. Normally, this link is closer to you than them. So you can work around it by making sure that none of the links close to you have a MTU smaller than your systems do. This assumes, of course, they are using ethernet and not something with a bigger MTU that makes this impractical. That is still just a workaround though, however it often makes sense for other reasons anyway. Most networks have a setup like this workaround suggests, which is why they don't run into the problem. Second, the problem is likely with slashdot's load balancing system. I haven't used the Arrowpoint load balancer that slashdot is using, but in my experience most load balancers are too braindead to properly deal with PMTU-D because the people writing code for them often barely understand what TCP is. What they need to do is, when they get an ICMP can't fragment message, is send it to _all_ the backend boxes (well, they can narrow that down a bit if they take extra pains, but it isn't necessarily worth it). Cisco's local director pile of junk doesn't. F5's bigips don't (well, supposedly they do now after a big stink was raised on nanog about how they broke whois, but I haven't seen it in action yet and they certainly refused to fix it a year or so before that incident, when we complained). I would not be at all suprised if the Arrowpoint doesn't either. The general rule is that if you have systems behind a load balancer, you must disable PMTU-D on them unless you are positive that your load balancer handles it properly. Ten to one, that is the problem here. It could be elsewhere, but it isn't as likely. I can tell you that the www.slashdot.org web servers do have PMTU-D enabled. It isn't clear why this suddenly started happening for you. Perhaps there was some change closer to your end of the world that either increased the MTU on your systems or decreased the MTU of some link. Or perhaps slashdot did really change something recently. I suppose it is also possible that something in the path between you and them started blocking ICMP can't fragment messages. In any case, the proper fix is probably to complain to slashdot and tell them to fix their probably broken systems. Unfortunately, without access to both ends, determining if that is the case for sure isn't easy. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-net" in the body of the message