From owner-freebsd-net Tue Jan 9 9:16:30 2001 Delivered-To: freebsd-net@freebsd.org Received: from freesbee.wheel.dk (freesbee.wheel.dk [193.162.159.97]) by hub.freebsd.org (Postfix) with ESMTP id 466F937B6C7 for ; Tue, 9 Jan 2001 09:16:10 -0800 (PST) Received: by freesbee.wheel.dk (Postfix, from userid 1001) id 631353E59; Tue, 9 Jan 2001 18:16:09 +0100 (CET) Date: Tue, 9 Jan 2001 18:16:09 +0100 From: Jesper Skriver To: Don Lewis Cc: freebsd-net@FreeBSD.ORG Subject: Re: ICMP error processing (was: Re: what to do now ? Was: cvs commit: src/sys/netinet ip_icmp.c tcp_subr.c tcp_var.h) Message-ID: <20010109181609.B10100@skriver.dk> References: <20001218182600.C1856@skriver.dk> <20001219222730.A29741@skriver.dk> <200012201046.CAA19456@salsa.gv.tsc.tdk.com> <20001220155118.N81814@skriver.dk> <20001231210746.A81834@skriver.dk> <200101090702.XAA15712@salsa.gv.tsc.tdk.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <200101090702.XAA15712@salsa.gv.tsc.tdk.com>; from Don.Lewis@tsc.tdk.com on Mon, Jan 08, 2001 at 11:02:17PM -0800 Sender: owner-freebsd-net@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Mon, Jan 08, 2001 at 11:02:17PM -0800, Don Lewis wrote: > [ cc: trimmed ] > > On Dec 31, 9:07pm, Jesper Skriver wrote: > } Subject: Re: what to do now ? Was: cvs commit: src/sys/netinet ip_icmp.c > } On Wed, Dec 20, 2000 at 03:51:18PM +0100, Jesper Skriver wrote: > } > On Wed, Dec 20, 2000 at 02:46:21AM -0800, Don Lewis wrote: > } > > } > > } @@ -714,6 +715,15 @@ > } > > } (lport && inp->inp_lport != lport) || > } > > } (laddr.s_addr && inp->inp_laddr.s_addr != laddr.s_addr) || > } > > } (fport && inp->inp_fport != fport)) { > } > > } + inp = inp->inp_list.le_next; > } > > } + continue; > } > > > } > > Wouldn't it be more cleaner (gets rid of the loop) and more efficient (if > } > > we're getting blasted with ICMP messages) to use in_pcblookup_hash()? > } > > } > I didn't change the loop, but I'll have a look at this code, to see if > } > we can improve it, but again, to get moving, I'd like to commit this, > } > and leave this for a later improvement, ok ? > } > } I've looked at this, and as far as I can see we cannot use > } in_pcblookup_hash, as it lookup a single session, and the code can in > } other cases act on multiple sessions, path MTU discovery is such a case. > > Actually, I don't see where path MTU discovery acts on multiple sessions. It was not something I studied in great detail, but the code currently has the ability to do so. > It looks like we pass a non-NULL third argument to tcp_ctlinput(), which > causes tcp_ctlinput() to pass the optional IP address and port information > to in_pcbnotify(), and in_pcbnotify() doesn't handle the PRC_MSGSIZE > specially, like it does with PRC_IS_REDIRECT and PRC_HOSTDEAD. Bug? > Should path MTU discovery be handled by pfctlinput() to notify all the > protocols? This would make sense, the MTU is protocol independent > My concern is that having to search through all the PCBs each time an > ICMP error message is received makes it too easy to DoS a busy server > that has a lot of open sockets. > > The way icmp -> notify code is structured is kind of strange anyway > (there's a nice diagram in TCP/IP Illustrated Volume 2 - Figure 22.32). > I think it would make more sense to validate the ICMP error and redirect > packets fairly early on by doing the PCB lookup and dropping any ICMP > packets that don't have a matching PCB. Redirects *and* path MTU > discovery messages would then be handled by pfctlinput(), and the other > errors by the protocol specific handler. I think it is a mistake to > call the same *_ctlinput() functions from both icmp_input() for one > connection and pfctlinput() for many connections. I think the call > chain should look something like the following, though some more thought > needs to be given to sensibly unwinding things to avoid duplicate code > and avoid doing duplicate work. > > icmp_input->{tcp,udp}_ctlinput1->in_pcbnotify1-> > {pfctlinput->{tcp,udp}_ctlinput->in_pcbnotify->..., {tcp,udp}_notify, ...} > > In the current code, it looks like an ICMP error that the addresses > and ports zeroed out will erroneously affect many sessions (what do > the RFCs say?). rfc793 doesn't mention any specific range for port numbers apart form being a 16 bit number, but it does say explicitly (in section 3.3) that the sequence number is a range from 0 to 2**32 - 1 If the port numbers and/or addresses are zero we affect many (or even all) sessions, I don't have anything near a full overview of the kernel, so I don't know if there is any check anywhere else for zero in those addresses and/or port numbers. But one "easy" fix for that would be to change the loop as you suggested to a in_pcblookup_hash lookup, I can have a look at this, but I'd appreciate if someone could review and commit kern/23986 so I don't have too many outstanding changes. > Another bit of strangeness is pfctlinput(PRC_IF{UP,DOWN}, ...) which > is passed the interface address, which is matched against the > *destination* address in the TCP and UDP PCBs. Fortunately these > protocols seem to ignore these commands ... As I see this we need a new function to do the same just for the source address in TCP and UDP PCBs, right ? /Jesper -- Jesper Skriver, jesper(at)skriver(dot)dk - CCIE #5456 Work: Network manager @ AS3292 (Tele Danmark DataNetworks) Private: Geek @ AS2109 (A much smaller network ;-) One Unix to rule them all, One Resolver to find them, One IP to bring them all and in the zone to bind them. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-net" in the body of the message