From owner-freebsd-current@FreeBSD.ORG Wed Jan 17 18:00:13 2007 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 48AC716A54F; Wed, 17 Jan 2007 18:00:13 +0000 (UTC) (envelope-from rizzo@icir.org) Received: from xorpc.icir.org (xorpc.icir.org [192.150.187.68]) by mx1.freebsd.org (Postfix) with ESMTP id 0C1C313C455; Wed, 17 Jan 2007 18:00:13 +0000 (UTC) (envelope-from rizzo@icir.org) Received: from xorpc.icir.org (localhost [127.0.0.1]) by xorpc.icir.org (8.12.11/8.13.6) with ESMTP id l0HI0CTj022728; Wed, 17 Jan 2007 10:00:12 -0800 (PST) (envelope-from rizzo@xorpc.icir.org) Received: (from rizzo@localhost) by xorpc.icir.org (8.12.11/8.12.3/Submit) id l0HI0CQR022727; Wed, 17 Jan 2007 10:00:12 -0800 (PST) (envelope-from rizzo) Date: Wed, 17 Jan 2007 10:00:12 -0800 From: Luigi Rizzo To: John Baldwin Message-ID: <20070117100012.A22625@xorpc.icir.org> References: <20061130105537.A69725@xorpc.icir.org> <20061201100821.A85139@xorpc.icir.org> <200701171209.54142.jhb@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <200701171209.54142.jhb@freebsd.org>; from jhb@freebsd.org on Wed, Jan 17, 2007 at 12:09:53PM -0500 Cc: freebsd-current@freebsd.org, current@freebsd.org Subject: Re: [bug found] Re: byte swapped udp length in diskless bootp request ? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Jan 2007 18:00:13 -0000 On Wed, Jan 17, 2007 at 12:09:53PM -0500, John Baldwin wrote: > On Friday 01 December 2006 13:08, Luigi Rizzo wrote: > > On Thu, Nov 30, 2006 at 10:55:37AM -0800, Luigi Rizzo wrote: > > > i was just trying to diskless-boot a -current kernel, > > > and when it was time for the kernel to acquire the address > > > i was getting the usual > > > > > > DHCP/BOOTP timeout for server 255.255.255.255 > > > > > > Usually it is because of lack of connectivity, but > > > a bit of inspection on the server showed (as you can see > > > below) that the UDP len field is byte-swapped - the 05bc > > > in the packet is in little-endian format, causing the > > > server to reject it. > > > > [ actually, it is the IP len that is byte-swapped ] > > > > > I am trying to follow the code in sys/nfsclient/bootp_subr.c > > > (which should send the packet) but it seemd to call sosend() > > > (at line 755) to generate the packet, so it looks really strange > > > that the bug is in such a central place... any ideas ? > > > > as a followup: > > > > Downgrading sys/kern/uipc_socket.c to version 1.284 make HEAD > > work again with in-kernel bootp.. > > > > i managed to locate the bug in the following commit: > > > > > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/kern/uipc_socket.c.diff?r1=1.284&r2=1.285 > > > > Revision 1.285 Thu Nov 2 17:45:28 2006 UTC (4 weeks ago) by andre > > Branch: MAIN > > Changes since 1.284: +29 -1 lines > > Diff to previous 1.284 (colored) > > > > Use the improved m_uiotombuf() function instead of home grown > sosend_copyin() > > to do the userland to kernel copying in sosend_generic() and > sosend_dgram(). > > > > sosend_copyin() is retained for ZERO_COPY_SOCKETS which are not yet > supported > > by m_uiotombuf(). > > > > I don't know exactly where the problem is, but the bug i found is triggered > > by in-kernel sockets (such as the one used by the internal bootp client) > > so maybe this was a case not tested by andre. > > > > I am unclear on where is the actual bug. hopefully something simple... > > Does the bootp code try to send a 0 byte packet per chance? no, the problem was a long standing bug in the code that loops back a copy of the packet in the case of simplex interfaces doing only a shallow copy of the mbuf chain. Before the change 1.285 above, the bug was not triggered because the ip header (modified in the copy looped back) was in an mbuf so the shallow copy was enough. With 1.285, somehow the IP header ended up in the cluster and so the modifications affected also the packet that was about to be transmitted. I committed a fix shortly after the bug was detected. it needs to be revisited but for the time being it should be enough to keep us running. cheers luigi mbuf. and modifies the (shared) mbuf before it is actually transmitted on the wire. It was triggered by the above change because in the past the IP header was in a separate mbuf > -- > John Baldwin