From owner-freebsd-alpha Tue Jan 22 15:57:25 2002 Delivered-To: freebsd-alpha@freebsd.org Received: from scaup.prod.itd.earthlink.net (scaup.mail.pas.earthlink.net [207.217.120.49]) by hub.freebsd.org (Postfix) with ESMTP id 1B7FB37B425 for ; Tue, 22 Jan 2002 15:56:41 -0800 (PST) Received: from pool0657.cvx21-bradley.dialup.earthlink.net ([209.179.194.147] helo=mindspring.com) by scaup.prod.itd.earthlink.net with esmtp (Exim 3.33 #1) id 16TAm7-000728-00; Tue, 22 Jan 2002 15:56:23 -0800 Message-ID: <3C4DFC23.F5391D2D@mindspring.com> Date: Tue, 22 Jan 2002 15:56:19 -0800 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Peter Wemm Cc: Andrew Gallatin , alpha@FreeBSD.ORG Subject: Re: Is anybody actually able to netboot at the moment? References: <20020122234007.1983E3BAD@overcee.wemm.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-alpha@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Peter Wemm wrote: > > Actually, there's a bug in the one's complement case on the > > FreeBSD checksum calculation, sometimes. I was able to see > > incorrect checksums on a number of packets. I think it's in > > the incremental update code, but since it doesn't seem to > > stop things from working, I never tracked down the source of > > the ethreal traces where I saw this. > > Terry, what crack are you smoking this time? We dont do incremental > checksums in the libstand code. That stuff is as simple and as unoptimized > as it gets. The bug is on transmit, not on receive, Peter. 8-). Working validation on the receive with packets with bad checksums would stop the load. To see if this is the problem, it would be wise to do a dump of a failed boot attempt with ethreal, which flags checksum errors on packets on the wire. As always, this may or may not be the problem at all, but in the spirit of Sherlock Holmes... > The alpha problems were in boot1 (the 7.5K loader) and that shares no > code with netboot at all. OK. I typically don't use netboot, so I can believe this... > I have experimented with alignment in the ethernet frame send code.. it > seems that we are trying to send with 2-byte alignment for the bootp case. > Fixing it doesn't seem to make much difference. However, I wonder if SRM > is doing some length rounding or something because the lengths are not 4 or > 8 byte multiples for the bootp queries but are for the working rarp > queries. However, even that doesn't make sense because it sometimes works. > I'm more suspicious of interactions between the tulip cards when being > driven by SRM and the switch at the moment. OK, another shot in the dark. The first 16 bit NE1000 cards an interesting problem, in that, unless you sent an even number of bus transfer units, it would always do an even transfer anyway, and the last two bytes would be byte-swapped when you went to checksum them, and you'd sum some garbage byte instead of the right byte. The fix for this was to always send an even number of bytes, even if the payload wwas an odd length, to get around the problem. Maybe this is a byte-order problem? If it is, the place to fix it is on the server (again), by making it pad packets out to a 2 (or 4 or 8?) byte boundary so that the received packets are transferred as a unit, but only the payload portion is checked. This "fix" would only apply if the packets sent on the wire were good in both directions (i.e. it's still time for the ethreal trace by an otherwise uninvolved third party machine). Hope this helps... I'm waving my hands as fast as I can... ;^) -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-alpha" in the body of the message