From owner-freebsd-stable Thu Dec 7 16: 2:36 2000 From owner-freebsd-stable@FreeBSD.ORG Thu Dec 7 16:02:30 2000 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from spoon.alink.net (spoon.alink.net [207.135.127.97]) by hub.freebsd.org (Postfix) with ESMTP id 2502837B400; Thu, 7 Dec 2000 16:02:30 -0800 (PST) Received: from [216.39.8.88] (netility88.hq.netility.com [216.39.8.88]) by spoon.alink.net (8.9.3/8.9.3) with ESMTP id QAA28026; Thu, 7 Dec 2000 16:02:28 -0800 (PST) Mime-Version: 1.0 X-Sender: jbrowne@pop.alink.net Message-Id: In-Reply-To: <200012070813.eB78D7F00560@mass.osd.bsdi.com> References: <200012070813.eB78D7F00560@mass.osd.bsdi.com> Date: Thu, 7 Dec 2000 16:02:26 -0800 To: Mike Smith , Matt Dillon , jhb@FreeBSD.ORG, ps@FreeBSD.ORG From: Jim Browne Subject: Re: More on BTX halted / crashes trying to use -stable /boot/loader Cc: freebsd-stable@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG At 00:13 -0800 12/7/00, Mike Smith wrote: > > The option works wonderfully for /boot/pxeboot. But it turns out > > that the normal /boot/loader, when compiled with the above > > option, will crash horribly whenever it tries to open() a file and > > can't find it in the UFS filesystem on disk... it falls through the > > filesystem list until it hits the tftp FS and BEWM. Explosion. > > > > I sure would appreciate it if one of the bootstrap gurus could take > > a look at what happens when the tftp open routine is called from a > > normal disk-based /boot/loader! > >Probably hits an uninitialised function vector; this would be a good >catch for someone looking to learn a bit about the loader and libstand. Devsw "pxedisk" treats struct open_file member f_devdata as a pointer to a socket number[1]. Other devsw drivers treat f_devdata as a pointer to a struct i386_devdesc[2]. When you boot via PXE, sys/boot/i386/loader/main.c sets the current device to "pxedisk". If you do not boot via PXE, your current device is likely to be some take on "disk". When TFTP tries to open a file, it is expecting struct open_file member f_devdata to be a pointer to a socket number. When currdev is "pxe", that assumption is correct. When currdev is "disk*", that assumption is incorrect. Specifically, tftp.c does: tftpfile->iodesc = io = socktodesc(*(int *) (f->f_devdata)); In my case, that often winds up making tftpfile->iodesc = 0. That parameter is later passed in tftp_makereq to sendrecv as the iodesc, which via sendudp (and possibly the ARP functions) winds up calling netif_put. netif_put derefs the bogus iodesc to get a function pointer for the put function of the network interface and calls it. WHAM. QED. :) I happen to be knee deep in this code right now as I am adding two things: support for booting from a flash based FS and porting the netboot Ethernet drivers to work under libstand(3) so I can use loader(8) with an AMD LANCE compatible chip. I was lurking until my code was finished, but your problem (which I was debugging today for my own configuration) is a good opportunity to speak up. I think the correct solution is to not overload f_devdata. Perhaps another field should be added to struct open_file specifically for a socket number and perhaps some error checking code is in order? :) I have to have my code working yesterday, so I'll keep plugging along on a solution. I'll email patches when finished. However, there are others who are far more familiar with this code than I, so pointers are appreciated especially from Alpha aware people. (I haven't even looked at the Alpha version of loader(8).) [1] sys/boot/i386/libi386/pxe.c function pxe_open towards the bottom. Actually, pxe.c just overwrites what is likely a pointer to a i386_devdesc that was allocated by i386_parsedev (i.e. memory leak). [2] sys/boot/i386/libi386/devicename.c function i386_parsedev On a final note: why is netif_drivers defined in pxe.c rather than conf.c? I'm currently working around that with a Makefile define, but I really think the defition of netif_drivers belongs in conf.c, especially if one is to have more than one netif_driver compiled into the binary (i.e. "pxe" and "ether") Jim Browne jbrowne@jbrowne.com "We lost our lease. You lose culture" - sign on SF Arts Comission Bldg To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message