Date: Thu, 23 May 1996 21:17:42 -0400 (EDT) From: Bill Paul <wpaul@skynet.ctr.columbia.edu> To: freebsd-hackers@freebsd.org Subject: three stage boot again Message-ID: <199605240117.VAA03425@skynet.ctr.columbia.edu>
next in thread | raw e-mail | index | archive | help
Yes, I'm still here. After much hair pulling, code scrutinizing, book reading, and Elvis only knows how much trial and error, I finally managed to cobble together an assembly language startup routine that lets me load the existing second stage bootstrap into a standalone program that can be loaded with itself. Basically, here's the magic I've managed to unravel: - The program is an OMAGIC binary link edited for address 0. This is necessary because of the real mode/protected mode switching business. When in real mode, we execute at physical memory location 0x10000, but with a code segment descriptor that basically maps 0x10000 to 0x0. (So the program thinks it's executing at 0x0 but really isn't.) In real mode, we're excuting at 0x1000:0, which again makes the code think it's executing at 0x0. (And this is why we can't make it larger than 64K, since that would cause the program to extend into 0x2000:0, and all the offsets and addresses calculated by the linker would no longer work.) - The program untimately runs at 0x10000, which is the same location as the existing bootstrap (this was so that I could steal the existing global descriptor table values until I understood them well enough to change them). The program is actually loaded into memory at a different location and copies itself to 0x10000. (It could actually go somewhere else, like 0x20000. I'm saving that for later.) - Even though the binary is link edited for address 0x0, its a.out header is massaged by a small fixup program that changes its entry address to 0x100000. This is to fool the existing boot block into loading it correctly: the second stage boot loads files into memory based on their entry points, however we can't link the program for its entry point since then it won't work when we relocate it. - The existing bootstrap needs to be modified slightly to allow loading of OMAGIC binaries. Currently, it expects to load ZMAGIC binaries, which I think have their sections page aligned. To account for this, the bootstrap skips a chunk of memory between loading the text and data segments; this makes the bootstrap blow up when it tries to load an OMAGIC binary. The code needs to be changed to check the magic value of the binary and only skip the space for ZMAGIC binaries instead of doing it unconditionally. - Once the standalone program is loaded, it copies itself down to 0x10000. This clobbers the global descriptor table left behind by the second stage boot, so we have to build a new one. The standalone image has its own table and it resets the GDT register to use it. It then performs an intrasegment jump to reload the code segment selector and to start executing in the new segment (with the 0x10000 offset). Then it sets the DS, SS and ES segment selectors to match the new code segment selector, resets the stack pointer and jumps to boot(). Sounds simple, right? Hah. All this took me a couple of weeks to figure out. I started off trying to figure out how the mach_kboot program worked, but that only frustrated me since it seems to have been written for a different assembler. Fortunately, I found a couple of reasonably helpful books on i386 architecture and programming in the Columbia engineering library. (Being a Columbia University employee has its perks: you get to check out books free of charge. :) It was with these that I finally learned what a global descriptor table was and what segment selector registers did (and how they were different from segment registers in real mode). Of course, progress was slow even with these books since they don't use gas for their examples. (And will somebody please tell me what the hell 'data32/addr32' mean?) Anyway. Now that that's done, there's still one more obstacle to overcome. When I went to link the new startup routine with the boot code for the first time, I ended up with an unresolved symbol called '_disklabel'. It turns out that this symbol is defined in start.S and looks like it's meant to just provide a pointer to a particular area of memory. I assume that this is supposed to contain the BSD disklabel for the disk from which the bootstrap was loaded, but I can't tell how it's supposed to know where it is. I also don't quite understand the following gas syntax ENTRY(disklabel) . = EXT(boot1) + 400 I realize that the ENTRY() macro is being (ab)used to turn disklabel into a global symbol, but I don't quite understand what the next line does. In a fuzzy sort of way, I think what's happening is that the disklabel ends up slapped into the boot block somehow and gets loaded somewhere along the way by the first stage. My problem is that the third stage will need this disklabel information. I'm not sure if I should somehow arrage to save this disklabel info and pass it to the third stage or if I should make the third stage read it over again. (It should be able to do it by itself, I suppose.) I might be able to figure this part out on my own, but my brain still itches from the last part. Sage advice from those who know how this stuff works would be most welcome. -Bill PS: Yes, I'm having tremendous fun, dammit. -- ============================================================================= -Bill Paul (212) 854-6020 | System Manager Work: wpaul@ctr.columbia.edu | Center for Telecommunications Research Home: wpaul@skynet.ctr.columbia.edu | Columbia University, New York City ============================================================================= License error: The license for this .sig file has expired. You must obtain a new license key before any more witty phrases will appear in this space. =============================================================================
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199605240117.VAA03425>