Date: Sun, 26 Jan 2014 02:30:34 +0000 (UTC) From: Warren Block <wblock@FreeBSD.org> To: doc-committers@freebsd.org, svn-doc-all@freebsd.org, svn-doc-head@freebsd.org Subject: svn commit: r43645 - head/en_US.ISO8859-1/books/arch-handbook/boot Message-ID: <201401260230.s0Q2UYPh092495@svn.freebsd.org>
next in thread | raw e-mail | index | archive | help
Author: wblock Date: Sun Jan 26 02:30:34 2014 New Revision: 43645 URL: http://svnweb.freebsd.org/changeset/doc/43645 Log: Rewrite of portions of the Boot chapter by Sergio Andrés Gómez del Real. Committed version is a modified version of the one submitted with the patch. Thanks to Sergio Andrés Gómez del Real for the submission, to John-Mark Gurney for technical review, and to both for their patience. PR: docs/185780 Submitted by: Sergio Andrés Gómez del Real <Sergio.G.DelReal@gmail.com> Reviewed by: jmg Modified: head/en_US.ISO8859-1/books/arch-handbook/boot/chapter.xml Modified: head/en_US.ISO8859-1/books/arch-handbook/boot/chapter.xml ============================================================================== --- head/en_US.ISO8859-1/books/arch-handbook/boot/chapter.xml Sun Jan 26 00:10:46 2014 (r43644) +++ head/en_US.ISO8859-1/books/arch-handbook/boot/chapter.xml Sun Jan 26 02:30:34 2014 (r43645) @@ -4,6 +4,8 @@ The FreeBSD Documentation Project Copyright (c) 2002 Sergey Lyubka <devnull@uptsoft.com> All rights reserved +Copyright (c) 2014 Sergio Andr?s G?mez del Real <Sergio.G.delReal@gmail.com> +All rights reserved $FreeBSD$ --> @@ -25,6 +27,18 @@ $FreeBSD$ </author> <!-- devnull@uptsoft.com 12 Jun 2002 --> </authorgroup> + + <authorgroup> + <author> + <personname> + <firstname>Sergio Andrés</firstname> + <surname> Gómez del Real</surname> + </personname> + + <contrib>Updated and enhanced by </contrib> + </author> + <!-- Sergio.G.DelReal@gmail.com Jan 2014 --> + </authorgroup> </info> <sect1 xml:id="boot-synopsis"> @@ -37,88 +51,103 @@ $FreeBSD$ <indexterm><primary>booting</primary></indexterm> <indexterm><primary>system initialization</primary></indexterm> <para>This chapter is an overview of the boot and system - initialization process, starting from the BIOS (firmware) POST, - to the first user process creation. Since the initial steps of - system startup are very architecture dependent, the IA-32 - architecture is used as an example.</para> + initialization processes, starting from the <acronym>BIOS</acronym> (firmware) + <acronym>POST</acronym>, to the first user process creation. Since the initial + steps of system startup are very architecture dependent, the + IA-32 architecture is used as an example.</para> + + <para>The &os; boot process can be surprisingly complex. After + control is passed from the <acronym>BIOS</acronym>, a considerable amount of + low-level configuration must be done before the kernel can be + loaded and executed. This setup must be done in a simple and + flexible manner, allowing the user a great deal of customization + possibilities.</para> </sect1> <sect1 xml:id="boot-overview"> <title>Overview</title> - <para>A computer running FreeBSD can boot by several methods, - although the most common method, booting from a harddisk where - the OS is installed, will be discussed here. The boot process - is divided into several steps:</para> - - <itemizedlist> - <listitem><para>BIOS POST</para></listitem> - <listitem><para><literal>boot0</literal> stage</para></listitem> - <listitem><para><literal>boot2</literal> stage</para></listitem> - <listitem><para>loader stage</para></listitem> - <listitem><para>kernel initialization</para></listitem> - </itemizedlist> + <para>The boot process is an extremely machine-dependent + activity. Not only must code be written for every computer + architecture, but there may also be multiple types of booting on + the same architecture. For example, looking at + <filename class="directory">/usr/sys/src/boot</filename> + reveals a great amount of architecture-dependent code. There is + a directory for each of the various supported architectures. In + the x86-specific <filename class="directory">i386</filename> + directory, there are subdirectories for different boot standards + like <filename>mbr</filename> (Master Boot Record), + <filename>gpt</filename> (<acronym>GUID</acronym> Partition + Table), and <filename>efi</filename> (Extensible Firmware + Interface). Each boot standard has its own conventions and data + structures. The example that follows shows booting an x86 + computer from an <acronym>MBR</acronym> hard drive with the &os; + <filename>boot0</filename> multi-boot loader stored in the very + first sector. That boot code starts the &os; three-stage boot + process.</para> + + <para>The key to understanding this process is that it is a series + of stages of increasing complexity. These stages are + <filename>boot1</filename>, <filename>boot2</filename>, and + <filename>loader</filename> (see &man.boot.8; for more detail). + The boot system executes each stage in sequence. The last + stage, <filename>loader</filename>, is responsible for loading + the &os; kernel. Each stage is examined in the following + sections.</para> - <indexterm><primary>BIOS POST</primary></indexterm> - <indexterm><primary>boot0</primary></indexterm> - <indexterm><primary>boot2</primary></indexterm> - <indexterm><primary>loader</primary></indexterm> - <para>The <literal>boot0</literal> and <literal>boot2</literal> - stages are also referred to as <emphasis>bootstrap stages 1 and - 2</emphasis> in &man.boot.8; as the first steps in FreeBSD's - 3-stage bootstrapping procedure. Various information is printed - on the screen at each stage, so you may visually recognize them - using the table that follows. Please note that the actual data + <para>Here is an example of the output generated by the + different boot stages. Actual output may differ from machine to machine:</para> <informaltable frame="none" pgwide="0"> <tgroup cols="2"> <tbody> <row> - <entry><para>Output (may vary)</para></entry> - <entry><para>BIOS (firmware) messages</para></entry> + <entry>&os; Component</entry> + <entry>Output (may vary)</entry> </row> <row> - <entry><para><screen>F1 FreeBSD + <entry><literal>boot0</literal></entry> + <entry><screen>F1 FreeBSD F2 BSD -F5 Disk 2</screen></para></entry> - <entry><para><literal>boot0</literal></para></entry> +F5 Disk 2</screen></entry> </row> <row> - <entry><para><screen>>>FreeBSD/i386 BOOT -Default: 1:ad(1,a)/boot/loader -boot:</screen></para></entry> - <entry><para><literal>boot2</literal> + <entry><literal>boot2</literal> <footnote><para>This prompt will appear if the user presses a key just after selecting an OS to boot at the <literal>boot0</literal> - stage.</para></footnote></para></entry> + stage.</para></footnote></entry> + <entry><screen>>>FreeBSD/i386 BOOT +Default: 1:ad(1,a)/boot/loader +boot:</screen></entry> </row> <row> - <entry><para><screen>BTX loader 1.0 BTX version is 1.01 -BIOS drive A: is disk0 -BIOS drive C: is disk1 -BIOS 639kB/64512kB available memory -FreeBSD/i386 bootstrap loader, Revision 0.8 + <entry><filename>loader</filename></entry> + <entry><screen>BTX loader 1.00 BTX version is 1.02 +Consoles: internal video/keyboard +BIOS drive C: is disk0 +BIOS 639kB/2096064kB available memory + +FreeBSD/x86 bootstrap loader, Revision 1.1 Console internal video/keyboard -(jkh@bento.freebsd.org, Mon Nov 20 11:41:23 GMT 2000) -/kernel text=0x1234 data=0x2345 syms=[0x4+0x3456] -Hit [Enter] to boot immediately, or any other key for command prompt -Booting [kernel] in 9 seconds..._</screen></para></entry> - <entry><para>loader</para></entry> +(root@snap.freebsd.org, Thu Jan 16 22:18:05 UTC 2014) +Loading /boot/defaults/loader.conf +/boot/kernel/kernel text=0xed9008 data=0x117d28+0x176650 syms=[0x8+0x137988+0x8+0x1515f8]</screen></entry> </row> <row> - <entry><para><screen>Copyright (c) 1992-2002 The FreeBSD Project. + <entry>kernel</entry> + <entry><screen>Copyright (c) 1992-2013 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. -FreeBSD 4.6-RC #0: Sat May 4 22:49:02 GMT 2002 - devnull@kukas:/usr/obj/usr/src/sys/DEVNULL -Timecounter "i8254" frequency 1193182 Hz</screen></para></entry> - <entry><para>kernel</para></entry> +FreeBSD is a registered trademark of The FreeBSD Foundation. +FreeBSD 10.0-RELEASE #0 r260789: Thu Jan 16 22:34:59 UTC 2014 + root@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64 +FreeBSD clang version 3.3 (tags/RELEASE_33/final 183502) 20130610</screen></entry> </row> </tbody> </tgroup> @@ -126,84 +155,114 @@ Timecounter "i8254" frequency 1193182 H </sect1> <sect1 xml:id="boot-bios"> - <title>BIOS POST</title> + <title>The <acronym>BIOS</acronym></title> - <para>When the PC powers on, the processor's registers are set - to some predefined values. One of the registers is the + <para>When the computer powers on, the processor's registers are + set to some predefined values. One of the registers is the <emphasis>instruction pointer</emphasis> register, and its value after a power on is well defined: it is a 32-bit value of - 0xfffffff0. The instruction pointer register points to code to - be executed by the processor. One of the registers is the + <literal>0xfffffff0</literal>. The instruction pointer register + (also known as the Program Counter) points to code to be + executed by the processor. Another important register is the <literal>cr0</literal> 32-bit control register, and its value - just after the reboot is 0. One of the cr0's bits, the bit PE - (Protection Enabled) indicates whether the processor is running - in protected or real mode. Since at boot time this bit is - cleared, the processor boots in real mode. Real mode means, + just after a reboot is <literal>0</literal>. One of + <literal>cr0</literal>'s bits, the PE (Protection Enabled) bit, + indicates whether the processor is running in 32-bit protected + mode or 16-bit real mode. Since this bit is cleared at boot + time, the processor boots in 16-bit real mode. Real mode means, among other things, that linear and physical addresses are - identical.</para> - - <para>The value of 0xfffffff0 is slightly less then 4Gb, so unless - the machine has 4Gb physical memory, it cannot point to a valid - memory address. The computer's hardware translates this address - so that it points to a BIOS memory block.</para> - - <para>BIOS stands for <emphasis>Basic Input Output - System</emphasis>, and it is a chip on the motherboard that - has a relatively small amount of read-only memory (ROM). This + identical. The reason for the processor not to start + immediately in 32-bit protected mode is backwards compatibility. + In particular, the boot process relies on the services provided + by the <acronym>BIOS</acronym>, and the <acronym>BIOS</acronym> + itself works in legacy, 16-bit code.</para> + + <para>The value of <literal>0xfffffff0</literal> is slightly less + than 4 GB, so unless the machine has 4 GB of physical + memory, it cannot point to a valid memory address. The + computer's hardware translates this address so that it points to + a <acronym>BIOS</acronym> memory block.</para> + + <para>The <acronym>BIOS</acronym> (Basic Input Output + System) is a chip on the motherboard that has a relatively small + amount of read-only memory (<acronym>ROM</acronym>). This memory contains various low-level routines that are specific to - the hardware supplied with the motherboard. So, the processor - will first jump to the address 0xfffffff0, which really resides - in the BIOS's memory. Usually this address contains a jump - instruction to the BIOS's POST routines.</para> - - <para>POST stands for <emphasis>Power On Self Test</emphasis>. - This is a set of routines including the memory check, system bus - check and other low-level stuff so that the CPU can initialize - the computer properly. The important step on this stage is - determining the boot device. All modern BIOS's allow the boot - device to be set manually, so you can boot from a floppy, - CD-ROM, harddisk etc.</para> - - <para>The very last thing in the POST is the <literal>INT - 0x19</literal> instruction. That instruction reads 512 bytes - from the first sector of boot device into the memory at address - 0x7c00. The term <emphasis>first sector</emphasis> originates - from harddrive architecture, where the magnetic plate is divided - to a number of cylindrical tracks. Tracks are numbered, and - every track is divided by a number (usually 64) sectors. Track - number 0 is the outermost on the magnetic plate, and sector 1, - the first sector (tracks, or, cylinders, are numbered starting - from 0, but sectors - starting from 1), has a special meaning. - It is also called Master Boot Record, or MBR. The remaining - sectors on the first track are never used <footnote><para>Some - utilities such as &man.disklabel.8; may store the - information in this area, mostly in the second - sector.</para></footnote>.</para> + the hardware supplied with the motherboard. The processor will + first jump to the address 0xfffffff0, which really resides in + the <acronym>BIOS</acronym>'s memory. Usually this address + contains a jump instruction to the <acronym>BIOS</acronym>'s + POST routines.</para> + + <para>The <acronym>POST</acronym> (Power On Self Test) + is a set of routines including the memory check, system bus + check, and other low-level initialization so the + <acronym>CPU</acronym> can set up the computer properly. The + important step of this stage is determining the boot device. + Modern <acronym>BIOS</acronym> implementations permit the + selection of a boot device, allowing booting from a floppy, + <acronym>CD-ROM</acronym>, hard disk, or other devices.</para> + + <para>The very last thing in the <acronym>POST</acronym> is the + <literal>INT 0x19</literal> instruction. The + <literal>INT 0x19</literal> handler reads 512 bytes from the + first sector of boot device into the memory at address + <literal>0x7c00</literal>. The term + <emphasis>first sector</emphasis> originates from hard drive + architecture, where the magnetic plate is divided into a number + of cylindrical tracks. Tracks are numbered, and every track is + divided into a number (usually 64) of sectors. Track numbers + start at 0, but sector numbers start from 1. Track 0 is the + outermost on the magnetic plate, and sector 1, the first sector, + has a special purpose. It is also called the + <acronym>MBR</acronym>, or Master Boot Record. The remaining + sectors on the first track are never used.</para> + + <para>This sector is our boot-sequence starting point. As we will + see, this sector contains a copy of our + <filename>boot0</filename> program. A jump is made by the + <acronym>BIOS</acronym> to address <literal>0x7c00</literal> so + it starts executing.</para> </sect1> <sect1 xml:id="boot-boot0"> - <title><literal>boot0</literal> Stage</title> + <title>The Master Boot Record (<literal>boot0</literal>)</title> <indexterm><primary>MBR</primary></indexterm> - <para>Take a look at the file <filename>/boot/boot0</filename>. - This is a small 512-byte file, and it is exactly what FreeBSD's - installation procedure wrote to your harddisk's MBR if you chose - the <quote>bootmanager</quote> option at installation - time.</para> + + <para>After control is received from the <acronym>BIOS</acronym> + at memory address <literal>0x7c00</literal>, + <filename>boot0</filename> starts executing. It is the first + piece of code under &os; control. The task of + <filename>boot0</filename> is quite simple: scan the partition + table and let the user choose which partition to boot from. The + Partition Table is a special, standard data structure embedded + in the <acronym>MBR</acronym> (hence embedded in + <filename>boot0</filename>) describing the four standard PC + <quote>partitions</quote> + <footnote> + <para><link + xlink:href="http://en.wikipedia.org/wiki/Master_boot_record"></link></para></footnote>. + <filename>boot0</filename> resides in the filesystem as + <filename>/boot/boot0</filename>. It is a small 512-byte file, + and it is exactly what &os;'s installation procedure wrote to + the hard disk's <acronym>MBR</acronym> if you chose the <quote>bootmanager</quote> + option at installation time. Indeed, + <filename>boot0</filename> <emphasis>is</emphasis> the + <acronym>MBR</acronym>.</para> <para>As mentioned previously, the <literal>INT 0x19</literal> - instruction loads an MBR, i.e., the <filename>boot0</filename> - content, into the memory at address 0x7c00. Taking a look at - the file <filename>sys/boot/i386/boot0/boot0.S</filename> can - give a guess at what is happening there - this is the boot - manager, which is an awesome piece of code written by Robert - Nordier.</para> - - <para>The MBR, or, <filename>boot0</filename>, has a special - structure starting from offset 0x1be, called the - <emphasis>partition table</emphasis>. It has 4 records of 16 - bytes each, called <emphasis>partition records</emphasis>, which - represent how the harddisk(s) are partitioned, or, in FreeBSD's + instruction causes the <literal>INT 0x19</literal> handler to + load an <acronym>MBR</acronym> (<filename>boot0</filename>) into + memory at address <literal>0x7c00</literal>. The source file + for <filename>boot0</filename> can be found in + <filename>sys/boot/i386/boot0/boot0.S</filename> - which is an + awesome piece of code written by Robert Nordier.</para> + + <para>A special structure starting from offset + <literal>0x1be</literal> in the <acronym>MBR</acronym> is called + the <emphasis>partition table</emphasis>. It has four records + of 16 bytes each, called <emphasis>partition records</emphasis>, + which represent how the hard disk is partitioned, or, in &os;'s terminology, sliced. One byte of those 16 says whether a partition (slice) is bootable or not. Exactly one record must have that flag set, otherwise <filename>boot0</filename>'s code @@ -229,186 +288,1471 @@ Timecounter "i8254" frequency 1193182 H </listitem> </itemizedlist> - <para>A partition record descriptor has the information about + <para>A partition record descriptor contains information about where exactly the partition resides on the drive. Both - descriptors, LBA and CHS, describe the same information, but in - different ways: LBA (Logical Block Addressing) has the starting - sector for the partition and the partition's length, while CHS - (Cylinder Head Sector) has coordinates for the first and last - sectors of the partition.</para> - - <para>The boot manager scans the partition table and prints the - menu on the screen so the user can select what disk and what - slice to boot. By pressing an appropriate key, - <filename>boot0</filename> performs the following - actions:</para> + descriptors, <acronym>LBA</acronym> and <acronym>CHS</acronym>, + describe the same information, but in different ways: + <acronym>LBA</acronym> (Logical Block Addressing) has the + starting sector for the partition and the partition's length, + while <acronym>CHS</acronym> (Cylinder Head Sector) has + coordinates for the first and last sectors of the partition. + The partition table ends with the special signature + <literal>0xaa55</literal>.</para> + + <para>The <acronym>MBR</acronym> must fit into 512 bytes, a single + disk sector. This program uses low-level <quote>tricks</quote> + like taking advantage of the side effects of certain + instructions and reusing register values from previous + operations to make the most out of the fewest possible + instructions. Care must also be taken when handling the + partition table, which is embedded in the <acronym>MBR</acronym> + itself. For these reasons, be very careful when modifying + <filename>boot0.S</filename>.</para> + + <para>Note that the <filename>boot0.S</filename> source file + is assembled <quote>as is</quote>: instructions are translated + one by one to binary, with no additional information (no + <acronym>ELF</acronym> file format, for example). This kind of + low-level control is achieved at link time through special + control flags passed to the linker. For example, the text + section of the program is set to be located at address + <literal>0x600</literal>. In practice this means that + <filename>boot0</filename> must be loaded to memory address + <literal>0x600</literal> in order to function properly.</para> + + <para>It is worth looking at the <filename>Makefile</filename> for + <filename>boot0</filename> + (<filename>sys/boot/i386/boot0/Makefile</filename>), as it + defines some of the run-time behavior of + <filename>boot0</filename>. For instance, if a terminal + connected to the serial port (COM1) is used for I/O, the macro + <literal>SIO</literal> must be defined + (<literal>-DSIO</literal>). <literal>-DPXE</literal> enables + boot through <acronym>PXE</acronym> by pressing + <keycap>F6</keycap>. Additionally, the program defines a set of + <emphasis>flags</emphasis> that allow further modification of + its behavior. All of this is illustrated in the + <filename>Makefile</filename>. For example, look at the + linker directives which command the linker to start the text + section at address <literal>0x600</literal>, and to build the + output file <quote>as is</quote> (strip out any file + formatting):</para> + + <figure xml:id="boot-boot0-makefile-as-is"> + <title><filename>sys/boot/i386/boot0/Makefile</filename></title> + + <programlisting> BOOT_BOOT0_ORG?=0x600 + LDFLAGS=-e start -Ttext ${BOOT_BOOT0_ORG} \ + -Wl,-N,-S,--oformat,binary</programlisting> + </figure> + + <para>Let us now start our study of the <acronym>MBR</acronym>, or + <filename>boot0</filename>, starting where execution + begins.</para> + + <note> + <para>Some modifications have been made to some instructions in + favor of better exposition. For example, some macros are + expanded, and some macro tests are omitted when the result of + the test is known. This applies to all of the code examples + shown.</para> + </note> + + <figure xml:id="boot-boot0-entrypoint"> + <title><filename>sys/boot/i386/boot0/boot0.S</filename></title> + + <programlisting>start: + cld # String ops inc + xorw %ax,%ax # Zero + movw %ax,%es # Address + movw %ax,%ds # data + movw %ax,%ss # Set up + movw 0x7c00,%sp # stack</programlisting> + </figure> + + <para>This first block of code is the entry point of the program. + It is where the <acronym>BIOS</acronym> transfers control. + First, it makes sure that the string operations autoincrement + its pointer operands (the <literal>cld</literal> instruction) + <footnote> + <para>When in doubt, we refer the reader to the official Intel + manuals, which describe the exact semantics for each + instruction: <link + xlink:href="http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html"></link>.</para></footnote>. + Then, as it makes no assumption about the state of the segment + registers, it initializes them. Finally, it sets the stack + pointer register (<literal>%sp</literal>) to address + <literal>0x7c00</literal>, so we have a working stack.</para> + + <para>The next block is responsible for the relocation and + subsequent jump to the relocated code.</para> + + <figure xml:id="boot-boot0-relocation"> + <title><filename>sys/boot/i386/boot0/boot0.S</filename></title> + + <programlisting> movw $0x7c00,%si # Source + movw $0x600,%di # Destination + movw $512,%cx # Word count + rep # Relocate + movsb # code + movw %di,%bp # Address variables + movb $16,%cl # Words to clear + rep # Zero + stosb # them + incb -0xe(%di) # Set the S field to 1 + jmp main-0x7c00+0x600 # Jump to relocated code</programlisting> + </figure> + + <para>Because <filename>boot0</filename> is loaded by the + <acronym>BIOS</acronym> to address <literal>0x7C00</literal>, it + copies itself to address <literal>0x600</literal> and then + transfers control there (recall that it was linked to execute at + address <literal>0x600</literal>). The source address, + <literal>0x7c00</literal>, is copied to register + <literal>%si</literal>. The destination address, + <literal>0x600</literal>, to register <literal>%di</literal>. + The number of bytes to copy, <literal>512</literal> (the + program's size), is copied to register <literal>%cx</literal>. + Next, the <literal>rep</literal> instruction repeats the + instruction that follows, that is, <literal>movsb</literal>, the + number of times dictated by the <literal>%cx</literal> register. + The <literal>movsb</literal> instruction copies the byte pointed + to by <literal>%si</literal> to the address pointed to by + <literal>%di</literal>. This is repeated another 511 times. On + each repetition, both the source and destination registers, + <literal>%si</literal> and <literal>%di</literal>, are + incremented by one. Thus, upon completion of the 512-byte copy, + <literal>%di</literal> has the value + <literal>0x600</literal>+<literal>512</literal>= + <literal>0x800</literal>, and <literal>%si</literal> has the + value <literal>0x7c00</literal>+<literal>512</literal>= + <literal>0x7e00</literal>; we have thus completed the code + <emphasis>relocation</emphasis>.</para> + + <para>Next, the destination register + <literal>%di</literal> is copied to <literal>%bp</literal>. + <literal>%bp</literal> gets the value <literal>0x800</literal>. + The value <literal>16</literal> is copied to + <literal>%cl</literal> in preparation for a new string operation + (like our previous <literal>movsb</literal>). Now, + <literal>stosb</literal> is executed 16 times. This instruction + copies a <literal>0</literal> value to the address pointed to by + the destination register (<literal>%di</literal>, which is + <literal>0x800</literal>), and increments it. This is repeated + another 15 times, so <literal>%di</literal> ends up with value + <literal>0x810</literal>. Effectively, this clears the address + range <literal>0x800</literal>-<literal>0x80f</literal>. This + range is used as a (fake) partition table for writing the + <acronym>MBR</acronym> back to disk. Finally, the sector field + for the <acronym>CHS</acronym> addressing of this fake partition + is given the value 1 and a jump is made to the main function + from the relocated code. Note that until this jump to the + relocated code, any reference to an absolute address was + avoided.</para> + + <para>The following code block tests whether the drive number + provided by the <acronym>BIOS</acronym> should be used, or + the one stored in <filename>boot0</filename>.</para> + + <figure xml:id="boot-boot0-drivenumber"> + <title><filename>sys/boot/i386/boot0/boot0.S</filename></title> + + <programlisting>main: + testb $SETDRV,-69(%bp) # Set drive number? + jnz disable_update # Yes + testb %dl,%dl # Drive number valid? + js save_curdrive # Possibly (0x80 set)</programlisting> + </figure> + + <para>This code tests the <literal>SETDRV</literal> bit + (<literal>0x20</literal>) in the <emphasis>flags</emphasis> + variable. Recall that register <literal>%bp</literal> points to + address location <literal>0x800</literal>, so the test is done + to the <emphasis>flags</emphasis> variable at address + <literal>0x800</literal>-<literal>69</literal>= + <literal>0x7bb</literal>. This is an example of the type of + modifications that can be done to <filename>boot0</filename>. + The <literal>SETDRV</literal> flag is not set by default, but it + can be set in the <filename>Makefile</filename>. When set, the + drive number stored in the <acronym>MBR</acronym> is used + instead of the one provided by the <acronym>BIOS</acronym>. We + assume the defaults, and that the <acronym>BIOS</acronym> + provided a valid drive number, so we jump to + <literal>save_curdrive</literal>.</para> + + <para>The next block saves the drive number provided by the + <acronym>BIOS</acronym>, and calls <literal>putn</literal> to + print a new line on the screen.</para> + + <figure xml:id="boot-boot0-savedrivenumber"> + <title><filename>sys/boot/i386/boot0/boot0.S</filename></title> + + <programlisting>save_curdrive: + movb %dl, (%bp) # Save drive number + pushw %dx # Also in the stack +#ifdef TEST /* test code, print internal bios drive */ + rolb $1, %dl + movw $drive, %si + call putkey +#endif + callw putn # Print a newline</programlisting> + </figure> + + <para>Note that we assume <varname>TEST</varname> is not defined, + so the conditional code in it is not assembled and will not + appear in our executable <filename>boot0</filename>.</para> + + <para>Our next block implements the actual scanning of the + partition table. It prints to the screen the partition type for + each of the four entries in the partition table. It compares + each type with a list of well-known operating system file + systems. Examples of recognized partition types are + <acronym>NTFS</acronym> (&windows;, ID 0x7), + <literal>ext2fs</literal> (&linux;, ID 0x83), and, of course, + <literal>ffs</literal>/<literal>ufs2</literal> (&os;, ID 0xa5). + The implementation is fairly simple.</para> + + <figure xml:id="boot-boot0-partition-scan"> + <title><filename>sys/boot/i386/boot0/boot0.S</filename></title> + + <programlisting> movw $(partbl+0x4),%bx # Partition table (+4) + xorw %dx,%dx # Item number + +read_entry: + movb %ch,-0x4(%bx) # Zero active flag (ch == 0) + btw %dx,_FLAGS(%bp) # Entry enabled? + jnc next_entry # No + movb (%bx),%al # Load type + test %al, %al # skip empty partition + jz next_entry + movw $bootable_ids,%di # Lookup tables + movb $(TLEN+1),%cl # Number of entries + repne # Locate + scasb # type + addw $(TLEN-1), %di # Adjust + movb (%di),%cl # Partition + addw %cx,%di # description + callw putx # Display it + +next_entry: + incw %dx # Next item + addb $0x10,%bl # Next entry + jnc read_entry # Till done</programlisting> + </figure> + + <para>It is important to note that the active flag for each entry + is cleared, so after the scanning, <emphasis>no</emphasis> + partition entry is active in our memory copy of + <filename>boot0</filename>. Later, the active flag will be set + for the selected partition. This ensures that only one active + partition exists if the user chooses to write the changes back + to disk.</para> + + <para>The next block tests for other drives. At startup, + the <acronym>BIOS</acronym> writes the number of drives present + in the computer to address <literal>0x475</literal>. If there + are any other drives present, <filename>boot0</filename> prints + the current drive to screen. The user may command + <filename>boot0</filename> to scan partitions on another drive + later.</para> + + <figure xml:id="boot-boot0-test-drives"> + <title><filename>sys/boot/i386/boot0/boot0.S</filename></title> + + <programlisting> popw %ax # Drive number + subb $0x79,%al # Does next + cmpb 0x475,%al # drive exist? (from BIOS?) + jb print_drive # Yes + decw %ax # Already drive 0? + jz print_prompt # Yes</programlisting> + </figure> + + <para>We make the assumption that a single drive is present, so + the jump to <literal>print_drive</literal> is not performed. We + also assume nothing strange happened, so we jump to + <literal>print_prompt</literal>.</para> + + <para>This next block just prints out a prompt followed by the + default option:</para> + + <figure xml:id="boot-boot0-prompt"> + <title><filename>sys/boot/i386/boot0/boot0.S</filename></title> + + <programlisting>print_prompt: + movw $prompt,%si # Display + callw putstr # prompt + movb _OPT(%bp),%dl # Display + decw %si # default + callw putkey # key + jmp start_input # Skip beep</programlisting> + </figure> + + <para>Finally, a jump is performed to + <literal>start_input</literal>, where the + <acronym>BIOS</acronym> services are used to start a timer and + for reading user input from the keyboard; if the timer expires, + the default option will be selected:</para> + + <figure xml:id="boot-boot0-start-input"> + <title><filename>sys/boot/i386/boot0/boot0.S</filename></title> + + <programlisting>start_input: + xorb %ah,%ah # BIOS: Get + int $0x1a # system time + movw %dx,%di # Ticks when + addw _TICKS(%bp),%di # timeout +read_key: + movb $0x1,%ah # BIOS: Check + int $0x16 # for keypress + jnz got_key # Have input + xorb %ah,%ah # BIOS: int 0x1a, 00 + int $0x1a # get system time + cmpw %di,%dx # Timeout? + jb read_key # No</programlisting> + </figure> + + <para>An interrupt is requested with number + <literal>0x1a</literal> and argument <literal>0</literal> in + register <literal>%ah</literal>. The <acronym>BIOS</acronym> + has a predefined set of services, requested by applications as + software-generated interrupts through the <literal>int</literal> + instruction and receiving arguments in registers (in this case, + <literal>%ah</literal>). Here, particularly, we are requesting + the number of clock ticks since last midnight; this value is + computed by the <acronym>BIOS</acronym> through the + <acronym>RTC</acronym> (Real Time Clock). This clock can be + programmed to work at frequencies ranging from 2 Hz to + 8192 Hz. The <acronym>BIOS</acronym> sets it to + 18.2 Hz at startup. When the request is satisfied, a + 32-bit result is returned by the <acronym>BIOS</acronym> in + registers <literal>%cx</literal> and <literal>%dx</literal> + (lower bytes in <literal>%dx</literal>). This result (the + <literal>%dx</literal> part) is copied to register + <literal>%di</literal>, and the value of the + <varname>TICKS</varname> variable is added to + <literal>%di</literal>. This variable resides in + <filename>boot0</filename> at offset <literal>_TICKS</literal> + (a negative value) from register <literal>%bp</literal> (which, + recall, points to <literal>0x800</literal>). The default value + of this variable is <literal>0xb6</literal> (182 in decimal). + Now, the idea is that <filename>boot0</filename> constantly + requests the time from the <acronym>BIOS</acronym>, and when the + value returned in register <literal>%dx</literal> is greater + than the value stored in <literal>%di</literal>, the time is up + and the default selection will be made. Since the RTC ticks + 18.2 times per second, this condition will be met after 10 + seconds (this default behaviour can be changed in the + <filename>Makefile</filename>). Until this time has passed, + <filename>boot0</filename> continually asks the + <acronym>BIOS</acronym> for any user input; this is done through + <literal>int 0x16</literal>, argument <literal>1</literal> in + <literal>%ah</literal>.</para> + + <para>Whether a key was pressed or the time expired, subsequent + code validates the selection. Based on the selection, the + register <literal>%si</literal> is set to point to the + appropriate partition entry in the partition table. This new + selection overrides the previous default one. Indeed, it + becomes the new default. Finally, the ACTIVE flag of the + selected partition is set. If it was enabled at compile time, + the in-memory version of <filename>boot0</filename> with these + modified values is written back to the <acronym>MBR</acronym> on + disk. We leave the details of this implementation to the + reader.</para> + + <para>We now end our study with the last code block from the + <filename>boot0</filename> program:</para> + + <figure xml:id="boot-boot0-check-bootable"> + <title><filename>sys/boot/i386/boot0/boot0.S</filename></title> + + <programlisting> movw $0x7c00,%bx # Address for read + movb $0x2,%ah # Read sector + callw intx13 # from disk + jc beep # If error + cmpw $0xaa55,0x1fe(%bx) # Bootable? + jne beep # No + pushw %si # Save ptr to selected part. + callw putn # Leave some space + popw %si # Restore, next stage uses it + jmp *%bx # Invoke bootstrap</programlisting> + </figure> + + <para>Recall that <literal>%si</literal> points to the selected + partition entry. This entry tells us where the partition begins + on disk. We assume, of course, that the partition selected is + actually a &os; slice.</para> + + <note> + <para>From now on, we will favor the use of the technically + more accurate term <quote>slice</quote> rather than + <quote>partition</quote>.</para> + </note> + + <para>The transfer buffer is set to <literal>0x7c00</literal> + (register <literal>%bx</literal>), and a read for the first + sector of the &os; slice is requested by calling + <literal>intx13</literal>. We assume that everything went okay, + so a jump to <literal>beep</literal> is not performed. In + particular, the new sector read must end with the magic sequence + <literal>0xaa55</literal>. Finally, the value at + <literal>%si</literal> (the pointer to the selected partition + table) is preserved for use by the next stage, and a jump is + performed to address <literal>0x7c00</literal>, where execution + of our next stage (the just-read block) is started.</para> + </sect1> + + <sect1 xml:id="boot-boot1"> + <title><literal>boot1</literal> Stage</title> + + <para>So far we have gone through the following sequence:</para> <itemizedlist> <listitem> - <para>modifies the bootable flag for the selected partition to - make it bootable, and clears the previous</para> + <para>The <acronym>BIOS</acronym> did some early hardware + initialization, including the <acronym>POST</acronym>. The + <acronym>MBR</acronym> (<filename>boot0</filename>) was + loaded from absolute disk sector one to address + <literal>0x7c00</literal>. Execution control was passed to + that location.</para> </listitem> <listitem> - <para>saves itself to disk to remember what partition (slice) - has been selected so to use it as the default on the next - boot</para> + <para><filename>boot0</filename> relocated itself to the + location it was linked to execute + (<literal>0x600</literal>), followed by a jump to continue + execution at the appropriate place. Finally, + <filename>boot0</filename> loaded the first disk sector from + the &os; slice to address <literal>0x7c00</literal>. + Execution control was passed to that location.</para> </listitem> + </itemizedlist> + + <para><filename>boot1</filename> is the next step in the + boot-loading sequence. It is the first of three boot stages. + Note that we have been dealing exclusively + with disk sectors. Indeed, the <acronym>BIOS</acronym> loads + the absolute first sector, while <filename>boot0</filename> + loads the first sector of the &os; slice. Both loads are to + address <literal>0x7c00</literal>. We can conceptually think of + these disk sectors as containing the files + <filename>boot0</filename> and <filename>boot1</filename>, + respectively, but in reality this is not entirely true for + <filename>boot1</filename>. Strictly speaking, unlike + <filename>boot0</filename>, <filename>boot1</filename> is not + part of the boot blocks + <footnote> + <para>There is a file <filename>/boot/boot1</filename>, but it + is not the written to the beginning of the &os; slice. + Instead, it is concatenated with <filename>boot2</filename> + to form <filename>boot</filename>, which + <emphasis>is</emphasis> written to the beginning of the &os; + slice and read at boot time.</para></footnote>. + Instead, a single, full-blown file, <filename>boot</filename> + (<filename>/boot/boot</filename>), is what ultimately is + written to disk. This file is a combination of + <filename>boot1</filename>, <filename>boot2</filename> and the + <literal>Boot Extender</literal> (or <acronym>BTX</acronym>). + This single file is greater in size than a single sector + (greater than 512 bytes). Fortunately, + <filename>boot1</filename> occupies <emphasis>exactly</emphasis> + the first 512 bytes of this single file, so when + <filename>boot0</filename> loads the first sector of the &os; + slice (512 bytes), it is actually loading + <filename>boot1</filename> and transferring control to + it.</para> + + <para>The main task of <filename>boot1</filename> is to load the + next boot stage. This next stage is somewhat more complex. It + is composed of a server called the <quote>Boot Extender</quote>, + or <acronym>BTX</acronym>, and a client, called + <filename>boot2</filename>. As we will see, the last boot + stage, <filename>loader</filename>, is also a client of the + <acronym>BTX</acronym> server.</para> + + <para>Let us now look in detail at what exactly is done by + <filename>boot1</filename>, starting like we did for + <filename>boot0</filename>, at its entry point:</para> + + <figure xml:id="boot-boot1-entry"> + <title><filename>sys/boot/i386/boot2/boot1.S</filename></title> + + <programlisting>start: + jmp main</programlisting> + </figure> + + <para>The entry point at <literal>start</literal> simply jumps + past a special data area to the label <literal>main</literal>, + which in turn looks like this:</para> + + <figure xml:id="boot-boot1-main"> + <title><filename>sys/boot/i386/boot2/boot1.S</filename></title> + + <programlisting>main: + cld # String ops inc + xor %cx,%cx # Zero + mov %cx,%es # Address + mov %cx,%ds # data + mov %cx,%ss # Set up + mov $start,%sp # stack + mov %sp,%si # Source + mov $0x700,%di # Destination + incb %ch # Word count + rep # Copy + movsw # code</programlisting> + </figure> + + <para>Just like <filename>boot0</filename>, this + code relocates <filename>boot1</filename>, + this time to memory address <literal>0x700</literal>. However, + unlike <filename>boot0</filename>, it does not jump there. + <filename>boot1</filename> is linked to execute at + address <literal>0x7c00</literal>, effectively where it was + loaded in the first place. The reason for this relocation will + be discussed shortly.</para> + + <para>Next comes a loop that looks for the &os; slice. Although + <filename>boot0</filename> loaded <filename>boot1</filename> + from the &os; slice, no information was passed to it about this + <footnote> + <para>Actually we did pass a pointer to the slice entry in + register <literal>%si</literal>. However, + <filename>boot1</filename> does not assume that it was + loaded by <filename>boot0</filename> (perhaps some other + <acronym>MBR</acronym> loaded it, and did not pass this + information), so it assumes nothing.</para></footnote>, + so <filename>boot1</filename> must rescan the + partition table to find where the &os; slice starts. Therefore + it rereads the <acronym>MBR</acronym>:</para> + + <figure xml:id="boot-boot1-find-freebsd"> + <title><filename>sys/boot/i386/boot2/boot1.S</filename></title> + + <programlisting> mov $part4,%si # Partition + cmpb $0x80,%dl # Hard drive? + jb main.4 # No + movb $0x1,%dh # Block count + callw nread # Read MBR</programlisting> + </figure> + + <para>In the code above, register <literal>%dl</literal> + maintains information about the boot device. This is passed on + by the <acronym>BIOS</acronym> and preserved by the + <acronym>MBR</acronym>. Numbers <literal>0x80</literal> and + greater tells us that we are dealing with a hard drive, so a + call is made to <literal>nread</literal>, where the + <acronym>MBR</acronym> is read. Arguments to + <literal>nread</literal> are passed through + <literal>%si</literal> and <literal>%dh</literal>. The memory + address at label <literal>part4</literal> is copied to + <literal>%si</literal>. This memory address holds a + <quote>fake partition</quote> to be used by + <literal>nread</literal>. The following is the data in the fake + partition:</para> + + <figure xml:id="boot-boot2-make-fake-partition"> + <title><filename>sys/boot/i386/boot2/Makefile</filename></title> + + <programlisting> part4: + .byte 0x80, 0x00, 0x01, 0x00 + .byte 0xa5, 0xfe, 0xff, 0xff + .byte 0x00, 0x00, 0x00, 0x00 + .byte 0x50, 0xc3, 0x00, 0x00</programlisting> + </figure> + + <para>In particular, the <acronym>LBA</acronym> for this fake + partition is hardcoded to zero. This is used as an argument to + the <acronym>BIOS</acronym> for reading absolute sector one from + the hard drive. Alternatively, CHS addressing could be used. + In this case, the fake partition holds cylinder 0, head 0 and + sector 1, which is equivalent to absolute sector one.</para> + + <para>Let us now proceed to take a look at + <literal>nread</literal>:</para> + + <figure xml:id="boot-boot1-nread"> + <title><filename>sys/boot/i386/boot2/boot1.S</filename></title> + + <programlisting>nread: + mov $0x8c00,%bx # Transfer buffer + mov 0x8(%si),%ax # Get + mov 0xa(%si),%cx # LBA + push %cs # Read from + callw xread.1 # disk + jnc return # If success, return</programlisting> + </figure> + + <para>Recall that <literal>%si</literal> points to the fake + partition. The word + <footnote> + <para>In the context of 16-bit real mode, a word is 2 + bytes.</para></footnote> + at offset <literal>0x8</literal> is copied to register + <literal>%ax</literal> and word at offset <literal>0xa</literal> + to <literal>%cx</literal>. They are interpreted by the + <acronym>BIOS</acronym> as the lower 4-byte value denoting the + LBA to be read (the upper four bytes are assumed to be zero). + Register <literal>%bx</literal> holds the memory address where + the <acronym>MBR</acronym> will be loaded. The instruction + pushing <literal>%cs</literal> onto the stack is very + interesting. In this context, it accomplishes nothing. However, as + we will see shortly, <filename>boot2</filename>, in conjunction + with the <acronym>BTX</acronym> server, also uses + <literal>xread.1</literal>. This mechanism will be discussed in + the next section.</para> + + <para>The code at <literal>xread.1</literal> further calls *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201401260230.s0Q2UYPh092495>