Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 18 Apr 2022 16:37:21 +0200
From:      Sergio Carlavilla <carlavilla@freebsd.org>
To:        Daniel Ebdrup Jensen <debdrup@freebsd.org>
Cc:        doc-committers@freebsd.org, dev-commits-doc-all@freebsd.org
Subject:   Re: git: 954bbbabe3 - main - arch-handbook: Update boot chapter
Message-ID:  <CAFwocyM9xNBOQi9wz4YarmgBOA627FQR5-Xmu0=y1sOeKGwhXw@mail.gmail.com>
In-Reply-To: <202204181433.23IEXIQk023321@gitrepo.freebsd.org>
References:  <202204181433.23IEXIQk023321@gitrepo.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 18 Apr 2022 at 16:33, Daniel Ebdrup Jensen <debdrup@freebsd.org> wr=
ote:
>
> The branch main has been updated by debdrup:
>
> URL: https://cgit.FreeBSD.org/doc/commit/?id=3D954bbbabe38e5dddddeee2774f=
4330f99b62d912
>
> commit 954bbbabe38e5dddddeee2774f4330f99b62d912
> Author:     Isa <isa@isoux.org>
> AuthorDate: 2022-04-03 21:29:27 +0000
> Commit:     Daniel Ebdrup Jensen <debdrup@FreeBSD.org>
> CommitDate: 2022-04-18 09:17:23 +0000
>
>     arch-handbook: Update boot chapter
>
>     A lot has changed in the code since RELEASE 10.0 in 2014, when this
>     document last received a major content change.
>
>     One significant change is in the path to the boot folder, ie
>     src/sys/boot has become src/stand/.
>
>     Another change is that various code blocks have had their sample text=
s
>     updated, such as the dmesg now looking like it does on a new install.
>
>     Similarly, the assembly code has been updated with the relevant secti=
ons
>     from the source tree. The spacing has been changed to be maximally
>     compatible with the original version.
>
>     Reviewed by:    imp (src), Pau Amma <pauamma@gundo.com>
>     Pull Request:   https://github.com/freebsd/freebsd-doc/pull/60
> ---
>  .../en/books/arch-handbook/boot/_index.adoc        | 466 ++++++++++-----=
------
>  1 file changed, 233 insertions(+), 233 deletions(-)
>
> diff --git a/documentation/content/en/books/arch-handbook/boot/_index.ado=
c b/documentation/content/en/books/arch-handbook/boot/_index.adoc
> index ebed0609ca..c280b5fe12 100644
> --- a/documentation/content/en/books/arch-handbook/boot/_index.adoc
> +++ b/documentation/content/en/books/arch-handbook/boot/_index.adoc
> @@ -50,14 +50,14 @@ endif::[]
>  [[boot-synopsis]]
>  =3D=3D Synopsis
>
> -This chapter is an overview of the boot and system initialization proces=
ses, starting from the BIOS (firmware) POST, to the first user process crea=
tion. Since the initial steps of system startup are very architecture depen=
dent, the IA-32 architecture is used as an example.
> +This chapter is an overview of the boot and system initialization proces=
ses, starting from the BIOS (firmware) POST, to the first user process crea=
tion. Since the initial steps of system startup are very architecture depen=
dent, the IA-32 architecture is used as an example. But the AMD64 and ARM64=
 architectures are much more important and compelling examples and should b=
e explained in the near future according to the topic of this document.
>
>  The FreeBSD boot process can be surprisingly complex. After control is p=
assed from the BIOS, a considerable amount of low-level configuration must =
be done before the kernel can be loaded and executed. This setup must be do=
ne in a simple and flexible manner, allowing the user a great deal of custo=
mization possibilities.
>
>  [[boot-overview]]
>  =3D=3D Overview
>
> -The boot process is an extremely machine-dependent activity. Not only mu=
st code be written for every computer architecture, but there may also be m=
ultiple types of booting on the same architecture. For example, a directory=
 listing of [.filename]#/usr/src/sys/boot# reveals a great amount of archit=
ecture-dependent code. There is a directory for each of the various support=
ed architectures. In the x86-specific [.filename]#i386# directory, there ar=
e subdirectories for different boot standards like [.filename]#mbr# (Master=
 Boot Record), [.filename]#gpt# (GUID Partition Table), and [.filename]#efi=
# (Extensible Firmware Interface). Each boot standard has its own conventio=
ns and data structures. The example that follows shows booting an x86 compu=
ter from an MBR hard drive with the FreeBSD [.filename]#boot0# multi-boot l=
oader stored in the very first sector. That boot code starts the FreeBSD th=
ree-stage boot process.
> +The boot process is an extremely machine-dependent activity. Not only mu=
st code be written for every computer architecture, but there may also be m=
ultiple types of booting on the same architecture. For example, a directory=
 listing of [.filename]#stand# reveals a great amount of architecture-depen=
dent code. There is a directory for each of the various supported architect=
ures. FreeBSD supports the CSM boot standard (Compatibility Support Module)=
. So CSM is supported (with both GPT and MBR partitioning support) and UEFI=
 booting (GPT is totally supported, MBR is mostly supported). It also suppo=
rts loading files from ext2fs, MSDOS, UFS and ZFS. FreeBSD also supports th=
e boot environment feature of ZFS which allows the HOST OS to communicate d=
etails about what to boot that go beyond a simple partition as was possible=
 in the past. But UEFI is more relevant than the CMS these days. The exampl=
e that follows shows booting an x86 computer from an MBR-partitioned hard d=
rive with the FreeBSD [.f
>  ilename]#boot0# multi-boot loader stored in the very first sector. That =
boot code starts the FreeBSD three-stage boot process.
>
>  The key to understanding this process is that it is a series of stages o=
f increasing complexity. These stages are [.filename]#boot1#, [.filename]#b=
oot2#, and [.filename]#loader# (see man:boot[8] for more detail). The boot =
system executes each stage in sequence. The last stage, [.filename]#loader#=
, is responsible for loading the FreeBSD kernel. Each stage is examined in =
the following sections.
>
> @@ -85,8 +85,8 @@ a|
>
>  [source,bash]
>  ....
> ->>FreeBSD/i386 BOOT
> -Default: 1:ad(1,a)/boot/loader
> +>>FreeBSD/x86 BOOT
> +Default: 0:ad(0p4)/boot/loader
>  boot:
>  ....
>
> @@ -102,7 +102,7 @@ BIOS 639kB/2096064kB available memory
>
>  FreeBSD/x86 bootstrap loader, Revision 1.1
>  Console internal video/keyboard
> -(root@snap.freebsd.org, Thu Jan 16 22:18:05 UTC 2014)
> +(root@releng1.nyi.freebsd.org, Fri Apr  9 04:04:45 UTC 2021)
>  Loading /boot/defaults/loader.conf
>  /boot/kernel/kernel text=3D0xed9008 data=3D0x117d28+0x176650 syms=3D[0x8=
+0x137988+0x8+0x1515f8]
>  ....
> @@ -112,13 +112,13 @@ a|
>
>  [source,bash]
>  ....
> -Copyright (c) 1992-2013 The FreeBSD Project.
> +Copyright (c) 1992-2021 The FreeBSD Project.
>  Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
>          The Regents of the University of California. All rights reserved=
.
>  FreeBSD is a registered trademark of The FreeBSD Foundation.
> -FreeBSD 10.0-RELEASE 0 r260789: Thu Jan 16 22:34:59 UTC 2014
> -    root@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64
> -FreeBSD clang version 3.3 (tags/RELEASE_33/final 183502) 20130610
> +FreeBSD 13.0-RELEASE 0 releng/13.0-n244733-ea31abc261f: Fri Apr  9 04:04=
:45 UTC 2021
> +    root@releng1.nyi.freebsd.org:/usr/obj/usr/src/i386.i386/sys/GENERIC =
i386
> +FreeBSD clang version 11.0.1 (git@github.com:llvm/llvm-project.git llvmo=
rg-11.0.1-0-g43ff75f2c3fe)
>  ....
>
>  |=3D=3D=3D
> @@ -143,7 +143,7 @@ This sector is our boot-sequence starting point. As w=
e will see, this sector con
>
>  After control is received from the BIOS at memory address `0x7c00`, [.fi=
lename]#boot0# starts executing. It is the first piece of code under FreeBS=
D control. The task of [.filename]#boot0# is quite simple: scan the partiti=
on table and let the user choose which partition to boot from. The Partitio=
n Table is a special, standard data structure embedded in the MBR (hence em=
bedded in [.filename]#boot0#) describing the four standard PC "partitions".=
 [.filename]#boot0# resides in the filesystem as [.filename]#/boot/boot0#. =
It is a small 512-byte file, and it is exactly what FreeBSD's installation =
procedure wrote to the hard disk's MBR if you chose the "bootmanager" optio=
n at installation time. Indeed, [.filename]#boot0#_is_ the MBR.
>
> -As mentioned previously, the `INT 0x19` instruction causes the `INT 0x19=
` handler to load an MBR ([.filename]#boot0#) into memory at address `0x7c0=
0`. The source file for [.filename]#boot0# can be found in [.filename]#sys/=
boot/i386/boot0/boot0.S# - which is an awesome piece of code written by Rob=
ert Nordier.
> +As mentioned previously, we're calling the BIOS `INT 0x19` to load the M=
BR ([.filename]#boot0#) into memory at address `0x7c00`. The source file fo=
r [.filename]#boot0# can be found in [.filename]#stand/i386/boot0/boot0.S# =
- which is an awesome piece of code written by Robert Nordier.
>
>  A special structure starting from offset `0x1be` in the MBR is called th=
e _partition table_. It has four records of 16 bytes each, called _partitio=
n records_, which represent how the hard disk is partitioned, or, in FreeBS=
D's terminology, sliced. One byte of those 16 says whether a partition (sli=
ce) is bootable or not. Exactly one record must have that flag set, otherwi=
se [.filename]#boot0#'s code will refuse to proceed.
>
> @@ -160,16 +160,15 @@ The MBR must fit into 512 bytes, a single disk sect=
or. This program uses low-lev
>
>  Note that the [.filename]#boot0.S# source file is assembled "as is": ins=
tructions are translated one by one to binary, with no additional informati=
on (no ELF file format, for example). This kind of low-level control is ach=
ieved at link time through special control flags passed to the linker. For =
example, the text section of the program is set to be located at address `0=
x600`. In practice this means that [.filename]#boot0# must be loaded to mem=
ory address `0x600` in order to function properly.
>
> -It is worth looking at the [.filename]#Makefile# for [.filename]#boot0# =
([.filename]#sys/boot/i386/boot0/Makefile#), as it defines some of the run-=
time behavior of [.filename]#boot0#. For instance, if a terminal connected =
to the serial port (COM1) is used for I/O, the macro `SIO` must be defined =
(`-DSIO`). `-DPXE` enables boot through PXE by pressing kbd:[F6]. Additiona=
lly, the program defines a set of _flags_ that allow further modification o=
f its behavior. All of this is illustrated in the [.filename]#Makefile#. Fo=
r example, look at the linker directives which command the linker to start =
the text section at address `0x600`, and to build the output file "as is" (=
strip out any file formatting):
> +It is worth looking at the [.filename]#Makefile# for [.filename]#boot0# =
([.filename]#stand/i386/boot0/Makefile#), as it defines some of the run-tim=
e behavior of [.filename]#boot0#. For instance, if a terminal connected to =
the serial port (COM1) is used for I/O, the macro `SIO` must be defined (`-=
DSIO`). `-DPXE` enables boot through PXE by pressing kbd:[F6]. Additionally=
, the program defines a set of _flags_ that allow further modification of i=
ts behavior. All of this is illustrated in the [.filename]#Makefile#. For e=
xample, look at the linker directives which command the linker to start the=
 text section at address `0x600`, and to build the output file "as is" (str=
ip out any file formatting):
>
>  [.programlisting]
>  ....
>        BOOT_BOOT0_ORG?=3D0x600
> -      LDFLAGS=3D-e start -Ttext ${BOOT_BOOT0_ORG} \
> -      -Wl,-N,-S,--oformat,binary
> +      ORG=3D${BOOT_BOOT0_ORG}
>  ....
>
> -.[.filename]#sys/boot/i386/boot0/Makefile# [[boot-boot0-makefile-as-is]]
> +.[.filename]#stand/i386/boot0/Makefile# [[boot-boot0-makefile-as-is]]
>  Let us now start our study of the MBR, or [.filename]#boot0#, starting w=
here execution begins.
>
>  [NOTE]
> @@ -185,46 +184,50 @@ start:
>        movw %ax,%es             # Address
>        movw %ax,%ds             #  data
>        movw %ax,%ss             # Set up
> -      movw 0x7c00,%sp          #  stack
> +      movw $LOAD,%sp           #  stack
>  ....
>
> -.[.filename]#sys/boot/i386/boot0/boot0.S# [[boot-boot0-entrypoint]]
> -This first block of code is the entry point of the program. It is where =
the BIOS transfers control. First, it makes sure that the string operations=
 autoincrement its pointer operands (the `cld` instruction) footnote:[When =
in doubt, we refer the reader to the official Intel manuals, which describe=
 the exact semantics for each instruction: .]. Then, as it makes no assumpt=
ion about the state of the segment registers, it initializes them. Finally,=
 it sets the stack pointer register (`%sp`) to address `0x7c00`, so we have=
 a working stack.
> +.[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-entrypoint]]
> +This first block of code is the entry point of the program. It is where =
the BIOS transfers control. First, it makes sure that the string operations=
 autoincrement its pointer operands (the `cld` instruction) footnote:[When =
in doubt, we refer the reader to the official Intel manuals, which describe=
 the exact semantics for each instruction: .]. Then, as it makes no assumpt=
ion about the state of the segment registers, it initializes them. Finally,=
 it sets the stack pointer register (`%sp`) to ($LOAD =3D address `0x7c00`)=
, so we have a working stack.
>
>  The next block is responsible for the relocation and subsequent jump to =
the relocated code.
>
>  [.programlisting]
>  ....
> -      movw $0x7c00,%si # Source
> -      movw $0x600,%di          # Destination
> -      movw $512,%cx            # Word count
> +      movw %sp,%si     # Source
> +      movw $start,%di          # Destination
> +      movw $0x100,%cx          # Word count
>        rep                      # Relocate
> -      movsb                    #  code
> +      movsw                    #  code
>        movw %di,%bp             # Address variables
> -      movb $16,%cl             # Words to clear
> +      movb $0x8,%cl            # Words to clear
>        rep                      # Zero
> -      stosb                    #  them
> +      stosw                    #  them
>        incb -0xe(%di)           # Set the S field to 1
> -      jmp main-0x7c00+0x600    # Jump to relocated code
> +      jmp main-LOAD+ORIGIN     # Jump to relocated code
>  ....
>
> -.[.filename]#sys/boot/i386/boot0/boot0.S# [[boot-boot0-relocation]]
> -As [.filename]#boot0# is loaded by the BIOS to address `0x7C00`, it copi=
es itself to address `0x600` and then transfers control there (recall that =
it was linked to execute at address `0x600`). The source address, `0x7c00`,=
 is copied to register `%si`. The destination address, `0x600`, to register=
 `%di`. The number of bytes to copy, `512` (the program's size), is copied =
to register `%cx`. Next, the `rep` instruction repeats the instruction that=
 follows, that is, `movsb`, the number of times dictated by the `%cx` regis=
ter. The `movsb` instruction copies the byte pointed to by `%si` to the add=
ress pointed to by `%di`. This is repeated another 511 times. On each repet=
ition, both the source and destination registers, `%si` and `%di`, are incr=
emented by one. Thus, upon completion of the 512-byte copy, `%di` has the v=
alue `0x600`+`512`=3D `0x800`, and `%si` has the value `0x7c00`+`512`=3D `0=
x7e00`; we have thus completed the code _relocation_.
> +.[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-relocation]]
> +As [.filename]#boot0# is loaded by the BIOS to address `0x7C00`, it copi=
es itself to address `0x600` and then transfers control there (recall that =
it was linked to execute at address `0x600`). The source address, `0x7c00`,=
 is copied to register `%si`. The destination address, `0x600`, to register=
 `%di`. The number of words to copy, `256` (the program's size =3D 512 byte=
s), is copied to register `%cx`. Next, the `rep` instruction repeats the in=
struction that follows, that is, `movsw`, the number of times dictated by t=
he `%cx` register. The `movsw` instruction copies the word pointed to by `%=
si` to the address pointed to by `%di`. This is repeated another 255 times.=
 On each repetition, both the source and destination registers, `%si` and `=
%di`, are incremented by one. Thus, upon completion of the 256-word (512-by=
te) copy, `%di` has the value `0x600`+`512`=3D `0x800`, and `%si` has the v=
alue `0x7c00`+`512`=3D `0x7e00`; we have thus completed the code _relocatio=
n_. Since the last update of th
>  is document, the copy instructions have changed in the code, so instead =
of the movsb and stosb, movsw and stosw have been introduced, which copy 2 =
bytes(1 word) in one iteration.
>
> -Next, the destination register `%di` is copied to `%bp`. `%bp` gets the =
value `0x800`. The value `16` is copied to `%cl` in preparation for a new s=
tring operation (like our previous `movsb`). Now, `stosb` is executed 16 ti=
mes. This instruction copies a `0` value to the address pointed to by the d=
estination register (`%di`, which is `0x800`), and increments it. This is r=
epeated another 15 times, so `%di` ends up with value `0x810`. Effectively,=
 this clears the address range `0x800`-`0x80f`. This range is used as a (fa=
ke) partition table for writing the MBR back to disk. Finally, the sector f=
ield for the CHS addressing of this fake partition is given the value 1 and=
 a jump is made to the main function from the relocated code. Note that unt=
il this jump to the relocated code, any reference to an absolute address wa=
s avoided.
> +Next, the destination register `%di` is copied to `%bp`. `%bp` gets the =
value `0x800`. The value `8` is copied to `%cl` in preparation for a new st=
ring operation (like our previous `movsw`). Now, `stosw` is executed 8 time=
s. This instruction copies a `0` value to the address pointed to by the des=
tination register (`%di`, which is `0x800`), and increments it. This is rep=
eated another 7 times, so `%di` ends up with value `0x810`. Effectively, th=
is clears the address range `0x800`-`0x80f`. This range is used as a (fake)=
 partition table for writing the MBR back to disk. Finally, the sector fiel=
d for the CHS addressing of this fake partition is given the value 1 and a =
jump is made to the main function from the relocated code. Note that until =
this jump to the relocated code, any reference to an absolute address was a=
voided.
>
>  The following code block tests whether the drive number provided by the =
BIOS should be used, or the one stored in [.filename]#boot0#.
>
>  [.programlisting]
>  ....
>  main:
> -      testb $SETDRV,-69(%bp)   # Set drive number?
> +      testb $SETDRV,_FLAGS(%bp)        # Set drive number?
> +#ifndef CHECK_DRIVE    /* disable drive checks */
> +      jz save_curdrive         # no, use the default
> +#else
>        jnz disable_update       # Yes
>        testb %dl,%dl            # Drive number valid?
>        js save_curdrive         # Possibly (0x80 set)
> +#endif
>  ....
>
> -.[.filename]#sys/boot/i386/boot0/boot0.S# [[boot-boot0-drivenumber]]
> +.[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-drivenumber]]
>  This code tests the `SETDRV` bit (`0x20`) in the _flags_ variable. Recal=
l that register `%bp` points to address location `0x800`, so the test is do=
ne to the _flags_ variable at address `0x800`-`69`=3D `0x7bb`. This is an e=
xample of the type of modifications that can be done to [.filename]#boot0#.=
 The `SETDRV` flag is not set by default, but it can be set in the [.filena=
me]#Makefile#. When set, the drive number stored in the MBR is used instead=
 of the one provided by the BIOS. We assume the defaults, and that the BIOS=
 provided a valid drive number, so we jump to `save_curdrive`.
>
>  The next block saves the drive number provided by the BIOS, and calls `p=
utn` to print a new line on the screen.
> @@ -242,7 +245,7 @@ save_curdrive:
>        callw putn               # Print a newline
>  ....
>
> -.[.filename]#sys/boot/i386/boot0/boot0.S# [[boot-boot0-savedrivenumber]]
> +.[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-savedrivenumber]]
>  Note that we assume `TEST` is not defined, so the conditional code in it=
 is not assembled and will not appear in our executable [.filename]#boot0#.
>
>  Our next block implements the actual scanning of the partition table. It=
 prints to the screen the partition type for each of the four entries in th=
e partition table. It compares each type with a list of well-known operatin=
g system file systems. Examples of recognized partition types are NTFS (Win=
dows(R), ID 0x7), `ext2fs` (Linux(R), ID 0x83), and, of course, `ffs`/`ufs2=
` (FreeBSD, ID 0xa5). The implementation is fairly simple.
> @@ -274,7 +277,7 @@ next_entry:
>        jnc read_entry           # Till done
>  ....
>
> -.[.filename]#sys/boot/i386/boot0/boot0.S# [[boot-boot0-partition-scan]]
> +.[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-partition-scan]]
>  It is important to note that the active flag for each entry is cleared, =
so after the scanning, _no_ partition entry is active in our memory copy of=
 [.filename]#boot0#. Later, the active flag will be set for the selected pa=
rtition. This ensures that only one active partition exists if the user cho=
oses to write the changes back to disk.
>
>  The next block tests for other drives. At startup, the BIOS writes the n=
umber of drives present in the computer to address `0x475`. If there are an=
y other drives present, [.filename]#boot0# prints the current drive to scre=
en. The user may command [.filename]#boot0# to scan partitions on another d=
rive later.
> @@ -282,14 +285,14 @@ The next block tests for other drives. At startup, =
the BIOS writes the number of
>  [.programlisting]
>  ....
>        popw %ax                 # Drive number
> -      subb $0x79,%al           # Does next
> -      cmpb 0x475,%al           #  drive exist? (from BIOS?)
> +      subb $0x80-0x1,%al               # Does next
> +      cmpb NHRDRV,%al          #  drive exist? (from BIOS?)
>        jb print_drive           # Yes
>        decw %ax                 # Already drive 0?
>        jz print_prompt          # Yes
>  ....
>
> -.[.filename]#sys/boot/i386/boot0/boot0.S# [[boot-boot0-test-drives]]
> +.[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-test-drives]]
>  We make the assumption that a single drive is present, so the jump to `p=
rint_drive` is not performed. We also assume nothing strange happened, so w=
e jump to `print_prompt`.
>
>  This next block just prints out a prompt followed by the default option:
> @@ -305,7 +308,7 @@ print_prompt:
>        jmp start_input          # Skip beep
>  ....
>
> -.[.filename]#sys/boot/i386/boot0/boot0.S# [[boot-boot0-prompt]]
> +.[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-prompt]]
>  Finally, a jump is performed to `start_input`, where the BIOS services a=
re used to start a timer and for reading user input from the keyboard; if t=
he timer expires, the default option will be selected:
>
>  [.programlisting]
> @@ -325,7 +328,7 @@ read_key:
>        jb read_key              # No
>  ....
>
> -.[.filename]#sys/boot/i386/boot0/boot0.S# [[boot-boot0-start-input]]
> +.[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-start-input]]
>  An interrupt is requested with number `0x1a` and argument `0` in registe=
r `%ah`. The BIOS has a predefined set of services, requested by applicatio=
ns as software-generated interrupts through the `int` instruction and recei=
ving arguments in registers (in this case, `%ah`). Here, particularly, we a=
re requesting the number of clock ticks since last midnight; this value is =
computed by the BIOS through the RTC (Real Time Clock). This clock can be p=
rogrammed to work at frequencies ranging from 2 Hz to 8192 Hz. The BIOS set=
s it to 18.2 Hz at startup. When the request is satisfied, a 32-bit result =
is returned by the BIOS in registers `%cx` and `%dx` (lower bytes in `%dx`)=
. This result (the `%dx` part) is copied to register `%di`, and the value o=
f the `TICKS` variable is added to `%di`. This variable resides in [.filena=
me]#boot0# at offset `_TICKS` (a negative value) from register `%bp` (which=
, recall, points to `0x800`). The default value of this variable is `0xb6` =
(182 in decimal). Now, th
>  e idea is that [.filename]#boot0# constantly requests the time from the =
BIOS, and when the value returned in register `%dx` is greater than the val=
ue stored in `%di`, the time is up and the default selection will be made. =
Since the RTC ticks 18.2 times per second, this condition will be met after=
 10 seconds (this default behavior can be changed in the [.filename]#Makefi=
le#). Until this time has passed, [.filename]#boot0# continually asks the B=
IOS for any user input; this is done through `int 0x16`, argument `1` in `%=
ah`.
>
>  Whether a key was pressed or the time expired, subsequent code validates=
 the selection. Based on the selection, the register `%si` is set to point =
to the appropriate partition entry in the partition table. This new selecti=
on overrides the previous default one. Indeed, it becomes the new default. =
Finally, the ACTIVE flag of the selected partition is set. If it was enable=
d at compile time, the in-memory version of [.filename]#boot0# with these m=
odified values is written back to the MBR on disk. We leave the details of =
this implementation to the reader.
> @@ -334,11 +337,11 @@ We now end our study with the last code block from =
the [.filename]#boot0# progra
>
>  [.programlisting]
>  ....
> -      movw $0x7c00,%bx         # Address for read
> +      movw $LOAD,%bx           # Address for read
>        movb $0x2,%ah            # Read sector
>        callw intx13             #  from disk
>        jc beep                  # If error
> -      cmpw $0xaa55,0x1fe(%bx)  # Bootable?
> +      cmpw $MAGIC,0x1fe(%bx)   # Bootable?
>        jne beep                 # No
>        pushw %si                        # Save ptr to selected part.
>        callw putn               # Leave some space
> @@ -346,7 +349,7 @@ We now end our study with the last code block from th=
e [.filename]#boot0# progra
>        jmp *%bx                 # Invoke bootstrap
>  ....
>
> -.[.filename]#sys/boot/i386/boot0/boot0.S# [[boot-boot0-check-bootable]]
> +.[.filename]#stand/i386/boot0/boot0.S# [[boot-boot0-check-bootable]]
>  Recall that `%si` points to the selected partition entry. This entry tel=
ls us where the partition begins on disk. We assume, of course, that the pa=
rtition selected is actually a FreeBSD slice.
>
>  [NOTE]
> @@ -376,7 +379,7 @@ start:
>         jmp main
>  ....
>
> -.[.filename]#sys/boot/i386/boot2/boot1.S# [[boot-boot1-entry]]
> +.[.filename]#stand/i386/boot2/boot1.S# [[boot-boot1-entry]]
>  The entry point at `start` simply jumps past a special data area to the =
label `main`, which in turn looks like this:
>
>  [.programlisting]
> @@ -389,13 +392,13 @@ main:
>        mov %cx,%ss              # Set up
>        mov $start,%sp           #  stack
>        mov %sp,%si              # Source
> -      mov $0x700,%di           # Destination
> +      mov $MEM_REL,%di         # Destination
>        incb %ch                 # Word count
>        rep                      # Copy
>        movsw                    #  code
>  ....
>
> -.[.filename]#sys/boot/i386/boot2/boot1.S# [[boot-boot1-main]]
> +.[.filename]#stand/i386/boot2/boot1.S# [[boot-boot1-main]]
>  Just like [.filename]#boot0#, this code relocates [.filename]#boot1#, th=
is time to memory address `0x700`. However, unlike [.filename]#boot0#, it d=
oes not jump there. [.filename]#boot1# is linked to execute at address `0x7=
c00`, effectively where it was loaded in the first place. The reason for th=
is relocation will be discussed shortly.
>
>  Next comes a loop that looks for the FreeBSD slice. Although [.filename]=
#boot0# loaded [.filename]#boot1# from the FreeBSD slice, no information wa=
s passed to it about this footnote:[Actually we did pass a pointer to the s=
lice entry in register %si. However, boot1 does not assume that it was load=
ed by boot0 (perhaps some other MBR loaded it, and did not pass this inform=
ation), so it assumes nothing.], so [.filename]#boot1# must rescan the part=
ition table to find where the FreeBSD slice starts. Therefore it rereads th=
e MBR:
> @@ -409,7 +412,7 @@ Next comes a loop that looks for the FreeBSD slice. A=
lthough [.filename]#boot0#
>        callw nread              # Read MBR
>  ....
>
> -.[.filename]#sys/boot/i386/boot2/boot1.S# [[boot-boot1-find-freebsd]]
> +.[.filename]#stand/i386/boot2/boot1.S# [[boot-boot1-find-freebsd]]
>  In the code above, register `%dl` maintains information about the boot d=
evice. This is passed on by the BIOS and preserved by the MBR. Numbers `0x8=
0` and greater tells us that we are dealing with a hard drive, so a call is=
 made to `nread`, where the MBR is read. Arguments to `nread` are passed th=
rough `%si` and `%dh`. The memory address at label `part4` is copied to `%s=
i`. This memory address holds a "fake partition" to be used by `nread`. The=
 following is the data in the fake partition:
>
>  [.programlisting]
> @@ -421,7 +424,7 @@ In the code above, register `%dl` maintains informati=
on about the boot device. T
>         .byte 0x50, 0xc3, 0x00, 0x00
>  ....
>
> -.[.filename]#sys/boot/i386/boot2/Makefile# [[boot-boot2-make-fake-partit=
ion]]
> +.[.filename]#stand/i386/boot2/boot1.S# [[boot-boot2-make-fake-partition]=
]
>  In particular, the LBA for this fake partition is hardcoded to zero. Thi=
s is used as an argument to the BIOS for reading absolute sector one from t=
he hard drive. Alternatively, CHS addressing could be used. In this case, t=
he fake partition holds cylinder 0, head 0 and sector 1, which is equivalen=
t to absolute sector one.
>
>  Let us now proceed to take a look at `nread`:
> @@ -429,7 +432,7 @@ Let us now proceed to take a look at `nread`:
>  [.programlisting]
>  ....
>  nread:
> -      mov $0x8c00,%bx          # Transfer buffer
> +      mov $MEM_BUF,%bx         # Transfer buffer
>        mov 0x8(%si),%ax         # Get
>        mov 0xa(%si),%cx         #  LBA
>        push %cs                 # Read from
> @@ -437,7 +440,7 @@ nread:
>        jnc return               # If success, return
>  ....
>
> -.[.filename]#sys/boot/i386/boot2/boot1.S# [[boot-boot1-nread]]
> +.[.filename]#stand/i386/boot2/boot1.S# [[boot-boot1-nread]]
>  Recall that `%si` points to the fake partition. The word footnote:[In th=
e context of 16-bit real mode, a word is 2 bytes.] at offset `0x8` is copie=
d to register `%ax` and word at offset `0xa` to `%cx`. They are interpreted=
 by the BIOS as the lower 4-byte value denoting the LBA to be read (the upp=
er four bytes are assumed to be zero). Register `%bx` holds the memory addr=
ess where the MBR will be loaded. The instruction pushing `%cs` onto the st=
ack is very interesting. In this context, it accomplishes nothing. However,=
 as we will see shortly, [.filename]#boot2#, in conjunction with the BTX se=
rver, also uses `xread.1`. This mechanism will be discussed in the next sec=
tion.
>
>  The code at `xread.1` further calls the `read` function, which actually =
calls the BIOS asking for the disk sector:
> @@ -460,7 +463,7 @@ xread.1:
>         lret                    # To far caller
>  ....
>
> -.[.filename]#sys/boot/i386/boot2/boot1.S# [[boot-boot1-xread1]]
> +.[.filename]#stand/i386/boot2/boot1.S# [[boot-boot1-xread1]]
>  Note the long return instruction at the end of this block. This instruct=
ion pops out the `%cs` register pushed by `nread`, and returns. Finally, `n=
read` also returns.
>
>  With the MBR loaded to memory, the actual loop for searching the FreeBSD=
 slice begins:
> @@ -469,10 +472,10 @@ With the MBR loaded to memory, the actual loop for =
searching the FreeBSD slice b
>  ....
>         mov $0x1,%cx             # Two passes
>  main.1:
> -       mov $0x8dbe,%si # Partition table
> +       mov $MEM_BUF+PRT_OFF,%si # Partition table
>         movb $0x1,%dh            # Partition
>  main.2:
> -       cmpb $0xa5,0x4(%si)      # Our partition type?
> +       cmpb $PRT_BSD,0x4(%si)   # Our partition type?
>         jne main.3               # No
>         jcxz main.5              # If second pass
>         testb $0x80,(%si)        # Active?
> @@ -480,32 +483,32 @@ main.2:
>  main.3:
>         add $0x10,%si            # Next entry
>         incb %dh                 # Partition
> -       cmpb $0x5,%dh            # In table?
> +       cmpb $0x1+PRT_NUM,%dh            # In table?
>         jb main.2                # Yes
>         dec %cx                  # Do two
>         jcxz main.1              #  passes
>  ....
>
> -.[.filename]#sys/boot/i386/boot2/boot1.S# [[boot-boot1-find-part]]
> +.[.filename]#stand/i386/boot2/boot1.S# [[boot-boot1-find-part]]
>  If a FreeBSD slice is identified, execution continues at `main.5`. Note =
that when a FreeBSD slice is found `%si` points to the appropriate entry in=
 the partition table, and `%dh` holds the partition number. We assume that =
a FreeBSD slice is found, so we continue execution at `main.5`:
>
>  [.programlisting]
>  ....
>  main.5:
> -       mov %dx,0x900                      # Save args
> -       movb $0x10,%dh                     # Sector count
> +       mov %dx,MEM_ARG                    # Save args
> +       movb $NSECT,%dh                    # Sector count
>         callw nread                        # Read disk
> -       mov $0x9000,%bx                    # BTX
> +       mov $MEM_BTX,%bx                           # BTX
>         mov 0xa(%bx),%si                   # Get BTX length and set
>         add %bx,%si                        #  %si to start of boot2.bin
> -       mov $0xc000,%di                    # Client page 2
> -       mov $0xa200,%cx                    # Byte
> +       mov $MEM_USR+SIZ_PAG*2,%di                         # Client page =
2
> +       mov $MEM_BTX+(NSECT-1)*SIZ_SEC,%cx                         # Byte
>         sub %si,%cx                        #  count
>         rep                                # Relocate
>         movsb                              #  client
>  ....
>
> -.[.filename]#sys/boot/i386/boot2/boot1.S# [[boot-boot1-main5]]
> +.[.filename]#stand/i386/boot2/boot1.S# [[boot-boot1-main5]]
>  Recall that at this point, register `%si` points to the FreeBSD slice en=
try in the MBR partition table, so a call to `nread` will effectively read =
sectors at the beginning of this partition. The argument passed on register=
 `%dh` tells `nread` to read 16 disk sectors. Recall that the first 512 byt=
es, or the first sector of the FreeBSD slice, coincides with the [.filename=
]#boot1# program. Also recall that the file written to the beginning of the=
 FreeBSD slice is not [.filename]#/boot/boot1#, but [.filename]#/boot/boot#=
. Let us look at the size of these files in the filesystem:
>
>  [source,bash]
> @@ -550,7 +553,7 @@ seta20.3:
>         jmp 0x9010              # Start BTX
>  ....
>
> -.[.filename]#sys/boot/i386/boot2/boot1.S# [[boot-boot1-seta20]]
> +.[.filename]#stand/i386/boot2/boot1.S# [[boot-boot1-seta20]]
>  Note that right before the jump, interrupts are enabled.
>
>  [[btx-server]]
> @@ -562,7 +565,7 @@ Next in our boot sequence is the BTX Server. Let us q=
uickly remember how we got
>  * [.filename]#boot0# relocates itself to `0x600`, the address it was lin=
ked to execute, and jumps over there. It then reads the first sector of the=
 FreeBSD slice (which consists of [.filename]#boot1#) into address `0x7c00`=
 and jumps over there.
>  * [.filename]#boot1# loads the first 16 sectors of the FreeBSD slice int=
o address `0x8c00`. This 16 sectors, or 8192 bytes, is the whole file [.fil=
ename]#boot#. The file is a concatenation of [.filename]#boot1# and [.filen=
ame]#boot2#. [.filename]#boot2#, in turn, contains the BTX server and the [=
.filename]#boot2# client. Finally, a jump is made to address `0x9010`, the =
entry point of the BTX server.
>
> -Before studying the BTX Server in detail, let us further review how the =
single, all-in-one [.filename]#boot# file is created. The way [.filename]#b=
oot# is built is defined in its [.filename]#Makefile# ([.filename]#/usr/src=
/sys/boot/i386/boot2/Makefile#). Let us look at the rule that creates the [=
.filename]#boot# file:
> +Before studying the BTX Server in detail, let us further review how the =
single, all-in-one [.filename]#boot# file is created. The way [.filename]#b=
oot# is built is defined in its [.filename]#Makefile# ([.filename]#stand/i3=
86/boot2/Makefile#). Let us look at the rule that creates the [.filename]#b=
oot# file:
>
>  [.programlisting]
>  ....
> @@ -570,19 +573,19 @@ Before studying the BTX Server in detail, let us fu=
rther review how the single,
>         cat boot1 boot2 > boot
>  ....
>
> -.[.filename]#sys/boot/i386/boot2/Makefile# [[boot-boot1-make-boot]]
> +.[.filename]#stand/i386/boot2/Makefile# [[boot-boot1-make-boot]]
>  This tells us that [.filename]#boot1# and [.filename]#boot2# are needed,=
 and the rule simply concatenates them to produce a single file called [.fi=
lename]#boot#. The rules for creating [.filename]#boot1# are also quite sim=
ple:
>
>  [.programlisting]
>  ....
>        boot1: boot1.out
> -       objcopy -S -O binary boot1.out boot1
> +       ${OBJCOPY} -S -O binary boot1.out ${.TARGET}
>
>        boot1.out: boot1.o
> -       ld -e start -Ttext 0x7c00 -o boot1.out boot1.o
> +       ${LD} ${LD_FLAGS} -e start --defsym ORG=3D${ORG1} -T ${LDSCRIPT} =
-o ${.TARGET} boot1.o
>  ....
>
> -.[.filename]#sys/boot/i386/boot2/Makefile# [[boot-boot1-make-boot1]]
> +.[.filename]#stand/i386/boot2/Makefile# [[boot-boot1-make-boot1]]
>  To apply the rule for creating [.filename]#boot1#, [.filename]#boot1.out=
# must be resolved. This, in turn, depends on the existence of [.filename]#=
boot1.o#. This last file is simply the result of assembling our familiar [.=
filename]#boot1.S#, without linking. Now, the rule for creating [.filename]=
#boot1.out# is applied. This tells us that [.filename]#boot1.o# should be l=
inked with `start` as its entry point, and starting at address `0x7c00`. Fi=
nally, [.filename]#boot1# is created from [.filename]#boot1.out# applying t=
he appropriate rule. This rule is the [.filename]#objcopy# command applied =
to [.filename]#boot1.out#. Note the flags passed to [.filename]#objcopy#: `=
-S` tells it to strip all relocation and symbolic information; `-O binary` =
indicates the output format, that is, a simple, unformatted binary file.
>
>  Having [.filename]#boot1#, let us take a look at how [.filename]#boot2# =
is constructed:
> @@ -590,30 +593,22 @@ Having [.filename]#boot1#, let us take a look at ho=
w [.filename]#boot2# is const
>  [.programlisting]
>  ....
>        boot2: boot2.ld
> -       @set -- `ls -l boot2.ld`; x=3D$$((7680-$$5)); \
> +       @set -- `ls -l ${.ALLSRC}`; x=3D$$((${BOOT2SIZE}-$$5)); \
>             echo "$$x bytes available"; test $$x -ge 0
> -       dd if=3Dboot2.ld of=3Dboot2 obs=3D7680 conv=3Dosync
> +       ${DD} if=3D${.ALLSRC} of=3D${.TARGET} bs=3D${BOOT2SIZE} conv=3Dsy=
nc
>
> -      boot2.ld: boot2.ldr boot2.bin ../btx/btx/btx
> -       btxld -v -E 0x2000 -f bin -b ../btx/btx/btx -l boot2.ldr \
> -           -o boot2.ld -P 1 boot2.bin
> +      boot2.ld: boot2.ldr boot2.bin ${BTXKERN}
> +       btxld -v -E ${ORG2} -f bin -b ${BTXKERN} -l boot2.ldr \
> +           -o ${.TARGET} -P 1 boot2.bin
>
>        boot2.ldr:
> -       dd if=3D/dev/zero of=3Dboot2.ldr bs=3D512 count=3D1
> +       ${DD} if=3D/dev/zero of=3D${.TARGET} bs=3D512 count=3D1
>
>        boot2.bin: boot2.out
> -       objcopy -S -O binary boot2.out boot2.bin
> +       ${OBJCOPY} -S -O binary boot2.out ${.TARGET}
>
> -      boot2.out: ../btx/lib/crt0.o boot2.o sio.o
> -       ld -Ttext 0x2000 -o boot2.out
> -
> -      boot2.o: boot2.s
> -       ${CC} ${ACFLAGS} -c boot2.s
> -
> -      boot2.s: boot2.c boot2.h ${.CURDIR}/../../common/ufsread.c
> -       ${CC} ${CFLAGS} -S -o boot2.s.tmp ${.CURDIR}/boot2.c
> -       sed -e '/align/d' -e '/nop/d' "MISSING" boot2.s.tmp > boot2.s
> -       rm -f boot2.s.tmp
> +      boot2.out: ${BTXCRT} boot2.o sio.o ashldi3.o
> +       ${LD} ${LD_FLAGS} --defsym ORG=3D${ORG2} -T ${LDSCRIPT} -o ${.TAR=
GET} ${.ALLSRC}
>
>        boot2.h: boot1.out
>         ${NM} -t d ${.ALLSRC} | awk '/([0-9])+ T xread/ \
> @@ -623,21 +618,19 @@ Having [.filename]#boot1#, let us take a look at ho=
w [.filename]#boot2# is const
>             REL1=3D`printf "%d" ${REL1}` > ${.TARGET}
>  ....
>
> -.[.filename]#sys/boot/i386/boot2/Makefile# [[boot-boot1-make-boot2]]
> +.[.filename]#stand/i386/boot2/Makefile# [[boot-boot1-make-boot2]]
>  The mechanism for building [.filename]#boot2# is far more elaborate. Let=
 us point out the most relevant facts. The dependency list is as follows:
>
>  [.programlisting]
>  ....
>        boot2: boot2.ld
> -      boot2.ld: boot2.ldr boot2.bin ${BTXDIR}/btx/btx
> +      boot2.ld: boot2.ldr boot2.bin ${BTXDIR}
>        boot2.bin: boot2.out
> -      boot2.out: ${BTXDIR}/lib/crt0.o boot2.o sio.o
> -      boot2.o: boot2.s
> -      boot2.s: boot2.c boot2.h ${.CURDIR}/../../common/ufsread.c
> +      boot2.out: ${BTXDIR} boot2.o sio.o ashldi3.o
>        boot2.h: boot1.out
>  ....
>
> -.[.filename]#sys/boot/i386/boot2/Makefile# [[boot-boot1-make-boot2-more]=
]
> +.[.filename]#stand/i386/boot2/Makefile# [[boot-boot1-make-boot2-more]]
>  Note that initially there is no header file [.filename]#boot2.h#, but it=
s creation depends on [.filename]#boot1.out#, which we already have. The ru=
le for its creation is a bit terse, but the important thing is that the out=
put, [.filename]#boot2.h#, is something like this:
>
>  [.programlisting]
> @@ -645,12 +638,12 @@ Note that initially there is no header file [.filen=
ame]#boot2.h#, but its creati
>  #define XREADORG 0x725
>  ....
>
> -.[.filename]#sys/boot/i386/boot2/boot2.h# [[boot-boot1-make-boot2h]]
> +.[.filename]#stand/i386/boot2/boot2.h# [[boot-boot1-make-boot2h]]
>  Recall that [.filename]#boot1# was relocated (i.e., copied from `0x7c00`=
 to `0x700`). This relocation will now make sense, because as we will see, =
the BTX server reclaims some memory, including the space where [.filename]#=
boot1# was originally loaded. However, the BTX server needs access to [.fil=
ename]#boot1#'s `xread` function; this function, according to the output of=
 [.filename]#boot2.h#, is at location `0x725`. Indeed, the BTX server uses =
the `xread` function from [.filename]#boot1#'s relocated code. This functio=
n is now accessible from within the [.filename]#boot2# client.
>
> -We next build [.filename]#boot2.s# from files [.filename]#boot2.h#, [.fi=
lename]#boot2.c# and [.filename]#/usr/src/sys/boot/common/ufsread.c#. The r=
ule for this is to compile the code in [.filename]#boot2.c# (which includes=
 [.filename]#boot2.h# and [.filename]#ufsread.c#) into assembly code. Havin=
g [.filename]#boot2.s#, the next rule assembles [.filename]#boot2.s#, creat=
ing the object file [.filename]#boot2.o#. The next rule directs the linker =
to link various files ([.filename]#crt0.o#, [.filename]#boot2.o# and [.file=
name]#sio.o#). Note that the output file, [.filename]#boot2.out#, is linked=
 to execute at address `0x2000`. Recall that [.filename]#boot2# will be exe=
cuted in user mode, within a special user segment set up by the BTX server.=
 This segment starts at `0xa000`. Also, remember that the [.filename]#boot2=
# portion of [.filename]#boot# was copied to address `0xc000`, that is, off=
set `0x2000` from the start of the user segment, so [.filename]#boot2# will=
 work properly when we tr
>  ansfer control to it. Next, [.filename]#boot2.bin# is created from [.fil=
ename]#boot2.out# by stripping its symbols and format information; boot2.bi=
n is a _raw_ binary. Now, note that a file [.filename]#boot2.ldr# is create=
d as a 512-byte file full of zeros. This space is reserved for the bsdlabel=
.
> +The next rule directs the linker to link various files ([.filename]#ashl=
di3.o#, [.filename]#boot2.o# and [.filename]#sio.o#). Note that the output =
file, [.filename]#boot2.out#, is linked to execute at address `0x2000` (${O=
RG2}). Recall that [.filename]#boot2# will be executed in user mode, within=
 a special user segment set up by the BTX server. This segment starts at `0=
xa000`. Also, remember that the [.filename]#boot2# portion of [.filename]#b=
oot# was copied to address `0xc000`, that is, offset `0x2000` from the star=
t of the user segment, so [.filename]#boot2# will work properly when we tra=
nsfer control to it. Next, [.filename]#boot2.bin# is created from [.filenam=
e]#boot2.out# by stripping its symbols and format information; boot2.bin is=
 a _raw_ binary. Now, note that a file [.filename]#boot2.ldr# is created as=
 a 512-byte file full of zeros. This space is reserved for the bsdlabel.
>
> -Now that we have files [.filename]#boot1#, [.filename]#boot2.bin# and [.=
filename]#boot2.ldr#, only the BTX server is missing before creating the al=
l-in-one [.filename]#boot# file. The BTX server is located in [.filename]#/=
usr/src/sys/boot/i386/btx/btx#; it has its own [.filename]#Makefile# with i=
ts own set of rules for building. The important thing to notice is that it =
is also compiled as a _raw_ binary, and that it is linked to execute at add=
ress `0x9000`. The details can be found in [.filename]#/usr/src/sys/boot/i3=
86/btx/btx/Makefile#.
> +Now that we have files [.filename]#boot1#, [.filename]#boot2.bin# and [.=
filename]#boot2.ldr#, only the BTX server is missing before creating the al=
l-in-one [.filename]#boot# file. The BTX server is located in [.filename]#s=
tand/i386/btx/btx#; it has its own [.filename]#Makefile# with its own set o=
f rules for building. The important thing to notice is that it is also comp=
iled as a _raw_ binary, and that it is linked to execute at address `0x9000=
`. The details can be found in [.filename]#stand/i386/btx/btx/Makefile#.
>
>  Having the files that comprise the [.filename]#boot# program, the final =
step is to _merge_ them. This is done by a special program called [.filenam=
e]#btxld# (source located in [.filename]#/usr/src/usr.sbin/btxld#). Some ar=
guments to this program include the name of the output file ([.filename]#bo=
ot#), its entry point (`0x2000`) and its file format (raw binary). The vari=
ous files are finally merged by this utility into the file [.filename]#boot=
#, which consists of [.filename]#boot1#, [.filename]#boot2#, the `bsdlabel`=
 and the BTX server. This file, which takes exactly 16 sectors, or 8192 byt=
es, is what is actually written to the beginning of the FreeBSD slice durin=
g installation. Let us now proceed to study the BTX server program.
>
> @@ -680,7 +673,7 @@ btx_hdr:    .byte 0xeb                      # Machine=
 ID
>                 .long 0x0                       # Entry address
>  ....
>
> -.[.filename]#sys/boot/i386/btx/btx/btx.S# [[btx-header]]
> +.[.filename]#stand/i386/btx/btx/btx.S# [[btx-header]]
>  Note the first two bytes are `0xeb` and `0xe`. In the IA-32 architecture=
, these two bytes are interpreted as a relative jump past the header into t=
he entry point, so in theory, [.filename]#boot1# could jump here (address `=
0x9000`) instead of address `0x9010`. Note that the last field in the BTX h=
eader is a pointer to the client's ([.filename]#boot2#) entry point. This f=
ield is patched at link time.
>
>  Immediately following the header is the BTX server's entry point:
> @@ -693,14 +686,14 @@ Immediately following the header is the BTX server'=
s entry point:
>  init:          cli                             # Disable interrupts
>                 xor %ax,%ax                     # Zero/segment
>                 mov %ax,%ss                     # Set up
> -               mov $0x1800,%sp         #  stack
> +               mov $MEM_ESP0,%sp               #  stack
>                 mov %ax,%es                     # Address
>                 mov %ax,%ds                     #  data
>                 pushl $0x2                      # Clear
>                 popfl                           #  flags
>  ....
>
> -.[.filename]#sys/boot/i386/btx/btx/btx.S# [[btx-init]]
> +.[.filename]#stand/i386/btx/btx/btx.S# [[btx-init]]
>  This code disables interrupts, sets up a working stack (starting at addr=
ess `0x1800`) and clears the flags in the EFLAGS register. Note that the `p=
opfl` instruction pops out a doubleword (4 bytes) from the stack and places=
 it in the EFLAGS register. As the value actually popped is `2`, the EFLAGS=
 register is effectively cleared (IA-32 requires that bit 2 of the EFLAGS r=
egister always be 1).
>
>  Our next code block clears (sets to `0`) the memory range `0x5e00-0x8fff=
`. This range is where the various data structures will be created:
> @@ -710,13 +703,13 @@ Our next code block clears (sets to `0`) the memory=
 range `0x5e00-0x8fff`. This
>  /*
>   * Initialize memory.
>   */
> -               mov $0x5e00,%di         # Memory to initialize
> -               mov $(0x9000-0x5e00)/2,%cx      # Words to zero
> +               mov $MEM_IDT,%di                # Memory to initialize
> +               mov $(MEM_ORG-MEM_IDT)/2,%cx    # Words to zero
>                 rep                             # Zero-fill
>                 stosw                           #  memory
>  ....
>
> -.[.filename]#sys/boot/i386/btx/btx/btx.S# [[btx-clear-mem]]
> +.[.filename]#stand/i386/btx/btx/btx.S# [[btx-clear-mem]]
>  Recall that [.filename]#boot1# was originally loaded to address `0x7c00`=
, so, with this memory initialization, that copy effectively disappeared. H=
owever, also recall that [.filename]#boot1# was relocated to `0x700`, so _t=
hat_ copy is still in memory, and the BTX server will make use of it.
>
>  Next, the real-mode IVT (Interrupt Vector Table is updated. The IVT is a=
n array of segment/offset pairs for exception and interrupt handlers. The B=
IOS normally maps hardware interrupts to interrupt vectors `0x8` to `0xf` a=
nd `0x70` to `0x77` but, as will be seen, the 8259A Programmable Interrupt =
Controller, the chip controlling the actual mapping of hardware interrupts =
to interrupt vectors, is programmed to remap these interrupt vectors from `=
0x8-0xf` to `0x20-0x27` and from `0x70-0x77` to `0x28-0x2f`. Thus, interrup=
t handlers are provided for interrupt vectors `0x20-0x2f`. The reason the B=
IOS-provided handlers are not used directly is because they work in 16-bit =
real mode, but not 32-bit protected mode. Processor mode will be switched t=
o 32-bit protected mode shortly. However, the BTX server sets up a mechanis=
m to effectively use the handlers provided by the BIOS:
> @@ -737,7 +730,7 @@ init.0:             mov %bx,(%di)                   #=
 Store IP
>                 loop init.0                     # Next IRQ
>  ....
>
> -.[.filename]#sys/boot/i386/btx/btx/btx.S# [[btx-ivt]]
> +.[.filename]#stand/i386/btx/btx/btx.S# [[btx-ivt]]
>  The next block creates the IDT (Interrupt Descriptor Table). The IDT is =
analogous, in protected mode, to the IVT in real mode. That is, the IDT des=
cribes the various exception and interrupt handlers used when the processor=
 is executing in protected mode. In essence, it also consists of an array o=
f segment/offset pairs, although the structure is somewhat more complex, be=
cause segments in protected mode are different than in real mode, and vario=
us protection mechanisms apply:
>
>  [.programlisting]
> @@ -745,7 +738,7 @@ The next block creates the IDT (Interrupt Descriptor =
Table). The IDT is analogou
>  /*
>   * Create IDT.
>   */
> -               mov $0x5e00,%di                 # IDT's address
> +               mov $MEM_IDT,%di                # IDT's address
>                 mov $idtctl,%si                 # Control string
>  init.1:                lodsb                           # Get entry
>                 cbw                             #  count
> @@ -768,7 +761,7 @@ init.3:             lea 0x8(%di),%di                #=
 Next entry
>                 jmp init.1                      # Continue
>  ....
>
> -.[.filename]#sys/boot/i386/btx/btx/btx.S# [[btx-idt]]
> +.[.filename]#stand/i386/btx/btx/btx.S# [[btx-idt]]
>  Each entry in the `IDT` is 8 bytes long. Besides the segment/offset info=
rmation, they also describe the segment type, privilege level, and whether =
the segment is present in memory or not. The construction is such that inte=
rrupt vectors from `0` to `0xf` (exceptions) are handled by function `intx0=
0`; vector `0x10` (also an exception) is handled by `intx10`; hardware inte=
rrupts, which are later configured to start at interrupt vector `0x20` all =
the way to interrupt vector `0x2f`, are handled by function `intx20`. Lastl=
y, interrupt vector `0x30`, which is used for system calls, is handled by `=
intx30`, and vectors `0x31` and `0x32` are handled by `intx31`. It must be =
noted that only descriptors for interrupt vectors `0x30`, `0x31` and `0x32`=
 are given privilege level 3, the same privilege level as the [.filename]#b=
oot2# client, which means the client can execute a software-generated inter=
rupt to this vectors through the `int` instruction without failing (this is=
 the way [.filename]#boot
>  2# use the services provided by the BTX server). Also, note that _only_ =
software-generated interrupts are protected from code executing in lesser p=
rivilege levels. Hardware-generated interrupts and processor-generated exce=
ptions are _always_ handled adequately, regardless of the actual privileges=
 involved.
>
>  The next step is to initialize the TSS (Task-State Segment). The TSS is =
a hardware feature that helps the operating system or executive software im=
plement multitasking functionality through process abstraction. The IA-32 a=
rchitecture demands the creation and use of _at least_ one TSS if multitask=
ing facilities are used or different privilege levels are defined. Since th=
e [.filename]#boot2# client is executed in privilege level 3, but the BTX s=
erver runs in privilege level 0, a TSS must be defined:
> @@ -783,7 +776,7 @@ init.4:             movb $_ESP0H,TSS_ESP0+1(%di)    #=
 Set ESP0
>                 movb $_TSSIO,TSS_MAP(%di)       # Set I/O bit map base
>  ....
>
> -.[.filename]#sys/boot/i386/btx/btx/btx.S# [[btx-tss]]
> +.[.filename]#stand/i386/btx/btx/btx.S# [[btx-tss]]
>  Note that a value is given for the Privilege Level 0 stack pointer and s=
tack segment in the TSS. This is needed because, if an interrupt or excepti=
on is received while executing [.filename]#boot2# in Privilege Level 3, a c=
hange to Privilege Level 0 is automatically performed by the processor, so =
a new working stack is needed. Finally, the I/O Map Base Address field of t=
he TSS is given a value, which is a 16-bit offset from the beginning of the=
 TSS to the I/O Permission Bitmap and the Interrupt Redirection Bitmap.
>
>  After the IDT and TSS are created, the processor is ready to switch to p=
rotected mode. This is done in the next block:
> @@ -807,7 +800,7 @@ init.8:             xorl %ecx,%ecx                  #=
 Zero
>                 movw %cx,%ss                    #  stack
>  ....
>
> -.[.filename]#sys/boot/i386/btx/btx/btx.S# [[btx-prot]]
> +.[.filename]#stand/i386/btx/btx/btx.S# [[btx-prot]]
>  First, a call is made to `setpic` to program the 8259A PIC (Programmable=
 Interrupt Controller). This chip is connected to multiple hardware interru=
pt sources. Upon receiving an interrupt from a device, it signals the proce=
ssor with the appropriate interrupt vector. This can be customized so that =
specific interrupts are associated with specific interrupt vectors, as expl=
ained before. Next, the IDTR (Interrupt Descriptor Table Register) and GDTR=
 (Global Descriptor Table Register) are loaded with the instructions `lidt`=
 and `lgdt`, respectively. These registers are loaded with the base address=
 and limit address for the IDT and GDT. The following three instructions se=
t the Protection Enable (PE) bit of the `%cr0` register. This effectively s=
witches the processor to 32-bit protected mode. Next, a long jump is made t=
o `init.8` using segment selector SEL_SCODE, which selects the Supervisor C=
ode Segment. The processor is effectively executing in CPL 0, the most priv=
ileged level, after this
>  jump. Finally, the Supervisor Data Segment is selected for the stack by =
assigning the segment selector SEL_SDATA to the `%ss` register. This data s=
egment also has a privilege level of `0`.
>
>  Our last code block is responsible for loading the TR (Task Register) wi=
th the segment selector for the TSS we created earlier, and setting the Use=
r Mode environment before passing execution control to the [.filename]#boot=
2# client.
> @@ -819,7 +812,7 @@ Our last code block is responsible for loading the TR=
 (Task Register) with the s
>   */
>                 movb $SEL_TSS,%cl               # Set task
>                 ltr %cx                         #  register
> -               movl $0xa000,%edx               # User base address
> +               movl $MEM_USR,%edx              # User base address
>                 movzwl %ss:BDA_MEM,%eax         # Get free memory
>                 shll $0xa,%eax                  # To bytes
>                 subl $ARGSPACE,%eax             # Less arg space
> @@ -838,6 +831,9 @@ Our last code block is responsible for loading the TR=
 (Task Register) with the s
>                 movb $0x7,%cl                   # Set remaining
>  init.9:                push $0x0                       #  general
>                 loop init.9                     #  registers
> +#ifdef BTX_SERIAL
> +               call sio_init                   # setup the serial consol=
e
> +#endif
>                 popa                            #  and initialize
>                 popl %es                        # Initialize
>                 popl %ds                        #  user
> @@ -846,7 +842,7 @@ init.9:             push $0x0                       #=
  general
>                 iret                            # To user mode
>  ....
>
> -.[.filename]#sys/boot/i386/btx/btx/btx.S# [[btx-end]]
> +.[.filename]#stand/i386/btx/btx/btx.S# [[btx-end]]
>  Note that the client's environment include a stack segment selector and =
stack pointer (registers `%ss` and `%esp`). Indeed, once the TR is loaded w=
ith the appropriate stack segment selector (instruction `ltr`), the stack p=
ointer is calculated and pushed onto the stack along with the stack's segme=
nt selector. Next, the value `0x202` is pushed onto the stack; it is the va=
lue that the EFLAGS will get when control is passed to the client. Also, th=
e User Mode code segment selector and the client's entry point are pushed. =
Recall that this entry point is patched in the BTX header at link time. Fin=
ally, segment selectors (stored in register `%ecx`) for the segment registe=
rs `%gs, %fs, %ds and %es` are pushed onto the stack, along with the value =
at `%edx` (`0xa000`). Keep in mind the various values that have been pushed=
 onto the stack (they will be popped out shortly). Next, values for the rem=
aining general purpose registers are also pushed onto the stack (note the `=
loop` that pushes the val
>  ue `0` seven times). Now, values will be started to be popped out of the=
 stack. First, the `popa` instruction pops out of the stack the latest seve=
n values pushed. They are stored in the general purpose registers in order =
`%edi, %esi, %ebp, %ebx, %edx, %ecx, %eax`. Then, the various segment selec=
tors pushed are popped into the various segment registers. Five values stil=
l remain on the stack. They are popped when the `iret` instruction is execu=
ted. This instruction first pops the value that was pushed from the BTX hea=
der. This value is a pointer to [.filename]#boot2#'s entry point. It is pla=
ced in the register `%eip`, the instruction pointer register. Next, the seg=
ment selector for the User Code Segment is popped and copied to register `%=
cs`. Remember that this segment's privilege level is 3, the least privilege=
d level. This means that we must provide values for the stack of this privi=
lege level. This is why the processor, besides further popping the value fo=
r the EFLAGS register, do
>  es two more pops out of the stack. These val!
>  ues go to the stack pointer (`%esp`) and the stack segment (`%ss`). Now,=
 execution continues at ``boot0``'s entry point.
>
>  It is important to note how the User Code Segment is defined. This segme=
nt's _base address_ is set to `0xa000`. This means that code memory address=
es are _relative_ to address 0xa000; if code being executed is fetched from=
 address `0x2000`, the _actual_ memory addressed is `0xa000+0x2000=3D0xc000=
`.
> @@ -886,9 +882,9 @@ struct bootinfo {
>
>  [.programlisting]
>  ....
> -sys/boot/i386/boot2/boot2.c:
> +stand/i386/boot2/boot2.c:
>      __exec((caddr_t)addr, RB_BOOTINFO | (opts & RBX_MASK),
> -          MAKEBOOTDEV(dev_maj[dsk.type], 0, dsk.slice, dsk.unit, dsk.par=
t),
> +          MAKEBOOTDEV(dev_maj[dsk.type], dsk.slice, dsk.unit, dsk.part),
>            0, 0, 0, VTOP(&bootinfo));
>  ....
>
> @@ -901,21 +897,21 @@ The main task for the loader is to boot the kernel.=
 When the kernel is loaded in
>
>  [.programlisting]
>  ....
> -sys/boot/common/boot.c:
> +stand/common/boot.c:
>      /* Call the exec handler from the loader matching the kernel */
> -    module_formats[km->m_loader]->l_exec(km);
> +    file_formats[fp->f_loader]->l_exec(fp);
>  ....
>
>  [[boot-kernel]]
>  =3D=3D Kernel Initialization
>
> -Let us take a look at the command that links the kernel. This will help =
identify the exact location where the loader passes execution to the kernel=
. This location is the kernel's actual entry point.
> +Let us take a look at the command that links the kernel. This will help =
identify the exact location where the loader passes execution to the kernel=
. This location is the kernel's actual entry point. This command is now exc=
luded from [.filename]#sys/conf/Makefile.i386#. The content that interests =
us can be found in [.filename]#/usr/obj/usr/src/i386.i386/sys/GENERIC/#.
>
>  [.programlisting]
>  ....
> -sys/conf/Makefile.i386:
> -ld -elf -Bdynamic -T /usr/src/sys/conf/ldscript.i386  -export-dynamic \
> --dynamic-linker /red/herring -o kernel -X locore.o \
> +/usr/obj/usr/src/i386.i386/sys/GENERIC/kernel.meta:
> +ld -m elf_i386_fbsd -Bdynamic -T /usr/src/sys/conf/ldscript.i386 --build=
-id=3Dsha1 --no-warn-mismatch \
> +--warn-common --export-dynamic  --dynamic-linker /red/herring -X -o kern=
el locore.o
>  <lots of kernel .o files>
>  ....
>
> @@ -959,7 +955,7 @@ sys/i386/i386/locore.s:
>         mov     %ax, %gs
>  ....
>
> -btext calls the routines `recover_bootinfo()`, `identify_cpu()`, `create=
_pagetables()`, which are also defined in [.filename]#locore.s#. Here is a =
description of what they do:
> +btext calls the routines `recover_bootinfo()`, `identify_cpu()`, which a=
re also defined in [.filename]#locore.s#. Here is a description of what the=
y do:
>
>  [.informaltable]
>  [cols=3D"1,1", frame=3D"none"]
> @@ -969,29 +965,27 @@ btext calls the routines `recover_bootinfo()`, `ide=
ntify_cpu()`, `create_pagetab
>  |This routine parses the parameters to the kernel passed from the bootst=
rap. The kernel may have been booted in 3 ways: by the loader, described ab=
ove, by the old disk boot blocks, or by the old diskless boot procedure. Th=
is function determines the booting method, and stores the `struct bootinfo`=
 structure into the kernel memory.
>
>  |`identify_cpu`
> -|This functions tries to find out what CPU it is running on, storing the=
 value found in a variable `_cpu`.
> -
> -|`create_pagetables`
> -|This function allocates and fills out a Page Table Directory at the top=
 of the kernel memory area.
> +|This function tries to find out what CPU it is running on, storing the =
value found in a variable `_cpu`.
>  |=3D=3D=3D
>
>  The next steps are enabling VME, if the CPU supports it:
>
>  [.programlisting]
>  ....
> -       testl   $CPUID_VME, R(_cpu_feature)
> -       jz      1f
> -       movl    %cr4, %eax
> -       orl     $CR4_VME, %eax
> -       movl    %eax, %cr4
> +sys/i386/i386/mpboot.s:
> +       testl   $CPUID_VME,%edx
> +       jz      3f
> +       orl     $CR4_VME,%eax
> +3:     movl    %eax,%cr4
>  ....
>
>  Then, enabling paging:
>
>  [.programlisting]
>  ....
> +sys/i386/i386/mpboot.s:
>  /* Now enable paging */
> -       movl    R(_IdlePTD), %eax
> +       movl    IdlePTD_nopae, %eax
>         movl    %eax,%cr3                       /* load ptd addr into mmu=
 */
>         movl    %cr0,%eax                       /* get control word */
>         orl     $CR0_PE|CR0_PG,%eax             /* enable paging */
> @@ -1002,11 +996,12 @@ The next three lines of code are because the pagin=
g was set, so the jump is need
>
>  [.programlisting]
>  ....
> -       pushl   $begin                          /* jump to high virtualiz=
ed address */
> +sys/i386/i386/mpboot.s:
> +       pushl   $mp_begin                               /* jump to high m=
em */
>         ret
>
>  /* now running relocated at KERNBASE where the system is linked to run *=
/
> -begin:
> +mp_begin:      /* now running relocated at KERNBASE */
>  ....
>
>  The function `init386()` is called with a pointer to the first free phys=
ical page, after that `mi_startup()`. `init386` is an architecture dependen=
t initialization function, and `mi_startup()` is an architecture independen=
t one (the 'mi_' prefix stands for Machine Independent). The kernel never r=
eturns from `mi_startup()`, and by calling it, the kernel finishes booting:
> @@ -1014,11 +1009,12 @@ The function `init386()` is called with a pointer=
 to the first free physical pag
>  [.programlisting]
>  ....
>  sys/i386/i386/locore.s:
> -       movl    physfree, %esi
> -       pushl   %esi                            /* value of first for ini=
t386(first) */
> -       call    _init386                        /* wire 386 chip for unix=
 operation */
> -       call    _mi_startup                     /* autoconfiguration, mou=
ntroot etc */
> -       hlt             /* never returns to here */
> +       pushl   physfree                        /* value of first for ini=
t386(first) */
> +       call    init386                         /* wire 386 chip for unix=
 operation */
> +       addl    $4,%esp
> +       movl    %eax,%esp                       /* Switch to true top of =
stack. */
> +       call    mi_startup                      /* autoconfiguration, mou=
ntroot etc */
> +       /* NOTREACHED */
>  ....
>
>  =3D=3D=3D `init386()`
> @@ -1032,15 +1028,13 @@ sys/i386/i386/locore.s:
>  * Initialize the DDB, if it is compiled into kernel.
>  * Initialize the TSS.
>  * Prepare the LDT.
> -* Set up proc0's pcb.
> +* Set up thread0's pcb.
>
>  `init386()` initializes the tunable parameters passed from bootstrap by =
setting the environment pointer (envp) and calling `init_param1()`. The env=
p pointer has been passed from loader in the `bootinfo` structure:
>
>  [.programlisting]
>  ....
>  sys/i386/i386/machdep.c:
> -               kern_envp =3D (caddr_t)bootinfo.bi_envp + KERNBASE;
> -
>         /* Init basic tunables, hz etc */
>         init_param1();
>  ....
> @@ -1050,8 +1044,10 @@ sys/i386/i386/machdep.c:
>  [.programlisting]
>  ....
>  sys/kern/subr_param.c:
> -       hz =3D HZ;
> +       hz =3D -1;
>         TUNABLE_INT_FETCH("kern.hz", &hz);
> +       if (hz =3D=3D -1)
> +               hz =3D vm_guest > VM_GUEST_NO ? HZ_VM : HZ;
>  ....
>
>  TUNABLE_<typename>_FETCH is used to fetch the value from the environment=
:
> @@ -1069,30 +1065,36 @@ Then `init386()` prepares the Global Descriptors =
Table (GDT). Every task on an x
>  [.programlisting]
>  ....
>  sys/i386/i386/machdep.c:
> -union descriptor gdt[NGDT * MAXCPU];   /* global descriptor table */
> +union descriptor gdt0[NGDT];   /* initial global descriptor table */
> +union descriptor *gdt =3D gdt0;  /* global descriptor table */
>
> -sys/i386/include/segments.h:
> +sys/x86/include/segments.h:
>  /*
>   * Entries in the Global Descriptor Table (GDT)
>   */
>  #define        GNULL_SEL       0       /* Null Descriptor */
> -#define        GCODE_SEL       1       /* Kernel Code Descriptor */
> -#define        GDATA_SEL       2       /* Kernel Data Descriptor */
> -#define        GPRIV_SEL       3       /* SMP Per-Processor Private Data=
 */
> -#define        GPROC0_SEL      4       /* Task state process slot zero a=
nd up */
> -#define        GLDT_SEL        5       /* LDT - eventually one per proce=
ss */
> -#define        GUSERLDT_SEL    6       /* User LDT */
> -#define        GTGATE_SEL      7       /* Process task switch gate */
> +#define        GPRIV_SEL       1       /* SMP Per-Processor Private Data=
 */
> +#define        GUFS_SEL        2       /* User %fs Descriptor (order cri=
tical: 1) */
> +#define        GUGS_SEL        3       /* User %gs Descriptor (order cri=
tical: 2) */
> +#define        GCODE_SEL       4       /* Kernel Code Descriptor (order =
critical: 1) */
> +#define        GDATA_SEL       5       /* Kernel Data Descriptor (order =
critical: 2) */
> +#define        GUCODE_SEL      6       /* User Code Descriptor (order cr=
itical: 3) */
> +#define        GUDATA_SEL      7       /* User Data Descriptor (order cr=
itical: 4) */
>  #define        GBIOSLOWMEM_SEL 8       /* BIOS low memory access (must b=
e entry 8) */
> -#define        GPANIC_SEL      9       /* Task state to consider panic f=
rom */
> -#define GBIOSCODE32_SEL        10      /* BIOS interface (32bit Code) */
> -#define GBIOSCODE16_SEL        11      /* BIOS interface (16bit Code) */
> -#define GBIOSDATA_SEL  12      /* BIOS interface (Data) */
> -#define GBIOSUTIL_SEL  13      /* BIOS interface (Utility) */
> -#define GBIOSARGS_SEL  14      /* BIOS interface (Arguments) */
> +#define        GPROC0_SEL      9       /* Task state process slot zero a=
nd up */
> +#define        GLDT_SEL        10      /* Default User LDT */
> +#define        GUSERLDT_SEL    11      /* User LDT */
> +#define        GPANIC_SEL      12      /* Task state to consider panic f=
rom */
> +#define        GBIOSCODE32_SEL 13      /* BIOS interface (32bit Code) */
> +#define        GBIOSCODE16_SEL 14      /* BIOS interface (16bit Code) */
> +#define        GBIOSDATA_SEL   15      /* BIOS interface (Data) */
> +#define        GBIOSUTIL_SEL   16      /* BIOS interface (Utility) */
> +#define        GBIOSARGS_SEL   17      /* BIOS interface (Arguments) */
> +#define        GNDIS_SEL       18      /* For the NDIS layer */
> +#define        NGDT            19
>  ....
>
> -Note that those #defines are not selectors themselves, but just a field =
INDEX of a selector, so they are exactly the indices of the GDT. for exampl=
e, an actual selector for the kernel code (GCODE_SEL) has the value 0x08.
> +Note that those #defines are not selectors themselves, but just a field =
INDEX of a selector, so they are exactly the indices of the GDT. for exampl=
e, an actual selector for the kernel code (GCODE_SEL) has the value 0x20.
>
>  The next step is to initialize the Interrupt Descriptor Table (IDT). Thi=
s table is referenced by the processor when a software or hardware interrup=
t occurs. For example, to make a system call, user application issues the `=
INT 0x80` instruction. This is a software interrupt, so the processor's har=
dware looks up a record with index 0x80 in the IDT. This record points to t=
he routine that handles this interrupt, in this particular case, this will =
be the kernel's syscall gate. The IDT may have a maximum of 256 (0x100) rec=
ords. The kernel allocates NIDT records for the IDT, where NIDT is the maxi=
mum (256):
>
> @@ -1108,8 +1110,8 @@ For each interrupt, an appropriate handler is set. =
The syscall gate for `INT 0x8
>  [.programlisting]
>  ....
>  sys/i386/i386/machdep.c:
> -       setidt(0x80, &IDTVEC(int0x80_syscall),
> -                       SDT_SYS386TGT, SEL_UPL, GSEL(GCODE_SEL, SEL_KPL))=
;
> +       setidt(IDT_SYSCALL, &IDTVEC(int0x80_syscall),
> +                       SDT_SYS386IGT, SEL_UPL, GSEL(GCODE_SEL, SEL_KPL))=
;
>  ....
>
>  So when a userland application issues the `INT 0x80` instruction, contro=
l will transfer to the function `_Xint0x80_syscall`, which is in the kernel=
 code segment and will be executed with supervisor privileges.
> @@ -1121,10 +1123,10 @@ Console and DDB are then initialized:
>  sys/i386/i386/machdep.c:
>         cninit();
>  /* skipped */
> -#ifdef DDB
> -       kdb_init();
> +  kdb_init();
> +#ifdef KDB
>         if (boothowto & RB_KDB)
> -               Debugger("Boot flags requested debugger");
> +               kdb_enter(KDB_WHY_BOOTFLAGS, "Boot flags requested debugg=
er");
>  #endif
>  ....
>
> @@ -1134,25 +1136,27 @@ The Local Descriptors Table is used to reference =
userland code and data. Several
>
>  [.programlisting]
>  ....
> -/usr/include/machine/segments.h:
> +sys/x86/include/segments.h:
>  #define        LSYS5CALLS_SEL  0       /* forced by intel BCS */
>  #define        LSYS5SIGR_SEL   1
> -#define        L43BSDCALLS_SEL 2       /* notyet */
>  #define        LUCODE_SEL      3
> -#define        LSOL26CALLS_SEL 4       /* Solaris >=3D 2.6 system call g=
ate */
>  #define        LUDATA_SEL      5
> -/* separate stack, es,fs,gs sels ? */
> -/* #define     LPOSIXCALLS_SEL 5*/     /* notyet */
> -#define LBSDICALLS_SEL 16      /* BSDI system call gate */
> -#define NLDT           (LBSDICALLS_SEL + 1)
> +#define        NLDT            (LUDATA_SEL + 1)
>  ....
>
> -Next, proc0's Process Control Block (`struct pcb`) structure is initiali=
zed. proc0 is a `struct proc` structure that describes a kernel process. It=
 is always present while the kernel is running, therefore it is declared as=
 global:
> +Next, proc0's Process Control Block (`struct pcb`) structure is initiali=
zed. proc0 is a `struct proc` structure that describes a kernel process. It=
 is always present while the kernel is running, therefore it is linked with=
 thread0:
>
>  [.programlisting]
>  ....
> -sys/kern/kern_init.c:
> -    struct     proc proc0;
> +sys/i386/i386/machdep.c:
> +register_t
> +init386(int first)
> +{
> +    /* ... skipped ... */
> +
> +    proc_linkup0(&proc0, &thread0);
> +    /* ... skipped ... */
> +}
>  ....
>
>  The structure `struct pcb` is a part of a proc structure. It is defined =
in [.filename]#/usr/include/machine/pcb.h# and has a process's information =
specific to the i386 architecture, such as registers values.
> @@ -1164,7 +1168,7 @@ This function performs a bubble sort of all the sys=
tem initialization objects an
>  [.programlisting]
>  ....
>  sys/kern/init_main.c:
> -       for (sipp =3D sysinit; *sipp; sipp++) {
> +       for (sipp =3D sysinit; sipp < sysinit_end; sipp++) {
>
>                 /* ... skipped ... */
>
> @@ -1186,10 +1190,11 @@ print_caddr_t(void *data __unused)
>  {
>         printf("%s", (char *)data);
>  }
> -SYSINIT(announce, SI_SUB_COPYRIGHT, SI_ORDER_FIRST, print_caddr_t, copyr=
ight)
> +/* ... skipped ... */
> +SYSINIT(announce, SI_SUB_COPYRIGHT, SI_ORDER_FIRST, print_caddr_t, copyr=
ight);
>  ....
>
> -The subsystem ID for this object is SI_SUB_COPYRIGHT (0x0800001), which =
comes right after the SI_SUB_CONSOLE (0x0800000). So, the copyright message=
 will be printed out first, just after the console initialization.
> +The subsystem ID for this object is SI_SUB_COPYRIGHT (0x0800001). So, th=
e copyright message will be printed out first, just after the console initi=
alization.
>
>  Let us take a look at what exactly the macro `SYSINIT()` does. It expand=
s to a `C_SYSINIT()` macro. The `C_SYSINIT()` macro then expands to a stati=
c `struct sysinit` structure declaration with another `DATA_SET` macro call=
:
>
> @@ -1198,91 +1203,62 @@ Let us take a look at what exactly the macro `SYS=
INIT()` does. It expands to a `
>  /usr/include/sys/kernel.h:
>        #define C_SYSINIT(uniquifier, subsystem, order, func, ident) \
>        static struct sysinit uniquifier ## _sys_init =3D { \ subsystem, \
> -      order, \ func, \ ident \ }; \ DATA_SET(sysinit_set,uniquifier ##
> +      order, \ func, \ (ident) \ }; \ DATA_WSET(sysinit_set,uniquifier #=
#
>        _sys_init);
>
>  #define        SYSINIT(uniquifier, subsystem, order, func, ident)      \
>         C_SYSINIT(uniquifier, subsystem, order,                 \
> -       (sysinit_cfunc_t)(sysinit_nfunc_t)func, (void *)ident)
> +       (sysinit_cfunc_t)(sysinit_nfunc_t)func, (void *)(ident))
>  ....
>
> -The `DATA_SET()` macro expands to a `MAKE_SET()`, and that macro is the =
point where all the sysinit magic is hidden:
> +The `DATA_SET()` macro expands to a `_MAKE_SET()`, and that macro is the=
 point where all the sysinit magic is hidden:
>
>  [.programlisting]
>  ....
>  /usr/include/linker_set.h:
> -#define MAKE_SET(set, sym)                                             \
> -       static void const * const __set_##set##_sym_##sym =3D sym;       =
 \
> -       __asm(".section .set." #set ",\"aw\"");                         \
> -       __asm(".long " #sym);                                           \
> -       __asm(".previous")
> -#endif
> -#define TEXT_SET(set, sym) MAKE_SET(set, sym)
> -#define DATA_SET(set, sym) MAKE_SET(set, sym)
> +#define TEXT_SET(set, sym) _MAKE_SET(set, sym)
> +#define DATA_SET(set, sym) _MAKE_SET(set, sym)
>  ....
>
> -In our case, the following declaration will occur:
> -
> -[.programlisting]
> -....
> -static struct sysinit announce_sys_init =3D {
> -       SI_SUB_COPYRIGHT,
> -       SI_ORDER_FIRST,
> -       (sysinit_cfunc_t)(sysinit_nfunc_t)  print_caddr_t,
> -       (void *) copyright
> -};
> -
> -static void const *const __set_sysinit_set_sym_announce_sys_init =3D
> -    announce_sys_init;
> -__asm(".section .set.sysinit_set" ",\"aw\"");
> -__asm(".long " "announce_sys_init");
> -__asm(".previous");
> -....
> -
> -The first `__asm` instruction will create an ELF section within the kern=
el's executable. This will happen at kernel link time. The section will hav=
e the name `.set.sysinit_set`. The content of this section is one 32-bit va=
lue, the address of announce_sys_init structure, and that is what the secon=
d `__asm` is. The third `__asm` instruction marks the end of a section. If =
a directive with the same section name occurred before, the content, i.e., =
the 32-bit value, will be appended to the existing section, so forming an a=
rray of 32-bit pointers.
> -
> +After executing these macros, various sections were made in the kernel, =
including`set.sysinit_set`.
>  Running objdump on a kernel binary, you may notice the presence of such =
small sections:
>
>  [source,bash]
>  ....
> -% objdump -h /kernel
> -  7 .set.cons_set 00000014  c03164c0  c03164c0  002154c0  2**2
> -                  CONTENTS, ALLOC, LOAD, DATA
> -  8 .set.kbddriver_set 00000010  c03164d4  c03164d4  002154d4  2**2
> -                  CONTENTS, ALLOC, LOAD, DATA
> -  9 .set.scrndr_set 00000024  c03164e4  c03164e4  002154e4  2**2
> -                  CONTENTS, ALLOC, LOAD, DATA
> - 10 .set.scterm_set 0000000c  c0316508  c0316508  00215508  2**2
> -                  CONTENTS, ALLOC, LOAD, DATA
> - 11 .set.sysctl_set 0000097c  c0316514  c0316514  00215514  2**2
> -                  CONTENTS, ALLOC, LOAD, DATA
> - 12 .set.sysinit_set 00000664  c0316e90  c0316e90  00215e90  2**2
> -                  CONTENTS, ALLOC, LOAD, DATA
> +% llvm-objdump -h /kernel
> +Sections:
> +Idx Name                               Size     VMA      Type
> *** 126 LINES SKIPPED ***

Thanks for this upgrade!!!



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFwocyM9xNBOQi9wz4YarmgBOA627FQR5-Xmu0=y1sOeKGwhXw>