Welcome neighbors. In this blog I will be publishing notes I have taken on UEFI, BIOS, bootloading, ELF, and other technical topics that interest me and seem to lack documentation or explanation. I will also be keeping a list of UEFI, bootloading, and other resources I have found useful on my resources page.
The rest of this post will be a whirlwind toure of bootloading and thus fairly introductory, so if you are already familiar with the world of bootloaders you might as well move on and read something else (although I would like to encourage you to look at the section where I propose new general bootloader terminology). In case you want to stick around for the full blog post I will be discussing:
- my motivations behind studying bootloading
- bootloader terminology
- how to navigate the plethora of bootloader implementations
- specifications that relate to bootloading
- BIOS, UEFI, and how they came to be
- how to write a simple boot sector
Why study bootloading?
Bootloaders provide a window into how hardware and systems operate. They reveal dependencies between hardware and software components as well as complexities that are masked by drivers. The lowest-level bootloaders provide a structured way of studying hardware because they systematically initialize the most important hardware components (memory, input/output, etc). For all of these reasons I am researching bootloading partially because I want to better understand hardware.
The other half of my motivation is that I am a security researcher and I know that if we cannot trust the bootloader or hardware, we cannot trust the system. Bootloaders contain the first code that runs on a system and are implicitly trusted by all software that acquire control from a bootloader (either directly or indirectly). I am especially interested in the data that drive the bootloading process since I have already discovered how well-formed ELF metadata can act as instructions to an unintentionally Turing-complete ld.so dynamic loader .
The grande overview of booting
Before I began to research bootloading, that which happened between the moment I hit my machine’s power button and the moment when Linux was loaded seemed magical. I hope to help demystify some of that magic in this blog. The exact form of the magic invoked by the power button is highly dependent on the system’s architecture, processor, chipset, and peripheral hardware. In this blog post I will mostly discuss bootloading in general and not focus on a single architecture. I will also not specifically discuss architectures with multiple processors but much of what is said here is applicable to such settings.
The many stages of bootloading
Bootloaders work hand-in-hand with their hardware to configure it to execute more powerful and feature-rich software. The architecture, processor type, and hardware configuration all have a large influence over what a bootloader needs to accomplish and which bootloader flavors can be used. Most bootloading happens in stages where each stage is a package of code and data that is loaded and executed by the prior stage until the target payload (typically the kernel/operating system) has been loaded and invoked. This sequential combination of multiple bootloader stages is often referred to as a bootloading chain. Most systems implement multiple bootloading stages due to varying degrees of space and addressing constraints, flexibility requirements that exist throughout the boot process, and the need to incorporate low-level software from multiple sources/groups of developers. Early boot stages typically initialize the hardware just enough to locate and load a larger and more powerful stage (and perhaps to allow for debugging), eventually loading other types of stages until the the target kernel (OS) and/or application (on some embedded systems) is located and loaded.
The bootloading process is kicked off when the system’s processor receives a reset signal generated (for example) when it first receives power. Different processors handle this signal in different ways, but they all ultimately end up executing instructions that are either in ROM/firmware that is embedded in the same chip as the processor or some firmware external to the processor that is configured to be accessible from a physical address (as defined in the processor’s specifications).
On Bootloader Terminology
UEFI, BIOS, coreboot, DAS U-Boot, firmware, barebox, GRUB, LILO, stage 1 bootloaders, secondary bootloaders: it can be easy to get lost in the world of bootloaders when you are simply trying to get a sense of the space. Wikipedia’s table comparing bootloaders is a good place to start but its tables of bootloaders and features are not exhaustive. It also doesn’t help that there is no standard language to describe bootloading stages – everyone uses their own terminology and branding that sound similar but have discrepancies in meaning. This can be disorienting to those who are new to the space.
In my attempt to standardize and use consistent terminology I will refer to the first set of instructions the primary processor executes as the kickoff stage. After the kickoff stage there may be multiple intermediary stages, followed by a penultimate stage which loads the system’s target. For example, consider how I labeled a sample Linux boot chain below that uses the GRUB bootloader for its intermediary and penultimate stages.
Navigating the sea of bootloaders
Many bootloaders are legacy BIOS or UEFI-based but different architectures such as PowerPC and SPARC have their own bootloading traditions. Embedded devices have their own limitations, requirements, end goals, and bootloader implementations. Whatever is first executed at startup may be fixed in ROM or in some re-writable non-volatile medium. In modern PCs we typically find this initial bootloader to be BIOS-syle or UEFI-compliant. Embedded systems may have ROM or re-programmable firmware-based initial bootloaders. There is no reason that bootloaders in embedded devices cannot be UEFI-compliant, but it is rare to find UEFI-compliant bootloaders in embedded devices.1
Some bootloaders are kickoff bootloaders, some transition between a resource-constrained/hardware-specific previous stage to a penultimate bootloader stage, and others exist simply to make it easier for an end-user to configure the bootloading process. Some assume that they will be loaded from the first sector of the disk drive by a previous stage in the bootloading process (such as MBR/boot sector-based bootloaders like GRUB), and others assume that they will be loaded by an EFI-based previous stage.
How do we navigate this sea of bootloaders? One way to do so is to think of the chain of bootloading stages as a sequence of adapters. Each link in the bootloading chain has its own expectations of what system resources it can use, how it can find the arguments passed to it, what arguments it expects, where it can find the next stage, and how it expects this next stage to be formatted. Whether or not two arbitrary bootloaders can be linked together consecutively greatly depends on these expectations. To understand and evaluate a given bootloader we should look to answer the following questions:
- What architectures does the bootloader run on (ARM? x86?)
- Where can this bootloader reside? (On disk? In firmware? In some other form of non-volatile memory? On a remote disk?)
- How large is this bootloader once compiled into a binary format?
- Where can this bootloader be located? (The first sector of a disk? In a UEFI System Partition? Some other previously known position on a disk? In a UEFI Firmware Volume/Firmware File System? At some known position in firmware? At some known address in memory? Via the network?)
- How is this bootloader packaged? (UEFI File? UEFI Firmware Volume? As an MBR? PE? ELF? With an OMAP 35xx Configuration Header?)
- What arguments does this bootloader expect and where can it locate them?
- What devices can this bootloader load its next stage from? (Disk? USB device? Network? Non-volatile memory?)
- What formats can this bootloader parse as it locates the next stage? (FAT? GPT? UEFI Firmware Volume/Firmware File System?)
- How can the next stage be packaged? (ELF? UEFI Firmware Volume? PE? TE? OMAP 35xx Configuration Header? Encryption? Compression? With a signature?)
- How does it pass arguments to the next stage?
Once we answer all these questions about a given bootloader we will have a clearer picture of how it fits in the whole bootloading chain of a system. These questions illuminate the adapter-like qualities of a given stage and how the stages link together, however they do not describe other qualities of a bootloading stage or ways they transform the system’s state. Nevertheless, sometimes a bootloading stage does nothing more than act as an adapter.
For example, the BeagleBoard UEFI implementation’s SEC stage does not much else besides loading the DXE stage from the UEFI firmware image because the x-loader boot stage that executes immediately before the UEFI SEC stage performs most platform initialization .
Example bootloading chains
The BeagleBone (more specifically the BeagleBone Rev A6b), which is different from the BeagleBoard described earlier, is a development board that uses an AM3358 processor. The AM3358 contains on-chip boot ROM (which I refer to as a “kickoff stage” and TI refers to as a “Primary Program Loader” (PPL)) that by default performs some simple tests to determine where the next bootloading stage lives (which I will call the “intermediary stage” and TI/u-boot calls the “Secondary Program Loader” (SPL)) , . BeagleBones are configured so the stage loaded by the kickoff stage lives in a MMC/SD card. This card must be formatted with a FAT file system so that the kickoff stage can find the intermediary stage in a file named MLO located in the root directory, load it at an address specified by the MLO’s header, and execute it as the next stage. By default, the BeagleBone comes with a u-boot bootloader that is split into two stages: one called the u-boot Secondary Program Loader (the MLO file) and a second stage is just called u-boot. This second stage is larger and more feature-rich than the first stage (the MLO file) and acts as the penultimate stage that loads the target Linux kernel.
Meanwhile over in the land of the standard x86 PC, when the machine is first powered on the CPU begins to execute whatever code happens to be sitting 16 bytes from the end of its address space . The kickoff bootloader is typically located on a chip in non-volatile memory (such as in SPI Flash). Due to various constraints, the size of this primary bootloader is tiny and it is up to this small piece of code to setup interrupt vectors, memory, and hardware. This system’s kickoff boot firmware also enumerates devices that could contain the next boot stage and eventually loads and executes an image (either an intermediary or penultimate stage) that drives the subsequent stage (intermediary, penultimate, or target) of the bootloading process. These layers/chains/stages of bootloaders allow for flexibility and customization of the bootloading process for the end user as well as generic mechanisms for initialization, configuration, and communication between the system’s hardware components.
Specifications related to bootloading
If you crack open your computer or even just inspect an embedded computing device such as a BeagleBone you will likely find hardware components made by different manufacturers. Hardware manufacturers have adopted various conventions to allow mixed and custom hardware components to work together on a single system. For example, bus standards such as PCI allow multiple devices to communicate and share hardware resources. Most bus standards also allow for more flexibility by defining a method of enumerating the devices on the bus without prior knowledge of which devices are attached.
Some standards are highly documented and public, and others… not-so-much. In general, the “standards” (in the general sense of the word) involved in the bootloading process will fit into one or more of the following categories:
- Data format (and parser) standards. Be that a file/executable format, a filesystem format, an image format, signature format, or any other data that may be passed between multiple parties during the boot process. Examples include MBR, Firmware File System, FAT, ELF, TE, SREC, SMBIOS, and chip-specific formats.
- Calling convention standards. I mean “calling convention” in a general sense: how different actors in the system pass execution flow amongst themselves and what their expectations are regarding the state of the system/parameters/return values before and after execution is passed to a different actor as well as how/where shared data can be found. Examples include parts of the BIOS Boot Specification, UEFI, and application binary interface (ABI) specifications.
- Protocol standards. How different components (often working in parallel) expect to communicate and share system resources. Examples include PCI and SPI.
There are probably some pieces of bootloading-related standards that fit in a miscellaneous category, but I will not be discussing them here.
Since a system’s boot loaders may not have hard-coded knowledge of all its peripheral components, it must use a combination of hard-coded knowledge of its environment and bus/data/calling convention standards to herd its hardware components into a more useful and powerful state so that all hardware works together as a cohesive whole during and after boot. For each separate component (including non-removable components) the system must be able to:
- Communicate with the component
- Allocate resources for the component
- Initialize the component
- (Optional) Allow other components to communicate with the component
- (Optional) Inform others about the component (such as the next bootloading stage)
BIOS and Cat Herding
Although we can all agree that BIOS stands for Basic Input/Output System, there are mixed ideas of what a BIOS really is and how to use BIOS-related terminology. Some folks consider “BIOS” to mean the firmware that drives the boot process on any system, but I find this definition misleading because I view BIOS as more of a style of system firmware found in “IBM PC-compatible” systems .
It is easier to understand what “IBM PC-compatible” and “BIOS” mean and why BIOS is so loosely defined when they are placed in their historical context. The original IBM Personal Computer (5150) was introduced in 1981. The IBM 5150 and its successors were built mostly out of hardware and software from outside vendors and were designed to allow users to add and swap hardware components . In fact, one of the few components that IBM designed specifically for their PC was the BIOS , although they published much of its source code (including system diagrams) in their documentation , . IBM discouraged vendors from building software that interacted directly with hardware, instead encouraging them to interface with the BIOS thus allowing the BIOS to act as a consistent interface layered above an evolving set of hardware .
Due to the popularity of the IBM PC outside vendors began releasing their own IBM PC clones that included ROM that behaved like an IBM PC BIOS so that hardware and software that worked on the IBM PC would also be compatible with their clone . The original IBM PC BIOS set the stage and tone for how BIOS evolved since the era of the IBM PC. What we consider to be legacy BIOS was never fully or centrally standardized, it was just what vendors needed to comply with in order to develop machines, hardware, and software that could exist and interact in the prominent PC ecosystem. The IBM PC BIOS was what cat herded outside vendors towards an ecosystem in which they not only co-exist but also interact.
Thus what we think of as the BIOS “standard” is really just a a set of expectations that exist between the hardware and software components in a system. Hardware attached to a system with BIOS-style firmware can make specific assumptions about how it is going to be communicated with, handled, and initialized by the system’s firmware at boot; the subsequent bootloader that is invoked can make certain assumptions about how it is going to be located, loaded, and run; and the kernel and bootloaders can interact with the hardware via BIOS-defined interfaces (although modern kernels generally do not directly interact with the BIOS).
Although there is no single definitive BIOS specification, there are multiple published specifications that are part of (legacy) BIOS-style firmware.2 BIOS-related specifications do not cover the breadth of what is expected to exist in BIOS-style firmware; some BIOS expectations are not defined in a specification but instead are copied from other popular BIOS implementations. Thus in order to understand BIOS, one should not only read specifications but also read a BIOS manufacturer’s technical reference such as the PhoenixBIOS 4.0 Programmer’s Guide, one of the older references such as an IBM PC technical reference, and it might do you some good to browse through the boot record/partition table information in a DOS Technical Reference to gain insight on how BIOS-style firmware locates the next bootloading stage on disk.
I am no BIOS historian, but judging from the language used in the BIOS-related specifications and their publication dates it appears that many of the written specifications associated with BIOS are more of a reflection of how the boot process worked in IBM PC-compatible system firmware as opposed to a premeditated design to which manufacturers were expected to conform. Thus the “cat herding” – what BIOS really is is a snapshot of how structure emerged from the multiple parties involved in developing IBM PC-compatible systems and hardware.
UEFI with an Emphasis on Interface
UEFI, the Unified Extensible Firmware Interface is a boot firmware standard developed to be the industry-standard and successor to legacy BIOS firmware. To truly understand UEFI, we must constantly remind ourselves that the purpose of UEFI is to define standard firmware-related interfaces. UEFI-compliant firmware often work with the same bus protocols as legacy BIOS firmware, but the details of these protocols and other standards are abstracted away into UEFI-defined bus drivers/interfaces. It is also important to note that UEFI-compliant firmware do not require trusted boot to be in place, so all of you who deeply believe that UEFI is pure evil, take a few breaths and remember that the true purpose of UEFI is to define standard firmware interfaces.
UEFI vs. BIOS
UEFI is often compared to BIOS because it is the successor of legacy BIOS and addresses many of the limitations inherit in legacy BIOS firmware. UEFI and BIOS are similar in that they accomplish the same major goals such as initializing hardware and allowing for flexibility in which kernel ultimately gets booted, although UEFI-compliant firmware is much more feature-rich than its plain legacy BIOS counterpart. UEFI is a firmware interface standard whereas BIOS is a firmware interface convention. UEFI specifies how different hardware and software components can expect to interact during boot and runtime down to which functions calls should be used and how various data structures should look whereas the BIOS convention applies mostly to hardware-level interaction (how hardware components communicate via interrupts). Nevertheless, BIOS and UEFI are not mutually exclusive, system firmware can implement UEFI-compliant interfaces but still support legacy BIOS booting (with a feature called the “Compatibility Support Module”).
How to jump on the bootloader bandwagon
So you want to write your own bootloader but you don’t know where to begin? Let me suggest you write a MBR/boot sector-based bootloader. It may not be as fulfilling as writing a bootloader for an embedded system because it doesn’t sit in firmware or require you to read large chunks of hardware specifications but it is a widely-used stage in legacy BIOS bootloading. There are many interesting boot sector toys and project seedlings such as:
- A PDF that is also bootable 
- A boot sector-based Tetris game 
- A boot sector written in rust 
Our toy bootloader does not have to do anything interesting, it does not even have to load additional images or stages. For sake of simplicity let us write a boot sector that also happens to be the bootloading target.
We will build a bootloader for the qemu-system-i386 virtual machine that has a legacy BIOS kickoff stage. In BIOS-style bootloading this bootloading stage is known as a boot sector because it located in the first sector of a hard or floppy disk.
What do we need to know to get started? We know that this machine has a BIOS-based kickoff stage and we are looking to write the next stage. We need to to answer the following questions:
- How should the bootloader code be packaged in non-volatile storage?
- What size restrictions are there (both in non-volatile and volatile storage)?
- Where should the bootloader image should be stored?
- What state is the system in (operating mode, available memory, stack)?
- How and where is our bootloader mapped to memory by the previous stage and what is the first address executed (the entrypoint)?
- How can our bootloader locate arguments passed to it (if any)?
If we want our bootloader to be able to load and execute a kernel image, then we must also answer these questions with respect to the kernel so that we can set the environment up to match the kernel’s expectations, however we will be doing that here.
Since there is no authoritative BIOS standard or other primary source (that I can find) that answers all the questions above, we must resort to referencing existing boot sector code and tutorials. The OSDev Wiki page on “Rolling Your Own Bootloader” is a nice resource but does not answer all of these questions, but we can find most other answers in Pierre ANCELOT’s blog post “Bootsector in Assembly”. Armed with this information we can answer the above questions as follows:
- We don’t know exactly what image “format” is used, but we know that the nasm assembler will give us what we need when passed “-f bin” option and passed source code with specific characteristics that I will describe later.
- The packaged bootloader image must be no bigger than 512 bytes. At runtime it initially cannot address memory above 1M and it has about 512KB of memory to work with.
- It must live in the first sector of a bootable disk.
- The CPU is executing in 16 bit real mode and we can make no assumptions about the stack.
- The previous stage has loaded our entire bootloader image to physical address 0x7c00 and it begins execution at address 0x7c00.
- We don’t need to locate any arguments passed in by the previous stage.
If we want to do something interesting with our bootloader such as print characters to the screen, we will find that BIOS-based bootloaders allow us to accomplish this via interrupts. You can read more about BIOS interrupts in the “BIOS Interrupt Call” Wikipedia article. In case you are not familiar with interrupts, what you should know is that when a INT instruction is called, the processor starts executing a function as specified by its operand which acts as an index into a table of function pointers (an interrupt table) that (hopefully) has been setup before we invoked the interrupt. At this stage in the boot process, the BIOS should have set up an interrupt table for us. All BIOS-style firmware handle many interrupts similarly but not exactly the same. If you want to be sure of how your particular BIOS handles an interrupt, you should check your BIOS’s technical reference, that is if you can find a copy. QEMU uses seabios and if you look for its developer documentation regarding BIOS interrupts you will find that it asks you to consult Ralf Brown’s interrupt list which is a comprehensive listing of interrupt calls across many chipsets and also points out differences in implementations.
Here is a toy bootloader written for the NASM assembler that simply prints a string “Hello, boot sector!” and hangs. I will explain the code in grueling detail below.
1 BITS 16 ; Tells the assembler that the processor is operating in 16-bit mode 2 ORG 0x7C00 ; Assembler should assume the program is loaded at address 0x7C00 3 4 ; This is the first instruction that is executed when this bootloader is invoked 5 mov ebx, msg ; Load address of message into a general purpose register. 6 ; ebx will hold the address of the next byte in the message to print 7 mov ah, 0x3 ; Setup "Get cursor position & shape" interrupt args 8 int 0x10 ; Causes current page number to be stored in bh 9 printnextchar: 10 mov al, [ebx]; Move next character in message to al register 11 inc ebx ; Increment the address of the next byte to print 12 cmp al, 0 ; Have we passed the end of the string we are printing? 13 je finish ; If we have printed the full string jump to the label 'hang' 14 mov ah, 0x0e ; Move the magic value 0x0e to the register ah 15 int 0x10 ; Invokes the "Video Services" interrupt 16 ; Because the magic value 0x0e is in register ah Video Services 17 ; Will write a character in TTY mode and it will print character 18 ; Specified by register al (which we set earlier) 19 jmp printnextchar; Loop to print the next character 20 21 finish: jmp finish; Loop in place to cause the bootloader to hang 22 23 ; Define a null-terminated string we want to print 24 msg: db 'Hello, boot sector!', 0 25 26 ; Pad the rest of the bootloader with zeros through byte 510 27 times (510) - ($ - $$) db 0 28 29 ; A valid boot sector is expected to have bytes 511-512 contain 0x55 and 0xaa 30 BIOS_signature: 31 db 0x55 32 db 0xaa
The first thing the code does is load the address of the message it wants to print into register ebx. After that it calls interrupt 0x10 with register ah set to value 0x3. According to Ralf Brown’s interrupt list, INT 10 is the video interrupt and when register ah is set to 0x03, the BIOS’s video interrupt will execute the “get cursor position and size” function which stores cursor information in various registers. We call this interrupt because it stores the current page number into register bh which we need in order to print out a character. I will not be explaining what “page number” means beyond the fact that we need it to print a character to screen properly.
The code then iterates through each character of msg which is pointed to by the ebx register. For each character in the string, it copies the character into the al register, increments ebx to point to the next character, checks if the character’s value in al is zero (null), and if it is null it jumps to the tight loop at offset 0x19. If the character is not null, it copies the magic value 0xe to the ah register, calls interrupt 0x10, and loops to print the next character. When this particular interrupt is invoked, the BIOS video services interrupt handler will invoke the “teletype output” function because register ah is set to 0xe. The “teletype output” function will print the character in register al to the screen at the cursor’s current position and at the page number in bh (which was set by our first INT instruction), it will then move the cursor forward one position. After all the characters are printed, we will jump to the last line of code which will cause the bootloader to hang in a tight loop.
Let us now see what this boot sector image looks like on disk. You can grab a copy of this this source code with a Makefile and other goodies from my github repository. If you are playing along at home, go ahead and build the image (see the README file) so we can inspect the image to understand how the toy bootloader works. The Makefile builds a disk image called hello.img that can be executed via qemu a la:
$ qemu-system-i386 -fda hello.img
If we disassemble the boot sector image we can see that the assembly code we wrote is located at the start of the image up until offset 0x1b of the file.
$ ndisasm hello.img -o7c00h | head -n 12 00007C00 66BB1B7C0000 mov ebx,0x7c1b 00007C06 B403 mov ah,0x3 00007C08 CD10 int 0x10 00007C0A 678A03 mov al,[ebx] 00007C0D 6643 inc ebx 00007C0F 3C00 cmp al,0x0 00007C11 7406 jz 0x7c19 00007C13 B40E mov ah,0xe 00007C15 CD10 int 0x10 00007C17 EBF1 jmp short 0x7c0a 00007C19 EBFE jmp short 0x7c19 00007C1B 48 dec ax
In the ndisasm output above we can see from the first disassembled instruction that msg is at offset 0x7c1b. Remember this image will be loaded at offset 0x7c00 in memory (our assembler directive in line 1 of the code instructed the assembler of this loading offset) so our msg string should be located at offset 0x1b from the top of the image.3
If we use a hex viewer to inspect our boot sector, we will find that the “Hello, boot sector!” string we statically defined at line 24 is indeed at offset 0x1b.
$ xxd -a hello.img 00000000: 66bb 1b7c 0000 b403 cd10 678a 0366 433c f..|......g..fC< 00000010: 0074 06b4 0ecd 10eb f1eb fe48 656c 6c6f .t.........Hello 00000020: 2c20 626f 6f74 2073 6563 746f 7221 0000 , boot sector!.. 00000030: 0000 0000 0000 0000 0000 0000 0000 0000 ................ * 000001f0: 0000 0000 0000 0000 0000 0000 0000 55aa ..............U.
Following the string, the file is padded with zeros until the last two bytes at offset 0x1fe-0x1ff (510-511). The padding of zeros comes from the assembler directive found at line 27 and the 0x55 0xaa bytes come from line 31-32. How did I know perform this padding and insert this signature? Mostly due to trial and error. QEMU did not execute the boot sector without the padding and signature and I realized the tutorials state that boot sectors needed to be padded in this manner.
Now that you know how to make a boot sector that prints characters to the screen, perhaps you too can do something interesting with the 510 bytes of a boot sector than can contain arbitrary data. However if you would like to first see more examples of simple boot loaders, you can find more resources on writing boot sectors such as Shikhin Sethi’s “This OS is a Boot Sector” blog post and the “Creating a Bare Bones Bootloader” blog post by Joe Savage.
To be continued…
The tianocore UEFI-reference implementation has been ported to the BeagleBoard although only the last stages in its bootloader chain are UEFI-compliant, the first couple of stages are not. The MinnowBoard, a development board for embedded devices, has full UEFI support.↩