Nablas Blog

Modern CPUs are quite complicated beasts. They have peripherals, memory controllers, memory management units and a whole lot more that needs to be set up so that a stdout, stdin and more is defined and usable. This hardware dependent setup is done during the boot process. For most embedded systems this is done by the bootloader, which hands the system off to a properly configured kernel. I'd like to explore how this works in detail.

When a CPU receives power, it is built to load the Program Counter from a predefined location in the memory. So the address that specifies the location from which we load the Program Counter (PC) is called the power-on-reset-vector. It's called a vector because it points to the location containing the initial PC. This procedure happens for any hardware interupt, with the difference being that every interrupt gets it's own vector, meaning that the instructions that get executed may be different for different interrupt sources.

Interrupt vectors point to specific locations in memory of the cpu

So for eventually booting an operating system it is important to have the power-on-reset-vector pointing to code that will eventually load the desired OS. In our case this code is the bootloader.

Now in many environments, it is forseen, that third parties may gain physical access to the hardware. To ensure that the device does not boot into a compromized image or load code that has been tampered with, many boot loaders try to verify the images by checking the signature of the images with against a known public key flashed into non rewritable memory, while also checking for the serial number of the memory to protect against swapping out the chip containing the key. For the ARM Cortex A9 in the ZYNQ this part is branded TrustZone.

Sometimes it might also be useful to allow to fetch the image from a network attached storage, meaning that the bootloader must set up the network hardware and incorporate a network stack.

To separate programs from each other the CPU may be run in various mode. In privileged mode, every CPU instruction may be issued, including those that change the virtualization behavior of the system. In unprivileged mode, a limited instruction set is executable, ensuring that once a user program has been given control of the Program Counter, it may not access memory outside of boundaries set up during privileged mode execution. The bootloader must prepare the CPU such that it is in the correct mode when the OS is booted. Processors like the venerable x86 require a bit of a complicated dance to enable all the modern features of the CPU.

Besides restricting the Instructions that may be issued in different modes, a key component of virtualization is the memory management unit (MMU) together with the memory hirarchy. The MMU translates memory addresses passed to the load and store calles into the actual addresses and inserts cached values if available (or begins reading a cache line from off-chip RAM, stalling the CPU in the mean time). This system needs to be configured properly so that an operating System may run.

In principle this could all be done by the operating system. However, as operating system images are of considerable size they are not stored in directly accessable memory, meaning they are loaded from an external storage device. For the operating system image to be able to boot directly, The entire operating system image must be readable by a simple load instruction meaning that It needs to be mapped into the memory space of the CPU at power on. As this is unpractical (and wastes valuable address space on smaller systems) the bootloader is loaded instead from an on chip memory that is directly mapped into the CPU memory by the default config of the MMU.

The boot loader must be tailored to the specific chip and system. This means, that every system either needs a custom bootloader, or a common bootloader needs to be flexible enough to adapt to a wide range of devices (especially in the embedded space).

Booting an embedded linux