Module a0_boot

Help

Expand description

§BSB - A0: Free-Standing Binaries and Boot Process

In this optional assignment, you will learn how to set up a Rust embedded (bare metal) project from scratch. You will be guided through the project setup with the compiler and linker configs, boot assembly and the Rust entry point. Finally, we verify that the kernel works by printing to the CGA screen.

For the next assignments, you will get a template that hides most of the complexity of the boot process.

§Installing Rust

The first step for this assignment is to install the Rust toolchain. Unfortunately, the packages provided by apt do not support building our own bare-metal targets; thus, we must install it manually.

Refer to the README setup section for the install instructions.

§Project Setup

Usually, a Rust project contains a Cargo.toml with the project configuration and a src directory with a main.rs that contains the main function as the entry point. Just create such a project by running cargo init my-kernel. This generates a simple hello world example which can be built and executed with cargo run.

For the bare-metal kernel, we need a bit of extra configuration to tell the compiler that we want to build for our own target.

First of all, we cannot use the entire standard library of Rust (only a subset) because it relies on syscalls to the OS, but we want to build our own OS. The following has to be added to the start of the src/main.rs to disable it.

#![no_std] // no standard library
#![no_main] // disable all Rust-level entry points
// ...

Next, we have to define our own compiler target. The following JSON file contains the system configuration for the Rust compiler.

build/i686-rstubs-kernel.json

{
    "arch": "x86",
    "cpu": "pentium",
    "data-layout": "e-m:e-p:32:32-p270:32:32-p271:32:32-p272:64:64-i128:128-f64:32:64-f80:32-n8:16:32-S128",
    "disable-redzone": true,
    "executables": true,
    "features": "-mmx,-sse,+soft-float",
    "linker-flavor": "ld.lld",
    "linker": "rust-lld",
    "llvm-target": "i686-unknown-none-elf",
    "os": "none",
    "panic-strategy": "abort",
    "pre-link-args": {
        "ld.lld": ["--script=build/link.ld"]
    },
    "target-c-int-width": "32",
    "target-endian": "little",
    "target-pointer-width": "32"
}

Building our target requires a nightly compiler, so we must tell cargo to use it.

rust-toolchain

nightly

Now, we have to instruct cargo to use the system configuration and that we want to build our own standard library. Also, this config contains useful cargo aliases (making it simpler to execute our kernel).

.cargo/config.toml

[alias]
run-gdb = ["run", "--config", "target.i686-rstubs-kernel.runner = './scripts/qemu.sh -S -kernel'"]
gdb = ["run", "--config", "target.i686-rstubs-kernel.runner = './scripts/gdb.sh'"]

[build]
target = "build/i686-rstubs-kernel.json"

# Cargo run should execute the following script
[target.i686-rstubs-kernel]
runner = "./scripts/qemu.sh -kernel"

[unstable]
# build our own standard library
build-std = ["core", "compiler_builtins"]
build-std-features = ["compiler-builtins-mem"]

The following shell scripts are used to start our QEMU VM and attach a debugger.

scripts/qemu.sh

#!/bin/bash
qemu-system-i386 -smp 4 -m 1G -serial mon:stdio -gdb tcp::1234 -no-shutdown -no-reboot -d guest_errors $@

scripts/gdb.sh

#!/bin/bash
gdb $1 -ex 'target remote :1234'

Note: Make the qemu.sh and gdb.sh scripts executable! (chmod u+x scripts/qemu.sh)

§The Multiboot Loader

So we have a freestanding binary, but how do we boot it?

The QEMU VM and many bootloaders (like Grub) support the Multiboot specification. Every kernel that implements this standard can be loaded by complaint bootloaders. This specification requires a specific Multiboot header to tell the bootloader how it should configure hardware (like video output) and load the kernel.

Offset	Field
0	Magic multiboot value
4	Flags
8	Checksum: -(Magic + Flags)
12-28	Not needed
32	Video mode
36	Video width
40	Video height
44	Video bit depth

Multiboot parses the ELF headers and this special Multiboot header, copies the code and data to the desired memory addresses, configures some hardware, and finally jumps to the entry point address from the ELF header. This is the part where our code starts.

Unfortunately, we still have to do a few things before we can execute our Rust main function. Most importantly, we have to create a stack. The following assembler code implements the Multiboot header and a bit of startup code that initializes the stack (and x86 segmentation stuff, which will be explained in a later assignment).

src/start.s

# Symbols imported from Rust
.extern main
.extern INIT_STACK
.extern STACK_SIZE

# Multiboot Header
.section .multiboot, "a" # allocatable
.globl multiboot
multiboot:
    MULTIBOOT_HEADER_MAGIC   = 0x1BADB002
    MULTIBOOT_HEADER_FLAGS   = 4 # video mode
    .long MULTIBOOT_HEADER_MAGIC
    .long MULTIBOOT_HEADER_FLAGS
    .long -(MULTIBOOT_HEADER_MAGIC + MULTIBOOT_HEADER_FLAGS) # checksum
    .long 0, 0, 0, 0, 0 # Not needed for ELFs
    .long 0     # Video mode (0: Graphic, 1: Text)
    .long 1280  # Video width
    .long 1024  # Video height
    .long 32    # Video bit depth

# Startup (protected mode)
.section .inittext
.globl start
start:
    # Load GDT (segmentation will be explained later)
    lgdt GDT_PTR
    # Set kernel code segment
    ljmp 0x8, offset start_high

# Second stage
.section .text
start_high:
    # Set kernel data segments
    mov ax, 0x10
    mov ds, ax
    mov es, ax
    mov fs, ax
    mov gs, ax
    mov ss, ax

    # Set init stack
    lea esp, INIT_STACK
    add esp, STACK_SIZE
    # Call Rust function
    call main

# Static Data for segmentation
.section .data
GDT_PTR:
	.word GDT_END - GDT - 1
	.long GDT
GDT: # Global Descriptor Table for segmentation
	.long 0x00000000, 0x00000000 # 0x00 NULL
	.long 0x0000FFFF, 0x00CF9A00 # 0x08 Kernel code
	.long 0x0000FFFF, 0x00CF9200 # 0x10 Kernel data
GDT_END:

The start assembly requires three symbols from our Rust code: our main function, the STACK_SIZE, and INIT_STACK pointer. They have to be created accordingly.

#![no_std] // no standard library
#![no_main] // disable all Rust-level entry points

use core::arch::global_asm;
use core::panic::PanicInfo;

// Include the assembly code
global_asm!(include_str!("start.s"), options(raw));

const STACK_S: usize = 4 * 1024;
#[no_mangle]
pub static STACK_SIZE: usize = STACK_S;
#[no_mangle]
pub static mut INIT_STACK: [u8; STACK_S] = [0; STACK_S];

#[no_mangle]
pub fn main() -> ! {
    loop {}
}

#[panic_handler]
fn panic(_info: &PanicInfo) -> ! {
    loop {}
}

Note: The main and panic functions return !, which is the never type meaning they just do not return at all.

And finally, we have to link all of our code together with the following linker script. This places the Multiboot header at the beginning and then puts the code and data in the correct sections.

build/link.ld

ENTRY(start)
OUTPUT_FORMAT(elf32-i386)

SECTIONS {
	. = 16M;  /* Load the kernel at 16MB */
	.boot : { /* Multiboot Header */
		KEEP( *(.multiboot) )
	}
	.text : { /* Executable code */
		*(.inittext)
		*(.text .text.*)
		*(.fini)
	}
	.data : { /* Static variables (non-zero) */
		*(.padata)
		*(.data .data.*)
		*(.rodata .rodata.*)
		*(.got .got.plt)
		*(.eh_frame .eh_fram)
		*(.jcr)
	}
	.bss : { /* Static variables (zeroed by multiboot) */
		*(.bss .bss.*)
		*(COMMON)
	}
}

Now you should have the following files:

.cargo/config.toml
build/
- i686-rstubs-kernel.json
- link.ld
scripts/
- gdb.sh
- qemu.sh
src/
- main.rs
- start.s
Cargo.toml
rust-toolchain

Try building and running the project with cargo run. If you see a QEMU window with some text (white on black), then everything works as expected. You can exit Qemu with Ctrl-a x or close its window.

To see if we reach the main function, start the qemu with cargo run-gdb and the debugger (on another terminal) with cargo gdb. Then, set a breakpoint (hb main) and continue execution (c). Now, you should hit the breakpoint. You can also inspect the stack and registers with monitor info registers and print variables with p var. To exit the debugger, type quit.

For more information on debugging, see the README debugging section.

§Output on CGA Text Mode

To verify that the kernel really works, we can print to the Color Graphics Adapter screen. This is done by writing to the memory region at 0xB8000.

The CGA screen has a resolution of 80x25 characters, and each character cell consists of two bytes: the first one is an index into the CGA font, selecting the character, and the second byte specifies the foreground and background colors.

         0         80 Columns         79
       0 +---------------------------->+
         | char, attr, char, attr, ... |
 25 rows | char, attr, ...             |
         v ...                         |
      24 +-----------------------------+

The CGA font is compatible with most ASCII characters, meaning you can print regular strings.

This example writes a red H in the upper-left corner:

const BUFFER: *mut u8 = 0xb8000 as _;
unsafe {
    BUFFER.add(0).write_volatile(b'H');
    BUFFER.add(1).write_volatile(0x14); // red on blue
}

Try to print a (byte-) string like b"Hello, World!" to the screen.

To see the full range of characters and colors, try to iterate through the whole CGA buffer and write the cell index to the characters and attributes.

The CGA output is continued in the next assignment, where we will implement proper abstractions. You can skip ahead to the full explanation of the CGA output if you want.