Expand description
§BSB - A0: Free-Standing Binaries and Boot Process
In this optional assignment, you will learn how to set up a Rust embedded (bare metal) project from scratch. You will be guided through the project setup with the compiler and linker configs, boot assembly and the Rust entry point. Finally, we verify that the kernel works by printing to the CGA screen.
For the next assignments, you will get a template that hides most of the complexity of the boot process.
§Installing Rust
The first step for this assignment is to install the Rust toolchain. Unfortunately, the packages provided by apt do not support building our own bare-metal targets; thus, we must install it manually.
Refer to the README setup section for the install instructions.
§Project Setup
Usually, a Rust project contains a Cargo.toml
with the project configuration and a src
directory with a main.rs
that contains the main
function as the entry point. Just create such a project by running cargo init my-kernel
. This generates a simple hello world example which can be built and executed with cargo run
.
For the bare-metal kernel, we need a bit of extra configuration to tell the compiler that we want to build for our own target.
First of all, we cannot use the entire standard library of Rust (only a subset) because it relies on syscalls to the OS, but we want to build our own OS. The following has to be added to the start of the src/main.rs
to disable it.
#![no_std] // no standard library
#![no_main] // disable all Rust-level entry points
// ...
Next, we have to define our own compiler target. The following JSON file contains the system configuration for the Rust compiler.
build/i686-rstubs-kernel.json
{
"arch": "x86",
"cpu": "pentium",
"data-layout": "e-m:e-p:32:32-p270:32:32-p271:32:32-p272:64:64-i128:128-f64:32:64-f80:32-n8:16:32-S128",
"disable-redzone": true,
"executables": true,
"features": "-mmx,-sse,+soft-float",
"linker-flavor": "ld.lld",
"linker": "rust-lld",
"llvm-target": "i686-unknown-none-elf",
"os": "none",
"panic-strategy": "abort",
"pre-link-args": {
"ld.lld": ["--script=build/link.ld"]
},
"target-c-int-width": "32",
"target-endian": "little",
"target-pointer-width": "32"
}
Building our target requires a nightly compiler, so we must tell cargo to use it.
rust-toolchain
nightly
Now, we have to instruct cargo to use the system configuration and that we want to build our own standard library. Also, this config contains useful cargo aliases (making it simpler to execute our kernel).
.cargo/config.toml
[alias]
run-gdb = ["run", "--config", "target.i686-rstubs-kernel.runner = './scripts/qemu.sh -S -kernel'"]
gdb = ["run", "--config", "target.i686-rstubs-kernel.runner = './scripts/gdb.sh'"]
[build]
target = "build/i686-rstubs-kernel.json"
# Cargo run should execute the following script
[target.i686-rstubs-kernel]
runner = "./scripts/qemu.sh -kernel"
[unstable]
# build our own standard library
build-std = ["core", "compiler_builtins"]
build-std-features = ["compiler-builtins-mem"]
The following shell scripts are used to start our QEMU VM and attach a debugger.
scripts/qemu.sh
#!/bin/bash
qemu-system-i386 -smp 4 -m 1G -serial mon:stdio -gdb tcp::1234 -no-shutdown -no-reboot -d guest_errors $@
scripts/gdb.sh
#!/bin/bash
gdb $1 -ex 'target remote :1234'
Note: Make the
qemu.sh
andgdb.sh
scripts executable! (chmod u+x scripts/qemu.sh
)
§The Multiboot Loader
So we have a freestanding binary, but how do we boot it?
The QEMU VM and many bootloaders (like Grub) support the Multiboot specification. Every kernel that implements this standard can be loaded by complaint bootloaders. This specification requires a specific Multiboot header to tell the bootloader how it should configure hardware (like video output) and load the kernel.
Offset | Field |
---|---|
0 | Magic multiboot value |
4 | Flags |
8 | Checksum: -(Magic + Flags) |
12-28 | Not needed |
32 | Video mode |
36 | Video width |
40 | Video height |
44 | Video bit depth |
Multiboot parses the ELF headers and this special Multiboot header, copies the code and data to the desired memory addresses, configures some hardware, and finally jumps to the entry point address from the ELF header. This is the part where our code starts.
Unfortunately, we still have to do a few things before we can execute our Rust main
function.
Most importantly, we have to create a stack.
The following assembler code implements the Multiboot header and a bit of startup code that initializes the stack (and x86 segmentation stuff, which will be explained in a later assignment).
src/start.s
# Symbols imported from Rust
.extern main
.extern INIT_STACK
.extern STACK_SIZE
# Multiboot Header
.section .multiboot, "a" # allocatable
.globl multiboot
multiboot:
MULTIBOOT_HEADER_MAGIC = 0x1BADB002
MULTIBOOT_HEADER_FLAGS = 4 # video mode
.long MULTIBOOT_HEADER_MAGIC
.long MULTIBOOT_HEADER_FLAGS
.long -(MULTIBOOT_HEADER_MAGIC + MULTIBOOT_HEADER_FLAGS) # checksum
.long 0, 0, 0, 0, 0 # Not needed for ELFs
.long 0 # Video mode (0: Graphic, 1: Text)
.long 1280 # Video width
.long 1024 # Video height
.long 32 # Video bit depth
# Startup (protected mode)
.section .inittext
.globl start
start:
# Load GDT (segmentation will be explained later)
lgdt GDT_PTR
# Set kernel code segment
ljmp 0x8, offset start_high
# Second stage
.section .text
start_high:
# Set kernel data segments
mov ax, 0x10
mov ds, ax
mov es, ax
mov fs, ax
mov gs, ax
mov ss, ax
# Set init stack
lea esp, INIT_STACK
add esp, STACK_SIZE
# Call Rust function
call main
# Static Data for segmentation
.section .data
GDT_PTR:
.word GDT_END - GDT - 1
.long GDT
GDT: # Global Descriptor Table for segmentation
.long 0x00000000, 0x00000000 # 0x00 NULL
.long 0x0000FFFF, 0x00CF9A00 # 0x08 Kernel code
.long 0x0000FFFF, 0x00CF9200 # 0x10 Kernel data
GDT_END:
The start assembly requires three symbols from our Rust code: our main
function, the STACK_SIZE
, and INIT_STACK
pointer.
They have to be created accordingly.
#![no_std] // no standard library
#![no_main] // disable all Rust-level entry points
use core::arch::global_asm;
use core::panic::PanicInfo;
// Include the assembly code
global_asm!(include_str!("start.s"), options(raw));
const STACK_S: usize = 4 * 1024;
#[no_mangle]
pub static STACK_SIZE: usize = STACK_S;
#[no_mangle]
pub static mut INIT_STACK: [u8; STACK_S] = [0; STACK_S];
#[no_mangle]
pub fn main() -> ! {
loop {}
}
#[panic_handler]
fn panic(_info: &PanicInfo) -> ! {
loop {}
}
Note: The main and panic functions return
!
, which is the never type meaning they just do not return at all.
And finally, we have to link all of our code together with the following linker script. This places the Multiboot header at the beginning and then puts the code and data in the correct sections.
build/link.ld
ENTRY(start)
OUTPUT_FORMAT(elf32-i386)
SECTIONS {
. = 16M; /* Load the kernel at 16MB */
.boot : { /* Multiboot Header */
KEEP( *(.multiboot) )
}
.text : { /* Executable code */
*(.inittext)
*(.text .text.*)
*(.fini)
}
.data : { /* Static variables (non-zero) */
*(.padata)
*(.data .data.*)
*(.rodata .rodata.*)
*(.got .got.plt)
*(.eh_frame .eh_fram)
*(.jcr)
}
.bss : { /* Static variables (zeroed by multiboot) */
*(.bss .bss.*)
*(COMMON)
}
}
Now you should have the following files:
- .cargo/config.toml
- build/
- i686-rstubs-kernel.json
- link.ld
- scripts/
- gdb.sh
- qemu.sh
- src/
- main.rs
- start.s
- Cargo.toml
- rust-toolchain
Try building and running the project with cargo run
.
If you see a QEMU window with some text (white on black), then everything works as expected.
You can exit Qemu with Ctrl-a x
or close its window.
To see if we reach the main
function, start the qemu with cargo run-gdb
and the debugger (on another terminal) with cargo gdb
.
Then, set a breakpoint (hb main
) and continue execution (c
).
Now, you should hit the breakpoint.
You can also inspect the stack and registers with monitor info registers
and print variables with p var
.
To exit the debugger, type quit
.
For more information on debugging, see the README debugging section.
§Output on CGA Text Mode
To verify that the kernel really works, we can print to the Color Graphics Adapter screen.
This is done by writing to the memory region at 0xB8000
.
The CGA screen has a resolution of 80x25 characters, and each character cell consists of two bytes: the first one is an index into the CGA font, selecting the character, and the second byte specifies the foreground and background colors.
0 80 Columns 79
0 +---------------------------->+
| char, attr, char, attr, ... |
25 rows | char, attr, ... |
v ... |
24 +-----------------------------+
The CGA font is compatible with most ASCII characters, meaning you can print regular strings.
This example writes a red H
in the upper-left corner:
const BUFFER: *mut u8 = 0xb8000 as _;
unsafe {
BUFFER.add(0).write_volatile(b'H');
BUFFER.add(1).write_volatile(0x14); // red on blue
}
Try to print a (byte-) string like b"Hello, World!"
to the screen.
To see the full range of characters and colors, try to iterate through the whole CGA buffer and write the cell index to the characters and attributes.
The CGA output is continued in the next assignment, where we will implement proper abstractions. You can skip ahead to the full explanation of the CGA output if you want.