Module b2_paging

Help

Expand description

§BST - A2: Paging

The goal of this task is to extend RStuBS with basic paging functionality, to isolate applications from each other, and to separate kernel and user code. We later want to be able to load an arbitrary number of user applications from an initial RAM disk (initrd).

In assignment 1, we started to isolate user applications from the kernel by limiting the available instructions (using x86 rings). This is, however, not enough. Applications can still access and modify the kernel memory and the code/data from each other. In this assignment, we want to give each application its own address space, preventing it from accessing arbitrary data.

§Paging Basics

Theory...

Paging is a virtual memory mechanism, where the memory is split into evenly sized “pages” (usually 4 KiB). The pages can be mapped in any order into a virtual address space. Compared to segmentation, this mechanism is much more flexible and does not suffer from external fragmentation.

Down below is an example for a virtual address space, which is the “view” an userspace application has of the memory. Usually, this “view” is entirely different from the real physical memory layout of the hardware.

The application can only access memory, that is “mapped”, meaning that there is a pointer to a physical page in the corresponding “page tables”. The page tables define the mapping policy (how virtual addresses are translated to physical addresses). These tables are created by the OS and used by the MMU (Memory Management Unit) of the CPU to automatically translate memory accesses.

For x86 32-bit, Intel uses two levels of page tables. The first page table (called page directory) points to a second level of tables, that then point to physical pages. To find the physical address, the virtual address is split into three parts. The first part indexes into the page directory to find the corresponding page table, the second part indexes into the page table to find the physical page and the last part is used as offset into the physical page. This is called a page table walk and shown below.

Because a page table walk, takes a lot of time, there is an extra cache on the CPU that caches these translations, the TLB (translation lookaside buffer). Unfortunately, this TLB has to be managed manually. Whenever you change the page tables, you have to manually invalidate TLB entries!

§Questions

What is paging? How does it differ from segmentation?
Who implements/enforces the paging mechanism?
Who defines the policy?

§Physical Memory Management

To dynamically allocate memory for paging, we need a page frame allocator that manages the available physical memory.

Luckily, a (multiboot compliant) bootloader already provides the operating system with a list of all available memory ranges not occupied by devices. It is important to note that the memory already occupied by the kernel image and multiboot modules (we are coming back to this) are not automatically excluded from this list, but must be explicitly filtered out. Furthermore, the area up to 1 MiB should not be used as many devices are mapped there.

Warning: The areas specified by Multiboot can overlap and be contradictory. In case of doubt, non-free (reserved) takes precedence.

Since parsing the Multiboot format is a bit quirky, we have already included a multiboot crate for this. The required multiboot pointer is stored in crate::MBOOT_PTR.

We want to have two separate allocators for kernel memory (below 32 MiB) and user memory. The multiboot memory ranges can be used to initialize these kernel and user page allocators. Every memory needed for the user applications should later be allocated from the user allocator. We would recommend implementing the free memory management using a bitmap. The kernel allocator should be able to allocate up to 4 contiguous pages for the kernel stacks of threads. Smaller stacks can lead to stack overflows.

§Allocators in Rust

Rust allows defining custom allocators with the global_allocator attribute. Usually, one would use implement a “slab” allocator for smaller allocations. However, to make things easier, you can just use the page allocator directly, wasting a bit of memory for everything smaller than a page.

#[global_allocator]
static mut GLOBAL_ALLOC: YourKernelAlloc = YourKernelAlloc;

After doing this, you can use the alloc standard library that contains a lot of useful data structures like Vectors, BTreeMaps, etc. Just add the following lines to the main.rs file (before the mod ...; statements).

#[macro_use]
extern crate alloc;

§Page Tables

We need sufficient abstractions for the page tables and their entries. These abstractions could also be put into crate::arch::paging, and you can reuse the Physical type in your page table entry bitfield.

Then, we can set up the kernel mapping and page tables. We also need functions to set a page directory in the CR3 register and activate paging by updating the CR0.

Detailed information on layout of the page table entries can be found in the Intel manual in Chapter 4.
We recommend first setting up only one mapping for the kernel area, handing it over to the CPU, and testing the correct functionality.
Pay attention to the IOAPIC and the LAPIC memory.

§Virtual Memory Layout

A virtual address space for an application looks like this:

The memory below the 32MiB kernel boundary should be identity-mapped in supervisor mode to only be accessible from ring 0. This kernel region is shared between all applications.

The second half of the memory map between 32MiB and 4GiB contains the user memory. This range is different for every thread and not identity-mapped. Directly at the 32MiB boundary is the beginning of the user program code. The user stack is at the end of the memory map at 0xffff_f000, growing downwards. We leave a single page free behind it because Rust forbids accessing the end of the memory map, as this breaks bounds checking and some optimization (similar to the access to a null pointer).

It makes sense to permanently not map the first page (from address 0x0) (to provoke a page fault when accessed).

§Threads and Scheduler

The scheduler must now manage an arbitrary number of threads loaded from an initial RAM disk (initrd, see below). It is advisable to use a dynamic data structure (from rusts alloc crate) to manage the threads.

The page directories and page tables should be set up for each application thread with the above-mentioned memory mapping. The kernel page tables can be shared between all page directories. If the dispatcher switches to another process in Scheduler::dispatch, the mapping of the next process must also become active.

If everything works, processes cannot access the memory areas of other processes and not the always-visible kernel area. Syscalls and interrupts should work again, and the kernel can easily access the memory of the currently interrupted process.

§New System Calls for Memory Management

Two additional system calls are needed to allow user programs to manage memory dynamically.

pub fn map(len: usize) -> Result<&'static mut [u8], Error>;
pub fn unmap(mem: &'static mut [u8]) -> Result<(), Error>;

The map system call adds new memory to the user application. This memory must be large enough to store the [u8] slice in it and may span over multiple pages. For this, the operating system must find a suitable gap in the user address space and insert free pages there. The memory should initially be zeroed out. If the allocation succeeds, map should return a reference to the new memory, and otherwise a meaningful error code. unmap does the inverse, freeing a previously mapped range. Remember to invalidate any TLB entries for freed pages.

Useful core library functions for slices are from_raw_parts_mut and as_ptr.

pub fn exit() -> !;

The exit system call terminates the current process and releases all associated resources. Under no circumstances should this system call return to the application. To validate memory release, the state of free memory management should be displayed. After all applications have ended, exactly as much free memory should be available as before the start of the first application.

§Separation of Kernel and User Code

To more easily isolate the kernel and applications, RStuBS should be compiled separately from the applications. For this, the applications must be moved to a separate crate. In addition, a syscall library should be created, which contains the syscall stubs for the applications. With the help of this library, each application can now be compiled individually without linking directly against the kernel. Additionally, another linker script is required that describes the structure of the executable file. This script should define a meaningful start address (0x0200_0000 for 32MiB); otherwise, it essentially resembles the kernel’s linker script (src/arch/i686/link.ld).

The directory structure might look like this:

rstubs/
    src/
        ...
    syscall/
        src/
            lib.rs
        Cargo.toml
    user/
        src/
            arch/
                link.ld
            bin/
                app1.rs
                app2.rs
        build.rs
        Cargo.toml
    build.rs
    Cargo.toml

You can create the new crates with cargo init user and cargo init --lib syscall.

Add a build.rs to the user code that selects the correct linker script (see the top-level build.rs) and the syscall library as a dependency to the user/Cargo.toml. Cargo compiles all rust files in src/bin as separate binaries (similar to a main.rs). Consequently, every application must have its own start function, which also should be the first element in the text section of the linker script (just add *(.text.start) to the beginning of the text section). Additionally, we need a simple panic handler for each application.

// user/src/bin/foo.rs
#![no_std]
#![no_main]

use core::panic::PanicInfo;

#[no_mangle]
pub extern "C" fn start() -> ! {
    syscall::println!(dbg: "Hi");
}

#[panic_handler]
fn panic(info: &PanicInfo) -> ! {
    syscall::println!(dbg: "{info}");
    syscall::exit();
}

The top-level rstubs crate now has two more dependencies: A build dependency to the user applications and a normal dependency to the syscall library, which contains all data structures that are shared between kernel and user.

# Cargo.toml
#...

# Tell rust-analyzer that the user bins are part of our project
[workspace]
members = ["user"]

[dependencies]
# ...
syscall = { path = "syscall" }

[build-dependencies]
user = { path = "user", artifact = "bin", target = "target" }

This build dependency tells cargo to build the application binaries and to give us paths to them in the top-level build script, where we can further strip and bundle them.

We aim to bundle the applications into a single initial RAM disk (initrd) that can be loaded with multiboot-compliant bootloaders (GRUB, QEMU…). This is a two-step process: first, we generate so-called “flat” binaries from the individual application ELFs with objcopy (or llvm-objcopy). These are complete memory images of the programs that can be executed directly (without an ELF loader at runtime).

objcopy -O binary --set-section-flags .bss=alloc,load,contents <app> <output>

Secondly, we want to bundle all applications into a single binary with the following format.

The app lengths page contains a large array with the sizes of all apps in pages. This header is then followed by the application code, which is padded so that every app starts at a multiple of 4KiB.

These two steps should be added to the top-level build.rs:

fn main() {
    // ...

    // bundle user applications
    // - find all environment variables starting with `CARGO_BIN_FILE_USER_`
    // - flatten the referenced binaries with objcopy

    // create initrd
    // - create a header with the app lengths
    // - add the app code with the correct padding

    // store it into the build directory alongisde the kernel
    let target = env::var("TARGET").unwrap();
    let profile = env::var("PROFILE").unwrap();
    let image_path = PathBuf::from_iter(["target", &target, &profile, "initrd"]);
    std::fs::write(image_path, output /* <- your byte buffer */ ).unwrap();
}

You can execute other tools from the build script with Command.
If you use “-” as the output argument for llvm-objcopy, the code is not written to a file but returned to the stdout, which can be used directly in the build script.

On the kernel side, the multiboot crate helps us parse the multiboot data structures, including the provided modules, which contain the initrd. However, the initrd itself must be parsed manually, and the kernel should be able to load an arbitrary number of apps. To have more space in the kernel area, it is advisable to copy the initrd to a suitable location during initialization, and free the initrd memory area.

§Checklist

Init allocators with free memory from multiboot
Setup kernel page tables
Load applications from the initrd (copy code to user pages and create stack)
Setup a user page directory for every thread (kernel page tables can be shared)
Activate paging before running the first app
Test: Try writing to kernel memory

Module b2_pagingCopy item path