The goal of this task is to extend StuBS with basic paging functionality, to isolate applications from each other, and to separate kernel and user code. We later want to be able to load an arbitrary number of user applications from an initial RAM disk (initrd).

In assignment 1, we started to isolate user applications from the kernel by limiting the available instructions (using x86 rings). This is, however, not enough. Applications can still access and modify the kernel memory and the code/data from each other. In this assignment, we want to give each application its own address space, preventing it from accessing arbitrary data.

Paging Basics

Paging is a virtual memory mechanism, where the memory is split into evenly sized "pages" (usually 4 KiB). The pages can be mapped in any order into a virtual address space. Compared to segmentation, this mechanism is much more flexible and does not suffer from external fragmentation.

Down below is an example for a virtual address space, which is the "view" an userspace application has of the memory. Usually, this "view" is entirely different from the real physical memory layout of the hardware.

The application can only access memory, that is "mapped", meaning that there is a pointer to a physical page in the corresponding "page tables". The page tables define the mapping policy (how virtual addresses are translated to physical addresses). These tables are created by the OS and used by the MMU (Memory Management Unit) of the CPU to automatically translate memory accesses.

For x86 32-bit, Intel uses two levels of page tables. The first page table (called page directory) points to a second level of tables, that then point to physical pages. To find the physical address, the virtual address is split into three parts. The first part indexes into the page directory to find the corresponding page table, the second part indexes into the page table to find the physical page and the last part is used as offset into the physical page. This is called a page table walk and shown below.

Because a page table walk takes a lot of time, there is an extra cache on the CPU that caches these translations, the TLB (translation lookaside buffer). Unfortunately, this TLB has to be managed manually. Whenever you change the page tables, you have to manually invalidate TLB entries!

Questions

What is paging? How does it differ from segmentation?
Who implements/enforces the paging mechanism?
Who defines the policy?

Physical Memory Management

To dynamically allocate memory for paging, we need a page frame allocator that manages the available physical memory.

Luckily, a (multiboot compliant) bootloader already provides the operating system with a list of all available memory ranges not occupied by devices. It is important to note that the memory already occupied by the kernel image and multiboot modules (we are coming back to this) are not automatically excluded from this list, but must be explicitly filtered out. The location of the kernel can be found by using the ___KERNEL_START___ and ___KERNEL_END___ symbols (see compiler/sections.ld for details). Furthermore, the area up to 1 MiB should not be used as many devices are mapped there.

Warning: The areas specified by Multiboot can overlap and be contradictory. In case of doubt, non-free (reserved) takes precedence.

Since parsing the Multiboot format is a bit quirky, we have provided you with helpers in the Multiboot namespace. See Multiboot::getMemoryMap() and related.

We want to have two separate allocators for kernel memory (below 32 MiB) and user memory. The multiboot memory ranges can be used to initialize these kernel and user page allocators. Every memory needed for the user applications should later be allocated from the user allocator. We recommend implementing the free memory management using a bitmap. However, you are free to implement other designs as well (linked lists, shadow page tables, ...). The kernel allocator should be able to allocate up to 4 contiguous pages for the kernel stacks of threads. Smaller stacks can lead to stack overflows.

Page Tables

We need sufficient abstractions for the page tables and their entries. Implement as many (sensible) abstractions as you like. It may also be useful to create a wrapper type for physical addresses so you don't confuse them with regular pointers.

Then, we can set up the kernel mapping and page tables. We also need functions to set a page directory in the CR3 register and activate paging by updating CR0.

Detailed information on layout of the page table entries can be found in the Intel manual in Chapter 4.
We recommend first setting up only one mapping for the kernel area, handing it over to the CPU, and testing the correct functionality.
Pay attention to the IOAPIC and the LAPIC memory.

Virtual Memory Layout

A virtual address space for an application looks like this:

The memory below the 32MiB kernel boundary should be identity-mapped in supervisor mode to only be accessible from ring 0. This kernel region is shared between all applications.

The second half of the memory map between 32MiB and 4GiB contains the user memory. This range is different for every thread and not identity-mapped. Directly at the 32MiB boundary is the beginning of the user program code. The user stack is at the end of the memory map at 0xffff_f000, growing downwards. We leave a single page free above the stack because C/C++ forbids accessing the end of the memory map, as this breaks bounds checking and some optimization (similar to the access to a null pointer).

It makes sense to permanently leave the first page (from address 0x0) unmaped (to provoke a page fault when dereferencing a nullptr).

Threads and Scheduler

The scheduler must now manage an arbitrary number of threads loaded from an initial RAM disk (initrd, see below).

The page directories and page tables should be set up for each application thread with the above-mentioned memory mapping. The kernel page tables can be shared between all page directories. If the dispatcher switches to another process in Dispatcher::go() or Dispatcher::dispatch(), the mapping of the next process must also become active.

If everything works, processes can neither access the memory areas of other processes nor the identity-mapped kernel area. Syscalls and interrupts should work again, and the kernel can easily access the memory of the currently interrupted process.

New System Calls for Memory Management

Two additional system calls are needed to allow user programs to manage memory dynamically.

void* map(size_t size);

int unmap(void* start, size_t size);

The map system call adds size new memory to the user application. For this, the operating system must find a suitable gap in the user address space and insert free pages there. The memory should initially be zeroed out. If the allocation succeeds, map should return a pointer to the new memory, and otherwise a meaningful error code. unmap does the inverse, freeing a previously mapped range. Remember to invalidate any TLB entries for freed pages. To simplify the allocator implementation, we restrict size to be multiples of the page size (4KiB) and start to be page-aligned.

void* exit()

The exit system call terminates the current process and releases all associated resources. Under no circumstances should this system call return to the application. To validate memory release, the state of free memory management should be displayed. After all applications have terminated, exactly as much free memory should be available as before the start of the first application.

Separation of Kernel and User Code

To more easily isolate the kernel and applications, StuBS should be compiled separately from the applications. For this, the applications must be moved to a separate folder/binary. In addition, a syscall library should be created, which contains the syscall stubs for the applications. With the help of this library, each application can now be compiled individually without linking directly against the kernel. Additionally, a linker script (sections.ld) is required that describes the structure of the executable file. This script should define a meaningful start address (0x0200_0000 for 32MiB); otherwise, it essentially resembles the kernel's linker script (kernel/compiler/sections.ld). In order to call the constructors of global objects in the application, you should link the supplied init.cc into each application. For easier development, you should automate the build/link process of applications with make.

The directory structure might look like this:

stubs/
    kernel/
        ...
    libsys/
        syscalls.{h/cc}
        shared_code.{h/cc}
        Makefile
    user/
        sections.ld
        app1/
            main.cc
        app2/
            main.cc
        init.cc
        Makefile

Since applications are now completely separated from the Thread class, they need their own entry point. Just as your average C applications we will utilize a main() function:

// your_app.cc

extern "C" void main() {

// Application code

}

We aim to bundle the applications into a single initial RAM disk (initrd) that can be loaded with multiboot-compliant bootloaders (GRUB, QEMU...). This is a two-step process: first, we generate so-called "flat" binaries from the individual application ELFs with objcopy (or llvm-objcopy). These are complete memory images of the programs that can be executed directly (without an ELF loader at runtime).

objcopy -O binary --set-section-flags .bss=alloc,load,contents <app> <output>

Secondly, we want to bundle all applications into a single binary with the following format.

The app lengths page contains a large array with the sizes of all apps in pages. This header is then followed by the application code, which is padded so that every app starts at a multiple of 4KiB. The supplied tool imgbuilder.cc takes over the task of building the image; interpreting the initrd must then done at runtime (by you). The memory location and size of the loaded initrd is provided in the Multiboot information (see Multiboot::getModule()). To have more space in the kernel area, it is advisable to copy the initrd to a suitable location during initialization, and free the initrd memory area.

Checklist

Init allocators with free memory from multiboot (exluding kernelimage and initrd memory)
Setup kernel page tables
Load applications from the initrd (copy code to user pages and create stack)
Setup a user page directory for every thread (kernel page tables can be shared)
Activate paging before running the first app
Test: Try writing to kernel memory