StuBS
|
The goal of this task is to extend StuBS with basic paging functionality, to isolate applications from each other, and to separate kernel and user code. We later want to be able to load an arbitrary number of user applications from an initial RAM disk (initrd).
In assignment 1, we started to isolate user applications from the kernel by limiting the available instructions (using x86 rings). This is, however, not enough. Applications can still access and modify the kernel memory and the code/data from each other. In this assignment, we want to give each application its own address space, preventing it from accessing arbitrary data.
Paging is a virtual memory mechanism, where the memory is split into evenly sized "pages" (usually 4 KiB). The pages can be mapped in any order into a virtual address space. Compared to segmentation, this mechanism is much more flexible and does not suffer from external fragmentation.
Down below is an example for a virtual address space, which is the "view" an userspace application has of the memory. Usually, this "view" is entirely different from the real physical memory layout of the hardware.
The application can only access memory, that is "mapped", meaning that there is a pointer to a physical page in the corresponding "page tables". The page tables define the mapping policy (how virtual addresses are translated to physical addresses). These tables are created by the OS and used by the MMU (Memory Management Unit) of the CPU to automatically translate memory accesses.
For x86 32-bit, Intel uses two levels of page tables. The first page table (called page directory) points to a second level of tables, that then point to physical pages. To find the physical address, the virtual address is split into three parts. The first part indexes into the page directory to find the corresponding page table, the second part indexes into the page table to find the physical page and the last part is used as offset into the physical page. This is called a page table walk and shown below.
Because a page table walk takes a lot of time, there is an extra cache on the CPU that caches these translations, the TLB (translation lookaside buffer). Unfortunately, this TLB has to be managed manually. Whenever you change the page tables, you have to manually invalidate TLB entries!
To dynamically allocate memory for paging, we need a page frame allocator that manages the available physical memory.
Luckily, a (multiboot compliant) bootloader already provides the operating system with a list of all available memory ranges not occupied by devices. It is important to note that the memory already occupied by the kernel image and multiboot modules (we are coming back to this) are not automatically excluded from this list, but must be explicitly filtered out. The location of the kernel can be found by using the ___KERNEL_START___
and ___KERNEL_END___
symbols (see compiler/sections.ld
for details). Furthermore, the area up to 1 MiB should not be used as many devices are mapped there.
Since parsing the Multiboot format is a bit quirky, we have provided you with helpers in the Multiboot
namespace. See Multiboot::getMemoryMap()
and related.
We want to have two separate allocators for kernel memory (below 32 MiB) and user memory. The multiboot memory ranges can be used to initialize these kernel and user page allocators. Every memory needed for the user applications should later be allocated from the user allocator. We recommend implementing the free memory management using a bitmap. However, you are free to implement other designs as well (linked lists, shadow page tables, ...). The kernel allocator should be able to allocate up to 4 contiguous pages for the kernel stacks of threads. Smaller stacks can lead to stack overflows.
We need sufficient abstractions for the page tables and their entries. Implement as many (sensible) abstractions as you like. It may also be useful to create a wrapper type for physical addresses so you don't confuse them with regular pointers.
Then, we can set up the kernel mapping and page tables. We also need functions to set a page directory in the CR3
register and activate paging by updating CR0
.
A virtual address space for an application looks like this:
The memory below the 32MiB kernel boundary should be identity-mapped in supervisor mode to only be accessible from ring 0. This kernel region is shared between all applications.
The second half of the memory map between 32MiB and 4GiB contains the user memory. This range is different for every thread and not identity-mapped. Directly at the 32MiB boundary is the beginning of the user program code. The user stack is at the end of the memory map at 0xffff_f000
, growing downwards. We leave a single page free above the stack because C/C++ forbids accessing the end of the memory map, as this breaks bounds checking and some optimization (similar to the access to a null pointer).
0x0
) unmaped (to provoke a page fault when dereferencing a nullptr
).The scheduler must now manage an arbitrary number of threads loaded from an initial RAM disk (initrd, see below).
The page directories and page tables should be set up for each application thread with the above-mentioned memory mapping. The kernel page tables can be shared between all page directories. If the dispatcher switches to another process in Dispatcher::go()
or Dispatcher::dispatch()
, the mapping of the next process must also become active.
If everything works, processes can neither access the memory areas of other processes nor the identity-mapped kernel area. Syscalls and interrupts should work again, and the kernel can easily access the memory of the currently interrupted process.
Two additional system calls are needed to allow user programs to manage memory dynamically.
The map
system call adds size
new memory to the user application. For this, the operating system must find a suitable gap in the user address space and insert free pages there. The memory should initially be zeroed out. If the allocation succeeds, map
should return a pointer to the new memory, and otherwise a meaningful error code. unmap
does the inverse, freeing a previously mapped range. Remember to invalidate any TLB entries for freed pages. To simplify the allocator implementation, we restrict size
to be multiples of the page size (4KiB) and start
to be page-aligned.
The exit
system call terminates the current process and releases all associated resources. Under no circumstances should this system call return to the application. To validate memory release, the state of free memory management should be displayed. After all applications have terminated, exactly as much free memory should be available as before the start of the first application.
To more easily isolate the kernel and applications, StuBS should be compiled separately from the applications. For this, the applications must be moved to a separate folder/binary. In addition, a syscall library should be created, which contains the syscall stubs for the applications. With the help of this library, each application can now be compiled individually without linking directly against the kernel. Additionally, a linker script (sections.ld
) is required that describes the structure of the executable file. This script should define a meaningful start address (0x0200_0000
for 32MiB); otherwise, it essentially resembles the kernel's linker script (kernel/compiler/sections.ld
). In order to call the constructors of global objects in the application, you should link the supplied init.cc
into each application. For easier development, you should automate the build/link process of applications with make
.
The directory structure might look like this:
Since applications are now completely separated from the Thread
class, they need their own entry point. Just as your average C applications we will utilize a main()
function:
We aim to bundle the applications into a single initial RAM disk (initrd) that can be loaded with multiboot-compliant bootloaders (GRUB, QEMU...). This is a two-step process: first, we generate so-called "flat" binaries from the individual application ELFs with objcopy
(or llvm-objcopy
). These are complete memory images of the programs that can be executed directly (without an ELF loader at runtime).
Secondly, we want to bundle all applications into a single binary with the following format.
The app lengths page contains a large array with the sizes of all apps in pages. This header is then followed by the application code, which is padded so that every app starts at a multiple of 4KiB. The supplied tool imgbuilder.cc
takes over the task of building the image; interpreting the initrd must then done at runtime (by you). The memory location and size of the loaded initrd
is provided in the Multiboot information (see Multiboot::getModule()
). To have more space in the kernel area, it is advisable to copy the initrd to a suitable location during initialization, and free the initrd memory area.