StuBS
|
In the last assignment, you implemented paging for memory isolation. Consequently, applications can no longer access each other's memory. Unfortunately, this also prevents them from sending any data to other applications.
We will change that in this assignment with safe and explicit interprocess communication. Additionally, we introduce the fork syscall, allowing applications to create new copies of themselves. The final task will be optimizing both IPC and fork with copy-on-write (COW).
Making this communication safe requires going through the kernel, which can reliably check the sent data and recipients. In RStuBS, the IPC mechanism follows the send-receive-reply
procedure:
![]()
The
send
function blocks until the correspondingreply
was executed, and similarly thereceive
function blocks until asend
for this process was issued. Thereply
function wakes up the sender does not block.
This mechanism allows sending data in both directions. The send function specifies two buffers: one for the message
to be sent to the recipient and a second mutable buffer where the response
should be stored. Accordingly, the receive function must specify an appropriate mutable buffer in which the message
is then copied. And lastly, the reply also takes another buffer for the response
, which is copied into the provided response
buffer from the send function.
We introduce four new syscalls that implement this send-receive-reply
procedure.
The fork syscall clones an application, including its address space and current instruction pointer. It returns twice: at the parent process, that called fork in the first place, and at the newly created child process. Other than UNIX, we do not create a process hierarchy in StuBS. Instead, fork just returns the pid of the other process: for the newly created process, this is the pid of the old process and vice versa.
And here is the corresponding syscall:
fork
with the appropriate pid. How?In the second half of this assignment, we want to reduce the number of (costly) copies for both IPC and fork. A common strategy to do so is the copy-on-write mechanism, which is implemented by most general-purpose OSs (Linux, Darwin, Windows). This mechanism shares the underlying data pages between processes and copies them lazily when a process modifies them for the first time. This hides most of the initial copy delay, and it turns out that, especially for fork, most of the memory does not have to be copied at all. Note that you can only COW whole pages. IPC buffers are not required to be page aligned/sized, so you might still have to copy parts at the start/end of the buffers.
The concrete implementation looks like this:
There are two ways to invalidate TLB entries: the first option is to flush all entries by writing to the CR3 register, and the alternative is the invlpg
instruction that invalidates a single entry.
There are a lot of data structure that can be used for storing the reference counters for each (COWed) page frame. Here we discuss a few examples, you can choose whatever you like. Generally we want to have a mapping from the physical page address to the corresponding reference counter.
Shadow Page Tables: Usually, page tables implement the mapping from virtual to physical address (including access rights). However, this tree data structure can also be used for other purposes. It can store different data instead of physical addresses, like reference counters. In this case, the page tables would not be given to the MMU. They are purely used as a map data structure (like a hash map). As a nice benefit, you probably could reuse the logic for creating and traversing them from your existing page directory/table abstractions. Also note, that you would use the physical address to walk the shadow page tables, not the virtual.
Typical Page Directory:
Shadow Page Directory:
Struct Page: Linux and other OSs commonly use a large array of entries for every physical page frame static FRAMES: [Frame; 1<<20]
. These arrays can be quickly indexed with physical_address / 4096
and store reference counters and other OS specific flags. The downside of this approach is, that these entries quickly become filled with a lot of unrelated data and difficult to maintain. Here you might need atomic ref counters, depending on your chosen implementation/semantics.
The following application is a stress test for your COW implementation. It checks the general functionality, including a few edge cases.
Here, we split our application into four pairs. Within each pair, the child sends a message (including its pid) to the parent, who waits for this message, modifies it, and sends it back to the child.
This test should print four lines of two identical letters similar to the following:
The order of lines could differ depending on the timing.
Finally, demonstrate the correctness of your implementation by printing the number of allocated page frames in the user and kernel area respectively:
main