Transparent Huge Pages for Virtual-Memory Objects

image

Multiple page sizes coexist within the same memory object. [Generated with AI]

Context

The rapid growth of main memory capacities poses significant challenges for both hardware engineers and software developers. In particular, efficiently provisioning large virtual address spaces requires substantial effort. To extend address space sizes, additional levels of page tables are necessary, which increase address translation overhead. To mitigate this issue, aggressive prefetching and complex cache hierarchies are employed, trying to hide the added latency.

One strategy to alleviate pressure on the memory management unit is the introduction of huge and giant pages. These larger pages increase the granularity of the virtual-to-physical address mapping from 4 KiB to 2 MiB or 1 GiB, effectively omitting lower-level page table entries. As a result, fewer steps are required to traverse the page tables and translation lookaside buffer (TLB) coverage increases, leading to less TLB misses. Although this approach is effective from a hardware perspective, it demands sophisticated software support to realize its benefits. Using multiple page sizes simultaneously in a system requires robust fragmentation avoidance mechanisms. Moreover, the computational overhead of active memory compaction can easily outweigh the advantages of this approach.

In contrast, Morsels address the memory challenge from an orthogonal perspective. Rather than focusing on address translation overhead, we targeted the substantial bookkeeping overhead associated with managing vast amounts of memory as numerous small 4 KiB pages. Technically, Morsels represent a subtree of the page table hierarchy, managed as indivisible object, enabling fast transfer between address spaces.

Problem

The Morsel concept has shown promising results in terms of memory management, but it still has some room for improvement regarding translation overhead. An initial implementation introduced huge and giant page support to Morsels, but this was done in an all-or-nothing manner.

While larger page sizes can reduce address translation overhead and enhance application performance, the actual benefits depend on the access pattern and data layout. However, for sparsely used memory ranges, internal fragmentation and resulting memory waste can outweigh potential performance gains. Therefore, it is essential to use huge pages only where they provide advantages, which often requires a finer-grained decision-making process that cannot be made at the object level.

Goal

The primary objective of this thesis is to implement fine-grained page size selection after Morsel creation, enabling dynamic adjustment of page sizes based on application needs. To achieve this, we propose introducing a new system call that allows specifying the page size for a given memory range. If the virtual memory range is not yet backed by physical memory, the specified page size will be used for population when necessary. Otherwise, existing memory pages should be promoted or demoted depending on the requested page size.

To ensure seamless integration, this operation must work completely transparent, allowing other threads to concurrently access the Morsel object without interruption. Furthermore, to maintain the self-contained nature of Morsels, the requested page size should be encoded into the unused bits of the page table entries, eliminating the need for additional metadata.

The thesis will follow these steps:

  1. Getting started: Familiarize with kernel development, set up a suitable development environment, and establish a functional test setup.
  2. Add system call: Add an ioctl command to the existing interface for page size specification.
  3. Implementation 1: Implement the ahead-of-time page size selection for non-populated memory ranges.
  4. Implementation 2: Extend the implementation with promotion/demotion of already memory-backed ranges.
  5. Evaluation: Evaluate the implementation with microbenchmarks targeting the new system call in different settings and application-level benchmarks that highlight the benefits of the mixed page sizes.
  6. Optional: Extend the system call to automatically select the optimal page size based on sparsity and memory usage.

Topics: C, Linux kernel, huge pages, virtual memory, paging

References

Publications

DIMES Workshop
Morsels: Explicit Virtual Memory Objects
Alexander Halbuer, Christian Dietrich, Florian Rommel, Daniel LohmannProceedings of the 1st Workshop on Disruptive Memory SystemsAssociation for Computing Machinery2023.
PDF Details Slides 10.1145/3609308.3625267 [BibTex]

Huge Pages for Virtual-Memory Objects

Currently, Morsels only support 4-KiB pages, but with larger page sizes the management overhead could be further reduced and the average memory access could be sped up, due to faster page-table walks and an increased TLB coverage.

 
Typ
Bachelorarbeit

 
Status
abgeschlossen

 
Supervisors
Alexander Halbuer
Daniel Lohmann

 
Project
ParPerOS

 
Bearbeiter
Marko Bolowski (abgegeben: 18. Mar 2024)