Inception of Virtual Memory

Understanding the computer hardware and how software is designed to manage the hardware is gonna be the most fascinating experience you will ever have as a software engineer or a programmer.

I am always curious about the operating system and I consider it one of the most complex software the mankind has ever produced. It was not built in a day but has been shaped with incredible genius pieces that stich together to mange the hardware and supervision the programs we ran on it. Day by day is is growing complex.

Today we will understand one such genius piece of the operating system called the Virtual Memory.

History

Directly jumping to the definition of Virtual Memory may not help us understand it’s elegance. But let’s see when it was introduced.

  • The first Unix release with virtual memory was the 3BSD implementation by researchers at Berkeley, released in 1979.
  • For Microsoft, Windows NT marked the introduction of virtual memory in 1992.
  • For the MAC, implementation of virtual memory began with the move from OS 9 to OS X in 2002.

Did you saw that? 1979!! Even the first implementation dates back to an era when the we did not even have the general purpose operating systems we have today. The Atlas Computer of 1962 was the first system with a virtual memory. The research papers on virtual memory dates back to 1950s.

Memory Challenges

The job of the operating system is to run as much processes as possible and each process needs memory to execute. But the physical memory is limited. So the operating system face challenges like:

  • Deciding how much memory to allocate to each process?
  • Avoid process A to access the memory of process B?
  • Manage fragmented memory (i.e. required memory is available but it’s not contagious).
  • Allocate memory to active processes but the required memory is more than the total available physical memory?

So there are special challenges when the load on the CPU increases, the process may run out of memory and it is subjected to corruption due to invasion of memory boundaries between processes.

Concept of Virtual Memory

The idea of virtual memory is simple. It tries to create virtual address space that doesn’t corresponds to real address in the physical memory. It is an abstraction created by the operating system to manage memory efficiently and with fewer errors.

Using this concept, the OS creates an illusion for the process that it has enough continuos memory blocks available even when the RAM is limited. This allows the OS to run large process or more process at the same time without running out of memory, making it easy to share memory between process and avoid memory faults.

The problems we above can be solved if we can somehow allocate each process it’s own memory (virtual) as and when required and then map those individual allocated memories to the real physical memory or the RAM. And that’s where the concept of virtualization comes into the picture.

Virtual memory is one of the great ideas in computer systems because it works automatically without the application programmer’s headache to understand the internals. Most of the programmers never need to worry about the concept of virtual memory but it is good to have knowledge and is essential for system programmers.

Understanding the Implementation

Consider we are having a single CPU. So when virtualization of memory was not there, and we want to run multiple processes, the OS manually creates the boundaries and will assign pieces of the physical memory to each process.
Now when the physical memory is fully occupied, the incoming process will not be executed and will get Out of memory error.

So now as stated above, suppose that we can allocate individual memory addresses to each process dynamically on the fly.

Process Memory Address (KB)
Process A 100 - 200
Process B 400 - 800
Process C 150 - 190
Process D 801 - 1024

But the catch here is that our RAM is only 512 KB! Now we now need a mechanism to map the 1024 KB of virtual memory to the 512 KB of RAM.

With virtualization, we have a new layer between the CPU and the RAM, with some additional hardware support called the Memory Management Unit also called the MMU. The MMU contains the Address Translator which converts the virtual address to the physical location. The OS will use the MMU components to carry out the mappings and there’s a supporting data structure dedicated to this which is called as Page Table.

The Page Table is the bridge between the Virtual Memory and the RAM.

The virtual memory island has it’s own terminologies which are actually similar to the physical memory. This is because when virtualization was pitched the semiconductor memory was not invented or coined.

Virtual Memory in Action

Note: The virtual memory is divided into chunks (ideally 4KB) called as pages. While the RAM is split into frames. For simplification let’s assume that the frames and pages are 1 KB each.

Taking the same scenario above, our RAM is of 512 KB and our virtual memory is 1024 KB. We can understand the working by looking at two distinct scenarios:

  1. Underload: Memory requirement is less than the available RAM
  2. Overload: Memory requirement is more that the available RAM

We will not encounter any issues when the CPU is underload. We will focus on overload scenario.

Page Table

The page table is the data structure that stores the mappings of the pages with the RAM and the Disk. It also has a bit field (0 or 1) to identify if the page is currently mapped to RAM or not.

Here’s how it will look like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
            PAGE TABLE (1024 ENTRIES)
--------------------------------------------------
| Virtual Page | Present | Physical Frame / Disk |
--------------------------------------------------
| VP0 | 1 | PF10 |
| VP1 | 1 | PF11 |
| VP2 | 1 | PF12 |
| VP3 | 1 | PF13 |
| ... | ... | ... |
| VP511 | 1 | PF300 |
---------------------------------------------------
| VP512 | 0 | DB 20 |
| VP513 | 0 | DB 31 |
| VP514 | 0 | DB 55 |
| ... | ... | ... |
| VP1023 | 0 | DB 100 |
--------------------------------------------------

In the above, VP is virtual page, PF is physical frame and DB is the disk block. You might know about the swap memory while disk partition of debian linux installation (like Ubuntu). This swap memory on the disk is where the OS keeps the excess Virtual Pages.

Page Resolution Steps

  1. When a virtual address is encountered, the CPU consults the MMU which talks to the Page Table to check if the VP is present in PF.
    • If yes then which PF it is?
    • If not, then where it is on the disk?
  2. Page Fault is triggered when a VP is not present in the RAM. Suppose a process is trying to access VP512 (from the diagram it’s not in RAM but is stored in the disk)
    • CPU raises an exception (PageFault), the OS handles this exception
    • The OS will try to evict unused VP from the RAM (let’s say VP1)
    • If VP1 was modified then it will be stored on the disk
    • The OS will load VP512 in the RAM and it will update the Page Table.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
            PAGE TABLE (1024 ENTRIES)
--------------------------------------------------
| Virtual Page | Present | Physical Frame / Disk |
--------------------------------------------------
| VP0 | 1 | PF10 |
| VP512 | 1 | PF11 | <- updated entry for VP512
| VP2 | 1 | PF12 |
| VP3 | 1 | PF13 |
| ... | ... | ... |
| VP511 | 1 | PF300 |
---------------------------------------------------
| VP1. | 0 | DB 20 |
| VP513 | 0 | DB 31 |
| VP514 | 0 | DB 55 |
| ... | ... | ... |
| VP1023 | 0 | DB 100 |
--------------------------------------------------

Page Replacement Algorithms

Page replacement algorithms are methods the operating system uses to decide which page to remove from RAM when RAM is full and a page fault occurs.

Here’s the most commonly known and used page replacement algorithms.

Algorithm Working Usages
FIFO Evict the oldest loaded page Rare
LRU Evict the least recently used page Too expensive
LRU Approximation Simulate LRU using reference bits, aging Common
Second Chance FIFO & reference bit for a second chance Common
CLOCK Circular list; evict first with ref=0 Very Common
Enhanced CLOCK CLOCK & dirty/reference bits Very Common
CLOCK-Pro Tracks hot & cold pages; advanced CLOCK Common in modern kernels
WSClock CLOCK & working set info Common in Linux Virtual Memory
LFU Evict least frequently used page Rare
MFU Evict most frequently used page Rare
Random Evict a random page Uncommon but simple
Optimal (OPT) Evict page used farthest in the future Not used

Conclusion

I have barely scratched the surface and there’s lot more to it like address translation, multi-level page table, swap memory, DMAs, and segmentation, locality, TLBs. For now I am taking a pause here but I would advice programmers, software engineers and CSE undergrads to study the operating systems in detail. For fun, profit and enlightenment.

In future I will try to add more to it especially multi-layer page tables and related case studies.

References