virtual memory csci 444/544 operating systems fall 2008

Virtual Memory

CSCI 444/544 Operating Systems

Fall 2008

Agenda

• What is virtual memory – rationales– intuition

• Working mechanism of virtual memory– demand paging– page fault

• Virtual memory policies– page selection– page replacement

What is Virtual Memory

• OS goal: support processes when not enough physical memory

– Single process with very large address space– Multiple processes with combined address spaces

• Virtual Memory: OS provides illusion of more physical memory

– User code should be independent of amount of physical memory

– Only part of the program needs to be in memory for execution

Virtual Memory

Logical address space can therefore be much larger than physical address space

Rationales behind virtual memory- locality of reference

- memory hierarchy (backing store, I.e., disk)

Paging makes virtual memory possible at fine-grain- demand paging: make a physical instance of a page in memory only

when needed

Locality of Reference

Leverage locality of reference within processes• Spatial: reference memory addresses near previously referenced

addresses• Temporal: reference memory addresses that have referenced in

the past• Processes spend majority of time in small portion of code

– Estimate: 90% of time in 10% of code

Implication: • Process only uses small amount of address space at any moment• Only small amount of address space must be resident in physical

memory

Backing Store

Leverage memory hierarchy of machine architecture

Each layer acts as “backing store” for layer above

• with virtual memory, the whole address space of each process has a copy in the backing store (I.e., disk)

• consider the whole program actually resides on the disk, only part of it is cached in memory

• with each page table entry, a valid-invalid bit is associated– 1 => in memory, 0 => not-in-memory or invalid logical page

Overview of Virtual Memory

Virtual Memory IntuitionIdea: OS keeps unreferenced pages on disk

• Slower, cheaper backing store than memory

Process can run when not all pages are loaded into main memory

OS and hardware cooperate to provide illusion of large disk as fast as main memory• Same behavior as if all of address space in main memory• Hopefully have similar performance

Requirements:• OS must have mechanism to identify location of each page in address

space in memory or on disk• OS must have policy for determining which pages live in memory and

which on disk

Demand Paging

Bring a page into memory only when it is needed• Less I/O needed• Less memory needed• Faster response• More user programs running

Page is needed => reference to it• Invalid reference => abort• Not-in-memory => bring into memory

– valid-invalid bit = 0 => page fault– Or some systems introduce extra-bit: present

• present is clear => page fault

Virtual Memory Translation

Hardware and OS cooperate to translate addressesFirst, hardware checks TLB for virtual address

• if TLB hit, address translation is done; page in physical memory

If TLB miss...• Hardware or OS walk page tables• If valid-invalid (present) bit is 1, then page in physical memory

If page fault (i.e., valid-invalid bit is 0 or present bit is cleared)• Trap into OS• OS selects a free frame or a victim page in memory to replace

– Write victim page out to disk if modified (add dirty bit to PTE)

• OS reads referenced page from disk into memory• Page table is updated, present bit is set• Process continues execution

Page FaultA reference to a page with valid bit set to 0 will trap to OS =>

page fault

OS looks at PCB to decide- invalid reference => abort

- just no in memory. Get free frame. Swap into frame. Reset tables

What if there is no free frame? - evict a victim page in memory

Page Fault Overhead

page fault exception handling

[swap page out]

swap page in

restart user program

Virtual Memory Benefits

Virtual memory allows other benefits during process creation:

- Copy-on-Write

- Memory-Mapped Files

Copy-on-Write

Copy-on-Write (COW) allows both parent and child processes to initially share the same pages in memory

If either process modifies a shared page, only then is the page copied

COW allows more efficient process creation as only modified pages are copied

Free pages are allocated from a pool of zeroed-out pages

Memory-Mapped FilesMemory-mapped file I/O allows file I/O to be treated as routine

memory access by mapping a disk block to a page in memory

A file is initially read using demand paging. - A certain portion of the file is read from the file system into a physical page. - Subsequent reads/writes to/from the file are like ordinary memory accesses.

Simplifies file access by treating file I/O through memory rather than read() and write() system calls

Also allows several processes to map the same file allowing the pages in memory to be shared

Virtual Memory Policies

OS has two decisions on a page fault• Page selection

– When should a page (or pages) on disk be brought into memory?– Two cases

• When process starts, code pages begin on disk• As process runs, code and data pages may be moved to memory

• Page replacement– Which resident page (or pages) in memory should be thrown out to

disk?

Goal: Minimize number of page faults• Page faults require milliseconds to handle (reading from disk)• Implication: Plenty of time for OS to make good decision

Page Selection

When should a page be brought from disk into memory?Demand paging: Load page only when a reference is made

to location on that page (page fault occurs)• Intuition: Wait until page must absolutely be in memory• When process starts: No pages are loaded in memory

– a flurry of page faults– after a time, locality works and reduces the page faults to a very low

level

• Pros: less work for user• Cons: pay cost of page fault for every newly accessed page

Pre-paging (anticipatory, pre-fetching): OS loads page into memory before page is referenced

Pre-Paging• OS predicts future accesses and brings pages into

memory ahead of time– Works well for some access patterns (e.g., sequential)

• Pros: May avoid page faults• Cons: ineffective if most of the extra pages that are

brought in are not referenced.

Combine demand or pre-paging with user-supplied hints about page references• User specifies: may need page in future, don’t need this

page anymore, or sequential access pattern, ...

What happens if no free frame available

Page replacement – find some page in memory, but not really in use, swap it out• Algorithm

– FIFO– Optimal – Least Recently Used (LRU)

• performance – want an algorithm which will result in minimum number of page faults

Page Replacement

Which page in main memory should selected as victim?• Write out victim page to disk if modified (dirty bit set)• If victim page is not modified (clean), just discard

A dirty bit for each page• Indicating if a page has been changed since last time

loaded from the backing store• Indicating whether swap-out is necessary for the victim

page

Page Replacement Algorithms

Page replacement algorithm: the algorithm that picks the victim page

Metrics:• Low page-fault rate• Implementation cost/feasibility

For the page-fault rate, we evaluate an algorithm • by running it on a particular string of memory references

(reference string)• and computing the number of page faults on that string

First-In-First-Out (FIFO)

FIFO: Replace page that has been in memory the longest• Intuition: First referenced long time ago, done with it

now• Advantages:

– Fair: All pages receive equal residency– Easy to implement (circular buffer)

• Disadvantage: Some pages may always be needed

A Case of FIFO Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5

3 frames (3 pages can be in memory at a time per process)

4 frames

FIFO Replacement – Belady’s Anomaly

more frames => more page faults

1

2

3

1

2

3

4

1

2

5

3

4

9 page faults

1

2

3

1

2

3

5

1

2

4

510 page faults

44 3

Optimal Algorithm

Replace page that will not used for longest period of time in the future

• Advantages: Guaranteed to minimize number of page faults

• Disadvantages: Requires that OS predict the future– Not practical, but good for comparison– Used for measuring how well your algorithm performs

Least Recently Used (LRU) Algorithm

Replace page not used for longest time in past• Intuition: Use past to predict the future

– past experience is a decent predictor of future behavior

• Advantages:– With locality, LRU approximates OPT

• Disadvantages:– Harder to implement, must track which pages have been

accessed– Does not handle all workloads well

A Case of LRUReference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5

Counter implementation• Every page entry has a counter; every time page is referenced

through this entry, copy the clock into the counter• When a page needs to be replaced, look at the counters to

determine which one to replace

1

2

3

5

4

4 3

5

Stack Implementation of LRU

Stack implementation – keep a stack of page numbers in a double link form:

• Page referenced: move it to the top

• Always replace at the bottom of the stack

• No search for replacement

Implementing LRUSoftware Perfect LRU (stack-based)

• OS maintains ordered list of physical pages by reference time• When page is referenced: Move page to front of list• When need victim: Pick page at back of list• Trade-off: Slow on memory reference, fast on replacement

Hardware Perfect LRU (clock-based)• Associate register with each page• When page is referenced: Store system clock in register• When need victim: Scan through registers to find oldest clock• Trade-off: Fast on memory reference, slow on replacement

(especially as size of memory grows)

In practice, do not implement Perfect LRU• LRU is an approximation anyway, so approximate more• Goal: Find an old page, but not necessarily the very oldest

LRU Approximation Algorithms

Reference bit• With each page associate a bit, initially = 0• When page is referenced bit set to 1• Replace the one which is 0 (if one exists). We do not know the

order, however.

Second chance• Need reference bit• Circle replacement

– Logically arrange all physical frames in a big circle (clock)

• If page to be replaced (in clock order) has reference bit = 1 then:– set reference bit 0– leave page in memory– replace next page (in clock order), subject to same rules

Second-chance Algorithm

Page Replacement Comparison

Add more physical memory, what happens to performance?• LRU, OPT: Add more memory, guaranteed to have

fewer (or same number of) page faults– Smaller memory sizes are guaranteed to contain a

subset of larger memory sizes

• FIFO: Add more memory, usually have fewer page faults

– Belady’s anomaly: May actually have more page faults!

virtual memory csci 444/544 operating systems fall 2008

Documents