Saturday 30 January 2016

Operating System - Memory Management PAGING SEGMENTATION THRASHING

Operating System - Memory Management PAGING SEGMENTATION THRASHING

Memory management is the functionality of an operating system which handles or manages primary memory. Memory management keeps track of each and every memory location either it is allocated to some process or it is free. It checks how much memory is to be allocated to processes. It decides which process will get memory at what time. It tracks whenever some memory gets freed or unallocated and correspondingly it updates the status.
Memory management provides protection by using two registers, a base register and a limit register. The base register holds the smallest legal physical memory address and the limit register specifies the size of the range. For example, if the base register holds 300000 and the limit register is 1209000, then the program can legally access all addresses from 300000 through 411999.
Memory Management Instructions and data to memory addresses can be done in following ways
  • Compile time -- When it is known at compile time where the process will reside, compile time binding is used to generate the absolute code.
  • Load time -- When it is not known at compile time where the process will reside in memory, then the compiler generates re-locatable code.
  • Execution time -- If the process can be moved during its execution from one memory segment to another, then binding must be delayed to be done at run time

Dynamic Loading

In dynamic loading, a routine of a program is not loaded until it is called by the program. All routines are kept on disk in a re-locatable load format. The main program is loaded into memory and is executed. Other routines methods or modules are loaded on request. Dynamic loading makes better memory space utilization and unused routines are never loaded.

Dynamic Linking

Linking is the process of collecting and combining various modules of code and data into a executable file that can be loaded into memory and executed. Operating system can link system level libraries to a program. When it combines the libraries at load time, the linking is called static linking and when this linking is done at the time of execution, it is called as dynamic linking.
In static linking, libraries linked at compile time, so program code size becomes bigger whereas in dynamic linking libraries linked at execution time so program code size remains smaller.

Logical versus Physical Address Space

An address generated by the CPU is a logical address whereas address actually available on memory unit is a physical address. Logical address is also known a Virtual address.
Virtual and physical addresses are the same in compile-time and load-time address-binding schemes. Virtual and physical addresses differ in execution-time address-binding scheme.
The set of all logical addresses generated by a program is referred to as a logical address space. The set of all physical addresses corresponding to these logical addresses is referred to as a physical address space.
The run-time mapping from virtual to physical address is done by the memory management unit (MMU) which is a hardware device. MMU uses following mechanism to convert virtual address to physical address.
  • The value in the base register is added to every address generated by a user process which is treated as offset at the time it is sent to memory. For example, if the base register value is 10000, then an attempt by the user to use address location 100 will be dynamically reallocated to location 10100.
  • The user program deals with virtual addresses; it never sees the real physical addresses.

Swapping

Swapping is a mechanism in which a process can be swapped temporarily out of main memory to a backing store , and then brought back into memory for continued execution.
Backing store is a usually a hard disk drive or any other secondary storage which fast in access and large enough to accommodate copies of all memory images for all users. It must be capable of providing direct access to these memory images.
Major time consuming part of swapping is transfer time. Total transfer time is directly proportional to the amount of memory swapped. Let us assume that the user process is of size 100KB and the backing store is a standard hard disk with transfer rate of 1 MB per second. The actual transfer of the 100K process to or from memory will take
100KB / 1000KB per second
= 1/10 second
= 100 milliseconds
Process Swapping

Memory Allocation

Main memory usually has two partitions
  • Low Memory -- Operating system resides in this memory.
  • High Memory -- User processes then held in high memory.
Operating system uses the following memory allocation mechanism.
S.N. Memory Allocation Description
1 Single-partition allocation In this type of allocation, relocation-register scheme is used to protect user processes from each other, and from changing operating-system code and data. Relocation register contains value of smallest physical address whereas limit register contains range of logical addresses. Each logical address must be less than the limit register.
2 Multiple-partition allocation In this type of allocation, main memory is divided into a number of fixed-sized partitions where each partition should contain only one process. When a partition is free, a process is selected from the input queue and is loaded into the free partition. When the process terminates, the partition becomes available for another process.

Fragmentation

As processes are loaded and removed from memory, the free memory space is broken into little pieces. It happens after sometimes that processes can not be allocated to memory blocks considering their small size and memory blocks remains unused. This problem is known as Fragmentation.
Fragmentation is of two types
S.N. Fragmentation Description
1 External fragmentation Total memory space is enough to satisfy a request or to reside a process in it, but it is not contiguous so it can not be used.
2 Internal fragmentation Memory block assigned to process is bigger. Some portion of memory is left unused as it can not be used by another process.
External fragmentation can be reduced by compaction or shuffle memory contents to place all free memory together in one large block. To make compaction feasible, relocation should be dynamic.

Paging

External fragmentation is avoided by using paging technique. Paging is a technique in which physical memory is broken into blocks of the same size called pages (size is power of 2, between 512 bytes and 8192 bytes). When a process is to be executed, it's corresponding pages are loaded into any available memory frames.
Logical address space of a process can be non-contiguous and a process is allocated physical memory whenever the free memory frame is available. Operating system keeps track of all free frames. Operating system needs n free frames to run a program of size n pages.
Address generated by CPU is divided into
  • Page number (p) -- page number is used as an index into a page table which contains base address of each page in physical memory.
  • Page offset (d) -- page offset is combined with base address to define the physical memory address.
Paging Following figure show the paging table architecture.
Paging Example

Segmentation

Segmentation is a technique to break memory into logical pieces where each piece represents a group of related information. For example ,data segments or code segment for each process, data segment for operating system and so on. Segmentation can be implemented using or without using paging.
Unlike paging, segment are having varying sizes and thus eliminates internal fragmentation. External fragmentation still exists but to lesser extent.
Logical Address Space Address generated by CPU is divided into
  • Segment number (s) -- segment number is used as an index into a segment table which contains base address of each segment in physical memory and a limit of segment.
  • Segment offset (o) -- segment offset is first checked against limit and then is combined with base address to define the physical memory address.



Segmentation Example

Page Faults

  • What happens when a process references a page that is in the backing store?
    • For pages in the backing store, clear the exists bit in the page table entries.
    • If exists is not set, then a reference to the page causes a trap to the operating system.
    • These traps are called page faults.
    • To handle a page fault, the operating system
      • Finds a free page frame in memory
      • Reads the page in from backing store to the page frame
      • Updates the page table entry, setting exists
      • Resumes execution of the thread
  • How does the OS figure out which page generated the fault?
    • x86: hardware saves the virtual address that caused the fault (CR2 register)
    • On some platforms OS gets only address of faulting instruction, must simulate the instruction and try every address to find the one that generated the fault
  • Restarting process execution after a page fault is tricky, since the fault may have occurred in the middle of an instruction.
    • If instructions are idempotent, just restart the faulting instruction (hardware saves instruction address during page fault).
    • Non-idempotent instructions are more difficult to restart:
      MOV +(SP), R2
      
    • Without hardware support it may be impossible to resume a process safely after a page fault. Hardware must keep track of side effects:
      • Undo all side effects during a page fault?
      • Save info about side effects, use it to restart instruction "in the middle"
    • Instruction set design for restartability:
      • Simple instructions are easier to restart.
      • Delay side effects until after all memory references.
      • Difficult to add restartability after the fact.

Page Fetching

  • Once the basic page fault mechanism is working, the OS has two scheduling decisions to make:
    • Page fetching: when to bring pages into memory.
    • Page replacement: which pages to throw out of memory.
  • Overall goal: make physical memory look larger than it is.
    • Locality: most programs spend most of their time using a small fraction of their code and data.
    • Keep in memory the information that is being used.
    • Keep unused information on disk.
    • Ideally: paging produces a memory system with the performance of main memory and the cost/capacity of disk!
  • Approaches to page fetching:
    • Demand fetching: start process with no pages loaded, don't fetch a page until it is referenced.
    • Prefetching: try to predict when pages will be needed and load them ahead of time to avoid page faults.
      • Requires predicting the future, so hard to do.
      • One approach: when taking a page fault, read many pages instead of just one (wins if program accesses memory sequentially).

Page Replacement

  • Once all of memory is in use, will need to throw out one page each time there is a page fault.
  • Random: pick any page at random (works surprisingly well!)
  • FIFO: throw out the page that has been in memory longest.
  • MIN: The optimal algorithm requires us to predict the future.
  • Least Recently Used (LRU): use the past to predict the future.
  • Strange but true: for some placement policies, such as FIFO, adding more memory can sometimes cause paging performance to get worse. This is called "Belady's Anomaly" after Les Belady, who was the first person to notice it.
  • Implementing LRU: need hardware support to keep track of which pages have been used recently.
    • Perfect LRU?
      • Keep a register for each page, store system clock into that register on each memory reference.
      • To choose page for placement, scan through all pages to find the one with the oldest clock.
      • Hardware costs would have been prohibitive in the early days of paging; also, expensive to scan all pages during replacement.
    • In practice nobody implements perfect LRU. Instead, we settle for an approximation that is efficient. Just find an old page, not necessarily the oldest.
  • Clock algorithm (called second chance algorithm in Silberschatz et al.): keep reference bit for each page frame, hardware sets the reference bit whenever a page is read or written. To choose page for placement:
    • Start with FIFO approach: cycle through pages in order circularly.
    • If the next page has been referenced, then don't replace it; just clear the reference bit and continue to the next page.
    • If the page has not been referenced since the last time we checked it, then replace that page.
  • Dirty bit: one bit for each page frame, set by hardware whenever the page is modified. If a dirty page is replaced, it must be written to disk before its page frame is reused.
  • The clock algorithm typically gives additional preference to dirty pages. For example, if the reference but for a page is clear, but the dirty bit is set, don't replace this page now, but clear the dirty bit and start writing the page to disk.
  • Free page pool: some systems keep a small list of clean pages that are available immediately for replacement.
    • During replacement, take the page that has been in the free pool the longest, then run the replacement algorithm to add a new page to the free pool.
    • Pages in the free pool have their exists bit off, so any references to those pages cause a page fault
    • If a page fault occurs for a page in the free pool, remove it from the free pool and put it back in service; much faster than reading from disk.
    • Provides an extra opportunity for recovery if we make a poor page replacement decision.
  • How to implement page replacement when there are multiple processes running in the system?
    • Global replacement: all pages from all processes are lumped into a single replacement pool. Each process competes with all the other processes for page frames; one "pig" process can slow the entire system down.
    • Per-process replacement: each process has a separate pool of pages. A page fault in one process can only replace one of that process's frames. This relieves interference from other processes.
    • Unfortunately, per-process replacement creates a new scheduling dilemma: how many page frames to allocate to each process? If this decision is made incorrectly, it can result in inefficient memory usage.
    • Most systems use global replacement.

Thrashing

  • Normally, if a thread takes a page fault and must wait for the page to be read from disk, the operating system runs a different thread while the I/O is occurring. Thus page faults are "free"?
  • What happens if memory gets overcommitted?
    • Suppose the pages being actively used by the current threads don't all fit in physical memory.
    • Each page fault causes one of the active pages to be moved to disk, so another page fault will occur soon.
    • The system will spend all his time reading and writing pages, and won't get much work done.
    • This situation is called thrashing; it was a serious problem in early demand paging systems.
  • How to deal with thrashing?
    • If a single process is too large for memory, there is nothing the OS can do. That process will simply thrash.
    • If the problem arises because of the sum of several processes:
      • Figure out how much memory each process needs.
      • Change scheduling priorities to run processes in groups that fit comfortably in memory: must shed load.
  • Working Sets: conceptual model proposed by Peter Denning to prevent thrashing.
    • Informal definition: the collection of pages a process is using actively, and which must thus be memory-resident to prevent this process from thrashing.
    • If the sum of all working sets of all runnable threads exceeds the size of memory, then stop running some of the threads for a while.
    • Divide runnable threads into two groups: active and inactive:
      • When a process is active its entire working set must always be in memory: never execute a thread whose working set is not resident.
      • When a process becomes inactive, its working set can migrate to disk.
      • Threads from inactive processes are never scheduled for execution.
      • The collection of active processes is called the balance set.
      • The system must have a mechanism for gradually moving processes into and out of the balance set.
      • As working sets change, the balance set must be adjusted.
  • How to compute working sets?
    • Denning proposed a working set parameter T: all pages referenced in the last T seconds comprise the working set.
    • Can extend the clock algorithm to keep an idle time for each page.
    • Pages with idle times less than T are in the working set.
  • Difficult questions for the working set approach:
    • How long should T be (typically minutes)?
    • How to handle changes in working sets?
    • How to manage the balance set?
    • How to account for memory shared between processes?
  • Page Fault Frequency: another approach to preventing thrashing.
    • Per-process replacement; at any given time, each process is allocated a fixed number of physical page frames.
    • Monitor the rate at which page faults are occurring for each process.
    • If the rate gets too high for a process, assume that its memory is overcommitted; increase the size of its memory pool.
    • If the rate gets too low for a process, assume that its memory pool can be reduced in size.
    • If the sum of all memory pools don't fit in memory, deactivate some processes.
  • In practice, today's operating systems don't worry much about thrashing:
    • With personal computers, users can notice thrashing and handle it themselves:
      • Typically, just buy more memory
      • Or, manage balance set by hand
    • Thrashing was a bigger issue for timesharing machines with dozens or hundreds of users:
      • Why should I stop my processes just so you can make progress?
      • System had to handle thrashing automatically.
    • Technology changes make it unreasonable to operate machines in a range where memory is even slightly overcommitted; better to just buy more memory.

0 comments:

Post a Comment