操作系统中的虚拟内存技术及其实现代码

虚拟内存是现代操作系统普遍使用的一种技术。

虚拟内存的基本思想是，每个进程有用独立的逻辑地址空间，内存被分为大小相等的多个块,称为页(Page)。每个页都是一段连续的地址。对于进程来看，逻辑上貌似有很多内存空间，其中一部分对应物理内存上的一块(称为页框 page frame，通常页和页框大小相等)，还有一些没加载在内存中的对应在硬盘上。通过引入进程的逻辑地址，把进程地址空间与实际存储空间分离，增加存储管理的灵活性。

地址空间和存储空间两个基本概念的定义如下：

地址空间：将源程序经过编译后得到的目标程序，存在于它所限定的地址范围内，这个范围称为地址空间。地址空间是逻辑地址的集合。

存储空间：指主存中一系列存储信息的物理单元的集合，这些单元的编号称为物理地址存储空间是物理地址的集合。

由此衍生出的管理方式有三种：
页式存储管理、段式存储管理和段页式存储管理。这里主要介绍页式存储。

在页式系统中进程建立时，操作系统为进程中所有的页分配页框。当进程撤销时收回所有分配给它的页框。在程序的运行期间，如果允许进程动态地申请空间，操作系统还要为进程申请的空间分配物理页框。操作系统为了完成这些功能，必须记录系统内存中实际的页框使用情况。操作系统还要在进程切换时，正确地切换两个不同的进程地址空间到物理内存空间的映射。为了理解操作系统如何完成这些需求，我们先理解页表技术。先看张图，转载自51CTO：

页表中的条目被称为页表项（page table entry），一个页表项负责记录一段虚拟地址到物理地址的映射关系。

既然页表是存储在内存中的，那么程序每次完成一次内存读取时都至少会访问内存两次，相比于不使用MMU（MMU是Memory Management Unit的缩写，它代表集成在CPU内部的一个硬件逻辑单元，主要作用是给CPU提供从虚拟地址向物理地址转换的功能，从硬件上给软件提供一种内存保护的机制）时的一次内存访问，效率被大大降低了，如果所使用的内存的性能比较差的话，这种效率的降低将会更明显。因此，如何在发挥MMU优势的同时使系统消耗尽量减小，就成为了一个亟待解决的问题。

于是，TLB产生了。TLB是什么呢？我们叫它转换旁路缓冲器，它实际上是MMU中临时存放转换数据的一组重定位寄存器。既然TLB本质上是一组寄存器，那么不难理解，相比于访问内存中的页表，访问TLB的速度要快很多。因此如果页表的内容全部存放于TLB中，就可以解决访问效率的问题了。

然而，由于制造成本等诸多限制，所有页表都存储在TLB中几乎是不可能的。这样一来，我们只能通过在有限容量的TLB中存储一部分最常用的页表，从而在一定程度上提高MMU的工作效率。

这一方法能够产生效果的理论依据叫做存储器访问的局部性原理。它的意思是说，程序在执行过程中访问与当前位置临近的代码的概率更高一些。因此，从理论上我们可以说，TLB中存储了当前时间段需要使用的大多数页表项，所以可以在很大程度上提高MMU的运行效率。

我们这里所用的是二级页表的技术，何为二级页表，即是MMU采用二级查表的方法，即首先由虚拟地址索引出第一张表的某一段内容，然后再根据这段内容搜索第二张表，最后才能确定物理地址。这里的第一张表，我们叫它一级页表，第二张表被称为是二级页表。采用二级查表法的主要目的是减小页表自身占据的内存空间，但缺点是进一步降低了内存的寻址效率。

好了，前情介绍完毕，下面上干货，用哈佛大学开发的用于教学的OS161来实现VM，OS161基于MIP-I hardware。

代码位于github上：https://github.com/tian-jiang/OS161-VirtualMemory

首先看一段代码，arch/mips/include/vm.h，物理内存的分配定义在此

/*
 * MIPS-I hardwired memory layout:
 *    0xc0000000 - 0xffffffff   kseg2 (kernel, tlb-mapped)
 *    0xa0000000 - 0xbfffffff   kseg1 (kernel, unmapped, uncached)
 *    0x80000000 - 0x9fffffff   kseg0 (kernel, unmapped, cached)
 *    0x00000000 - 0x7fffffff   kuseg (user, tlb-mapped)
 *
 * (mips32 is a little different)
 */

#define MIPS_KUSEG  0x00000000
#define MIPS_KSEG0  0x80000000
#define MIPS_KSEG1  0xa0000000
#define MIPS_KSEG2  0xc0000000

内存的分配用图表示如下

这张图展示了在OS161中物理内存的分配.

让我们从头开始：man.c

1     /* Early initialization. */
2     ram_bootstrap();
3         .......
4
5     /* Late phase of initialization. */
6     vm_bootstrap();
7         ........

在操作系统启动的时候，调用raw_bootstrap()以及vm_bootstrap()来启动vm管理模块。那么这两个函数是在哪里定义和使用的呢，我们接着看下面的代码。

include/vm.h和arch/mips/include/vm.h

    /* Initialization function */
void vm_bootstrap(void);

/*
 * Interface to the low-level module that looks after the amount of
 * physical memory we have.
 *
 * ram_getsize returns the lowest valid physical address, and one past
 * the highest valid physical address. (Both are page-aligned.) This
 * is the memory that is available for use during operation, and
 * excludes the memory the kernel is loaded into and memory that is
 * grabbed in the very early stages of bootup.
 *
 * ram_stealmem can be used before ram_getsize is called to allocate
 * memory that cannot be freed later. This is intended for use early
 * in bootup before VM initialization is complete.
 */

void ram_bootstrap(void);
paddr_t ram_stealmem(unsigned long npages);
void ram_getsize(paddr_t *lo, paddr_t *hi);

这两个function是定义在这里的，那么这两个function又是干什么事情的呢

vaddr_t firstfree;   /* first free virtual address; set by start.S */

static paddr_t firstpaddr;  /* address of first free physical page */
static paddr_t lastpaddr;   /* one past end of last free physical page */

/*
 * Called very early in system boot to figure out how much physical
 * RAM is available.
 */
void
ram_bootstrap(void)
{
    size_t ramsize;

    /* Get size of RAM. */
    ramsize = mainbus_ramsize();

    /*
     * This is the same as the last physical address, as long as
     * we have less than 508 megabytes of memory. If we had more,
     * various annoying properties of the MIPS architecture would
     * force the RAM to be discontiguous. This is not a case we
     * are going to worry about.
     */
    if (ramsize > 508*1024*1024) {
        ramsize = 508*1024*1024;
    }

    lastpaddr = ramsize;

    /*
     * Get first free virtual address from where start.S saved it.
     * Convert to physical address.
     */
    firstpaddr = firstfree - MIPS_KSEG0;

    kprintf("%uk physical memory available\n",
        (lastpaddr-firstpaddr)/1024);
}

/*
 * Initialise the frame table
 */
void
vm_bootstrap(void)
{
    frametable_bootstrap();
}

/*
 * Make variables static to prevent it from other file‘s accessing
 */
static struct frame_table_entry *frame_table;
static paddr_t frametop, freeframe;

/*
 * initialise frame table
 */
void
frametable_bootstrap(void)
{
    struct frame_table_entry *p;
    paddr_t firsta, lasta, paddr;
    unsigned long framenum, entry_num, frame_table_size, i;

    // get the useable range of physical memory
    ram_getsize(&firsta, &lasta);
    KASSERT((firsta & PAGE_FRAME) == firsta);
    KASSERT((lasta & PAGE_FRAME) == lasta);

    framenum = (lasta - firsta) / PAGE_SIZE;

    // calculate the size of the whole framemap
    frame_table_size = framenum * sizeof(struct frame_table_entry);
    frame_table_size = ROUNDUP(frame_table_size, PAGE_SIZE);
    entry_num = frame_table_size / PAGE_SIZE;
    KASSERT((frame_table_size & PAGE_FRAME) == frame_table_size);

    frametop = firsta;
    freeframe = firsta + frame_table_size;

    if (freeframe >= lasta) {
        // This is impossible for most of the time
        panic("vm: framemap consume physical memory?\n");
    }

    // keep the frame state in the top of the useable range of physical memory
    // the free frame page address started from the end of the frame map
    frame_table = (struct frame_table_entry *) PADDR_TO_KVADDR(firsta);

    // Initialise the frame list, each entry corrsponding to a frame,
    // and each entry stores the address of the next free frame.
    // If the next frame address of this entry equals zero, means this current frame is allocated
    p = frame_table;
    for (i = 0; i < framenum-1; i++) {
        if (i < entry_num) {
            p->next_freeframe = 0;
            p += 1;
            continue;
        }
        paddr = frametop + (i+1) * PAGE_SIZE;
        p->next_freeframe = paddr;
        p += 1;
    }
}

struct frame_table_entry {
    // address of next free frame
    size_t          next_freeframe;
};

raw_bootstrap是系统初始化时用来查看有多少物理内存可以使用的。而vm_bootstrap只是简单的调用了frametable_bootstrap()，而frametable_bootstrap()则是将能用的物理内存分页，每页大小为4K，然后保存一个记录空白页的linked list在内存中，从free的内存的顶部开始存放，但是在存放之前，先要算出需要多少空间来存放这个frame table。所以代码的前段在计算frame table的大小，后面则是初始化frame table这个linked list。因为初始化的时候都是空的，所以直接指向下一个page的地址即可。

操作系统的vm初始化到此完毕。那vm是怎么使用的呢，请看下面

时间： 2024-08-29 06:23:09

操作系统中的虚拟内存技术及其实现代码

操作系统中的虚拟内存技术及其实现代码的相关文章

资深程序员带你玩转深度学习中的正则化技术（附Python代码）！

Android中直播视频技术探究之---基础核心类ByteBuffer解析

工信部：目前手机操作系统中安卓遭病毒入侵风险最大

VC中利用多线程技术实现线程之间的通信

微软的操作系统中让 32 位支持大于 4GB 的内存。

Java中的多线程技术全面详解

ARM Cortex-M底层技术(2)—启动代码详解

.net中对象序列化技术浅谈

OS X 和iOS 中的多线程技术（上）