CSAPP Virtual Memory


what is virtual memory? why we need it?Today, I want to talk about it and record somethings in this blog for comprehending batter.

First and foremost,What is Virtual Memory?

What is Virtual Memory(Basic Concept Of Virtual Memory)

It is generally know that most basic operation of modern computer only consist of getting command and executing. Memory just a liner space store the code and data. When we use the computer and run a program in the machine. Some code and data of the program is moved to memory form disk. If the address of the code line is fixed, We must to move the code and data to a fixed space of memory. It is terrible,no doubt.

In the model Operating System, we can run a lot of program, and the memory of every program is individual. The processes run on machine have a Virtual Space and the Virtual Space seems infinite and exclusive. This is what the Virtual memory do in model Operating system.

So now, We know that Virtual Memory is the tools in Operating System to help we to run multiple program.

Physical and Virtual Addressing

Physical Addressing

The main memory is organized as an array of M contiguous byte-size cells. Each cell has a unique address known as Physical Address. The most natural way for a CPU to access memory would be to use Physical Addressing

Virtual Addressing

With virtual addressing, the CPU accesses memory by virtual address(VA), which is converted to the physical address before being sent to main memory (address translation) .

Address translation is a collaborative work which is done by the Hardware and Operating System.

The hardware called the memory management unit(MMU). MMU lookup table stored in main memory whose contents are managed by operating System.

Why we need it.()

In general, we need the Virtual memory for three reasons:

  1. make the main memory efficient by treating it as cache for an address space stored on disk.(Caching)

  2. It simplifies memory management by providing each process with a uniform address space.(Management)

  3. It protects the address space of each process from corruption by other processes.(Protection)


VM as Tool for Caching


At first, we think of memory as a kind of cache between disk and cpu. The CPU can run by get command and data from disk, but the access time of disk is very large. We need the DRAM to cache the code and data.

We partition the data as the transfer units between the disk and main memory.

The transfer units known as Virtual Pages(VPs) in Virtual Memory.(Virtual Memory stored in disk)

The transfer units known as Physical pages(PPs) or Page frames.(Caching in Main memory)

This concept is similar as SRAM Caching. But the DRAM Caching need bigger Pages is transfer units because the First Access Time of disk is so large. The cost of reading the first byte from disk sector is about 100000 times slower than reading successive bytes in sector.

Page Table

When we access an address of Virtual Memory, we want to know if it cached. If the answer is yes, we need to get the Physical address of the Virtual address and access DRAM-CACHE to get the data. So we need a Table(Page Table) to record the information about if the VM is cached and the corresponding address(Physical Address).

The structure of page table like this.

When we access the address X(also in the X-th page), we just to find the X-th Item of the table and check the information to decide next step.

If the valid is 1, the machine can get the PA and access the DRAM-CACHE.

If the valid is 0, the machine will run a Page Fault handler in System kernel, which selects a victim page. If the victim has been modified, then the kernel copies it back to disk. In either case, the kernel modifies the page table Entity for the victim page to reflect the fact that victim is no longer cached in DRAM-CACHE. Next, kernel copies the VP we accessing to DRAM-CACHE and update the Page Table.(If the Cache Space is enough, we don’t need to select the victim)

VM as Tool for Memory Management

Separate Virtual Space for every process has a profound impact on the Memory Management. VM simplifies linking and loading, the sharing of code and data, and allocating memory to applications.

  • Simplifying Linking

    A separate and virtual address space allow each process to use the same basic format for its memory image, regardless of where the code and data actually reside in physical memory.

  • Simplifying Loading

    Linux Loader just allocates virtual pages for the code and data segments. Then Loader points the Address of Page Table for the process.

    Loader never actually copies any data from disk into memory cache. The data are paged in automatically and on demand by the virtual memory system the first time each page is referenced.

  • Simplifying Sharing

  • Simplifying Allocating

When a program running in a user process requests additional heap space, operating system just allocates some virtual page for process. And operating system don’t need to decide where the page should be cached in Memory.

VM as Tool for Memory Protection

It’s easy to prohibit that process access private data of another process for the separate address space of VM system.

Program accessing of memory must via the address translation, so we just need to add some flag into page Table Entity that the protection system could be built up.

Address Translation

There are the details about address translation

This figure show How the address translation work.

VPO is identical to the VPO, both they are compose of p bit . When we access a VP, CPU find the correct Page Table Entity and the Page Table Start Address is stored in PTBR(A register in CPU).

If the Valid is 1, that means page hit.

  1. Processor access a Virtual Address and The VP sends to MMU

  2. MMU requests the correct PTE from the Main Memory/Cache

  3. The cache/Memory return PTE

  4. MMU get correct Physical Address and sends it to the cache/Memory

  5. cache/Memory return data.

If the Vaild is 0.

  1. -3. same as above

  2. The MMU triggers an exception(Page Fault Exception)

  3. The fault handler identifies a victim page in Physical Memory,and if the page has been modified, pages it out to disk.

  4. The handler page in the new page and update Page Table.

  5. The handler returns to the original process and re-execute the instruction.

Cache the Page Table Entity (TLB)

Each time when we access the memory looking for data or some other things, we must access Physical Memory two times(for Getting PTE and Getting Data).So someone proposes a idea that we can cache the PTE into a faster place then we would get faster when transferring address. This is the origin of TLB.


Save Memory(Multi-Leave Page Tables)

if we had a 32-bit address space, 4 KB pages, and a 4-byte PTE, then we would need a 4 MB page table resident in memory at all times, even if the application referenced only a small chunk of the virtual address space.

The multi-Leave Page Table solve the problem.

It’s easy to understand that we just create the page table when we use the corresponding address. So we can save a lot of space but the structure is more complex. And we must to access k PTEs. It seems expecsive but TLB come to the rescue here by caching PETs.

Case : Intel Core I7/Linux Memory System.

CSAPP-12 Concurrent Programming 并发编程 读书笔记

CSAPP Concurrent Programming



  • 进程

  • IO多路复用

  • 线程

这三种方法各有优缺点。 对于进程来说,进程拥有独立的地址空间,但是进程之间通信需要用IPC,较为不便 IO多路复用其实只有一个进程,但是在这一个进程里面,拥有一个状态机,根据不同的IO到达的状态,执行不同的逻辑,从而实现并发。IO多路复用里面的控制流不会被系统调度,而且自身根据状态调度。 线程结合了IO多路复用和进程的特点,一个进程可以包含多个线程,这多个线程共享同一个地址空间,这方便了线程之间的通信,同时线程也可以像进程一样被系统调度。




这显然是不能接受的,所以我们要实现“并发”,让服务器可以在一段时间内,同时照顾着多个顾客。注意这里我们用的是一段时间来描述,因为并发不要求在一个时刻同时服务多个顾客,而只要在一段时间内同时照顾到这多个顾客即可。对应的,如果使用一个时刻这个词的话,我们的描述应该是“并行”,也就是 parallel ,对应的可能有多个服务员,来实现并行。





  • 优点在于

    • 独立的地址空间,可以消除不同进程之间的奇奇怪怪的冲突。

  • 缺点在于

    • 需要使用IPC进行进程间通信,这通常来说是比较慢速的。











事实上,现在很多的高性能服务器,例如NodeJs Nginx等,都使用事件驱动的编程方法,内涵思想就是IO多路复用,为什么呢?主要就是因为他具有巨大的性能优势。



  • 速度快

  • 因为在单个进程中,数据易于共享。

  • 易于调试


  • 写代码复杂

  • 进程可能阻塞

  • 无法利用多核处理器

考虑下第二个缺点,进程可能阻塞,这是什么意思呢?我们知道事件驱动模型、IO多路复用的服务器是通常是运行在单个进程上的。这意味着一旦阻塞,这个进程上的所有客户都将被搁置。 假如有一个恶意的顾客,来到饭馆举手,把服务员钓来之后,一直支支吾吾不说清楚,这是不是耽误了其他顾客呢?














Nginx vs. Apache







但在一个进程中的线程全部共享这个进程的上下文,这包括read-only text(CODE区)、可读写数据、堆、共享库的代码和数据区、打开的文件





  • 全局变量


  • 局部变量


  • 局部静态变量






主要分为Load Update Save三个步骤,那么这就不是一个整体的操作了,显然在这种分散操作的情况下,就容易出现共享变量出错。




很简单,只要让cnt++变成原子操作,即当某个线程进入L U S这个步骤之后,其他的线程不得干扰,必须等待该线程操作完成。

















CSAPP-Lab:Bomb Lab



  1. Data Lab
  2. Bomb Lab
  3. Attack Lab
  4. Buffer Lab (IA32)
  5. Architecture Lab
  6. Architecture Lab (Y86)
  7. Cache Lab
  8. Performance Lab
  9. Shell Lab
  10. Malloc Lab
  11. Proxy Lab


  • A “binary bomb” is a program provided to students as an object code file. When run, it prompts the user to type in 6 different strings. If any of these is incorrect, the bomb “explodes,” printing an error message and logging the event on a grading server. Students must “defuse” their own unique bomb by disassembling and reverse engineering the program to determine what the 6 strings should be. The lab teaches students to understand assembly language, and also forces them to learn how to use a debugger. It’s also great fun. A legendary lab among the CMU undergrads.Here’s a Linux/x86-64 binary bomb that you can try out for yourself. The feature that notifies the grading server has been disabled, so feel free to explode this bomb with impunity. If you’re an instructor with a CS:APP account, then you can download the solution.



CSAPP-Lab:Data Lab


本帖为第一个实验,Data Lab。



  1. Data Lab
  2. Bomb Lab
  3. Attack Lab
  4. Buffer Lab (IA32)
  5. Architecture Lab
  6. Architecture Lab (Y86)
  7. Cache Lab
  8. Performance Lab
  9. Shell Lab
  10. Malloc Lab
  11. Proxy Lab


实验文件解压后 有一个datalab-handout文件夹,里面有很多c文件,我们只需要在 bits.c里面编写程序,然后运行评测程序即可。



./btest -f [函数名]

./btest -f [函数名] -1 [参数1] -2 [参数2] -3……



 * bitXor - x^y using only ~ and & 
 *   Example: bitXor(4, 5) = 1
 *   Legal ops: ~ &
 *   Max ops: 14
 *   Rating: 1
* bit异或
* 要求只用~和&完成异或操作
* 思路:第一题比较简单,任何逻辑操作都可以用~和&表达,异或也是,
* 去查一下亦或的等价表达式,用摩根公式简化一下即可 */ int bitXor(int x, int y) { int a = ~(~x & y); int b = ~( x & ~y); int c = ~( a & b); return c; } /* * tmin - return minimum two's complement integer * Legal ops: ! ~ & ^ | + << >> * Max ops: 4 * Rating: 1
* 返回补码表达的最小值
* 根据补码的知识,补码的最小值就是1000000……000B */ int tmin(void) { return 1<<31; } //2 /* * isTmax - returns 1 if x is the maximum, two's complement number, * and 0 otherwise * Legal ops: ! ~ & ^ | + * Max ops: 10 * Rating: 1
* 检验参数x是否是补码最大值
* 思路,补码最大值是0111……1111 用~(1<<31))得到0111……111 然后和x进行一下亦或操作即可,如果完全一直则得0,取反为1 */ int isTmax(int x) { return !(x^(~(1<<31))); } /* * allOddBits - return 1 if all odd-numbered bits in word set to 1 * where bits are numbered from 0 (least significant) to 31 (most significant) * Examples allOddBits(0xFFFFFFFD) = 0, allOddBits(0xAAAAAAAA) = 1 * Legal ops: ! ~ & ^ | + << >> * Max ops: 12 * Rating: 2 */ int allOddBits(int x) { int nX = ~x; return !( (nX&0xAA) | (nX&(0xAA<<8)) | (nX&(0xAA<<16)) | (nX&(0xAA<<24))); } /* * negate - return -x * Example: negate(1) = -1. * Legal ops: ! ~ & ^ | + << >> * Max ops: 5 * Rating: 2 */ int negate(int x) { return (~x)+1; } //3 /* * isAsciiDigit - return 1 if 0x30 <= x <= 0x39 (ASCII codes for characters '0' to '9') * Example: isAsciiDigit(0x35) = 1. * isAsciiDigit(0x3a) = 0. * isAsciiDigit(0x05) = 0. * Legal ops: ! ~ & ^ | + << >> * Max ops: 15 * Rating: 3 */ int isAsciiDigit(int x) { int a = !(!(x>>8));//验证高24位是否全为0(符合要求), int b = !(!(0x30^(x&0xF0))); int c1 = (0x08&x)>>3; int c2 = (0x04&x)>>2; int c3 = (0x02&x)>>1; int c = (c1 & c2) | (c1 & c3);//如果是0表示 低四位符合要求 // printf("x:0x%x a:0x%x b:0x%x c:0x%x c1:0x%x c2:0x%x c3:0x%x\n",x,a,b,c,c1,c2,c3); return !(a | b | c); } /* * conditional - same as x ? y : z * Example: conditional(2,4,5) = 4 * Legal ops: ! ~ & ^ | + << >> * Max ops: 16 * Rating: 3 */ int conditional(int x, int y, int z) { int a = ((!(!x))<<31)>>31; return (y&a) | (z&~a); } /* * isLessOrEqual - if x <= y then return 1, else return 0 * Example: isLessOrEqual(4,5) = 1. * Legal ops: ! ~ & ^ | + << >> * Max ops: 24 * Rating: 3 */ int isLessOrEqual(int x, int y) { int x_sign = x>>31&1; int y_sign = y>>31&1; int a = x_sign & (!y_sign); int b = !(x_sign^y_sign); int c = b & ((x+~y)>>31 & 1); return a | c; } //4 /* * logicalNeg - implement the ! operator, using all of * the legal operators except ! * Examples: logicalNeg(3) = 0, logicalNeg(0) = 1 * Legal ops: ~ & ^ | + << >> * Max ops: 12 * Rating: 4 */ int logicalNeg(int x) { int x_sign = (x>>31) & 1; int myx = x&(~(1<<31)); int a = (myx+~0)>>31; return (a & ((~x_sign)&1)); } /* howManyBits - return the minimum number of bits required to represent x in * two's complement * Examples: howManyBits(12) = 5 * howManyBits(298) = 10 * howManyBits(-5) = 4 * howManyBits(0) = 1 * howManyBits(-1) = 1 * howManyBits(0x80000000) = 32 * Legal ops: ! ~ & ^ | + << >> * Max ops: 90 * Rating: 4 */ int howManyBits(int x) { int b16,b8,b4,b2,b1,b0; int sign=x>>31; x = (sign&~x)|(~sign&x);//如果x为正则不变,否则按位取反(这样好找最高位为1的,原来是最高位为0的,这样也将符号位去掉了) // 不断缩小范围 b16 = !!(x>>16)<<4;//高十六位是否有1 x = x>>b16;//如果有(至少需要16位),则将原数右移16位 b8 = !!(x>>8)<<3;//剩余位高8位是否有1 x = x>>b8;//如果有(至少需要16+8=24位),则右移8位 b4 = !!(x>>4)<<2;//同理 x = x>>b4; b2 = !!(x>>2)<<1; x = x>>b2; b1 = !!(x>>1); x = x>>b1; b0 = x; return b16+b8+b4+b2+b1+b0+1;//+1表示加上符号位 }