




  • On the stack, temporal allocation locality (allocations made close together in time) implies spatial locality (storage that is close together in space). In turn, when temporal allocation locality implies temporal access locality (objects allocated together are accessed together), the sequential stack storage tends to perform better with respect to CPU caches and operating system paging systems.

  • Memory density on the stack tends to be higher than on the heap because of the reference type overhead (discussed later in this chapter). Higher memory density often leads to better performance, e.g., because more objects fit in the CPU cache.

  • Thread stacks tend to be fairly small – the default maximum stack size on Windows is 1MB, and most threads tend to actually use only a few stack pages. On modern systems, the stacks of all application threads can fit into the CPU cache, making typical stack object access extremely fast. (Entire heaps, on the other hand, rarely fit into CPU caches.)
    《Pro .NET Performance》


  1. 栈内存的分配具有空间上的局部性(分配时间的局部性),而我们知道cache设计的基础理论就是程序的局部性,因此这是利于cache的
  2. 栈的内存密度比堆更高,这是利于cache的
  3. 线程栈往往比较小,可以完全cache到缓冲里面
  4. 此外,申请堆内存需要malloc,需要经历一些内存空间的寻找,还可能会引发系统调用,而栈内存是自动分配的,在内存的分配速度上,栈就比堆快