王家林谈Spark性能优化第十季之全球独家揭秘Spark统一内存管理！

内容：

1、传统的Spark内存管理的问题；

2、Spark统一内存管理；

3、展望；

==========传统的Spark内存管理的问题============

Spark内存分为三部分：

Execution：Shuffles、Joins、Sort、Aggregations等等，默认情况下占用，spark.shuffle.memoryfraction默认是0.2；

Storage：Persist(Canche)、Large Task Result、Torrent类型的Broadcast等，默认占用，spark.storage.memoryfraction默认是0.6；

Other：Program Object、Metadata、code，默认0.2

有内存使用安全系统(safetyfraction)：0.8，也就是Execution和Storage等只能使用配置内存的80%，就是上面只能是0.16、0.48、0.16

如果单台机器不强，则在Execution中的数据会一直spill到磁盘中，则shuffle的性能就会很慢，单台机器越强，结果就越强，所以建集群的时候，追求单台机器（内存）的极致，而非单纯的拼机器的数量。

如果没有多少large task result，则可以适当提高Execution。

同理，如果storage如果不做cache，则是非常大的内存浪费。

传统的内存分配的模型，对spark人才的需求很高。

很好的证明：

从Execution角度说说怎么样去分配内存的，分配内存有个ShuffleMemoryManager、TaskMemoryManager、ExecutorMemoryManager。

一个具体的Task过来，可能会占满Executor的内存，

==========Spark统一内存管理============

UnifiedMemoryManager，Spark1.6保证至少有300M的空间可以用

/**
* A [[MemoryManager]] that enforces a soft boundary between execution and storage such that
* either side can borrow memory from the other.
*
* The region shared between execution and storage is a fraction of (the total heap space - 300MB)
* configurable through `spark.memory.fraction` (default 0.75). The position of the boundary
* within this space is further determined by `spark.memory.storageFraction` (default 0.5).
* This means the size of the storage region is 0.75 * 0.5 = 0.375 of the heap space by default.
*
* Storage can borrow as much execution memory as is free until execution reclaims its space.
* When this happens, cached blocks will be evicted from memory until sufficient borrowed
* memory is released to satisfy the execution memory request.
*
* Similarly, execution can borrow as much storage memory as is free. However, execution
* memory is *never* evicted by storage due to the complexities involved in implementing this.
* The implication is that attempts to cache blocks may fail if execution has already eaten
* up most of the storage space, in which case the new blocks will be evicted immediately
* according to their respective storage levels.
*
* @param storageRegionSize Size of the storage region, in bytes.
* This region is not statically reserved; execution can borrow from
* it if necessary. Cached blocks can be evicted only if actual
* storage memory usage exceeds this region.
*/

object UnifiedMemoryManager {

// Set aside a fixed amount of memory for non-storage, non-execution purposes.
// This serves a function similar to `spark.memory.fraction`, but guarantees that we reserve
// sufficient memory for the system even for small heaps. E.g. if we have a 1GB JVM, then
// the memory used for execution and storage will be (1024 - 300) * 0.75 = 543MB by default.
private val RESERVED_SYSTEM_MEMORY_BYTES = 300 * 1024 * 1024

统一内存管理，当executor内存不够的时候，会向storage借，有多少借多少

当strorage内存不够，也不会让executor释放出来

王家林老师名片：

中国Spark第一人

新浪微博：http://weibo.com/ilovepains

微信公众号：DT_Spark

博客：http://blog.sina.com.cn/ilovepains

手机：18610086859

QQ：1740415547

邮箱：[email protected]

时间： 2024-10-16 08:38:18

王家林谈Spark性能优化第十季之全球独家揭秘Spark统一内存管理！

王家林谈Spark性能优化第十季之全球独家揭秘Spark统一内存管理！的相关文章

王家林谈Spark性能优化第六季

王家林谈Spark性能优化第八季之Spark Tungsten-sort Based Shuffle 内幕解密

王家林谈Spark性能优化第一季！(DT大数据梦工厂)

Spark性能优化指南——高级篇

Spark性能优化指南——基础篇

美团Spark性能优化指南——基础篇

【转载】 Spark性能优化指南——基础篇

Spark性能优化指南——基础篇转

【转载】Spark性能优化指南——高级篇