内容:
1、传统的Spark内存管理的问题;
2、Spark统一内存管理;
3、 展望;
==========传统的Spark内存管理的问题============
Spark内存分为三部分:
Execution:Shuffles、Joins、Sort、Aggregations等等,默认情况下占用,spark.shuffle.memoryfraction默认是0.2;
Storage:Persist(Canche)、Large Task Result、Torrent类型的Broadcast等,默认占用,spark.storage.memoryfraction默认是0.6;
Other:Program Object、Metadata、code,默认0.2
有内存使用安全系统(safetyfraction):0.8,也就是Execution和Storage等只能使用配置内存的80%,就是上面只能是0.16、0.48、0.16
如果单台机器不强,则在Execution中的数据会一直spill到磁盘中,则shuffle的性能就会很慢,单台机器越强,结果就越强,所以建集群的时候,追求单台机器(内存)的极致,而非单纯的拼机器的数量。
如果没有多少large task result,则可以适当提高Execution。
同理,如果storage如果不做cache,则是非常大的内存浪费。
传统的内存分配的模型,对spark人才的需求很高。
很好的证明:
从Execution角度说说怎么样去分配内存的,分配内存有个ShuffleMemoryManager、TaskMemoryManager、ExecutorMemoryManager。
一个具体的Task过来,可能会占满Executor的内存,
==========Spark统一内存管理============
UnifiedMemoryManager,Spark1.6保证至少有300M的空间可以用
/**
* A [[MemoryManager]] that enforces a soft boundary between execution and storage such that
* either side can borrow memory from the other.
*
* The region shared between execution and storage is a fraction of (the total heap space - 300MB)
* configurable through `spark.memory.fraction` (default 0.75). The position of the boundary
* within this space is further determined by `spark.memory.storageFraction` (default 0.5).
* This means the size of the storage region is 0.75 * 0.5 = 0.375 of the heap space by default.
*
* Storage can borrow as much execution memory as is free until execution reclaims its space.
* When this happens, cached blocks will be evicted from memory until sufficient borrowed
* memory is released to satisfy the execution memory request.
*
* Similarly, execution can borrow as much storage memory as is free. However, execution
* memory is *never* evicted by storage due to the complexities involved in implementing this.
* The implication is that attempts to cache blocks may fail if execution has already eaten
* up most of the storage space, in which case the new blocks will be evicted immediately
* according to their respective storage levels.
*
* @param storageRegionSize Size of the storage region, in bytes.
* This region is not statically reserved; execution can borrow from
* it if necessary. Cached blocks can be evicted only if actual
* storage memory usage exceeds this region.
*/
object UnifiedMemoryManager {
// Set aside a fixed amount of memory for non-storage, non-execution purposes.
// This serves a function similar to `spark.memory.fraction`, but guarantees that we reserve
// sufficient memory for the system even for small heaps. E.g. if we have a 1GB JVM, then
// the memory used for execution and storage will be (1024 - 300) * 0.75 = 543MB by default.
private val RESERVED_SYSTEM_MEMORY_BYTES = 300 * 1024 * 1024
统一内存管理,当executor内存不够的时候,会向storage借,有多少借多少
当strorage内存不够,也不会让executor释放出来
王家林老师名片:
中国Spark第一人
新浪微博:http://weibo.com/ilovepains
微信公众号:DT_Spark
博客:http://blog.sina.com.cn/ilovepains
手机:18610086859
QQ:1740415547
邮箱:[email protected]