执行”spark-shell –master yarn –deploy-mode client”,虚拟内存大小溢出,报错

在Hadoop 2.7.2集群下执行如下命令:

spark-shell  --master yarn --deploy-mode client

爆出下面的错误:

org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.

在Yarn WebUI上面查看启动的Cluster状态,log显示为:

Container [pid=28920,containerID=container_1389136889967_0001_01_000121] is running beyond virtual memory limits. Current
usage: 1.2 GB of 1 GB physical memory used; 2.2 GB of 2.1 GB virtual memory used. Killing container.

这是由于虚拟内存大小超过了设定的数值,可以修改配置,进行规避。

There is a check placed at Yarn level for Vertual and Physical memory usage ratio. Issue is not only that VM doesn‘t have sufficient pysical memory. But it is because Virtual memory usage is more than expected for given physical memory.

Note : This is happening on Centos/RHEL 6 due to its aggressive allocation of virtual memory.

It can be resolved either by :

  1. Disable virtual memory usage check by setting yarn.nodemanager.vmem-check-enabled to false;
  2. Increase VM:PM ratio by setting yarn.nodemanager.vmem-pmem-ratio to some higher value(default value is 2.1).

Add following property in yarn-site.xml
     <property>
              <name>yarn.nodemanager.vmem-check-enabled</name>
              <value>false</value>
              <description>Whether virtual memory limits will be enforced for containers</description>
    </property>
              <property>
              <name>yarn.nodemanager.vmem-pmem-ratio</name>
              <value>4</value>
              <description>Ratio between virtual memory to physical memory when setting memory limits for containers</description>
    </property>

3.Then, restart yarn.

Reference:

http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/

http://blog.chinaunix.net/uid-28311809-id-4383551.html

http://stackoverflow.com/questions/21005643/container-is-running-beyond-memory-limits

时间: 2025-01-08 11:02:32

执行”spark-shell –master yarn –deploy-mode client”,虚拟内存大小溢出,报错的相关文章

【原创 Hadoop&amp;Spark 动手实践 5】Spark 基础入门,集群搭建以及Spark Shell

Spark 基础入门,集群搭建以及Spark Shell 主要借助Spark基础的PPT,再加上实际的动手操作来加强概念的理解和实践. Spark 安装部署 理论已经了解的差不多了,接下来是实际动手实验: 练习1 利用Spark Shell(本机模式) 完成WordCount spark-shell 进行Spark-shell本机模式 第一步:通过文件方式导入数据 scala> val rdd1 = sc.textFile("file:///tmp/wordcount.txt")

spark-shell启动报错:Yarn application has already ended! It might have been killed or unable to launch application master

spark-shell不支持yarn cluster,以yarn client方式启动 spark-shell --master=yarn --deploy-mode=client 启动日志,错误信息如下 其中"Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME",只是一个警告,官方的解释如下: 大概是说:如果 spark.

忽然遇到报错:ERROR spark.SparkContext: Error initializing SparkContext.

java.lang.IllegalArgumentException: System memory 100663296 must be at least 4.718592E8. Please use a larger heap size. 在Eclipse里开发Spark项目,尝试直接在spark里运行程序的时候,遇到下面这个报错: 很明显,这是JVM申请的memory不够导致无法启动SparkContext.但是该怎么设呢? 但是检查了一下启动脚本 #!/bin/bash /usr/local

[转帖]k8s集群node节点一直NotReady, 且node节点(并非master)的kubelet报错:Unable to update cni config: No networks found in /etc/cni/net.d

k8s集群node节点一直NotReady, 且node节点(并非master)的kubelet报错:Unable to update cni config: No networks found in /etc/cni/net.d http://www.voidcn.com/article/p-wpuagtbj-byy.html ? 考虑到node节点的kubelet报错Unable to update cni config: No networks found in /etc/cni/net.

redhat6.4执行二进制程序报错:/lib/ld-linux.so.2: bad ELF interpreter: No such file or directory

今天同事想在redhat6.4系统环境下,收集IBM3650 m4的所有硬件日志信息,当执行IBM的日志收集程序的时候,出现如下报错: /lib/ld-linux.so.2: bad ELF interpreter: No such file or directory 原因: 在64位的系统中执行了32位的程序 解决方法: yum -y install glibc.i686

学习Spring Cloud中eureka注册中心添加security认证,eureka client注册启动报错

最近使用SpringCloud在eureka server端添加security登录认证之后,eureka client注册启动一直报错,大概意思是未发现eureka server,导致注册启动失败! 1 2018-08-09 14:50:06.042 WARN 13256 --- [nfoReplicator-0] c.n.discovery.InstanceInfoReplicator : There was a problem with the instance info replicat

Linux-006-执行Shell脚本报错 $&#39;\r&#39;:command not found

在 windows 下编写 Shell 脚本,在 Linux 上执行时,报错提示: $'\r':command not found. 因为 windows 下的换行符是 \r\n ,而 Linux 的换行符是 \n.因而在 Linux 下运行 windows 编写的 Shell 脚本,会报如上所示的错误. 解决方法:将换行符替换成 Linux 平台的换行符即可.命令如下所示: sed -i 's/\r//' 脚本名 Linux-006-执行Shell脚本报错 $'\r':command not

Spark跑在Yarn上出现错误,原因是jdk的版本问题

./bin/spark-shell --master yarn 2019-07-01 12:20:13 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Setting default log level to "WARN". To adjust logging level use

Spark运行模式:cluster与client

When run SparkSubmit --class [mainClass], SparkSubmit will call a childMainClass which is 1. client mode, childMainClass = mainClass 2. standalone cluster mde, childMainClass = org.apache.spark.deploy.Client 3. yarn cluster mode, childMainClass = org