sparkr——报错

> sc <- sparkR.init()
Re-using existing Spark Context. Please stop SparkR with sparkR.stop() or restart R to create a new Spark Context
> sqlContext <- sparkRSQL.init(sc)
> df <- createDataFrame(sqlContext, faithful)
17/03/01 15:05:56 INFO SparkContext: Starting job: collectPartitions at NativeMethodAccessorImpl.java:-2
17/03/01 15:05:56 INFO DAGScheduler: Got job 0 (collectPartitions at NativeMethodAccessorImpl.java:-2) with 1 output partitions
17/03/01 15:05:56 INFO DAGScheduler: Final stage: ResultStage 0 (collectPartitions at NativeMethodAccessorImpl.java:-2)
17/03/01 15:05:56 INFO DAGScheduler: Parents of final stage: List()
17/03/01 15:05:56 INFO DAGScheduler: Missing parents: List()
17/03/01 15:05:56 INFO DAGScheduler: Submitting ResultStage 0 (ParallelCollectionRDD[0] at parallelize at RRDD.scala:460), which has no missing parents
17/03/01 15:05:56 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1280.0 B, free 1280.0 B)
17/03/01 15:05:56 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 854.0 B, free 2.1 KB)
17/03/01 15:05:56 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 172.16.31.137:49150 (size: 854.0 B, free: 511.5 MB)
17/03/01 15:05:56 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1006
17/03/01 15:05:56 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (ParallelCollectionRDD[0] at parallelize at RRDD.scala:460)
17/03/01 15:05:56 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
17/03/01 15:05:56 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, test3, partition 0,PROCESS_LOCAL, 12976 bytes)
17/03/01 15:05:56 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on test3:50531 (size: 854.0 B, free: 511.5 MB)
17/03/01 15:05:56 INFO DAGScheduler: ResultStage 0 (collectPartitions at NativeMethodAccessorImpl.java:-2) finished in 0.396 s
17/03/01 15:05:56 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 389 ms on test3 (1/1)
17/03/01 15:05:56 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
17/03/01 15:05:56 INFO DAGScheduler: Job 0 finished: collectPartitions at NativeMethodAccessorImpl.java:-2, took 0.526915 s
> showDF(df)
17/03/01 15:06:02 INFO SparkContext: Starting job: showString at NativeMethodAccessorImpl.java:-2
17/03/01 15:06:02 INFO DAGScheduler: Got job 1 (showString at NativeMethodAccessorImpl.java:-2) with 1 output partitions
17/03/01 15:06:02 INFO DAGScheduler: Final stage: ResultStage 1 (showString at NativeMethodAccessorImpl.java:-2)
17/03/01 15:06:02 INFO DAGScheduler: Parents of final stage: List()
17/03/01 15:06:02 INFO DAGScheduler: Missing parents: List()
17/03/01 15:06:02 INFO DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[4] at showString at NativeMethodAccessorImpl.java:-2), which has no missing parents
17/03/01 15:06:02 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 8.7 KB, free 10.8 KB)
17/03/01 15:06:02 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 3.5 KB, free 14.4 KB)
17/03/01 15:06:02 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 172.16.31.137:49150 (size: 3.5 KB, free: 511.5 MB)
17/03/01 15:06:02 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006
17/03/01 15:06:02 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 1 (MapPartitionsRDD[4] at showString at NativeMethodAccessorImpl.java:-2)
17/03/01 15:06:02 INFO TaskSchedulerImpl: Adding task set 1.0 with 1 tasks
17/03/01 15:06:02 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, test2, partition 0,PROCESS_LOCAL, 12976 bytes)
17/03/01 15:06:03 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on test2:57552 (size: 3.5 KB, free: 511.5 MB)
17/03/01 15:06:04 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1, test2): java.io.IOException: Cannot run program "Rscript": error=2, No such file or directory
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
    at org.apache.spark.api.r.RRDD$.createRProcess(RRDD.scala:413)
    at org.apache.spark.api.r.RRDD$.createRWorker(RRDD.scala:429)
    at org.apache.spark.api.r.BaseRRDD.compute(RRDD.scala:63)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: error=2, No such file or directory
    at java.lang.UNIXProcess.forkAndExec(Native Method)
    at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
    at java.lang.ProcessImpl.start(ProcessImpl.java:130)
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
    ... 20 more

17/03/01 15:06:04 INFO TaskSetManager: Starting task 0.1 in stage 1.0 (TID 2, test2, partition 0,PROCESS_LOCAL, 12976 bytes)
17/03/01 15:06:04 INFO TaskSetManager: Lost task 0.1 in stage 1.0 (TID 2) on executor test2: java.io.IOException (Cannot run program "Rscript": error=2, No such file or directory) [duplicate 1]
17/03/01 15:06:04 INFO TaskSetManager: Starting task 0.2 in stage 1.0 (TID 3, test3, partition 0,PROCESS_LOCAL, 12976 bytes)
17/03/01 15:06:04 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on test3:50531 (size: 3.5 KB, free: 511.5 MB)
17/03/01 15:06:04 INFO TaskSetManager: Lost task 0.2 in stage 1.0 (TID 3) on executor test3: java.io.IOException (Cannot run program "Rscript": error=2, No such file or directory) [duplicate 2]
17/03/01 15:06:04 INFO TaskSetManager: Starting task 0.3 in stage 1.0 (TID 4, test3, partition 0,PROCESS_LOCAL, 12976 bytes)
17/03/01 15:06:04 INFO TaskSetManager: Lost task 0.3 in stage 1.0 (TID 4) on executor test3: java.io.IOException (Cannot run program "Rscript": error=2, No such file or directory) [duplicate 3]
17/03/01 15:06:04 ERROR TaskSetManager: Task 0 in stage 1.0 failed 4 times; aborting job
17/03/01 15:06:04 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool
17/03/01 15:06:04 INFO TaskSchedulerImpl: Cancelling stage 1
17/03/01 15:06:04 INFO DAGScheduler: ResultStage 1 (showString at NativeMethodAccessorImpl.java:-2) failed in 2.007 s
17/03/01 15:06:04 INFO DAGScheduler: Job 1 failed: showString at NativeMethodAccessorImpl.java:-2, took 2.027519 s
17/03/01 15:06:04 ERROR RBackendHandler: showString on 15 failed
Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) :
  org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 (TID 4, test3): java.io.IOException: Cannot run program "Rscript": error=2, No such file or directory
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
    at org.apache.spark.api.r.RRDD$.createRProcess(RRDD.scala:413)
    at org.apache.spark.api.r.RRDD$.createRWorker(RRDD.scala:429)
    at org.apache.spark.api.r.BaseRRDD.compute(RRDD.scala:63)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
    at org.apache.spark.rdd.R
  org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 (TID 4, test3): java.io.IOException: Cannot run program "Rscript": error=2, No such file or directory

重点为这一句这一错误,使得在sparkr中,定义class为

class(df)
[1] "DataFrame"
attr(,"package")
[1] "SparkR"
的对象之后,使用class以及names以及show可以查看

但使用showDF以及head则报出如上错误。即无法读取

关注重点报错句,可知,其他节点上没有

Rscript

解决办法为,登陆其他的机器,将将Rscript copy到/usr/bin便可

或改成单节点:

即启动时,去掉--master

sparkR --driver-class-path /data1/mysql-connector-java-5.1.18.jar

时间: 2024-08-09 06:35:11

sparkr——报错的相关文章

Spark 报错解决--Error initializing SparkContext

在提交spark作业的时候,spark出现报错 ./spark-shell 19/05/14 05:37:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Setting default log level to "WARN". To adjust logging leve

Spark SQL 报错总结

报错一: 启动spark-shell后查询hive中的表信息,报错 $SPARK_HOME/bin/spark-shell spark.sql("select * from student.student ").show() Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.ap

oracle安装故障:完美解决xhost +报错: unable to open display “”

oracle安装 先切换到root用户,执行xhost + 然后再切换到oracle用户,执行export DISPLAY=:0.0 出现乱码执行export LANG=US_en 在这里给大家介绍下两种情况的常见问题: 一种是本地运行的命令,另一种则是远程ssh命令安装. DISPLAY科普 DISPLAY变量是用来设置将图形显示到何处.比如CENTOS,你用图形界面登录进去,DISPLAY自动设置为DISPLAY=:0.0表示显式到本地监视器,那么通过终端工具(例如:xshell)进去,运行

SQL查询字段添加括号报错:Operand should contain 1 column(s)

SQL语句:查询连个字段的信息 SELECT (menu_id,menu_captions) FROM bsdb.menulist a WHERE a.menu_id like ('2_'); 然后,因为这是在存储过程中的一个语句所以,在执行存储过程的时候编译不会报错,但是执行的时候却汇报错:Operand should contain 1 column(s):原因不好解释: 下面是官方发解释(MYSQL):https://dev.mysql.com/doc/refman/5.0/en/row-

安装linux 系统报错:No DEFAULT or UI configuration directive found 解决方法

在报错信息后面的boot命令行输入: /isolinux/vmlinuz initrd=/isolinux/initrd.img 如果不行 重新输入 /isolinux/vmlinuz initrd=/isolinux/initrd.img xdriver=vesa nomodeset boot:/isolinux/vmlinuz initrd=/isolinux/initrd.img boot: /isolinux/vmlinuz initrd=/isolinux/initrd.img xdr

解决 U盘安装Windows Server 2012 R2 报错

报错原因: 使用UltraISO刻录镜像时会更改U盘的文件格式为FAT32, 而Server 2012 R2的安装文件install.wim为5G多,故安装失败. 解决方法: 1.按照正常的方法刻录镜像到U盘: 2.更改U盘文件系统: 进入命令行模式,输入 convert f: /fs:NTFS (F盘为我的U盘所在盘符) 3.把install.wim重新拷贝到U盘对应目录

filebeat+kafka+SparkStreaming程序报错及解决办法

17/07/01 03:07:21 WARN RandomBlockReplicationPolicy: Expecting 1 replicas with only 0 peer/s. 17/07/01 03:07:21 WARN BlockManager: Block input-0-1498849640800 replicated to only 0 peer(s) instead of 1 peers 17/07/01 03:07:26 ERROR Executor: Exception

Maven中,pom.xml文件报错

一:错误消息,如下图: aus 原因是本地仓库在org.codehaus.plexus:plexus-uils:pom:3.0.20下面没有jar文件,只有一个plexus-utils-3.0.20.pom.lastUpdated,如下图: 解决:将该文件夹删掉,然后右击项目:Maven->Update Project就可以了 若pom.xml里面还有类型的报错,就像这样解决就OK了

PL/SQL developer 连接oracle数据库报错“initialization error could not load oci.dll”

声明:PL/SQL 版本:PL/SQL Developer 9.0.6 (http://files.allroundautomations.com/plsqldev906.exe) 报错提示如图: 原因:PL/SQL只对32位OS进行支持,解决方法是额外加载一个oci.dll文件 解决办法:1.下载OCI.DLL相关库文件.地址: (需注册Oracle账号) http://www.oracle.com/technetwork/topics/winsoft-085727.html ----->