spark提示Caused by: java.lang.ClassCastException: scala.collection.mutable.WrappedArray$ofRef cannot be cast to [Lscala.collection.immutable.Map;

spark提示Caused by: java.lang.ClassCastException: scala.collection.mutable.WrappedArray$ofRef cannot be cast to [Lscala.collection.immutable.Map;

起因

编写了一个处理两列是否相等的UDF,这两列的数据结构是一样的,但是结构比较复杂,如下:

|-- list: array (nullable = true)
 |    |-- element: map (containsNull = true)
 |    |    |-- key: string
 |    |    |-- value: array (valueContainsNull = true)
 |    |    |    |-- element: struct (containsNull = true)
 |    |    |    |    |-- Date: integer (nullable = true)
 |    |    |    |    |-- Name: string (nullable = true)
 |-- list2: array (nullable = true)
 |    |-- element: map (containsNull = true)
 |    |    |-- key: string
 |    |    |-- value: array (valueContainsNull = true)
 |    |    |    |-- element: struct (containsNull = true)
 |    |    |    |    |-- Date: integer (nullable = true)
 |    |    |    |    |-- Name: string (nullable = true)

也就是说Array里嵌着Map,Map里还嵌着一个Array,只能依次去比较,编写的UDF如下:

case class AppList(Date: Int, versionCode: Int, Name:String)

    def isMapEqual(map1: Map[String, Array[AppList]], map2:Map[String, Array[AppList]]): Boolean = {
      try{
        if (map1.size != map2.size){
          return false
        } else{
          for ( x <- map1.keys){
            if (map1(x) != map2(x)){
              return false
            }
          }
          return true
        }
      } catch {
        case e: Exception => false
      }
    }

    def isListEqual(list1: Array[Map[String, Array[AppList]]], list2:Seq[Map[String, Seq[AppList]]]): Boolean = {
      try {
        if (list1.length != list2.length){
           return false
        } else if (list1.length == 0 || list2.length == 0){
          return false
        } else {
          return isMapEqual(list1(0), list2(0))
        }
      } catch {
        case e: Exception => false
      }
    }

    val isColumnEqual = udf((list1: Array[Map[String, Array[AppList]]], list2:Array[Map[String, Array[AppList]]]) => {
      isListEqual(list1, list2)
    })

然后我就贴到spark-shell里去执行了下面语句:

val dat = df.withColumn("equal", isColumnEqual($"list1", $"list2"))
dat.show()

此时就出现了如下错误:

Caused by: org.apache.spark.SparkException: Failed to execute user defined function($anonfun$1: (array<map<string,array<struct<Date:int,Name:string>>>>, array<map<string,array<struct<Date:int,Name:string>>>>) => boolean)
  at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
  at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
  at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377)
  at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:231)
  at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:225)
  at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)
  at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)
  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
  at org.apache.spark.scheduler.Task.run(Task.scala:99)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassCastException: scala.collection.mutable.WrappedArray$ofRef cannot be cast to [Lscala.collection.immutable.Map;
  at $anonfun$1.apply(<console>:42)
  ... 16 more
解决办法

所谓的解决办法,自然是去谷歌了…

这里看到,说把Array改成Seq就好了,囧,尝试了一下,果然就好了

原因

这里说:

So it looks like the ArrayType on Dataframe "idDF" is really a WrappedArray and not an Array - So the function call to "filterMapKeysWithSet" failed as it expected an Array but got a WrappedArray/ Seq instead (which doesn‘t implicitly convert to Array in Scala 2.8 and above).

意思是,此Array非Scala中的原生Array,而是封装了一下的Array(有错的一定指出来,我都没写过Scala,慌

参考
时间: 2024-10-25 14:07:47

spark提示Caused by: java.lang.ClassCastException: scala.collection.mutable.WrappedArray$ofRef cannot be cast to [Lscala.collection.immutable.Map;的相关文章

Caused by: java.lang.ClassCastException: org.springframework.web.SpringServletContainerInitializer cannot be cast to javax.servlet.ServletContainerInitializer错误解决办法

严重: Failed to initialize end point associated with ProtocolHandler ["http-bio-8080"] java.net.BindException: Address already in use <null>:8080 at org.apache.tomcat.util.net.JIoEndpoint.bind(JIoEndpoint.java:407) at org.apache.tomcat.util.

关于android使用ksoap2报Caused by: java.lang.ClassCastException: org.ksoap2.SoapFault cannot be cast to org.ksoap2.serialization.SoapObject

Caused by: java.lang.ClassCastException: org.ksoap2.SoapFault cannot be cast to org.ksoap2.serialization.SoapObject 报这种类似的错误的,困扰了我挺久.偶尔报不是一直报. 原来是少了一个判断,因为服务器每次返回的不一定是我们想要的结果. 见图 困惑了好久,最后在 stackoverflow 里面才找到答案. 大概就是服务器异常的时候会导致这个,难怪我程序会崩溃有些时候,比如某个地方本

Caused by: java.lang.ClassCastException: org.springframework.web.SpringServletContainerInitializer cannot be cast to javax.servlet.ServletContainerInitializer

A child container failed during startjava.util.concurrent.ExecutionException: org.apache.catalina.LifecycleException: Failed to start component [StandardEngine[Tomcat].StandardHost[localhost].StandardContext[/book_shop]] at java.util.concurrent.Futur

java.lang.ClassCastException: org.springframework.web.filter.CharacterEncodingFilter cannot be cast

严重: Exception starting filter encodingFilterjava.lang.ClassCastException: org.springframework.web.filter.CharacterEncodingFilter cannot be cast to javax.servlet.Filterat org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConf

java.lang.ClassCastException: android.content.res.XmlBlock$Parser cannot be cast to android.view.ani

监控指标 性能测试通常需要监控的指标包括: 1.服务器Linux(包括CPU.Memory.Load.I/O). 2.数据库:1.Mysql 2.Oracle(缓存命中.索引.单条SQL性能.数据库[/url]线程数.数据池连接数). 3.中间件:1.Jboss 2. Apache(包括线程数.连接数.日志). 4.网络: 吞吐量.吞吐率. 5.应用: jvm内存.日志.Full GC频率. 6.监控工具(LoadRunner[/url]):用户执行情况.场景状态.事务响应时间.TPS等. 7.

在maven项目中使用apache cxf中遇到异常 java.lang.ClassCastException: org.springframework.web.filter.CharacterEncodingFilter cannot be cast to javax.servlet.Filter

使用maven虽然很方便,写一个dependency的标签就可以直接从仓库下载对应的jar包,还能处理该jar包的继承依赖关系.但是同时需要你对jar包更加了解,如果依赖关系很复杂,那么很可能会产生jar包冲突,从而使项目报一些莫名其妙的异常. 在用apache cxf的过程中就出现了这样的问题. 1,在项目的pom.xml中加入apache cxf的依赖配置: <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xs

mybatis查询异常-Error querying database. Cause: java.lang.ClassCastException: org.apache.ibatis.executor.ExecutionPlaceholder cannot be cast to java.util.List

背景,mybatis查询的时候直接取的sqlsession,没有包装成SqlSessionTemplate,没有走spring提供的代理. 然后我写的获取sqlsession的代码没有考虑到并发的情况,导致sqlsession建的太多 并发大了之后,查询报错 org.apache.ibatis.exceptions.PersistenceException: ### Error querying database. Cause: java.lang.ClassCastException: org

java.lang.ClassCastException: xut.bookshop.entity.User_$$_javassist_3 cannot be cast to javassist.util.proxy.Proxy

报错信息 java.lang.ClassCastException: xut.bookshop.entity.User_$$_javassist_3 cannot be cast to javassist.util.proxy.Proxy org.hibernate.proxy.pojo.javassist.JavassistLazyInitializer.getProxy(JavassistLazyInitializer.java:147) org.hibernate.proxy.pojo.j

idea连接spark集群报错解析:Caused by: java.lang.ClassCastException

cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.sql.execution.aggregate.SortAggregateExec.aggregateExpressions of type scala.collection.Seq in instance of org.apache.spark.sql.execution.aggregate