hivesql 迁移spark3.0 sparksql报错如Cannot safely cast '字段':StringType to IntegerType的问题

一问题

hivesql可以正常运行，spark3.0运行报错如图

spark3.0配置查看源码新增一个

  val STORE_ASSIGNMENT_POLICY =
    buildConf("spark.sql.storeAssignmentPolicy")
      .doc("When inserting a value into a column with different data type, Spark will perform " +
        "type coercion. Currently, we support 3 policies for the type coercion rules: ANSI, " +
        "legacy and strict. With ANSI policy, Spark performs the type coercion as per ANSI SQL. " +
        "In practice, the behavior is mostly the same as PostgreSQL. " +
        "It disallows certain unreasonable type conversions such as converting " +
        "`string` to `int` or `double` to `boolean`. " +
        "With legacy policy, Spark allows the type coercion as long as it is a valid `Cast`, " +
        "which is very loose. e.g. converting `string` to `int` or `double` to `boolean` is " +
        "allowed. It is also the only behavior in Spark 2.x and it is compatible with Hive. " +
        "With strict policy, Spark doesn‘t allow any possible precision loss or data truncation " +
        "in type coercion, e.g. converting `double` to `int` or `decimal` to `double` is " +
        "not allowed."
      )
      .stringConf
      .transform(_.toUpperCase(Locale.ROOT))
      .checkValues(StoreAssignmentPolicy.values.map(_.toString))
      .createWithDefault(StoreAssignmentPolicy.ANSI.toString)

看下配置有三种类型

  object StoreAssignmentPolicy extends Enumeration {
    val ANSI, LEGACY, STRICT = Value
  }

对于ANSI策略，Spark根据ANSI SQL执行类型强制。这种行为基本上与PostgreSQL相同

它不允许某些不合理的类型转换，如转换“`string`to`int`或`double` to`boolean`

对于LEGACY策略 Spark允许类型强制，只要它是有效的‘Cast‘ 这也是Spark 2.x中的唯一行为，它与Hive兼容。

对于STRICT策略 Spark不允许任何可能的精度损失或数据截断

所以我们增加配置

spark.sql.storeAssignmentPolicy=LEGACY

之后能正常运行

hivesql 迁移spark3.0 sparksql报错如Cannot safely cast '字段':StringType to IntegerType的问题

原文地址：https://www.cnblogs.com/songchaolin/p/12098618.html

时间： 2024-09-29 21:31:03

hivesql 迁移spark3.0 sparksql报错如Cannot safely cast '字段':StringType to IntegerType的问题的相关文章

CentOS7.0开机报错“piix4_smbus”和“dev fd0”的解决办法

系统:CentOS 7.0 X64 报错现象: 此问题包含了两个错误: 错误1: piix4_smbus host smbus controller not enabled 修改办法: [[email protected] ~]# lsmod | grep i2c i2c_piix4 22106 0 i2c_core 40325 2 drm,i2c_piix4 [[email protected] ~]# vi /etc/modprob

SQL Developer 4.0 启动报错“unable to create an instance of the java virtual machine located at path”

安装了Oracle之后,第一件事情就是想想怎么去连接,进而操作.SQL Developer是官方提供的强大工具,个人看来也是第一选择. 目前官网提供的最新版是4.0.1.14.48,下载下来之后,就跃跃欲试了.将下载下来的包解压,直接运行sqldeveloper.exe这个文件,选择了本地安装的JDK路径,之后却不幸的报错了,提示"unable to create an instance of the java virtual machine located at path",具体界面

RedHat5.5_X64 Linux安装oracle 11.2.0.3 报错

REDHAT linux 安装 11G 11.2.0.3 报错 oracle用户执行./runinstaller后直接报错查看日志后 [[email protected] OraInstall2012-06-29_12-08-50AM]# more installActions2012-06-29_12-08-50AM.log SEVERE: [FATAL] HXZG: HXZG. Refer associated stacktrace #oracle.install.commons

关于web.xml3.0启动报错

九月 08, 2017 10:18:19 上午 org.apache.tomcat.util.digester.SetPropertiesRule begin 警告: [SetPropertiesRule]{Server/Service/Engine/Host/Context} Setting property 'source' to 'org.eclipse.jst.jee.server:war_item' did not find a matching property. 九月 08, 20

hyper-v迁移vm的时候报错？

hyper-v""迁移""vm的时候报错,如果"共享存储""没有问题的话,多半是""hyper-v管理器""的""虚拟机交换机管理器""中的""虚拟网络交换机""的名字不一样导致的:

AndroidStudio3.0 注解报错Annotation processors must be explicitly declared now. The following dependencies on the compile classpath are found to contain annotation processor.

体验最新版AndroidStudio3.0 Canary 8的时候,发现之前项目的butter knife报错,用到注解的应该都会报错 Error:Execution failed for task ':app:javaPreCompileDebug'. > Annotation processors must be explicitly declared now. The following dependencies on the compile classpath are found to

hivesql 迁移spark3.0 sparksql报错如Cannot safely cast '字段':StringType to IntegerType的问题

hivesql 迁移spark3.0 sparksql报错如Cannot safely cast '字段':StringType to IntegerType的问题的相关文章

CentOS7.0开机报错“piix4_smbus”和“dev fd0”的解决办法

SQL Developer 4.0 启动报错“unable to create an instance of the java virtual machine located at path”

RedHat5.5_X64 Linux安装oracle 11.2.0.3 报错

关于web.xml3.0启动报错

hyper-v迁移vm的时候报错？

AndroidStudio3.0 注解报错Annotation processors must be explicitly declared now. The following dependencies on the compile classpath are found to contain annotation processor.

mysql8.0+运行报错The server time zone value 'ÖÐ¹ú±ê×¼Ê±¼ä' is unrecognized or represents more than one time zone. 解决办法

vc++6.0各种报错合集

未能加载文件或程序集 Newtonsoft.Json, Version=4.5.0.0 的报错，解决方法