hivesql 迁移spark3.0 sparksql报错如Cannot safely cast '字段':StringType to IntegerType的问题

一 问题

hivesql可以正常运行,spark3.0运行报错如图

spark3.0配置 查看源码新增一个

  val STORE_ASSIGNMENT_POLICY =
    buildConf("spark.sql.storeAssignmentPolicy")
      .doc("When inserting a value into a column with different data type, Spark will perform " +
        "type coercion. Currently, we support 3 policies for the type coercion rules: ANSI, " +
        "legacy and strict. With ANSI policy, Spark performs the type coercion as per ANSI SQL. " +
        "In practice, the behavior is mostly the same as PostgreSQL. " +
        "It disallows certain unreasonable type conversions such as converting " +
        "`string` to `int` or `double` to `boolean`. " +
        "With legacy policy, Spark allows the type coercion as long as it is a valid `Cast`, " +
        "which is very loose. e.g. converting `string` to `int` or `double` to `boolean` is " +
        "allowed. It is also the only behavior in Spark 2.x and it is compatible with Hive. " +
        "With strict policy, Spark doesn‘t allow any possible precision loss or data truncation " +
        "in type coercion, e.g. converting `double` to `int` or `decimal` to `double` is " +
        "not allowed."
      )
      .stringConf
      .transform(_.toUpperCase(Locale.ROOT))
      .checkValues(StoreAssignmentPolicy.values.map(_.toString))
      .createWithDefault(StoreAssignmentPolicy.ANSI.toString)

看下配置有三种类型

  object StoreAssignmentPolicy extends Enumeration {
    val ANSI, LEGACY, STRICT = Value
  }

对于ANSI策略,Spark根据ANSI SQL执行类型强制。这种行为基本上与PostgreSQL相同

它不允许某些不合理的类型转换,如转换“`string`to`int`或`double` to`boolean`

对于LEGACY策略 Spark允许类型强制,只要它是有效的‘Cast‘ 这也是Spark 2.x中的唯一行为,它与Hive兼容。

对于STRICT策略 Spark不允许任何可能的精度损失或数据截断

所以我们增加配置

spark.sql.storeAssignmentPolicy=LEGACY

之后能正常运行

hivesql 迁移spark3.0 sparksql报错如Cannot safely cast '字段':StringType to IntegerType的问题

原文地址:https://www.cnblogs.com/songchaolin/p/12098618.html

时间: 2024-09-29 21:31:03

hivesql 迁移spark3.0 sparksql报错如Cannot safely cast '字段':StringType to IntegerType的问题的相关文章

CentOS7.0开机报错“piix4_smbus”和“dev fd0”的解决办法

系统:CentOS 7.0 X64 报错现象: 此问题包含了两个错误: 错误1: piix4_smbus host smbus controller not enabled 修改办法: [[email protected] ~]# lsmod | grep i2c i2c_piix4              22106  0  i2c_core               40325  2 drm,i2c_piix4 [[email protected] ~]# vi /etc/modprob

SQL Developer 4.0 启动报错“unable to create an instance of the java virtual machine located at path”

安装了Oracle之后,第一件事情就是想想怎么去连接,进而操作.SQL Developer是官方提供的强大工具,个人看来也是第一选择. 目前官网提供的最新版是4.0.1.14.48,下载下来之后,就跃跃欲试了.将下载下来的包解压,直接运行sqldeveloper.exe这个文件,选择了本地安装的JDK路径,之后却不幸的报错了,提示"unable to create an instance of the java virtual machine located at path",具体界面

RedHat5.5_X64 Linux安装oracle 11.2.0.3 报错

REDHAT linux 安装 11G  11.2.0.3   报错 oracle用户执行./runinstaller后 直接报错 查看日志后 [[email protected] OraInstall2012-06-29_12-08-50AM]# more installActions2012-06-29_12-08-50AM.log  SEVERE: [FATAL] HXZG: HXZG. Refer associated stacktrace #oracle.install.commons

关于web.xml3.0启动报错

九月 08, 2017 10:18:19 上午 org.apache.tomcat.util.digester.SetPropertiesRule begin 警告: [SetPropertiesRule]{Server/Service/Engine/Host/Context} Setting property 'source' to 'org.eclipse.jst.jee.server:war_item' did not find a matching property. 九月 08, 20

hyper-v迁移vm的时候报错?

hyper-v""迁移""vm的时候报错,如果"共享存储""没有问题的话,多半是""hyper-v管理器""的""虚拟机交换机管理器""中的""虚拟网络交换机""的名字不一样导致的:

AndroidStudio3.0 注解报错Annotation processors must be explicitly declared now. The following dependencies on the compile classpath are found to contain annotation processor.

体验最新版AndroidStudio3.0 Canary 8的时候,发现之前项目的butter knife报错,用到注解的应该都会报错 Error:Execution failed for task ':app:javaPreCompileDebug'. > Annotation processors must be explicitly declared now. The following dependencies on the compile classpath are found to

mysql8.0+运行报错The server time zone value 'Öйú±ê׼ʱ¼ä' is unrecognized or represents more than one time zone. 解决办法

话不多说,从错误即可知道是时区的错误,因此只要将时区设置为你当前系统时区即可, 因此使用root用户登录mysql,按照如下图所示操作即可. 我电脑的系统为北京时区,因此在系统中设置后,再连接数据库运行,一切OK! mysql8.0+运行报错The server time zone value 'Öйú±ê׼ʱ¼ä' is unrecognized or represents more than one time zone. 解决办法 原文地址:https://www.cnblogs.co

vc++6.0各种报错合集

背景: 由于APP对于现在的我来说,只是一个工具,对VC++6.0绝对的是浅尝辄止吧!(暂时没有太多的时间分配到这块)所以在此把错误积累下来,以备下次使用少走弯路. 正文 一.出现警告“warning c4273:inconsistent dll linkage” 在报错的XX.h头文件中可以看到如下定义 #ifdef XXX_DLL_EXPORTS #define XXX_API extern"C" __declspec(dllexport) #else #define XXX_AP

未能加载文件或程序集 Newtonsoft.Json, Version=4.5.0.0 的报错,解决方法

使用httpclient测试webapi的时候客户端报错: {"未能加载文件或程序集“Newtonsoft.Json, Version=4.5.0.0, Culture=neutral, PublicKeyToken=30ad4fe6b2a6aeed”或它的某一个依赖项.找到的程序集清单定义与程序集引用不匹配. (异常来自 HRESULT:0x80131040)":"Newtonsoft.Json, Version=4.5.0.0, Culture=neutral, Publ