0015-如何使用Sentry管理Hive外部表权限

温馨提示:要看高清无码套图,请使用手机打开并单击图片放大查看。

1.文档编写目的

本文档主要讲述如何使用Sentry对Hive外部表权限管理,并基于以下假设:

1.操作系统版本:RedHat6.5

2.CM版本:CM 5.11.1

3.集群已启用Kerberos和Sentry

4.采用具有sudo权限的ec2-user用户进行操作

2.前置准备

2.1创建外部表数据父目录

1.使用hive用户登录Kerberos

[[email protected] 1874-hive-HIVESERVER2]# kinit -kt hive.keytab hive/[email protected]
[[email protected] 1874-hive-HIVESERVER2]# klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: hive/[email protected]

Valid starting     Expires            Service principal
09/01/17 11:10:54  09/02/17 11:10:54  krbtgt/[email protected]
        renew until 09/06/17 11:10:54
[[email protected] 1874-hive-HIVESERVER2]# 

2.创建HDFS目录

使用如下命令在HDFS的根目录下创建Hive外部表的数据目录/extwarehouse

[[email protected] ec2-user]# hadoop fs -mkdir /extwarehouse
[[email protected] ec2-user]# hadoop fs -ls /
drwxr-xr-x   - hive   supergroup          0 2017-09-01 11:27 /extwarehouse
drwxrwxrwx   - user_r supergroup          0 2017-08-23 03:23 /fayson
drwx------   - hbase  hbase               0 2017-09-01 02:59 /hbase
drwxrwxrwt   - hdfs   supergroup          0 2017-08-31 06:18 /tmp
drwxrwxrwx   - hdfs   supergroup          0 2017-08-30 03:48 /user
[[email protected] ec2-user]# hadoop fs -chown hive:hive /extwarehouse
[[email protected] ec2-user]# hadoop fs -chmod 771 /extwarehouse
[[email protected] ec2-user]# hadoop fs -ls /
drwxrwx--x   - hive   hive                0 2017-09-01 11:27 /extwarehouse
drwxrwxrwx   - user_r supergroup          0 2017-08-23 03:23 /fayson
drwx------   - hbase  hbase               0 2017-09-01 02:59 /hbase
drwxrwxrwt   - hdfs   supergroup          0 2017-08-31 06:18 /tmp
drwxrwxrwx   - hdfs   supergroup          0 2017-08-30 03:48 /user
[[email protected] ec2-user]# 

2.2配置外部表数据父目录的ACL同步

1.确保HDFS已开启sentry并启用ACL同步

2.配置sentry同步路径(2.1创建的Hive外部表数据目录)

3.配置完成,重启服务。

3.创建Hive外部表

1.使用beeline命令行连接hive,创建Hive外部表

建表语句:

create external table if not exists student(
        name string,
        age int,
        addr string
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,‘
LOCATION ‘/extwarehouse/student‘;

终端操作:

[[email protected] 1874-hive-HIVESERVER2]# beeline
Beeline version 1.1.0-cdh5.11.1 by Apache Hive
beeline> !connect jdbc:hive2://localhost:10000/;principal=hive/[email protected]
...
0: jdbc:hive2://localhost:10000/> create external table if not exists student(
. . . . . . . . . . . . . . . . >         name string,
. . . . . . . . . . . . . . . . >         age int,
. . . . . . . . . . . . . . . . >         addr string
. . . . . . . . . . . . . . . . > )
. . . . . . . . . . . . . . . . > ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,‘
. . . . . . . . . . . . . . . . > LOCATION ‘/extwarehouse/student‘;
...
INFO  : OK
No rows affected (0.236 seconds)
0: jdbc:hive2://localhost:10000/> 

2.向student表中load数据

准备测试数据

[[email protected] student]# pwd
/home/ec2-user/student
[[email protected] student]# ll
total 4
-rw-r--r-- 1 root root 39 Sep  1 11:37 student.txt
[[email protected] student]# cat student.txt
zhangsan,18,guangzhou
lisi,20,shenzhen
[[email protected] student]# 

将student.txt文件put到hdfs的/tmp/student目录

[[email protected] student]# hadoop fs -mkdir /tmp/student
[[email protected] student]# ll
total 4
-rw-r--r-- 1 hive hive 39 Sep  1 11:37 student.txt
[[email protected] student]# hadoop fs -put student.txt /tmp/student
[[email protected] student]# hadoop fs -ls /tmp/student
Found 1 items
-rw-r--r--   3 hive supergroup         39 2017-09-01 11:57 /tmp/stu
dent/student.txt
[[email protected] student]# 

在beeline命令行下,将数据load到student表

0: jdbc:hive2://localhost:10000/> load data inpath ‘/tmp/student‘ into table student;
...
INFO  : Table default.student stats: [numFiles=1, totalSize=39]
INFO  : Completed executing command(queryId=hive_20170901115858_5a76aa76-1b24-40ce-8254-42991856c05b); Time taken: 0.263 seconds
INFO  : OK
No rows affected (0.41 seconds)
0: jdbc:hive2://localhost:10000/> 

执行完load命令后,查看表数据

0: jdbc:hive2://localhost:10000/> select * from student;
...
INFO  : OK
+---------------+--------------+---------------+--+
| student.name  | student.age  | student.addr  |
+---------------+--------------+---------------+--+
| zhangsan      | 18           | guangzhou     |
| lisi          | 20           | shenzhen      |
+---------------+--------------+---------------+--+
2 rows selected (0.288 seconds)
0: jdbc:hive2://localhost:10000/> 

4.使用fayson用户在beeline和impala-shell查看

使用fayson用户的principal初始化Kerberors的票据

[[email protected] cdh-shell-master]$ kinit fayson
Password for [email protected]:
[[email protected] cdh-shell-master]$ klist
Ticket cache: FILE:/tmp/krb5cc_500
Default principal: [email protected]

Valid starting     Expires            Service principal
09/01/17 12:27:39  09/02/17 12:27:39  krbtgt/[email protected]
        renew until 09/08/17 12:27:39
[[email protected] cdh-shell-master]$ 

4.1访问hdfs目录

[[email protected] ~]$ hadoop fs -ls /extwarehouse/student
ls: Permission denied: user=fayson, access=READ_EXECUTE, inode="/extwarehouse/student":hive:hive:drwxrwx--x
[[email protected] ~]$ 

4.2beeline命令行查看

[[email protected] ~]$ beeline
Beeline version 1.1.0-cdh5.11.1 by Apache Hive
beeline> !connect jdbc:hive2://localhost:10000/;principal=hive/[email protected]
...
INFO  : OK
+-----------+--+
| tab_name  |
+-----------+--+
+-----------+--+
No rows selected (0.295 seconds)
0: jdbc:hive2://localhost:10000/> select * from student;
Error: Error while compiling statement: FAILED: SemanticException No valid privileges
 User fayson does not have privileges for QUERY
 The required privileges: Server=server1->Db=default->Table=student->Column=addr->action=select; (state=42000,code=40000)
0: jdbc:hive2://localhost:10000/> 

4.3impala-shell命令行查看

[[email protected] cdh-shell-master]$ impala-shell
...
[Not connected] > connect ip-172-31-10-156.ap-southeast-1.compute.internal:21000;
Connected to ip-172-31-10-156.ap-southeast-1.compute.internal:21000
Server version: impalad version 2.8.0-cdh5.11.1 RELEASE (build 3382c1c488dff12d5ca8d049d2b59babee605b4e)
[ip-172-31-10-156.ap-southeast-1.compute.internal:21000] > show tables;
Query: show tables
ERROR: AuthorizationException: User ‘[email protected]‘ does not have privileges to access: default.*

[ip-172-31-10-156.ap-southeast-1.compute.internal:21000] > select * from student;
Query: select * from student
Query submitted at: 2017-09-01 12:33:06 (Coordinator: http://ip-172-31-10-156.ap-southeast-1.compute.internal:25000)
ERROR: AuthorizationException: User ‘[email protected]‘ does not have privileges to execute ‘SELECT‘ on: default.student

[ip-172-31-10-156.ap-southeast-1.compute.internal:21000] > 

4.4测试总结

通过hive用户创建的外部表,未给fayson用户赋予student表读权限情况下,无权限访问hdfs的(/extwarehouse/student)数据目录,在beeline和impala-shell命令行下,fayson用户均无权限查询student表数据。

5.为fayson用户赋予student表读权限

注:以下操作均在hive管理员用户下操作

1.创建student_read角色

0: jdbc:hive2://localhost:10000/> create role student_read;
...
INFO  : Executing command(queryId=hive_20170901124848_927878ba-0217-4a32-a508-bf29fed67be8): create role student_read
...
INFO  : OK
No rows affected (0.104 seconds)
0: jdbc:hive2://localhost:10000/> 

2.将student表的查询权限授权给student_read角色

0: jdbc:hive2://localhost:10000/> grant select on table student to role student_read;
...
INFO  : Executing command(queryId=hive_20170901125252_8702d99d-d8eb-424e-929d-5df352828e2c): grant select on table student to role student_read
...
INFO  : OK
No rows affected (0.111 seconds)
0: jdbc:hive2://localhost:10000/> 

3.将student_read角色授权给fayson用户组

0: jdbc:hive2://localhost:10000/> grant role student_read to group fayson;
...
INFO  : Executing command(queryId=hive_20170901125454_5f27a87e-2f63-46d9-9cce-6f346a0c415c): grant role student_read to group fayson
...
INFO  : OK
No rows affected (0.122 seconds)
0: jdbc:hive2://localhost:10000/> 

6.再次测试

使用fayson用户登录Kerberos

6.1访问HDFS目录

访问student数据所在hdfs目录/extwarehouse/student

[[email protected] ~]$ hadoop fs -ls /extwarehouse/student
Found 1 items
-rwxrwx--x+  3 hive hive         39 2017-09-01 14:42 /extwarehouse/student/student.txt
[[email protected] ~]$ 

6.2beeline查询student表

[[email protected] ~]$ klist
Ticket cache: FILE:/tmp/krb5cc_500
Default principal: [email protected]

Valid starting     Expires            Service principal
09/01/17 12:58:59  09/02/17 12:58:59  krbtgt/[email protected]
        renew until 09/08/17 12:58:59
[[email protected] ~]$
[[email protected] ~]$ beeline
Beeline version 1.1.0-cdh5.11.1 by Apache Hive
beeline> !connect jdbc:hive2://localhost:10000/;principal=hive/[email protected]
...
INFO  : OK
+-----------+--+
| tab_name  |
+-----------+--+
| student   |
+-----------+--+
1 row selected (0.294 seconds)
0: jdbc:hive2://localhost:10000/> select * from student;
...
INFO  : OK
+---------------+--------------+---------------+--+
| student.name  | student.age  | student.addr  |
+---------------+--------------+---------------+--+
| zhangsan      | 18           | guangzhou     |
| lisi          | 20           | shenzhen      |
+---------------+--------------+---------------+--+
2 rows selected (0.241 seconds)
0: jdbc:hive2://localhost:10000/> 

6.3impala-shell查询student表

[[email protected] cdh-shell-master]$ klist
Ticket cache: FILE:/tmp/krb5cc_500
Default principal: [email protected]

Valid starting     Expires            Service principal
09/01/17 12:58:59  09/02/17 12:58:59  krbtgt/[email protected]
        renew until 09/08/17 12:58:59
[[email protected] cdh-shell-master]$ impala-shell
...
[Not connected] > connect ip-172-31-10-156.ap-southeast-1.compute.internal:21000;
Connected to ip-172-31-10-156.ap-southeast-1.compute.internal:21000
Server version: impalad version 2.8.0-cdh5.11.1 RELEASE (build 3382c1c488dff12d5ca8d049d2b59babee605b4e)
[ip-172-31-10-156.ap-southeast-1.compute.internal:21000] > show tables;
Query: show tables
+---------+
| name    |
+---------+
| student |
+---------+
Fetched 1 row(s) in 0.02s
[ip-172-31-10-156.ap-southeast-1.compute.internal:21000] > select * from student;
...
+----------+-----+-----------+
| name     | age | addr      |
+----------+-----+-----------+
| zhangsan | 18  | guangzhou |
| lisi     | 20  | shenzhen  |
+----------+-----+-----------+
Fetched 2 row(s) in 0.13s
[ip-172-31-10-156.ap-southeast-1.compute.internal:21000] > 

6.4测试总结

通过hive用户创建的外部表,给fayson用户赋予student表读权限后,可正常访问hdfs的(/extwarehouse/student)数据目录,在beeline和impala-shell命令行下,fayson用户均可查询student表数据。

7.Sentry管理Hive外部表权限总结

开启外部表的数据父目录ACL同步后,不需要单独的维护外部表数据目录权限。

参考文档:

https://www.cloudera.com/documentation/enterprise/latest/topics/sg\_hdfs\_sentry\_sync.html

醉酒鞭名马,少年多浮夸! 岭南浣溪沙,呕吐酒肆下!挚友不肯放,数据玩的花!
温馨提示:要看高清无码套图,请使用手机打开并单击图片放大查看。

原文地址:http://blog.51cto.com/14049791/2318339

时间: 2024-09-30 10:19:35

0015-如何使用Sentry管理Hive外部表权限的相关文章

0035-如何使用Sentry管理Hive外部表(补充)

温馨提示:要看高清无码套图,请使用手机打开并单击图片放大查看. 1.文档编写目的 本文文档主要讲述如何使用Sentry管理Hive/Impala外部表权限. 内容概述 1.创建测试库及外部表 2.创建角色并授权 3.授权测试 4.测试总结 测试环境 1.操作系统为CentOS6.5 2.CM和CDH版本为5.12.1 3.采用root用户操作 前置条件 1.集群运行正常 2.集群已启用Kerberos且正常使用 3.HDFS/Hive/Impala/Hue服务已与Sentry集成 4.Hive用

HIVE外部表 分区表

HIVE外部表 分区表    外部表        创建hive表,经过检查发现TBLS表中,hive表的类型为MANAGED_TABLE. 在真实开发中,很可能在hdfs中已经有了数据,希望通过hive直接使用这些数据作为表内容.        此时可以直接创建出hdfs文件夹,其中放置数据,再在hive中创建表管来管理,这种方式创建出来的表叫做外部表. #创建目录,上传已有文件        hadoop fs -mkdir /data        hadoop fs -put stude

Hive 外部表 分区表

  之前主要研究oracle与mysql,认为hive事实上就是一种数据仓库的框架,也没有太多另类,所以主要精力都在研究hadoop.hbase,sqoop,mahout,近期略微用心看了下hive.事实上hive还是比我想象中好用的多,心里有点点暗爽,不论是与hadoop的衔接,还是在对外查询分析,定期hsql生成报表方面,都很方便.能够不用mapreduce.直接用hive生成报表. 真是方便.  Hive 提供两者表的两种使用方式,一种是内部表(托管表),第二种就是外部表. 对于两种表的使

hive外部表的建立与数据匹配

1.建立hive的外部表匹配hdfs上的数据 出现如下报错: hive (solar)> select * from solar.ori_mysql_sqoop_open_third_party_user_da limit 10; OK Failed with exception java.io.IOException:java.io.IOException: Not a file: hdfs://f04/sqoop/open/third_party_user/dt=2016-12-12 Tim

hive外部表

创建外部表.数据从HDFS获取  只是建立了链接,hdfs中的数据丢失,表中数据也丢失;hdfs数据增加,表中数据也增加 上传文件 创建外部表 删除文件 执行查询语句,发现少了 原文地址:https://www.cnblogs.com/ggzhangxiaochao/p/9220855.html

CDH Sentry 管理Hive鉴权

在CM管理界面上要开启一些选项,这里不做详细说明,网上一查都有,推荐文章http://www.jianshu.com/p/055c40dcb8c5 但仅限于看配置的内容,至于下面的,作者没有详细解释 sentry的权限设置是在hive sql中完成的,需要用Beeline登录,如beeline -u jdbc:hive2://hiveIP:10000 -n user -p passwd 用户名密码属于hive所在的OS服务器,也就是说你要在hiveserver2上建立一个用户并且设置密码才可以登

创建Hbase Hive外部表报错: Unable to determine ZooKeeper ensemble

创建HBase的Hive外部表 1: create external table ttt(rowkey string,info map<string,string>)STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,info:") TBLPROPERTIES ("hb

【原创】问题定位分享(16)spark写数据到hive外部表报错ClassCastException: org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat cannot be cast to org.apache.hadoop.hive.ql.io.HiveOutputFormat

spark 2.1.1 spark在写数据到hive外部表(底层数据在hbase中)时会报错 Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat cannot be cast to org.apache.hadoop.hive.ql.io.HiveOutputFormat at org.apache.spark.sql.hive.SparkHiveWrit

ASP.NET-权限管理五张表

ASP.NET 权限管理五张表 权限管理的表(5张表) 每个表里面必有的一些信息 序号 名称  字段  类型   主键 默认值 是否为空 备注 1  用户ID  ID      INT     是  null    否 用户ID 2 用户名称 UserName varchar(100) 否 null 否 用户名称 3 用户密码 UserPassword varchar(20) 否 null 否 用户密码 4 状态 Status smallint 否 null 否 有效1,无效0 5 创建人 Cr