0002-CENTOS7.2安装CDH5.10和Kudu1.2

Fayson的github: https://github.com/fayson/cdhproject

推荐关注微信公众号:“Hadoop实操”,ID:gh_c4c535955d0f,或者扫描文末二维码。

1.概述

本文档描述CENTOS7.2操作系统部署CDH企业版的过程。Cloudera企业级数据中心的安装主要分为4个步骤:

1.集群服务器配置,包括安装操作系统、关闭防火墙、同步服务器时钟等;
2.外部数据库安装
3.安装Cloudera管理器;
4.安装CDH集群;
5.集群完整性检查,包括HDFS文件系统、MapReduce、Hive等是否可以正常运行。

这篇文档将着重介绍Cloudera管理器与CDH的安装,并基于以下假设:

1.操作系统版本:CENTOS7.2
2.MariaDB数据库版本为10.2.1
3.CM版本:CDH 5.10.0
4.CDH版本:CDH 5.10.0
5.采用ec2-user对集群进行部署
6.您已经下载CDH和CM的安装包

2.前期准备

2.1.hostname及hosts配置

集群中各个节点之间能互相通信使用静态IP地址。IP地址和主机名通过/etc/hosts配置,主机名/etc/hostname进行配置。

以cm节点(172.31.2.159)为例:

  • hostname配置

/etc/hostname文件如下:

ip-172-31-2-159 

或者你可以通过命令修改立即生效

[email protected] ~$ sudo hostnamectl  set-hostname ip-172-31-2-159 

<font face="微软雅黑" size=4 color=red >注意:这里修改hostname跟REDHAT6的区别

  • hosts配置

/etc/hosts文件如下:

172.31.2.159 ip-172-31-2-159
172.31.12.108 ip-172-31-12-108
172.31.5.236 ip-172-31-5-236
172.31.7.96 ip-172-31-7-96

以上两步操作,在集群中其它节点做相应配置。

2.2.禁用SELinux

在所有节点执行sudo setenforce 0 命令,此处使用批处理shell执行:

[email protected] ~$ sh ssh\_do\_all.sh  node.list "sudo setenforce 0" 

集群所有节点修改/etc/selinux/config文件如下:

SELINUX=disabled
SELINUXTYPE=targeted

2.3.关闭防火墙

集群所有节点执行 sudo systemctl stop命令,此处通过shell批量执行命令如下:

[[email protected] ~]$ sh ssh_do_all.sh node.list "sudo systemctl stop firewalld"
[[email protected] ~]$ sh ssh_do_all.sh node.list "sudo systemctl disable firewalld"
[[email protected] ~]$ sh ssh_do_all.sh node.list "sudo systemctl status firewalld"

2.4.集群时钟同步

在CentOS7.2的操作系统上,已经默认的安装了chrony,配置chrony时钟同步,将cm(172.31.2.159)服务作为本地chrony服务器,其它3台服务器与其保持同步,配置片段:

  • 172.31.2.159配置与自己同步

    [[email protected] ip-172-31-2-159 ~]$ sudo vim /etc/chrony.conf
    server ip-172-31-2-159 iburst
    #keyfile=/etc/chrony.keys
  • 集群其它节点:在注释下增加如下配置
[[email protected] ~]$ sudo vim /etc/chrony.conf
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
server ip-172-31-2-159 iburst
#keyfile=/etc/chrony.keys
  • 重启所有机器的chrony服务
[[email protected] ~]$ sh ssh_do_all.sh node.list "sudo systemctl restart chronyd"

  • 验证始终同步,在所有节点执行chronycsources命令,如下使用脚本批量执行
[[email protected] ~]$ sh ssh_do_all.sh node.list "chronyc sources"

2.5.配置操作系统repo

  • 挂载操作系统iso文件

    [[email protected] ~]$ sudo mkdir /media/DVD1
    [[email protected] ~]$ sudo mount -o loop
    CentOS-7-x86_64-DVD-1611.iso /media/DVD1/

  • 配置操作系统repo
    [[email protected] ~]$ sudo vim /etc/yum.repos.d/local_os.repo
    [local_iso]
    name=CentOS-$releasever - Media
    baseurl=file:///media/DVD1
    gpgcheck=0
    enabled=1
    [[email protected] ~]$ sudo yum repolist

2.6.安装http服务

  • 安装httpd服务
[[email protected] ~]$ sudo yum -y install httpd
  • 启动或停止httpd服务
[[email protected] ~]$ sudo systemctl start httpd
[[email protected] ~]$ sudo systemctl stop httpd
  • 安装完httpd后,重新制作操作系统repo,换成http的方式方便其它服务器也可以访问

    [[email protected] ~]$ sudo mkdir /var/www/html/iso
    [[email protected] ~]$ sudo scp -r /media/DVD1/* /var/www/html/iso/
    [[email protected] ~]$ sudo vim /etc/yum.repos.d/os.repo
    [osrepo]
    name=os_repo
    baseurl=http://172.31.2.159/iso/
    enabled=true
    gpgcheck=false
    [[email protected] ~]$ sudo yum repolist

2.7.安装MariaDB

  • 由于centos7默认使用的是5.5.52版本的MariaDB,此处使用的10.2.1版本,点击,在官网下载rpm安装包:

    MariaDB-10.2.1-centos7-x86_64-client.rpm
    MariaDB-10.2.1-centos7-x86_64-common.rpm
    MariaDB-10.2.1-centos7-x86_64-compat.rpm
    MariaDB-10.2.1-centos7-x86_64-server.rpm

将包下载到本地,放在同一目录,执行createrepo命令生成rpm元数据。

此处使用apache2,将上述mariadb10.2.1目录移动到/var/www/html目录下, 使得用户可以通过HTTP访问这些rpm包。

[[email protected] ~]$ sudo mv mariadb10.2.1 /var/www/html/

安装MariaDB依赖

[[email protected] ~]$ yum install libaio perl perl-DBI perl-Module-Pluggable perl-Pod-Escapes perl-Pod-Simple perl-libs perl-version

制作本地repo

[[email protected] ~]$ sudo vim /etc/yum.repos.d/mariadb.repo
[mariadb]
name = MariaDB
baseurl = http://172.31.2.159/ mariadb10.2.1
enable = true
gpgcheck = false
[[email protected] ~]$ sudo yum repolist  
  • 安装MariaDB

    [[email protected] ~]$ sudo yum -y install MariaDB-server MariaDB-client
  • 启动并配置MariaDB
[[email protected] ~]$ sudo systemctl  start mariadb

[[email protected] ~]$ sudo /usr/bin/mysql_secure_installation

NOTE: RUNNING ALL PARTS OF THIS SCRIPT IS  RECOMMENDED FOR ALL MariaDB

       SERVERS IN PRODUCTION USE!   PLEASE READ EACH STEP CAREFULLY!

In order to log into MariaDB to secure it, we‘ll  need the current

password for the root user.  If you‘ve just installed MariaDB, and

you haven‘t set the root password yet, the  password will be blank,

so you should just press enter here.

Enter current password for root (enter for none):  

OK, successfully used password, moving on...

Setting the root password ensures that nobody can  log into the MariaDB

root user without the proper authorisation.

Set root password? [Y/n] Y

New password:

Re-enter new password:

Password updated successfully!

Reloading privilege tables..

 ...  Success!

By default, a MariaDB installation has an  anonymous user, allowing anyone

to log into MariaDB without having to have a user  account created fo

them.  This  is intended only for testing, and to make the installation

go a bit smoother.  You should remove them before moving into a

production environment.

Remove anonymous users? [Y/n] Y

 ...  Success!

Normally, root should only be allowed to connect  from ‘localhost‘.  This

ensures that someone cannot guess at the root  password from the network.

Disallow root login remotely? [Y/n] n

 ...  skipping.

By default, MariaDB comes with a database named  ‘test‘ that anyone can

access.   This is also intended only for testing, and should be removed

before moving into a production environment.

Remove test database and access to it? [Y/n] Y

 - Dropping  test database...

 ...  Success!

 - Removing  privileges on test database...

 ...  Success!

Reloading the privilege tables will ensure that  all changes made so fa

will take effect immediately.

Reload privilege tables now? [Y/n] Y

 ...  Success!

Cleaning up...

All done!   If you‘ve completed all of the above steps, your MariaDB

installation should now be secure.

Thanks for using MariaDB!
  • 建立CM和Hive需要的表
[[email protected] ~]$ mysql -uroot -p
Enter password:
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 9
Server version: 10.2.1-MariaDB MariaDB Server
Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.
Type ‘help;‘ or ‘\h‘ for help. Type ‘\c‘ to clear the current input statement.
MariaDB [(none)]>

create database metastore default character set utf8;
CREATE USER ‘hive‘@‘%‘ IDENTIFIED BY ‘password‘;
GRANT ALL PRIVILEGES ON metastore. * TO ‘hive‘@‘%‘;
FLUSH PRIVILEGES;
create database cm default character set utf8;
CREATE USER ‘cm‘@‘%‘ IDENTIFIED BY ‘password‘;
GRANT ALL PRIVILEGES ON cm. * TO ‘cm‘@‘%‘;
FLUSH PRIVILEGES;

create database am default character set utf8;
CREATE USER ‘am‘@‘%‘ IDENTIFIED BY ‘password‘;
GRANT ALL PRIVILEGES ON am. * TO ‘am‘@‘%‘;
FLUSH PRIVILEGES;

create database rm default character set utf8;
CREATE USER ‘rm‘@‘%‘ IDENTIFIED BY ‘password‘;
GRANT ALL PRIVILEGES ON rm. * TO ‘rm‘@‘%‘;
FLUSH PRIVILEGES;
  • 安装jdbc驱动
[[email protected] ~]$ sudo mkdir -p /usr/share/java/
[[email protected] ~]$ sudo mv mysql-connector-java-5.1.37.jar /usr/share/java/
[[email protected] java]$ cd /usr/share/java
[[email protected] java]$ sudo ln -s mysql-connector-java-5.1.37.jar mysql-connector-java.jar
[[email protected] java]$ ll
total 964
-rw-r--r--. 1 root root 985600 Oct  6  2015 mysql-connector-java-5.1.37.jar
lrwxrwxrwx. 1 root root     31 Mar 29 14:37 mysql-connector-java.jar -> mysql-connector-java-5.1.37.jar

3.Cloudera Manager安装

3.1.配置本地repo源

将Cloudera Manager安装需要的7个rpm包下载到本地,放在同一目录,执行createrepo命令生成rpm元数据。

[[email protected] cm]$ ls
cloudera-manager-agent-5.10.0-1.cm5100.p0.85.el7.x86_64.rpm
cloudera-manager-daemons-5.10.0-1.cm5100.p0.85.el7.x86_64.rpm
cloudera-manager-server-5.10.0-1.cm5100.p0.85.el7.x86_64.rpm
cloudera-manager-server-db-2-5.10.0-1.cm5100.p0.85.el7.x86_64.rpm
enterprise-debuginfo-5.10.0-1.cm5100.p0.85.el7.x86_64.rpm
jdk-6u31-linux-amd64.rpm
oracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm
[[email protected] cm]$ sudo createrepo .
Spawning worker 0 with 1 pkgs
Spawning worker 1 with 1 pkgs
Spawning worker 2 with 1 pkgs
Spawning worker 3 with 1 pkgs
Spawning worker 4 with 1 pkgs
Spawning worker 5 with 1 pkgs
Spawning worker 6 with 1 pkgs
Spawning worker 7 with 0 pkgs
Workers Finished
Saving Primary metadata
Saving file lists metadata
Saving other metadata
Generating sqlite DBs
Sqlite DBs complete
  • 配置Web服务器

    此处使用apache2,将上述cdh5.10.0/cm5.10.0目录移动到/var/www/html目录下, 使得用户可以通过HTTP访问这些rpm包。

    [[email protected] ~]$ sudo mv cdh5.10.0/ cm5.10.0/ /var/www/html/


  • 制作Cloudera Manager的repo源
    [[email protected] ~]$ sudo vim /etc/yum.repos.d/cm.repo
    [cmrepo]
    name = cm_repo
    baseurl = http://172.31.2.159/cm5.10.0.0
    enable = true
    gpgcheck = false
    [[email protected] yum.repos.d]$ sudo yum repolist
  • 验证安装JDK
    [[email protected] ~]$ sudo yum -y install oracle-j2sdk1.7-1.7.0+update67-1

3.2 安装Cloudera Manager Server

  • 通过yum安装ClouderaManager Server

    [[email protected] ~]$ sudo yum -y install cloudera-manager-server
  • 初始化数据库
    [[email protected] ~]$ sudo /usr/share/cmf/schema/scm_prepare_database.sh mysql cm cm password
    JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
    Verifying that we can write to /etc/cloudera-scm-server
    Creating SCM configuration file in /etc/cloudera-scm-server
    Executing:  /usr/java/jdk1.7.0_67-cloudera/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/
    oracle-connector-java.jar:/usr/share/cmf/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.
    [                          main] DbCommandExecutor              INFO  Successfully connected to database.
    All done, your SCM database is configured correctly!
  • 启动Cloudera Manager Server
    [[email protected] ~]$ sudo systemctl start cloudera-scm-server
  • 检查端口是否监听
    [[email protected] ~]$ sudo netstat -lnpt | grep 7180
    tcp        0      0 0.0.0.0:7180            0.0.0.0:*               LISTEN      6890/java  
  • 通过http://172.31.2.159:7180/cmf/login访问CM

4.CDH安装

4.1.CDH集群安装向导

1.admin/admin登录到CM
2.同意license协议,点击继续

3.选择60试用,点击继续

4.点击“继续”

5.输入主机ip或者名称,点击搜索找到主机后点击继续

6.点击“继续”

7.使用parcel选择,点击“更多选项”,点击“-”删除其它所有地址,输入http://172.31.2.159/cm5.10.0/,点击“保存更改”

8.选择自定义存储库,输入cm的http地址

9.点击“继续”,进入下一步安装jdk

10.点击“继续”,进入下一步,默认多用户模式

11.点击“继续”,进入下一步配置ssh账号密码

12.点击“继续”,进入下一步,安装Cloudera Manager相关到各个节点

13.点击“继续”,进入下一步安装cdh到各个节点

14.点击“继续”,进入下一步主机检查,确保所有检查项均通过


点击完成进入服务安装向导。

4.2.集群设置安装向导

1.选择需要安装的服务

2.点击“继续”,进入集群角色分配

3.点击“继续”,进入下一步,测试数据库连接

4.测试成功,点击“继续”,进入目录设置,此处使用默认默认目录,根据实际情况进行目录修改

5.点击“继续”,进入各个服务启动

6.安装成功

7.安装成功后进入home管理界面

5.Kudu安装

CDH从5.10开始,打包集成Kudu1.2,并且Cloudera正式提供支持。这个版本开始Kudu的安装较之前要简单很多,省去了Impala_Kudu,安装完Kudu,Impala即可直接操作Kudu。

以下安装步骤基于用户使用Cloudera Manager来安装和部署Kudu1.2

5.1.安装csd文件

1.下载csd文件

[[email protected] ~]# wget http://archive.cloudera.com/kudu/csd/KUDU-5.10.0.jar

2.将下载的jar包文件移动到/opt/cloudera/csd目录

[[email protected] ~]# mv KUDU-5.10.0.jar /opt/cloudera/csd

3.修改权限

[[email protected] ~]# chown cloudera-scm:cloudera-scm /opt/cloudera/csd/KUDU-5.10.0.jar
[[email protected] ~]# chmod 644 /opt/cloudera/csd/KUDU-5.10.0.jar

4.重启Cloudera Manager服务

[[email protected] ~]# systemctl restart cloudera-scm-server

5.2.安装Kudu服务

1.下载Kudu服务需要的Parcel包

[[email protected] ~]# wget http://archive.cloudera.com/kudu/parcels/5.10/KUDU-1.2.0-1.cdh5.10.1.p0.66-el7.parcel
[[email protected] ~]# wget http://archive.cloudera.com/kudu/parcels/5.10/KUDU-1.2.0-1.cdh5.10.1.p0.66-el7.parcel.sha1
[[email protected] ~]# wget http://archive.cloudera.com/kudu/parcels/5.10/manifest.json

2.将Kudu的Parcel包部署到http服务

[[email protected] ~]# mkdir kudu1.2
[[email protected] ~]# mv KUDU-1.2.0-1.cdh5.10.1.p0.66-el7.parcel* kudu1.2/
[[email protected] ~]# mv manifest.json kudu1.2
[[email protected] ~]# mv kudu1.2/ /var/www/html/
[[email protected] ~]# systemctl start httpd

3.检查http显示Kudu正常:

4.通过CM界面配置Kudu的Parcel地址,并下载,分发,激活Kudu。

5.通过CM安装Kudu1.2

添加Kudu服务

选择Master和Tablet Server

配置相应的目录,<font face="微软雅黑" size=4 color=red >注:无论是Master还是Tablet根据实际情况数据目录(fs_data_dir)应该都可能有多个,以提高并发读写,从而提高Kudu性能

启动Kudu服务

安装完毕

5.3.配置Impala

在CDH5.10中,安装完Kudu1.2后,默认Impala即可直接操作Kudu进行SQL操作,但为了省去每次建表都需要在TBLPROPERTIES中添加kudu_master_addresses属性,建议在Impala的高级配置Kudu Master的地址:

--kudu\_master\_hosts=ip-172-31-2-159:7051

6.快速组件服务验证

6.1.HDFS验证(mkdir+put+cat+get)

[[email protected] ~]# hadoop fs -mkdir -p /lilei/test_table
[[email protected] ~]# cat > a.txt
1#2
c#d
我#你^C
[[email protected] ~]#
[[email protected] ~]#
[[email protected] ~]#
[[email protected] ~]# hadoop fs -put a.txt /lilei/test_table
[[email protected] ~]# hadoop fs -cat /lilei/test_table/a.txt
1#2
c#d
[[email protected] ~]# rm -rf a.txt
[[email protected] ~]#
[[email protected] ~]# hadoop fs -get /lilei/test_table/a.txt
[[email protected] ~]#
[[email protected] ~]# cat a.txt
1#2
c#d

6.2.Hive验证

[[email protected] ~]# hive

Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/jars/hive-common-1.1.0-cdh5.10.0.jar!/hive-log4j.properties
WARNING: Hive CLI is deprecated and migration to Beeline is recommended.
hive> create external table test_table
    > (
    > s1 string,
    > s2 string
    > )
    > row format delimited fields terminated by ‘#‘
    > stored as textfile location ‘/lilei/test_table‘;
OK
Time taken: 0.631 seconds
hive> select * from test_table;
OK
1   2
c   d
Time taken: 0.36 seconds, Fetched: 2 row(s)
hive> select count(*) from test_table;
Query ID = root_20170404013939_69844998-4456-4bc1-9da5-53ea91342e43
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1491283979906_0005, Tracking URL = http://ip-172-31-2-159:8088/proxy/application_1491283979906_0005/
Kill Command = /opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/hadoop/bin/hadoop job  -kill job_1491283979906_0005
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2017-04-04 01:39:25,425 Stage-1 map = 0%,  reduce = 0%
2017-04-04 01:39:31,689 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.02 sec
2017-04-04 01:39:36,851 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 2.34 sec
MapReduce Total cumulative CPU time: 2 seconds 340 msec
Ended Job = job_1491283979906_0005
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 2.34 sec   HDFS Read: 6501 HDFS Write: 2 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 340 msec
OK
2
Time taken: 21.56 seconds, Fetched: 1 row(s)

6.3.MapReduce验证

[[email protected] ~]# hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/hadoop-examples.jar pi 5 5
Number of Maps  = 5
Samples per Map = 5
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Starting Job
17/04/04 01:38:15 INFO client.RMProxy: Connecting to ResourceManager at ip-172-31-2-159/172.31.2.159:8032
17/04/04 01:38:15 INFO mapreduce.JobSubmissionFiles: Permissions on staging directory /user/root/.staging are incorrect: rwxrwxrwx. Fixing permissions to correct value rwx------
17/04/04 01:38:15 INFO input.FileInputFormat: Total input paths to process : 5
17/04/04 01:38:15 INFO mapreduce.JobSubmitter: number of splits:5
17/04/04 01:38:15 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1491283979906_0004
17/04/04 01:38:16 INFO impl.YarnClientImpl: Submitted application application_1491283979906_0004
17/04/04 01:38:16 INFO mapreduce.Job: The url to track the job: http://ip-172-31-2-159:8088/proxy/application_1491283979906_0004/
17/04/04 01:38:16 INFO mapreduce.Job: Running job: job_1491283979906_0004
17/04/04 01:38:21 INFO mapreduce.Job: Job job_1491283979906_0004 running in uber mode : false
17/04/04 01:38:21 INFO mapreduce.Job:  map 0% reduce 0%
17/04/04 01:38:26 INFO mapreduce.Job:  map 100% reduce 0%
17/04/04 01:38:32 INFO mapreduce.Job:  map 100% reduce 100%
17/04/04 01:38:32 INFO mapreduce.Job: Job job_1491283979906_0004 completed successfully
17/04/04 01:38:32 INFO mapreduce.Job: Counters: 49
    File System Counters
        FILE: Number of bytes read=64
        FILE: Number of bytes written=749758
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=1350
        HDFS: Number of bytes written=215
        HDFS: Number of read operations=23
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=3
    Job Counters
        Launched map tasks=5
        Launched reduce tasks=1
        Data-local map tasks=5
        Total time spent by all maps in occupied slots (ms)=16111
        Total time spent by all reduces in occupied slots (ms)=2872
        Total time spent by all map tasks (ms)=16111
        Total time spent by all reduce tasks (ms)=2872
        Total vcore-seconds taken by all map tasks=16111
        Total vcore-seconds taken by all reduce tasks=2872
        Total megabyte-seconds taken by all map tasks=16497664
        Total megabyte-seconds taken by all reduce tasks=2940928
    Map-Reduce Framework
        Map input records=5
        Map output records=10
        Map output bytes=90
        Map output materialized bytes=167
        Input split bytes=760
        Combine input records=0
        Combine output records=0
        Reduce input groups=2
        Reduce shuffle bytes=167
        Reduce input records=10
        Reduce output records=0
        Spilled Records=20
        Shuffled Maps =5
        Failed Shuffles=0
        Merged Map outputs=5
        GC time elapsed (ms)=213
        CPU time spent (ms)=3320
        Physical memory (bytes) snapshot=2817884160
        Virtual memory (bytes) snapshot=9621606400
        Total committed heap usage (bytes)=2991587328
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters
        Bytes Read=590
    File Output Format Counters
        Bytes Written=97
Job Finished in 17.145 seconds
Estimated value of Pi is 3.68000000000000000000

6.4.Impala验证

[[email protected] ~]# impala-shell -i ip-172-31-7-96
Starting Impala Shell without Kerberos authentication
Connected to ip-172-31-7-96:21000
Server version: impalad version 2.7.0-cdh5.10.0 RELEASE (build 785a073cd07e2540d521ecebb8b38161ccbd2aa2)
***********************************************************************************
Welcome to the Impala shell.
(Impala Shell v2.7.0-cdh5.10.0 (785a073) built on Fri Jan 20 12:03:56 PST 2017)

Run the PROFILE command after a query has finished to see a comprehensive summary
of all the performance and diagnostic information that Impala gathered for that
query. Be warned, it can be very long!
***********************************************************************************
[ip-172-31-7-96:21000] > show tables;
Query: show tables
+------------+
| name       |
+------------+
| test_table |
+------------+
Fetched 1 row(s) in 0.20s
[ip-172-31-7-96:21000] > select * from test_table;
Query: select * from test_table
Query submitted at: 2017-04-04 01:41:56 (Coordinator: http://ip-172-31-7-96:25000)
Query progress can be monitored at: http://ip-172-31-7-96:25000/query_plan?query_id=c4a06bd46f9106b:4a69f04800000000
+----+----+
| s1 | s2 |
+----+----+
| 1  | 2  |
| c  | d  |
+----+----+
Fetched 2 row(s) in 3.73s
[ip-172-31-7-96:21000] > select count(*) from test_table;
Query: select count(*) from test_table
Query submitted at: 2017-04-04 01:42:06 (Coordinator: http://ip-172-31-7-96:25000)
Query progress can be monitored at: http://ip-172-31-7-96:25000/query_plan?query_id=2a415724696f7414:1f9113ea00000000
+----------+
| count(*) |
+----------+
| 2        |
+----------+
Fetched 1 row(s) in 0.15s

6.5.Spark验证

[[email protected] ~]# spark-shell
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  ‘_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.6.0
      /_/

Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_67)
Type in expressions to have them evaluated.
Type :help for more information.
Spark context available as sc (master = yarn-client, app id = application_1491283979906_0006).
17/04/04 01:43:26 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.1.0
17/04/04 01:43:27 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException
SQL context available as sqlContext.

scala> var textFile=sc.textFile("hdfs://ip-172-31-2-159:8020/lilei/test_table/a.txt")
textFile: org.apache.spark.rdd.RDD[String] = hdfs://ip-172-31-2-159:8020/lilei/test_table/a.txt MapPartitionsRDD[1] at textFile at <console>:27

scala> 

scala> textFile.count()
res0: Long = 2

6.6.Kudu验证

[[email protected] ~]# impala-shell -i ip-172-31-7-96
Starting Impala Shell without Kerberos authentication
Connected to ip-172-31-7-96:21000
Server version: impalad version 2.7.0-cdh5.10.0 RELEASE (build 785a073cd07e2540d521ecebb8b38161ccbd2aa2)
***********************************************************************************
Welcome to the Impala shell.
(Impala Shell v2.7.0-cdh5.10.0 (785a073) built on Fri Jan 20 12:03:56 PST 2017)

Every command must be terminated by a ‘;‘.
***********************************************************************************
[ip-172-31-7-96:21000] > CREATE TABLE my_first_table
                       > (
                       >   id BIGINT,
                       >   name STRING,
                       >   PRIMARY KEY(id)
                       > )
                       > PARTITION BY HASH PARTITIONS 16
                       > STORED AS KUDU;
Query: create TABLE my_first_table
(
  id BIGINT,
  name STRING,
  PRIMARY KEY(id)
)
PARTITION BY HASH PARTITIONS 16
STORED AS KUDU

Fetched 0 row(s) in 1.35s
[ip-172-31-7-96:21000] > INSERT INTO my_first_table VALUES (99, "sarah");
Query: insert INTO my_first_table VALUES (99, "sarah")
Query submitted at: 2017-04-04 01:46:08 (Coordinator: http://ip-172-31-7-96:25000)
Query progress can be monitored at: http://ip-172-31-7-96:25000/query_plan?query_id=824ce0b3765c6b91:5ea8dd7c00000000
Modified 1 row(s), 0 row error(s) in 3.37s
[ip-172-31-7-96:21000] >
[ip-172-31-7-96:21000] > INSERT INTO my_first_table VALUES (1, "john"), (2, "jane"), (3, "jim");
Query: insert INTO my_first_table VALUES (1, "john"), (2, "jane"), (3, "jim")
Query submitted at: 2017-04-04 01:46:13 (Coordinator: http://ip-172-31-7-96:25000)
Query progress can be monitored at: http://ip-172-31-7-96:25000/query_plan?query_id=a645259c3b8ae7cd:e446e15500000000
Modified 3 row(s), 0 row error(s) in 0.11s
[ip-172-31-7-96:21000] > select * from my_first_table;
Query: select * from my_first_table
Query submitted at: 2017-04-04 01:46:19 (Coordinator: http://ip-172-31-7-96:25000)
Query progress can be monitored at: http://ip-172-31-7-96:25000/query_plan?query_id=f44021589ff0d94d:8d30568200000000
+----+-------+
| id | name  |
+----+-------+
| 2  | jane  |
| 3  | jim   |
| 1  | john  |
| 99 | sarah |
+----+-------+
Fetched 4 row(s) in 0.55s
[ip-172-31-7-96:21000] > delete from my_first_table where id =99;
Query: delete from my_first_table where id =99
Query submitted at: 2017-04-04 01:46:56 (Coordinator: http://ip-172-31-7-96:25000)
Query progress can be monitored at: http://ip-172-31-7-96:25000/query_plan?query_id=814090b100fdf0b4:1b516fe400000000
Modified 1 row(s), 0 row error(s) in 0.15s
[ip-172-31-7-96:21000] >
[ip-172-31-7-96:21000] > select * from my_first_table;
Query: select * from my_first_table
Query submitted at: 2017-04-04 01:46:57 (Coordinator: http://ip-172-31-7-96:25000)
Query progress can be monitored at: http://ip-172-31-7-96:25000/query_plan?query_id=724aa3f84cedb109:a679bf0200000000
+----+------+
| id | name |
+----+------+
| 2  | jane |
| 3  | jim  |
| 1  | john |
+----+------+
Fetched 3 row(s) in 0.15s
[ip-172-31-7-96:21000] > INSERT INTO my_first_table VALUES (99, "sarah");
Query: insert INTO my_first_table VALUES (99, "sarah")
Query submitted at: 2017-04-04 01:47:32 (Coordinator: http://ip-172-31-7-96:25000)
Query progress can be monitored at: http://ip-172-31-7-96:25000/query_plan?query_id=6244b3c6d33b443e:f43c857300000000
Modified 1 row(s), 0 row error(s) in 0.11s
[ip-172-31-7-96:21000] >
[ip-172-31-7-96:21000] > update my_first_table set name=‘lilei‘ where id=99;
Query: update my_first_table set name=‘lilei‘ where id=99
Query submitted at: 2017-04-04 01:47:32 (Coordinator: http://ip-172-31-7-96:25000)
Query progress can be monitored at: http://ip-172-31-7-96:25000/query_plan?query_id=8f4ab0dd3c19f9df:b2c7bdfa00000000
Modified 1 row(s), 0 row error(s) in 0.13s
[ip-172-31-7-96:21000] > select * from my_first_table;
Query: select * from my_first_table
Query submitted at: 2017-04-04 01:47:34 (Coordinator: http://ip-172-31-7-96:25000)
Query progress can be monitored at: http://ip-172-31-7-96:25000/query_plan?query_id=6542579c8bd5b6ad:af68f50800000000
+----+-------+
| id | name  |
+----+-------+
| 2  | jane  |
| 3  | jim   |
| 1  | john  |
| 99 | lilei |
+----+-------+
Fetched 4 row(s) in 0.15s
[ip-172-31-7-96:21000] > upsert  into my_first_table values(1, "john"), (4, "tom"), (99, "lilei1");
Query: upsert into my_first_table values(1, "john"), (4, "tom"), (99, "lilei1")
Query submitted at: 2017-04-04 01:48:52 (Coordinator: http://ip-172-31-7-96:25000)
Query progress can be monitored at: http://ip-172-31-7-96:25000/query_plan?query_id=694fc7ac2bc71d21:947f1fa200000000
Modified 3 row(s), 0 row error(s) in 0.11s
[ip-172-31-7-96:21000] >
[ip-172-31-7-96:21000] > select * from my_first_table;
Query: select * from my_first_table
Query submitted at: 2017-04-04 01:48:52 (Coordinator: http://ip-172-31-7-96:25000)
Query progress can be monitored at: http://ip-172-31-7-96:25000/query_plan?query_id=a64e0ee707762b6b:69248a6c00000000
+----+--------+
| id | name   |
+----+--------+
| 2  | jane   |
| 3  | jim    |
| 1  | john   |
| 99 | lilei1 |
| 4  | tom    |
+----+--------+
Fetched 5 row(s) in 0.16s

为天地立心,为生民立命,为往圣继绝学,为万世开太平。

推荐关注Hadoop实操,第一时间,分享更多Hadoop干货,欢迎转发和分享。

原创文章,欢迎转载,转载请注明:转载自微信公众号Hadoop实操

原文地址:http://blog.51cto.com/14049791/2326208

时间: 2024-10-05 17:49:37

0002-CENTOS7.2安装CDH5.10和Kudu1.2的相关文章

0002-CENTOS7.2安装CDH5.10和Kudu1.2(二)

5 Kudu安装 CDH从5.10开始,打包集成Kudu1.2,并且Cloudera正式提供支持.这个版本开始Kudu的安装较之前要简单很多,省去了Impala_Kudu,安装完Kudu,Impala即可直接操作Kudu. 以下安装步骤基于用户使用Cloudera Manager来安装和部署Kudu1.2 5.1 安装csd文件 5.2 安装Kudu服务 3.检查http显示Kudu正常: 4.通过CM界面配置Kudu的Parcel地址,并下载,分发,激活Kudu. 5.通过CM安装Kudu1.

0002-CENTOS7.2安装CDH5.10和Kudu1.2(一)

1 概述 本文档描述CENTOS7.2操作系统部署CDH企业版的过程.Cloudera企业级数据中心的安装主要分为4个步骤: 1.集群服务器配置,包括安装操作系统.关闭防火墙.同步服务器时钟等: 2.外部数据库安装 3.安装Cloudera管理器: 4.安装CDH集群: 集群完整性检查,包括HDFS文件系统.MapReduce.Hive等是否可以正常运行. 这篇文档将着重介绍Cloudera管理器与CDH的安装,并基于以下假设: 操作系统版本:CENTOS7.2 MariaDB数据库版本为10.

Centos7下安装mono3.10.0

mono 3.10.0 正式发布:性能进一步改进,以前已经写过一篇  Centos 7.0 安装Mono 3.4 和Jexus 5.6.下面我们在CentOS 7上通过源码安装Mono 3.10, 需要安装最新的libgdiplus3.8. 1. 安装依赖项 yum -y install wget glib2-devel libtiff libtiff-devel libjpeg libjpeg-devel giflib giflib-devel libpng libpng-devel libX

CentOS7.2 安装nginx-1.10.3

nginx-1.10.3 下载nginx 检查是否安装了依赖库: [[email protected] ~]# rpm -q gcc gcc-4.8.5-11.el7.x86_64 [[email protected] ~]# rpm -q openssl openssl-1.0.1e-60.el7_3.1.x86_64 [[email protected] ~]# rpm -q zlib zlib-1.2.7-17.el7.x86_64 [[email protected] ~]# rpm -

CENTOS7离线安装CDH5.6遇到的问题

1,安装cloudera-manager-agent 时,报错,报端口80问题, 开启80端口 firewall-cmd --zone=public --add-port=80/tcp --permanent 出现success表明添加成功 命令含义: --zone #作用域 --add-port=80/tcp  #添加端口,格式为:端口/通讯协议 --permanent   #永久生效,没有此参数重启后失效 运行.停止.禁用firewalld 启动:# systemctl start  fir

centos7+cdh5.10.0搭建

一.选择环境: 1.说明 本次部署使用台机器,3台用于搭建CDH集群,1台为内部源.内部源机器是可以连接公网的,可以提前部署好内部源,本次部署涉及到的服务器的hosts配置如下: 192.168.10.114    sp-04 192.168.10.115    sp-05 192.168.10.116    sp-06 本次安装为cdh5.10.0 需要centos版本: Jdk版本:jdk-8u73-linux-x64.tar 参考地址:https://www.cloudera.com/do

centos7.4下离线安装CDH5.7

(一)安装前的规划 (1)操作系统版本:centos7.4(64bit) [[email protected] etc]# more /etc/centos-release CentOS Linux release 7.4.1708 (Core) [[email protected] installPackage]# cat /proc/version Linux version 3.10.0-693.el7.x86_64 ([email protected]) (gcc version 4.8

cdh-5.10.0搭建安装

1.修改主机名为master, slave1, slave2 vim /etc/sysconfig/network HOSTNAME = master HOSTNAME = slave1 HOSTNAME = slave2 2.修改hosts文件(三个节点都要设置) vim /etc/hosts 192.168.1.7 master 192.168.1.8 slave1 192.168.1.9 slave2 3.IP设置(三个节点都要设置) DEVICE=eth0 HWADDR=4C:CC:6A

centos7.0安装apache-2.4.10

centos7.0安装完毕后,通过yum 安装的apache版本是 2.4.6的. 于是先停止了httpd服务,然后卸载了默认安装的版本. systemctl stop httpd.service rpm -qa|grep httpd #查看apache包,找到名字(例如httpd-2.4.6-18.el6_2.1.x86_64)后用下面命令删除 rpm -e httpd-2.2.15-15.el6_2.1.x86_64 #不过要先把依赖apache的包删除 或者 yum list|grep h