hadoop的三种方式

时间同步:date
火墙:iptables
解析:hosts

主机:192.168.2.149
节点:192.168.2.150
     192.168.2.125
     192.168.2.126

rsync和ssh

网站:http://hadoop.apache.org/

有三种方式:单个节点
      尾部式 ( 用于测试 )
      完全分布式

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1.基本安装及单个节点
    (1)下载  lftp i
            get hadoop-1.2.1.tar.gz  jdk-6u32-linux-x64.bin
    (2)java:*sh jdk-6u32-linux-x64.bin
             结果:Java(TM) SE Development Kit 6 successfully installed.
                 Product Registration is FREE and includes many benefits:
                 * Notification of new versions, patches, and updates
                 * Special offers on Oracle products, services and training
                 * Access to early releases and documentation

Product and system data will be collected. If your configuration
                 supports a browser, the JDK Product Registration form will
                 be presented. If you do not register, none of this information
                 will be saved. You may also register your JDK later by
                 opening the register.html file (located in the JDK installation
                 directory) in a browser.

For more information on what data Registration collects and
                 how it is managed and used, see:
                 http://java.sun.com/javase/registration/JDKRegistrationPrivacy.html

Press Enter to continue.....
                 Done.
             查看:ls
                 结果:hadoop-1.2.1.tar.gz  jdk1.6.0_32 ( 多了此目录 )  jdk-6u32-linux-x64.bin
       安装:tar zfx hadoop-1.2.1.tar.gz
       移动:mv jdk1.6.0_32/ hadoop-1.2.1/jdk  ( 为了方便一次性端走服务 )
       链接:ln -s hadoop-1.2.1 hadoop ( 为了方便以后的更新 )
   (3)配置 vim hadoop/conf/hadoop-env.sh
            内容:export JAVA_HOME=/root/hadoop/jdk( 9行  java所在的目录 )
   (4)测试
       1.输入 *mkdir hadoop/input
            *cp /root/hadoop/conf/*.xml /root/hadoop/input/
            *ls /root/hadoop/input/
            结果:capacity-scheduler.xml  fair-scheduler.xml  hdfs-site.xml          mapred-site.xml
                 core-site.xml           hadoop-policy.xml   mapred-queue-acls.xml
       2.输出 *cd /root/hadoop
            *bin/hadoop jar hadoop-examples-1.2.1.jar   ( 查看所支持的功能 )
            结果:An example program must be given as the first argument.
                 Valid program names are:
                 aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.
                 aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.
                 dbcount: An example job that count the pageview counts from a database.
                 grep: A map/reduce program that counts the matches of a regex in the input. ( 过滤 )
                 join: A job that effects a join over sorted, equally partitioned datasets
                 multifilewc: A job that counts words from several files.
                 pentomino: A map/reduce tile laying program to find solutions to pentomino problems.
                 pi: A map/reduce program that estimates Pi using monte-carlo method.
                 randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
                 randomwriter: A map/reduce program that writes 10GB of random data per node.
                 secondarysort: An example defining a secondary sort to the reduce.
                 sleep: A job that sleeps at each map and reduce task.
                 sort: A map/reduce program that sorts the data written by the random writer.
                 sudoku: A sudoku solver.
                 teragen: Generate data for the terasort
                 terasort: Run the terasort
                 teravalidate: Checking results of terasort
                 wordcount: A map/reduce program that counts the words in the input files.

*bin/hadoop jar hadoop-examples-1.2.1.jar grep input/ output ‘dfs[a-z.]+‘   ( 把input的有dfs开头的过滤到output里,自动建立output目录 )
            结果:14/08/05 09:50:28 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=0
                 14/08/05 09:50:28 INFO mapred.JobClient:     Map output records=1

*ls /root/hadoop/output/
            结果:part-00000  _SUCCESS

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2.尾分布式文件系统

/root/hadoop/bin/stop-all.sh

(1)ssh ( 实现无密钥验证 )
      *ssh-keygen  ( 获取密钥,空格即可 )
           结果:Generating public/private rsa key pair.
                Enter file in which to save the key (/root/.ssh/id_rsa): Enter passphrase (empty for no passphrase):
                Enter same passphrase again:
                Your identification has been saved in /root/.ssh/id_rsa.
                Your public key has been saved in /root/.ssh/id_rsa.pub.
                The key fingerprint is:
                07:f2:66:76:2b:59:76:29:c9:d9:b1:50:0f:5a:e9:2d [email protected]
                The key‘s randomart image is:
                +--[ RSA 2048]----+
                |            +.   |
                |           +.o   |
                |      . . o....  |
                |       o o =E+.  |
                |        S X =.   |
                |       + * +     |
                |        o .      |
                |         .       |
                |                 |
                +-----------------+
          *ssh-copy-id server149.example.com
           结果:The authenticity of host ‘server149.example.com (192.168.2.149)‘ can‘t be established.
                RSA key fingerprint is 66:9e:7c:f4:48:10:e3:20:59:1e:e9:44:35:32:42:14.
                Are you sure you want to continue connecting (yes/no)? yes                                      ***
                Warning: Permanently added ‘server149.example.com,192.168.2.149‘ (RSA) to the list of known hosts.
                [email protected]‘s password:                                                   ( 输入密码 )
                Now try logging into the machine, with "ssh ‘server149.example.com‘", and check in:
                .ssh/authorized_keys
                to make sure we haven‘t added extra keys that you weren‘t expecting.
         
          *ssh-copy-id localhost
          结果:The authenticity of host ‘localhost (::1)‘ can‘t be established.
               RSA key fingerprint is 66:9e:7c:f4:48:10:e3:20:59:1e:e9:44:35:32:42:14.
               Are you sure you want to continue connecting (yes/no)? yes                                ***
               Warning: Permanently added ‘localhost‘ (RSA) to the list of known hosts.
               Now try logging into the machine, with "ssh ‘localhost‘", and check in:
               .ssh/authorized_keys
               to make sure we haven‘t added extra keys that you weren‘t expecting.

测试:ssh server149.example.com
              结果:Last login: Tue Aug  5 08:44:19 2014 from 192.168.2.1
          离开:logout
              结果:Connection to server149.example.com closed.
 
       网站:http://hadoop.apache.org/docs/r1.2.1/single_node_setup.html
   (2)配置文件 *vim /root/hadoop/conf/core-site.xml
               内容:<configuration>
                       <property>
                           <name>fs.default.name</name>
                                <value>hdfs://server149.example.com:9000</value>
                       </property>
                    </configuration>

*vim /root/hadoop/conf/hdfs-site.xml
               内容:<configuration>
                       <property>
                          <name>dfs.replication</name>
                               <value>1</value>
                      </property>
                   </configuration>
   
               *vim /root/hadoop/conf/mapred-site.xml
               内容:<configuration>
                         <property>
                             <name>mapred.job.tracker</name>
                                 <value>server149.example.com:9001</value>
                        </property>
                    </configuration>

(3)解析  vim /etc/hosts
             内容:192.168.2.149    server149.example.com

(4)格式化并开启服务
        格式化: */root/hadoop/bin/hadoop namenode -format
            结果:14/08/05 10:22:08 INFO namenode.NameNode: STARTUP_MSG:
                /************************************************************
                STARTUP_MSG: Starting NameNode
                STARTUP_MSG:   host = server149.example.com/192.168.2.149
                STARTUP_MSG:   args = [-format]
                STARTUP_MSG:   version = 1.2.1
                STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by ‘mattf‘ on Mon Jul 22 15:23:09 PDT 2013
                STARTUP_MSG:   java = 1.6.0_32
                ************************************************************/
                14/08/05 10:22:08 INFO util.GSet: Computing capacity for map BlocksMap
                14/08/05 10:22:08 INFO util.GSet: VM type       = 64-bit
                14/08/05 10:22:08 INFO util.GSet: 2.0% max memory = 1013645312
                14/08/05 10:22:08 INFO util.GSet: capacity      = 2^21 = 2097152 entries
                14/08/05 10:22:08 INFO util.GSet: recommended=2097152, actual=2097152
                14/08/05 10:22:08 INFO namenode.FSNamesystem: fsOwner=root
                14/08/05 10:22:08 INFO namenode.FSNamesystem: supergroup=supergroup
                14/08/05 10:22:08 INFO namenode.FSNamesystem: isPermissionEnabled=true
                14/08/05 10:22:08 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
                14/08/05 10:22:08 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
                14/08/05 10:22:08 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0
                14/08/05 10:22:08 INFO namenode.NameNode: Caching file names occuring more than 10 times
                14/08/05 10:22:09 INFO common.Storage: Image file /tmp/hadoop-root/dfs/name/current/fsimage of size 110 bytes saved in 0 seconds.
                14/08/05 10:22:09 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/tmp/hadoop-root/dfs/name/current/edits
                14/08/05 10:22:09 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/tmp/hadoop-root/dfs/name/current/edits
                14/08/05 10:22:09 INFO common.Storage: Storage directory /tmp/hadoop-root/dfs/name has been successfully formatted.
                14/08/05 10:22:09 INFO namenode.NameNode: SHUTDOWN_MSG:
                /************************************************************
                SHUTDOWN_MSG: Shutting down NameNode at server149.example.com/192.168.2.149
                ************************************************************
            */root/hadoop/bin/start-all.sh
            结果:namenode running as process 2504. Stop it first.
                 localhost: starting datanode, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-datanode-server149.example.com.out
                 localhost: starting secondarynamenode, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-secondarynamenode-server149.example.com.out
                 jobtracker running as process 2674. Stop it first.
                 localhost: starting tasktracker, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-tasktracker-server149.example.com.out

(5)查看 /root/hadoop/jdk/bin/jps
            结果:3183 TaskTracker
                 2674 JobTracker
                 3035 SecondaryNameNode
                 2932 DataNode
                 2504 NameNode
                 3276 Jps

3.测试
  (1)网页监控 http://192.168.2.149:50070/dfshealth.jsp
              http://192.168.2.149:50030/jobtracker.jsp
  (2)测试    rm -fr /root/hadoop/output/
        *建目录:test
        /root/hadoop/bin/hadoop fs -mkdir test
         查看:/root/hadoop/bin/hadoop fs -ls
              结果:Found 1 items
                   drwxr-xr-x   - root supergroup          0 2014-08-05 11:13 /user/root/test
         *上传:传到test目录里
        /root/hadoop/bin/hadoop fs -put /root/hadoop/conf/*.xml test
         查看:/root/hadoop/bin/hadoop fs -ls test
              结果:Found 7 items
                   -rw-r--r--   1 root supergroup       7457 2014-08-05 11:15 /user/root/test/capacity-scheduler.xml
                   -rw-r--r--   1 root supergroup        348 2014-08-05 11:15 /user/root/test/core-site.xml
                   -rw-r--r--   1 root supergroup        327 2014-08-05 11:15 /user/root/test/fair-scheduler.xml
                   -rw-r--r--   1 root supergroup       4644 2014-08-05 11:15 /user/root/test/hadoop-policy.xml
                   -rw-r--r--   1 root supergroup        316 2014-08-05 11:15 /user/root/test/hdfs-site.xml
                   -rw-r--r--   1 root supergroup       2033 2014-08-05 11:15 /user/root/test/mapred-queue-acls.xml
                   -rw-r--r--   1 root supergroup        344 2014-08-05 11:15 /user/root/test/mapred-site.xml
        *输出:cd /root/hadoop
             bin/hadoop jar hadoop-examples-1.2.1.jar grep test output ‘dfs[a-z.]+‘
             结果:14/08/05 11:16:57 INFO util.NativeCodeLoader: Loaded the native-hadoop library
                  14/08/05 11:16:57 WARN snappy.LoadSnappy: Snappy native library not loaded
                  14/08/05 11:16:57 INFO mapred.FileInputFormat: Total input paths to process : 7
                  14/08/05 11:16:58 INFO mapred.JobClient: Running job: job_201408051022_0001
                  14/08/05 11:16:59 INFO mapred.JobClient:  map 0% reduce 0%  ......
                  14/08/05 11:23:34 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=1445437440
                  14/08/05 11:23:34 INFO mapred.JobClient:     Map output records=2
        *查看:/root/hadoop/bin/hadoop fs -ls
             结果:Found 2 items
                  drwxr-xr-x   - root supergroup          0 2014-08-05 11:23 /user/root/output
                  drwxr-xr-x   - root supergroup          0 2014-08-05 11:15 /user/root/test

/root/hadoop/bin/hadoop fs -cat output/*
              结果:1    dfs.replication
                   1    dfsadmin
                   cat: File does not exist: /user/root/output/_logs

*下载:/root/hadoop/bin/hadoop fs -get output test ( 下载到本地 )
              查看:ll -d /root/hadoop/test/
                  结果:drwxr-xr-x 3 root root 4096 Aug  5 11:27 /root/hadoop/test/

cat /root/hadoop/test/*
                  结果:cat: /root/hadoop/test/_logs: Is a directory
                      1    dfs.replication
                      1    dfsadmin
        *删除:rm -fr /root/hadoop/test/
              /root/hadoop/bin/hadoop fs -rmr output
              结果:Deleted hdfs://server149.example.com:9000/user/root/output
              查看:/root/hadoop/bin/hadoop fs -ls
                  结果:Found 1 items
                       drwxr-xr-x   - root supergroup          0 2014-08-05 11:15 /user/root/test

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                     master                      slives
HDFS                 namenode                    datanode
mp                   jobtracker                  tasktracter

mfs:数据存储
HDFS:存储与计算。

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

3.完全式( 服务分开,添加节点 )

1.主机( 149 )
  (1)停服务  */root/hadoop/bin/stop-all.sh
              结果:stopping jobtracker
                   localhost: stopping tasktracker
                   stopping namenode
                   localhost: stopping datanode
                   localhost: stopping secondarynamenode
             */root/hadoop/jdk/bin/jps
             结果:8675 Jps
  (2)解析   vim /etc/hosts ( 都需要 )
             内容:192.168.2.125   server125.example.com
                  192.168.2.150   server150.example.com
                  192.168.2.149   server149.example.com
  (3)ssh    ssh-copy-id server150.example.com
             ssh-copy-id server125.example.com

(4)配置文件vim /root/hadoop/conf/hdfs-site.xml
             内容:<value>2</value> ( 9 )

vim /root/hadoop/conf/masters
             内容:server149.example.com
      
             vim /root/hadoop/conf/slaves
             内容:server150.example.com
                  server125.example.com
  (5)复制给添加的节点 ( 添加的节点主机做链接 )
             scp -r /root/hadoop-1.2.1 server150.example.com:
             scp -r /root/hadoop-1.2.1 server125.example.com:
  (6)格式化 /root/hadoop/bin/hadoop namenode -format
            结果:14/08/05 10:22:08 INFO namenode.NameNode: STARTUP_MSG:
                /************************************************************
                STARTUP_MSG: Starting NameNode
                STARTUP_MSG:   host = server149.example.com/192.168.2.149
                STARTUP_MSG:   args = [-format]
                STARTUP_MSG:   version = 1.2.1
                STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by ‘mattf‘ on Mon Jul 22 15:23:09 PDT 2013
                STARTUP_MSG:   java = 1.6.0_32
                ************************************************************/
                14/08/05 10:22:08 INFO util.GSet: Computing capacity for map BlocksMap
                14/08/05 10:22:08 INFO util.GSet: VM type       = 64-bit
                14/08/05 10:22:08 INFO util.GSet: 2.0% max memory = 1013645312
                14/08/05 10:22:08 INFO util.GSet: capacity      = 2^21 = 2097152 entries
                14/08/05 10:22:08 INFO util.GSet: recommended=2097152, actual=2097152
                14/08/05 10:22:08 INFO namenode.FSNamesystem: fsOwner=root
                14/08/05 10:22:08 INFO namenode.FSNamesystem: supergroup=supergroup
                14/08/05 10:22:08 INFO namenode.FSNamesystem: isPermissionEnabled=true
                14/08/05 10:22:08 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
                14/08/05 10:22:08 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
                14/08/05 10:22:08 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0
                14/08/05 10:22:08 INFO namenode.NameNode: Caching file names occuring more than 10 times
                14/08/05 10:22:09 INFO common.Storage: Image file /tmp/hadoop-root/dfs/name/current/fsimage of size 110 bytes saved in 0 seconds.
                14/08/05 10:22:09 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/tmp/hadoop-root/dfs/name/current/edits
                14/08/05 10:22:09 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/tmp/hadoop-root/dfs/name/current/edits
                14/08/05 10:22:09 INFO common.Storage: Storage directory /tmp/hadoop-root/dfs/name has been successfully formatted.
                14/08/05 10:22:09 INFO namenode.NameNode: SHUTDOWN_MSG:
                /************************************************************
                SHUTDOWN_MSG: Shutting down NameNode at server149.example.com/192.168.2.149
                ************************************************************

(7)开启服务/root/hadoop/bin/start-all.sh
             结果:starting namenode, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-namenode-server149.example.com.out
                  server125.example.com: starting datanode, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-datanode-server125.example.com.out
                  server150.example.com: starting datanode, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-datanode-server150.example.com.out
                  server149.example.com: starting secondarynamenode, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-secondarynamenode-server149.example.com.out
                  starting jobtracker, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-jobtracker-server149.example.com.out
                  server150.example.com: starting tasktracker, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-tasktracker-server150.example.com.out
                  server125.example.com: starting tasktracker, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-tasktracker-server125.example.com.out

(8)查看服务/root/hadoop/jdk/bin/jps
             结果:5721 JobTracker
                  5643 SecondaryNameNode
                  5832 Jps
                  5479 NameNode

(9)测试   */root/hadoop/bin/hadoop fs -put /root/hadoop/conf/ test
             */root/hadoop/bin/hadoop fs -ls
             结果:Found 1 items
                  drwxr-xr-x   - root supergroup          0 2014-08-05 11:54 /user/root/test
             */root/hadoop/bin/hadoop fs -ls test
             结果:Found 17 items
                  -rw-r--r--   2 root supergroup       7457 2014-08-05 11:54 /user/root/test/capacity-scheduler.xml
                 http://hadoop.apache.org/ -rw-r--r--   2 root supergroup       1095 2014-08-05 11:54 /user/root/test/configuration.xsl
                  -rw-r--r--   2 root supergroup        348 2014-08-05 11:54 /user/root/test/core-site.xml
                  -rw-r--r--   2 root supergroup        327 2014-08-05 11:54 /user/root/test/fair-scheduler.xml
                  -rw-r--r--   2 root supergroup       2428 2014-08-05 11:54 /user/root/test/hadoop-env.sh
                  -rw-r--r--   2 root supergroup       2052 2014-08-05 11:54 /user/root/test/hadoop-metrics2.properties
                  -rw-r--r--   2 root supergroup       4644 2014-08-05 11:54 /user/root/test/hadoop-policy.xml
                  -rw-r--r--   2 root supergroup        316 2014-08-05 11:54 /user/root/test/hdfs-site.xml
                  -rw-r--r--   2 root supergroup       5018 2014-08-05 11:54 /user/root/test/log4j.properties
                  -rw-r--r--   2 root supergroup       2033 2014-08-05 11:54 /user/root/test/mapred-queue-acls.xml
                  -rw-r--r--   2 root supergroup        344 2014-08-05 11:54 /user/root/test/mapred-site.xml
                  -rw-r--r--   2 root supergroup         22 2014-08-05 11:54 /user/root/test/masters
                  -rw-r--r--   2 root supergroup         44 2014-08-05 11:54 /user/root/test/slaves
                  -rw-r--r--   2 root supergroup       2042 2014-08-05 11:54 /user/root/test/ssl-client.xml.example
                  -rw-r--r--   2 root supergroup       1994 2014-08-05 11:54 /user/root/test/ssl-server.xml.example
                  -rw-r--r--   2 root supergroup       3890 2014-08-05 11:54 /user/root/test/task-log4j.properties
                  -rw-r--r--   2 root supergroup        382 2014-08-05 11:54 /user/root/test/taskcontroller.cfg
             */root/hadoop/bin/hadoop jar /root/hadoop/hadoop-examples-1.2.1.jar wordcount test output
             */root/hadoop/bin/hadoop fs -cat output/*

2.被添加的节点主机( 150 )
  (1)解析   vim /etc/hosts ( 都需要 )
             内容:192.168.2.125   server125.example.com
                  192.168.2.150   server150.example.com
                  192.168.2.149   server149.example.com
  (2)链接     ln -s hadoop-1.2.1/ hadoop
               ln -s /root/hadoop/jdk/bin/jps /usr/local/sbin/
  (3)查看服务 /root/hadoop/jdk/bin/jps
              结果:1459 Jps
                   1368 TaskTracker
                   1300 DataNode

3.被添加的节点主机( 125 )
  (1)解析   vim /etc/hosts ( 都需要 )
             内容:192.168.2.125   server125.example.com
                  192.168.2.150   server150.example.com
                  192.168.2.149   server149.example.com
  (2)链接     ln -s hadoop-1.2.1/ hadoop
               ln -s /root/hadoop/jdk/bin/jps /usr/local/sbin/
  (3)查看服务 /root/hadoop/jdk/bin/jps
              结果:1291 DataNode
                   1355 TaskTracker
                   1448 Jps

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
在线添加节点

当用的不是超级用户的时候:需要有一个存在的用户,uid必需是相同的,当时

1.主机:192.168.2.149
   (1)节点配置 vim /root/hadoop/conf/slaves
               内容:server126.example.com
   (2)解析    vim /etc/hosts
               内容:192.168.2.126    server126.example.com
                    192.168.2.149    server149.example.com
                    192.168.2.150    server150.example.com
                    192.168.2.125   server125.example.com
   (3)ssh     ssh-copy-id server126.example.com
               ssh server126.example.com ( 直接链接上不需要输入密码 )
   (4)复制hadoop ( 复制完以后在添加的节点主机上做链接 2.1 2.2 )
               scp -r /root/hadoop-1.2.1 server126.example.com:
   (5)大数据  *dd if=/dev/zero of=/root/hadoop/data1.file bs=1M count=500
               结果:500+0 records in
                    500+0 records out
                    524288000 bytes (524 MB) copied, 43.1638 s, 12.1 MB/s
              *dd if=/dev/zero of=/root/hadoop/data2.file bs=1M count=500
              *dd if=/dev/zero of=/root/hadoop/data3.file bs=1M count=500
   (6)上传数据 ( 上传完数据以后在添加的节点主机上开启服务并查看数据 2.3 2.4)
               /root/hadoop/bin/hadoop fs -mkdir data
               /root/hadoop/bin/hadoop fs -put /root/hadoop/data{1,2,3}.file data
   (7)平衡数据 /root/hadoop/bin/start-balancer.sh
               结果:starting balancer, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-balancer-server149.example.com.out

2.添加的节点主机:192.168.2.126
  (1)链接    ln -s hadoop-1.2.1/ hadoop
              ln -s /root/hadoop/jdk/bin/jps /usr/local/sbin/
  (2)解析    vim /etc/hosts
             内容:192.168.2.126    server126.example.com
                  192.168.2.149    server149.example.com
  (3)开启服务 */root/hadoop/bin/hadoop-daemon.sh start datanode
              结果:starting datanode, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-datanode-server126.example.com.out
              
              */root/hadoop/bin/hadoop-daemon.sh start tasktracker
              结果:starting tasktracker, logging to /root/hadoop-1.2.1/libexec/../logs/hadoop-root-tasktracker-server126.example.com.out

*jps
              结果:1714 TaskTracker
                   1783 Jps
                   1631 DataNode
  (4)查看数据 /root/hadoop/bin/hadoop dfsadmin -report
              结果:Configured Capacity: 15568306176 (14.5 GB)
                   Present Capacity: 10721746944 (9.99 GB)
                   DFS Remaining: 7550484480 (7.03 GB)
                   DFS Used: 3171262464 (2.95 GB)
                   DFS Used%: 29.58%
                   Under replicated blocks: 0
                   Blocks with corrupt replicas: 0
                   Missing blocks: 0

-------------------------------------------------
                   Datanodes available: 3 (3 total, 0 dead)

Name: 192.168.2.125:50010
                   Decommission Status : Normal
                   Configured Capacity: 5189435392 (4.83 GB)
                   DFS Used: 1137459200 (1.06 GB)
                   Non DFS Used: 1615728640 (1.5 GB)
                   DFS Remaining: 2436247552(2.27 GB)
                   DFS Used%: 21.92%
                   DFS Remaining%: 46.95%
                   Last contact: Tue Aug 05 15:36:01 CST 2014

Name: 192.168.2.126:50010
                   Decommission Status : Normal
                   Configured Capacity: 5189435392 (4.83 GB)
                   DFS Used: 651132928 (620.97 MB)
                   Non DFS Used: 1615208448 (1.5 GB)
                   DFS Remaining: 2923094016(2.72 GB)
                   DFS Used%: 12.55%              *****
                   DFS Remaining%: 56.33%
                   Last contact: Tue Aug 05 15:36:01 CST 2014

Name: 192.168.2.150:50010
                   Decommission Status : Normal
                   Configured Capacity: 5189435392 (4.83 GB)
                   DFS Used: 1382670336 (1.29 GB)
                   Non DFS Used: 1615659008 (1.5 GB)
                   DFS Remaining: 2191106048(2.04 GB)
                   DFS Used%: 26.64%
                   DFS Remaining%: 42.22%
                   Last contact: Tue Aug 05 15:36:00 CST 2014

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
在线删除节点
   (1)修改配置 *vim /root/hadoop/conf/mapred-site.xml
               内容:<configuration>
                      <property>
                          <name>mapred.job.tracker</name>
                               <value>server149.example.com:9001</value>
                      </property>
    
                      <property>         ( 添加此模块 )
                          <name>dfs.hosts.exclude</name>
                               <value>/root/hadoop/conf/exclude-host</value>
                      </property>
                   </configuration>

*vim /root/hadoop/conf/exclude-host
               内容:server150.example.com ( 要删除节点的主机 )

(2)刷新节点 /root/hadoop/bin/hadoop dfsadmin -refreshNodes
   (3)查看    /root/hadoop/bin/hadoop dfsadmin -report
               内容:Configured Capacity: 15568306176 (14.5 GB)
                    Present Capacity: 10594420391 (9.87 GB)
                    DFS Remaining: 7300513792 (6.8 GB)
                    DFS Used: 3293906599 (3.07 GB)
                    DFS Used%: 31.09%
Under replicated blocks: 20
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 3 (3 total, 0 dead)

Name: 192.168.2.125:50010
Decommission Status : Normal
Configured Capacity: 5189435392 (4.83 GB)
DFS Used: 1137459200 (1.06 GB)
Non DFS Used: 1674080256 (1.56 GB)
DFS Remaining: 2377895936(2.21 GB)
DFS Used%: 21.92%
DFS Remaining%: 45.82%
Last contact: Tue Aug 05 15:48:43 CST 2014

Name: 192.168.2.126:50010
Decommission Status : Normal
Configured Capacity: 5189435392 (4.83 GB)
DFS Used: 773777063 (737.93 MB)
Non DFS Used: 1684134233 (1.57 GB)
DFS Remaining: 2731524096(2.54 GB)
DFS Used%: 14.91%
DFS Remaining%: 52.64%
Last contact: Tue Aug 05 15:48:43 CST 2014

Name: 192.168.2.150:50010
Decommission Status : Decommission in progress   ***    ( Decommissioned )
Configured Capacity: 5189435392 (4.83 GB)
DFS Used: 1382670336 (1.29 GB)
Non DFS Used: 1615671296 (1.5 GB)
DFS Remaining: 2191093760(2.04 GB)
DFS Used%: 26.64%
DFS Remaining%: 42.22%
Last contact: Tue Aug 05 15:48:42 CST 2014

时间: 2024-11-04 09:58:01

hadoop的三种方式的相关文章

数据导入HBase最常用的三种方式及实践分析

数据导入HBase最常用的三种方式及实践分析         摘要:要使用Hadoop,需要将现有的各种类型的数据库或数据文件中的数据导入HBase.一般而言,有三种常见方式:使用HBase的API中的Put方法,使用HBase 的bulk load工具和使用定制的MapReduce Job方式.本文均有详细描述. [编者按]要使用Hadoop,数据合并至关重要,HBase应用甚广.一般而言,需要 针对不同情景模式将现有的各种类型的数据库或数据文件中的数据转入至HBase 中.常见方式为:使用H

联接HIVE SERVER客户端的三种方式

在Hive/bin 目录下输入./hive --service hiveserver 代表hive启动了服务器模式. 和普通模式不同的是,这时hive同时启动了一个名为thrift的服务器. 你不用去研究这个服务器的原理,认为他是一个传递信息的人就好,你可以通过他向hive发送命令,然后hive再把命令送给hadoop. 1.命令行模式: ./hive -h127.0.0.1 -p10000 简单明了,IP和端口. 2.JDBC模式: 名字很糊人的.     private static Str

hive三种方式区别和搭建、HiveServer2环境搭建、HWI环境搭建和beeline环境搭建

说在前面的话 以下三种情况,最好是在3台集群里做,比如,master.slave1.slave2的master和slave1都安装了hive,将master作为服务端,将slave1作为服务端. hive三种方式区别和搭建 Hive中metastore(元数据存储)的三种方式: a)内嵌Derby方式 b)Local方式 c)Remote方式 1.本地derby这种方式是最简单的存储方式,只需要在hive-site.xml做如下配置便可<?xml version="1.0"?&g

AngularJs学习——实现数据绑定的三种方式

三种方式: 方式一:<h5>{{msg}}</h5>  此方式在页面刷新的时候会闪现{{}} 方式二:<h5 ng-bind="msg"></h5> 方式三:<h5 ng-clock class="ng-clock">{{msg}}</h5> 示例代码: <!DOCTYPE html> <html lang="en" ng-app="myapp&q

小蚂蚁学习页面静态化(2)——更新生成纯静态化页面的三种方式

更新生成纯静态化页面的三种方式:1.按照时间间隔更新.2.手动更新.3.定时更新(需要系统配合). 1. 按照时间间隔更新. 当用户第一次访问这个页面的时候,程序自动判断,该静态文件是否存在,并且该文件是否还在有效时间内,如果该文件未超出了有效时间,用户访问的是已经生成的静态文件.如果超出了有效时间,用户得到的是动态输出的内容,同时重新生成静态文件.稍微修改一下昨天的代码为例: <?php //首先判断是否有静态文件,并且文件的最新修改时间到现在是否大于20秒 if(is_file('./tex

Linux中设置服务自启动的三种方式

有时候我们需要Linux系统在开机的时候自动加载某些脚本或系统服务 主要用三种方式进行这一操作: ln -s                       在/etc/rc.d/rc*.d目录中建立/etc/init.d/服务的软链接(*代表0-6七个运行级别之一) chkonfig                命令行运行级别设置 ntsysv                   伪图形运行级别设置 注意:1.这三种方式主要用于以redhat为基础的发行版 2.如果还不知道运行级别是什么,那么最

Linux识别ntfs及挂载的三种方式

NTFS-3G是一个开源软件,支持在Linux操作系统下读写NTFS格式的分区.它能快速且安全的操作Windows XP,Windows Server 2003, Windows 2000 以及WindowsVista文件系统. 1 .环境准备 安装该软件需要依赖于fuse, Centos6.*中应该默认安装过fuse: [[email protected] yum.repos.d]# rpm -q fuse fuse-2.8.3-4.el6.i686 已经安装 如果没有安装可以yum安装或者编

Tomcat热部署的三种方式

热部署是指在你修改项目BUG的时候对JSP或JAVA类进行了修改在不重启WEB服务器前提下能让修改生效.但是对配置文件的修改除外! 1.直接把项目web文件夹放在webapps里. 2.在tomcat\conf\server.xml中的<host></host>内部添加<context/>标签: <Context debug="0" docBase="D:\demo1\web" path="/demo1"

【Struts2】Struts2获取session的三种方式

1.Map<String,Object> map =  ActionContext.getContext().getSession(); 2.HttpSession session = ServletActionContext.getRequest().getSession(); 3.让Action实现SessionAware接口,并实现public void setSession(Map<String, Object> session) {} 方法,Struts2会在实例化Act