?
1 Centos 6.5 编译hadoop2.7.1 ? 主机配置: ? ? sudo yum install gcc gcc-c++ sudo yum install ncurses-devel ? sudo yum -y install lzo-devel zlib-devel autoconf automake libtool cmake openssl-devel ? ? 编译 mvn clean package -Pdist,native -DskipTests -Dtar ? ? 2配置hadoop2.7.1 1)core-site.xml (fs.defaultFS配置hdfs地址, DFS Master 端口) 2)hdfs-site.xml ? 3)mapred-site.xml ? 4)yarn-site.xml ? ? 3 eclipse连接hdfs DFS Master port 为 8020, 即hdfs://hd1:8020中配置的端口 在hadoop1中,左边是job.tracker的端口号,右边是hdfs的端口号 ? ? 查看文件系统: bin/hadoop ? hdfs dfs等价于hadoop fs[[email protected] hadoop-2.7.1]$ bin/hdfs dfs Usage: hadoop fs [generic options] ????[-appendToFile <localsrc> ... <dst>] ????[-cat [-ignoreCrc] <src> ...] ????[-checksum <src> ...] ? ????[-chgrp [-R] GROUP PATH...] ##改变文件的所属组 ????[-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...] ##改变文件的模式位 ????[-chown [-R] [OWNER][:[GROUP]] PATH...] ##改变文件的所有者 ? ????[-copyFromLocal [-f] [-p] [-l] <localsrc> ... <dst>] ????[-copyToLocal [-p] [-ignoreCrc] [-crc] <src> ... <localdst>] ? ????[-moveFromLocal <localsrc> ... <dst>] ????[-moveToLocal <src> <localdst>] ? ????[-count [-q] [-h] <path> ...] ? ? ????[-createSnapshot <snapshotDir> [<snapshotName>]] ????[-deleteSnapshot <snapshotDir> <snapshotName>] ????[-renameSnapshot <snapshotDir> <oldName> <newName>] ? ????[-df [-h] [<path> ...]] ????[-du [-s] [-h] <path> ...] ????[-expunge] ? ????[-find <path> ... <expression> ...] ? ????[-get [-p] [-ignoreCrc] [-crc] <src> ... <localdst>] ????[-put [-f] [-p] [-l] <localsrc> ... <dst>] ? ????[-getmerge [-nl] <src> <localdst>] ? ????[-help [cmd ...]] ? ????[-ls [-d] [-h] [-R] [<path> ...]] ????[-mkdir [-p] <path> ...] ? ????[-mv <src> ... <dst>] ????[-cp [-f] [-p | -p[topax]] <src> ... <dst>] ? ????[-rm [-f] [-r|-R] [-skipTrash] <src> ...] ????[-rmdir [--ignore-fail-on-non-empty] <dir> ...] ? ????[-getfacl [-R] <path>] ????[-getfattr [-R] {-n name | -d} [-e en] <path>] ? ????[-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]] ????[-setfattr {-n name [-v value] | -x name} <path>] ? ????[-setrep [-R] [-w] <rep> <path> ...] ????[-stat [format] <path> ...] ????[-tail [-f] <file>] ????[-test -[defsz] <path>] ????[-text [-ignoreCrc] <src> ...] ????[-touchz <path> ...] ????[-truncate [-w] <length> <path> ...] ????[-usage [cmd ...]] ? Generic options supported are -conf <configuration file> ????????specify an application configuration file 指定应用配置文件 -D <property=value> ????????use value for given property 指定给定属性的值 -fs <local|namenode:port> ????????specify a namenode -jt <local|resourcemanager:port> ????specify a ResourceManager -files <comma separated list of files> ????????????specify comma separated files to be copied to the map reduce cluster 指定逗号分隔的文件,将被拷贝到集群 -libjars <comma separated list of jars> ????????????specify comma separated jar files to include in the classpath. -archives <comma separated list of archives> ????specify comma separated archives to be unarchived on the compute machines. ? The general command line syntax is bin/hadoop command [genericOptions] [commandOptions] ? ? ? ? WordCount示例
? ? 运行输出: INFO - session.id is deprecated. Instead, use dfs.metrics.session-id INFO - Initializing JVM Metrics with processName=JobTracker, sessionId= WARN - No job jar file set. User classes may not be found. See Job or Job#setJar(String). INFO - Total input paths to process : 1 INFO - number of splits:1 INFO - Submitting tokens for job: job_local498662469_0001 INFO - The url to track the job: http://localhost:8080/ INFO - Running job: job_local498662469_0001 INFO - OutputCommitter set in config null INFO - File Output Committer Algorithm version is 1 INFO - OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter INFO - Waiting for map tasks INFO - Starting task: attempt_local498662469_0001_m_000000_0 INFO - File Output Committer Algorithm version is 1 INFO - Using ResourceCalculatorProcessTree : [ ] INFO - Processing split: hdfs://hd1:8020/input/file_test.txt:0+23 INFO - (EQUATOR) 0 kvi 26214396(104857584) INFO - mapreduce.task.io.sort.mb: 100 INFO - soft limit at 83886080 INFO - bufstart = 0; bufvoid = 104857600 INFO - kvstart = 26214396; length = 6553600 INFO - Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer INFO - INFO - Starting flush of map output INFO - Spilling map output INFO - bufstart = 0; bufend = 39; bufvoid = 104857600 INFO - kvstart = 26214396(104857584); kvend = 26214384(104857536); length = 13/6553600 INFO - Finished spill 0 INFO - Task:attempt_local498662469_0001_m_000000_0 is done. And is in the process of committing INFO - map INFO - Task ‘attempt_local498662469_0001_m_000000_0‘ done. INFO - Finishing task: attempt_local498662469_0001_m_000000_0 INFO - map task executor complete. INFO - Waiting for reduce tasks INFO - Starting task: attempt_local498662469_0001_r_000000_0 INFO - File Output Committer Algorithm version is 1 INFO - Using ResourceCalculatorProcessTree : [ ] INFO - Using ShuffleConsumerPlugin: [email protected] INFO - MergerManager: memoryLimit=623902720, maxSingleShuffleLimit=155975680, mergeThreshold=411775808, ioSortFactor=10, memToMemMergeOutputsThreshold=10 INFO - attempt_local498662469_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events INFO - localfetcher#1 about to shuffle output of map attempt_local498662469_0001_m_000000_0 decomp: 37 len: 41 to MEMORY INFO - Read 37 bytes from map-output for attempt_local498662469_0001_m_000000_0 INFO - closeInMemoryFile -> map-output of size: 37, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->37 INFO - EventFetcher is interrupted.. Returning INFO - 1 / 1 copied. INFO - finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs INFO - Merging 1 sorted segments INFO - Down to the last merge-pass, with 1 segments left of total size: 29 bytes INFO - Merged 1 segments, 37 bytes to disk to satisfy reduce memory limit INFO - Merging 1 files, 41 bytes from disk INFO - Merging 0 segments, 0 bytes from memory into reduce INFO - Merging 1 sorted segments INFO - Down to the last merge-pass, with 1 segments left of total size: 29 bytes INFO - 1 / 1 copied. INFO - mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords INFO - Task:attempt_local498662469_0001_r_000000_0 is done. And is in the process of committing INFO - 1 / 1 copied. INFO - Task attempt_local498662469_0001_r_000000_0 is allowed to commit now INFO - Saved output of task ‘attempt_local498662469_0001_r_000000_0‘ to hdfs://hd1:8020/output/count/_temporary/0/task_local498662469_0001_r_000000 INFO - reduce > reduce INFO - Task ‘attempt_local498662469_0001_r_000000_0‘ done. INFO - Finishing task: attempt_local498662469_0001_r_000000_0 INFO - reduce task executor complete. INFO - Job job_local498662469_0001 running in uber mode : false INFO - map 100% reduce 100% INFO - Job job_local498662469_0001 completed successfully INFO - Counters: 35 ? ????File System Counters ????????FILE: Number of bytes read=446 ????????FILE: Number of bytes written=552703 ????????FILE: Number of read operations=0 ????????FILE: Number of large read operations=0 ????????FILE: Number of write operations=0 ????????HDFS: Number of bytes read=46 ????????HDFS: Number of bytes written=23 ????????HDFS: Number of read operations=13 ????????HDFS: Number of large read operations=0 ????????HDFS: Number of write operations=4 ? ????Map-Reduce Framework ????????Map input records=3 ????????Map output records=4 ????????Map output bytes=39 ????????Map output materialized bytes=41 ????????Input split bytes=100 ????????Combine input records=4 ????????Combine output records=3 ????????Reduce input groups=3 ????????Reduce shuffle bytes=41 ????????Reduce input records=3 ????????Reduce output records=3 ????????Spilled Records=6 ????????Shuffled Maps =1 ????????Failed Shuffles=0 ????????Merged Map outputs=1 ????????GC time elapsed (ms)=38 ????????Total committed heap usage (bytes)=457703424 ? ????Shuffle Errors ????????BAD_ID=0 ????????CONNECTION=0 ????????IO_ERROR=0 ????????WRONG_LENGTH=0 ????????WRONG_MAP=0 ????????WRONG_REDUCE=0 ? ????File Input Format Counters ????????Bytes Read=23 ? ????File Output Format Counters ????????Bytes Written=23 ? ? ? |