官方文档参数解释:http://flume.apache.org/FlumeUserGuide.html#hdfs-sink
需要注意:文件格式,fileType=DataStream 默认为SequenceFile,是hadoop的文件格式,改为DataStream就可直接读了(SqeuenceFile怎么用还不知道。。)
配置文件:
hdfs.conf
a1.sources = r1 a1.sinks = k1 a1.channels = c1 # Describe/configure the source # Describe the sink # Use a channel which buffers events in memory # Bind the source and sink to the channel |
启动hadoop
启动flume:
./flume-ng agent -c . -f /usr/local/hadoop/apache-flume-1.6.0-bin/conf/hdfs.conf -n a1 -Dflume.root.logger=INFO,console
在被监听的文件夹下生成日志文件:
for i in {1000..2000}; do echo "test line $i" >> /usr/local/hadoop/apache-flume-1.6.0-bin/logs/spool_text$i.log; done;
查看hdfs: http://node4:50070