一、netcat source + memory channel + logger sink
1. 修改配置
1)修改$FLUME_HOME/conf下的flume-env.sh文件,修改内容如下
export JAVA_HOME=/opt/modules/jdk1.7.0_67
2)在$FLUME_HOME/conf目录下,创建agent子目录,新建netcat-memory-logger.conf,配置内容如下:
# netcat-memory-logger # Name the components on this agent a1.sources = r1 a1.sinks = k1 a1.channels = c1 # Describe/configure the source a1.sources.r1.type = netcat a1.sources.r1.bind = beifeng-hadoop-02 a1.sources.r1.port = 44444 # Describe the sink a1.sinks.k1.type = logger # Use a channel which buffers events in memory a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 # Bind the source and sink to the channel a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1
2. 启动flume并测试
1) 启动
bin/flume-ng agent -n a1 -c conf/ -f conf/agent/netcat-memory-logger.conf -Dflume.root.logger=INFO,console
2) 测试
nc beifeng-hadoop-02 44444
输入任意字符串,观察服务器的日志文件即可。
使用linux的nc命令,如果命令不存在则先安装一下。
安装netcat:sudo yum -y install nc
二、agent: avro source + file channel + hdfs sink
1. 增加配置
在$FLUME_HOME/conf目录下,创建agent子目录,新建avro-file-hdfs.conf,配置内容如下:
# Name the components on this agent a1.sources = r1 a1.sinks = k1 a1.channels = c1 # Describe/configure the source a1.sources.r1.type = netcat a1.sources.r1.bind = beifeng-hadoop-02 a1.sources.r1.port = 4141 # Describe the sink a1.sinks.k1.type = hdfs a1.sinks.k1.hdfs.path = hdfs://beifeng-hadoop-02:9000/flume/events/%Y-%m-%d # default:FlumeData a1.sinks.k1.hdfs.filePrefix = FlumeData a1.sinks.k1.hdfs.useLocalTimeStamp = true a1.sinks.k1.hdfs.rollInterval = 0 a1.sinks.k1.hdfs.rollCount = 0 # 一般接近block 128 120 125 a1.sinks.k1.hdfs.rollSize = 10240 a1.sinks.k1.hdfs.fileType = DataStream #a1.sinks.k1.hdfs.round = true #a1.sinks.k1.hdfs.roundValue = 10 #a1.sinks.k1.hdfs.roundUnit = minute # Use a channel which buffers events in memory a1.channels.c1.type = file a1.channels.c1.checkpointDir = /opt/modules/cdh/apache-flume-1.5.0-cdh5.3.6-bin/checkpoint a1.channels.c1.dataDirs = /opt/modules/cdh/apache-flume-1.5.0-cdh5.3.6-bin/data # Bind the source and sink to the channel a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1
2. 启动并测试
1)启动flume agent
bin/flume-ng agent -n a1 -c conf/ -f conf/agent/avro-file-hdfs.conf -Dflume.root.logger=INFO,console
2)使用flume自带的avro-client测试
bin/flume-ng avro-client --host beifeng-hadoop-02 --port 4141 --filename /home/beifeng/order_info.txt
时间: 2024-10-19 21:13:29