搭建前先统一时间,关闭防火墙,使用的jar包版本是1.6.0的
服务配置有两种方式
第一种:具体步骤如下:
1.将jar包传至node1上,解压至根目录
2.更改目录名,使用如下命令:mv apache-flume-1.6.0-bin /home/install/flume-1.6
3.进入flume-1.6目录后,vi test1,创建test1文件,打开
https://flume.apache.org/FlumeUserGuide.html链接,复制关于source、channel和sink的那一段代码
将其中的localhost改为node1
4.运行flume
5.打开node2,输入命令:telnet node1 44444
任意字符如hello,看node1有没有收到hello字符
Ctrl+】,quit退出,Ctrl+c终止
第二种方式:具体步骤如下:
1.先启动zookeeper服务:zkServer.sh start,再启动dfs:start-dfs.sh
2.vi test2
具体代码:
[root@node1 flume-1.6]# vi test2
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type =spooldir
a1.sources.r1.spoolDir=/opt/flume
# Describe the sink
a1.sinks.k1.type =avro
a1.sinks.k1.hostname=node2
a1.sinks.k1.port=55555
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 100000
a1.channels.c1.transactionCapacity = 1000
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
在node2(相当于服务器)上
具体代码:
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type =avro
a1.sources.r1.bind=node2
a1.sources.r1.port=55555
# Describe the sink
a1.sinks.k1.type =hdfs
a1.sinks.k1.hdfs.path=hdfs://xiaotian/flume/%Y-%m-%d/
a1.sinks.k1.hdfs.filePrefix=%H%M
a1.sinks.k1.hdfs.rollSize=0
a1.sinks.k1.hdfs.rollCount=0
a1.sinks.k1.hdfs.rollInterval=60
a1.sinks.k1.hdfs.idleTimeout=1
a1.sinks.k1.hdfs.fileType=DataStream
a1.sinks.k1.hdfs.round=true
a1.sinks.k1.hdfs.roundValue=1
a1.sinks.k1.hdfs.roundUnit=minute
a1.sinks.k1.hdfs.useLocalTimeStamp=true
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 100000
a1.channels.c1.transactionCapacity = 1000
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
~
3.进入flume-1.6里的conf目录
mv flume-env.sh.template flume-env.sh
vi flume-env.sh
添加一句话:JAVA_OPTS="-Xms100m -Xmx300m"
4.新建一个/opt/flume目录,然后再启动flume,启动时先启动node2(改变source的),再启动node1
bin/flume-ng agent --conf conf --conf-file test2 --name a1 -Dflume.root.logger=INFO,console
启动node1后可先在node上查看进程,看是否启动成功:netstat -ntpl