说明:本来研究开源日志的系统是flume,后来发现配置比较麻烦,网上搜索到fluentd也是开源的日志收集系统,配置简单多了,性能不错,所以就改研究这个东东了!官方主页,大家可以看看:fluentd.org,支持300+的plugins,应该是不错的!
fluentd是通过hadoop中的webHDFS与HDFS进行通信的,所以在配置fluentd时,一定要保证webHDFS能正常通信,和通过webHDFS写数据到hdfs中!
原理图如下:
webHDFS的相关配置与测试,请看这篇文章:http://shineforever.blog.51cto.com/1429204/1585942
安装环境大致说明:
1)fluentd和hadoop中的namenode要安装到一台物理机器上;
2)os版本:rhel 5.7 64位
3)hadoop版本:1.2.1
4)jdk1.7.0_67
5)ruby版本:ruby 2.1.2p95
1.安装前的准备工作,安装ruby,因为fluentd是ruby开发的:
yum install openssl-devel zlib-devel gcc gcc-c++ make autoconf readline-devel curl-devel expat-devel gettext-devel
卸载系统自带ruby版本:
yum erase ruby ruby-libs ruby-mode ruby-rdoc ruby-irb ruby-ri ruby-docs
通过源码安装ruby:
wget -c http://cache.ruby-lang.org/pub/ruby/2.1/ruby-2.1.2.tar.gz
然后解压包,编译,把ruby安装到目录 /usr/local/ruby即可,然后设置profile环境变量。
测试ruby:
[[email protected] install]# ruby -v
ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-linux]
出现以上字段,代表ruby安装成功。
2.fluentd安装:
fluentd有源码安装,gem安装或者rpm方式安装三种方式;
本文采用rpm的安装方式官方文档已经帮我们写好了脚本,直接执行就行了:
curl -L http://toolbelt.treasuredata.com/sh/install-redhat-td-agent2.sh | sh
安装成功以后,启动脚本是:/etc/init.d/td-agent start
配置文件路径是:/etc/td-agent/
[[email protected] install]# cd /etc/td-agent/
You have new mail in /var/spool/mail/root
[[email protected] td-agent]# pwd
/etc/td-agent
[[email protected] td-agent]# ls
logrotate.d plugin prelink.conf.d td-agent.conf
3.利用gem安装fluentd插件fluent-plugin-webhdfs
1)由于国内防火墙block了ruby源,请更换gem的源:
[[email protected] bin]# td-agent-gem source --remove https://ruby.taobao.org/
https://ruby.taobao.org/ removed from sources
[[email protected] bin]# td-agent-gem source -a https://ruby.taobao.org/
https://ruby.taobao.org/ added to sources
2)安装插件:
td-agent-gem install fluent-plugin-webhdfs
查看gem的安装列表:
td-agent-gem list
*** LOCAL GEMS ***
bigdecimal (1.2.4)
bundler (1.7.7)
cool.io (1.2.4)
fluent-mixin-config-placeholders (0.3.0)
fluent-mixin-plaintextformatter (0.2.6)
fluent-plugin-webhdfs (0.4.1)
fluentd (0.12.0.pre.2)
http_parser.rb (0.6.0)
io-console (0.4.2)
json (1.8.1)
ltsv (0.1.0)
minitest (4.7.5)
msgpack (0.5.9)
psych (2.0.5)
rake (10.1.0)
rdoc (4.1.0)
sigdump (0.2.2)
string-scrub (0.0.5)
test-unit (2.1.2.0)
thread_safe (0.3.4)
tzinfo (1.2.2)
tzinfo-data (1.2014.10)
uuidtools (2.1.5)
webhdfs (0.6.0)
yajl-ruby (1.2.1)
4)配置flunetd,加载fluent-plugin-webhdfs 模块;
加入以下字段:
vim /etc/td-agent/td-agent.conf
<match hdfs.*.*> type webhdfs host node1.test.com port 50070 path /log/%Y%m%d_%H/access.log.${hostname} flush_interval 1s </match>
重启td-agent服务;
5)设置hdfs相关配置:
创建log目录
hadoop fs -mkdir /log
赋予log目录权限为777,如果不赋予,数据写不进去,官方文档没有说明,测试了好久才发现!
hadoop fs -chmod 777 /log
6)再次重启td-agent服务,开始测试,测试命令如下:
curl -X POST -d ‘json={"json":"message"}‘ http://172.16.41.151:8888/hdfs.access.test
这时就发现hadoop里面文件有变化了!
安装配置过程中的报错:
1)
2014-12-03 15:56:12 +0800 [warn]: failed to communicate hdfs cluster, path: /log/20141203_15/access.log.node1.test.com
2014-12-03 15:56:12 +0800 [warn]: temporarily failed to flush the buffer. next_retry=2014-12-03 15:56:28 +0800 error_class="WebHDFS::ClientError" error="{\"RemoteException\":{\"exception\":\"IllegalArgumentException\",\"javaClassName\":\"java.lang.IllegalArgumentException\",\"message\":\"n must be positive\"}}" instance=23456251808160
2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:313:in `request‘
2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:231:in `operate_requests‘
2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:45:in `create‘
2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:189:in `rescue in send_data‘
2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:186:in `send_data‘
2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:205:in `write‘
2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/buffer.rb:296:in `write_chunk‘
2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/buffer.rb:276:in `pop‘
2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/output.rb:311:in `try_flush‘
2014-12-03 15:56:12 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/output.rb:132:in `run‘
出现以上情况,是你的hdfs文件系统有问题,不能写数据等等,请单独测试hdfs的是否运行正常!
2)
2014-12-04 14:44:55 +0800 [warn]: failed to communicate hdfs cluster, path: /log/20141204_14/access.log.node1.test.com
2014-12-04 14:44:55 +0800 [warn]: temporarily failed to flush the buffer. next_retry=2014-12-04 14:45:30 +0800 error_class="WebHDFS::IOError" error="{\"RemoteException\":{\"exception\":\"AccessControlException\",\"javaClassName\":\"org.apache.hadoop.security.AccessControlException\",\"message\":\"org.apache.hadoop.security.AccessControlException: Permission denied: user=webuser, access=WRITE, inode=\\\"\\\":hadoop:supergroup:rwxr-xr-x\"}}" instance=23456251808060
2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:317:in `request‘
2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:242:in `operate_requests‘
2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:45:in `create‘
2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:189:in `rescue in send_data‘
2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:186:in `send_data‘
2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:205:in `write‘
2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/buffer.rb:296:in `write_chunk‘
2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/buffer.rb:276:in `pop‘
2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/output.rb:311:in `try_flush‘
2014-12-04 14:44:55 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/output.rb:132:in `run‘
2014-12-04 14:45:31 +0800 [warn]: failed to communicate hdfs cluster, path: /log/20141204_14/access.log.node1.test.com
2014-12-04 14:45:31 +0800 [warn]: temporarily failed to flush the buffer. next_retry=2014-12-04 14:46:26 +0800 error_class="WebHDFS::IOError" error="{\"RemoteException\":{\"exception\":\"AccessControlException\",\"javaClassName\":\"org.apache.hadoop.security.AccessControlException\",\"message\":\"org.apache.hadoop.security.AccessControlException: Permission denied: user=webuser, access=WRITE, inode=\\\"\\\":hadoop:supergroup:rwxr-xr-x\"}}" instance=23456251808060
2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:317:in `request‘
2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:242:in `operate_requests‘
2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/webhdfs-0.5.5/lib/webhdfs/client_v1.rb:45:in `create‘
2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:189:in `rescue in send_data‘
2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:186:in `send_data‘
2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-webhdfs-0.3.1/lib/fluent/plugin/out_webhdfs.rb:205:in `write‘
2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/buffer.rb:296:in `write_chunk‘
2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/buffer.rb:276:in `pop‘
2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/output.rb:311:in `try_flush‘
2014-12-04 14:45:31 +0800 [warn]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/output.rb:132:in `run‘
出现以上情况,一般是你的hdfs没有设置好权限,把存放日志的hdfs目录chmod 777,就可以了!
如果日志写入hdfs正常,日志显示的是:2014-12-04 14:48:40 +0800 [warn]: retry succeeded. instance=23456251808060