使用kafka作为生产者生产数据到hdfs(单节点)

关键:查看kafka官网的userguide

agent.sources = kafkaSource
agent.channels = memoryChannel
agent.sinks = hdfsSink

agent.sources.kafkaSource.type = org.apache.flume.source.kafka.KafkaSource
agent.sources.kafkaSource.zookeeperConnect = 192.168.57.11:2181
agent.sources.kafkaSource.topic = test_pan
agent.sources.kafkaSource.groupId = test-consumer-group
agent.sources.kafkaSource.kafka.consumer.timeout.ms = 100

agent.channels.memoryChannel.type = memory
agent.channels.memoryChannel.capacity=100
agent.channels.memoryChannel.transactionCapacity=100

agent.sinks.hdfsSink.type = hdfs
agent.sinks.hdfsSink.hdfs.path = hdfs://beicai/test/pan
agent.sinks.hdfsSink.hdfs.writeFormat = Text
agent.sinks.hdfsSink.hdfs.fileType = DataStream

agent.sinks.hdfsSink.hdfs.rollSize = 1024
agent.sinks.hdfsSink.hdfs.rollCount = 0
agent.sinks.hdfsSink.hdfs.rollInterval = 60

agent.sinks.hdfsSink.hdfs.filePrefix=test
agent.sinks.hdfsSink.hdfs.fileSuffix=.data

agent.sinks.hdfsSink.hdfs.inUserPrefix=_
agent.sinks.hdfsSink.hdfs.inUserSuffix=
agent.sinks.hdfsSink.hdfs.fileType = DataStream
agent.sinks.hdfsSink.hdfs.writeFormat = TEXT
agent.sinks.hdfsSink.hdfs.rollInterval = 1
agent.sinks.sink1.hdfs.filePrefix =A

agent.sources.kafkaSource.channels = memoryChannel
agent.sinks.hdfsSink.channel = memoryChannel

原文地址:https://www.cnblogs.com/pingzizhuanshu/p/9102602.html

时间: 2024-10-23 06:57:35

使用kafka作为生产者生产数据到hdfs(单节点)的相关文章

使用kafka作为生产者生产数据到hdfs(单节点)

查看kafka官网的userguide agent.sources = kafkaSource agent.channels = memoryChannel agent.sinks = hdfsSink agent.sources.kafkaSource.type = org.apache.flume.source.kafka.KafkaSource agent.sources.kafksSource.zookeeperConnect = 192.168.57.11:2181 agent.sou

使用kafka作为生产者生产数据到hdfs

关键:查看kafka官网的userGuide 配置文件: agent.sources = r1agent.sinks = k1agent.channels = c1 ## sources configagent.sources.r1.type = org.apache.flume.source.kafka.KafkaSourceagent.sources.r1.kafka.bootstrap.servers = 192.168.80.128:9092,192.168.80.129:9092,19

使用kafka作为生产者生产数据_到_hbase

配置文件: agent.sources = r1agent.sinks = k1agent.channels = c1 ## sources configagent.sources.r1.type = org.apache.flume.source.kafka.KafkaSourceagent.sources.r1.kafka.bootstrap.servers = 192.168.80.128:9092,192.168.80.129:9092,192.168.80.130:9092agent.

Kafka单节点及集群配置安装

一.单节点 1.上传Kafka安装包到Linux系统[当前为Centos7]. 2.解压,配置conf/server.property. 2.1配置broker.id 2.2配置log.dirs 2.3配置zookeeper.connect 3.启动Zookeeper集群 备注:zookeeper集群启动时,先启动的节点因节点启动过少而出现not running这种情况,是正常的,把所有节点都启动之后这个情况就会消失! 3.启动Kafka服务 执行:./kafka-server-start.sh

简洁的kafka单节点和分布式安装文档

单节点安装: Ip:single:192.168.1.186 1.下载:http://kafka.apache.org/downloads.html 下载源码或者对应scala版本的bin包 2.解压 [[email protected] cloud]# tar -zxvf kafka_2.10-0.8.2-beta.tgz -C /cloud/ 3.进入解压目录 [[email protected] kafka_2.10-0.8.2-beta]# cd config/ [[email prot

kafka的生产者类

1 import com.*.message.Configuration; 2 import org.apache.kafka.clients.producer.Callback; 3 import org.apache.kafka.clients.producer.ProducerRecord; 4 import org.apache.kafka.clients.producer.RecordMetadata; 5 import org.apache.kafka.common.serializ

kafka的单节点搭建和集群搭建

首先,kafka的运行,需要zookeeper在后台运行,虽然kafka中已经内置了zookeeper,但是我们还是用自己搭建的分布式zookeeper kafka的单节点搭建  (这里用了自带的zookeeper) 启动服务 ?1.配置和启动zookeeper服务 使用kafka内置zk ?配置zk文件:/opt/kafka/config/zookeeper.properties ?启用zk服务: ?/opt/kafka/bin/zookeeper-server-start.sh /opt/k

大数据生态圈 —— 单节点伪分布式环境搭建

本文参考 关于环境搭建,imooc上关于Hadoop.hbase.spark等课程上有很多环境搭建的例子,但是有些并没有说明为什么选择这个的版本,我们知道这些开源的技术发展很快,所以有必要搞清楚如何对它们进行版本选择 环境 centos7.7(主机名hadoop0001,用户名hadoop) + spark 2.2.0 + scala 2.11.8 + hbase1.3.6 + Hadoop 2.6.5 + zookeeper 3.4.14 + kafka 0.8.2.1 + flume 1.6

在Ubuntu下配置运行Hadoop2.4.0单节点配置

还没有修改hosts,请先按前文修改. 还没安装java的,请按照前文配置. (1)增加用户并设立公钥: sudo addgroup hadoop sudo adduser --ingroup hadoop hduser su - hduser cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys ssh localhost exit   (2)把编译完的hadoop复制到/usr/local目录,并修改目录权限 cp –r /root