ranger kafka - Authorizing Kafka access over non-authenticated channel via Ranger

Authorizing Kafka access over non-authenticated channel via Ranger

This section answers some questions one is likely to encounter when trying to authorize access to Kafka over non-authenticated channel. This Kafka feature is available in HDP releases 2.3.4 (Dal-M20) or later.

Can I authorizer access to Kafka over a non-secure channel via Ranger?

Yes. you can control access by ip-address.

Can I authorize access to Kafka over non-secure channel by user/user-groups?

No, one can’t use user/group based access to authorize Kafka access over a non-secure channel.  This is because it isn‘t possible to assert client’s identity over the non-secure channel.

What is a recommended way to set-up policies when trying to control access to Kafka over a non-secure channel?

Ensure that all Brokers nodes have Kafka Admin access.  This is a mandatory step.  If you don’t perform this step then your cluster won’t work properly.

  • Identify the nodes where brokers are running.
  • Create a policy where resource is * (i.e. all topics) and grant Kafka Admin access type to the public user group.  Specify ip-address of all the brokers as the ip-range policy condition on the policy item.

Ensure publishers have appropriate access.

  • Identify ip address of all nodes where publishers would run along with their respective topics.
  • Create policy where resources are the respective topic names and grant Publish access type to public user group.  Specify ip-address of machines where those publishers would run as the ip-range policy condition on the policy item.
  • Specify topic name(s) as policy resource.  Note that you can specify multiple topics  or even regular expressions in topic names.

Ensure consumers have appropriate access. Same process as publishers except change access type to Consume instead or Produce.

Why do we have to specify public user group on all policies items created for authorizing Kafka access over non-secure channel?

  • Kafka can’t assert the identity of client user over a non-secure channel.  Thus, Kafka treats all users for such access as an anonymous user (a special user literally named ANONYMOUS).
  • Ranger‘s public user group is a means to model all users which, of course, includes this anonymous user (ANONYMOUS).

What are the specific things to watch out for when setting up authorization for accessing Kafka over non-secure channel?

  • Make sure that all broker-ips have Kafka admin access to all topics, i.e. *.
  • Make sure no publishers or consumers are running on broker nodes that need access control.  Since broker ips have open access it isn’t possible to control access on those nodes.

I have the policies as specified above, however, I am still not able to consume over an non-authenticated channel using bin/kafka-console-consumer.sh script that is a part of the Kafka distribution!  The consumer hangs and gives the error message “No brokers found in ZK.”  What gives?

  • Ensure that /etc/kafka/conf/kafka_client_jaas.conf does not have specification for serviceName="zookeeper".  This is typically the Client section.
  • Ensure that you are not specifying --security-protocol PLAINTEXTSASL argument to the consumer.  Either specify --security-protocol PLAINTEXT or leave --security-protocol unspecified since its default value is PLAINTEXT.

I can’t edit the /etc/kafka/conf/kafka_client_jaas.conf file!  What should I do to consume kafka messages over an non-authenticated channel?

  • In that case just do a kinit with a valid password/ticket.
  • That token will get used to authenticate you to zookeeper.  After that you should be able to consume messages from kafka over non-authenticated channel.  Connection to Kafka brokers correctly happens over non-authenticated channel and should get authorized as user ANONYMOUS.

Why do I need to edit the /etc/kafka/conf/kafka_client_jaas.conf file?

Presence of Client block in /etc/kafka/conf/kafka_client_jaas.conf for service zookeeper causes the console consumer connect to zookeeper in  secure mode.  To do so it needs a ticket -- which won’t exist in simple auth mode, so it fails.

Authorizing topic creation

This section describes the issues one might encounter while trying to authorize topic creation in Kafka using Ranger.

Can I authorizer topic creation via Ranger?

Yes, but only if the topic is being auto-created by consumers or producers.

What is the recommended policy setup to authorize topic auto-creation for producers or consumers?

  • Create a policy where resource is all topics, i.e. *.
  • For producers, create a policy item under this policy which grants both Produce and Configure permissions to the relevant user/user-groups.
  • For consumers, create a policy item under this policy which grants both Consume and Configure permissions to the relevant user/user-groups.

Can I authorize topic auto-creation for producers or consumers that connect over non-authenticated channel?

  • Yes, create a policy similar to that for secure producer.
  • Either add user group public to the policy item or specify and ip-address base custom condition.
  • Refer to FAQ about authorizing Kafka access over non-authenticated channel for additional details and rationale.

Why do I have to grant create access to all topics (via *) to allow for auto-creation to work for producers and/or consumers?

Topic creation is currently a cluster level privilege.  Thus it requires access privileges over all topics in a cluster, i.e. *.

I want to allow topic auto creation for any topic that starts with finance, e.g. finance_1finance_2, etc. to users that are part of Finance user group.  But I don’t want them to be able to auto create topics that start with other strings, say, marketing_123.  Can I model this sort of an authorization in Ranger Kafka plugin?

  • No.  Because in Kafka currently topic creation is a cluster level permissions, i.e. all topics.
  • There is a pending proposal about Hierarchical topics in Kafka which, if and when it’s implemented, could help with that use case.

I am using the Kafka supplied console consumer to test topic auto creation by a consumer, but it is not working.  Shouldn’t the new topic get auto-created the moment I startup the consumer?  I have verified the recommended policy setup as indicated above!  What gives?

Make sure that you specify the following two argument to the console consumer.

  • --new-consumer
  • --boot-strap <broker-name(s)>: Any single broker host/port would do.

Most common way of creating topic involves using the bin/kafka-tpics.sh script that is a part of the Kafka distribution.  Can I authorize topic creation via that mechanism?

No.

Why can’t I authorize topic creation done via the bin/kafka-tpics.sh script!?

  • This script talks directly to zookeeper.  Hence, the policies of Kafka plugin don’t come into the picture.
  • Script adds entries into zookeeper nodes and watchers inside the brokers monitor it and create topics.

So what are my options to authorize topic creation via the bin/kafka-tpics.sh script?

  • Since this directly interacts with zookeeper this is best controlled via zookeeper acls.

Is there a Ranger plugin for Zookeeper?

Not yet.

Where can I learn more about Kafka’s support for publish/consume over non-authenticated channel?

Please refer to KAFKA-1809 which implemented the multiple listeners Design.

原文地址:https://www.cnblogs.com/felixzh/p/12259436.html

时间: 2024-10-07 06:49:43

ranger kafka - Authorizing Kafka access over non-authenticated channel via Ranger的相关文章

第89课:SparkStreaming on Kafka之Kafka解析和安装实战

本篇博文将从以下方面组织内容: 1. Kafka解析 2. 消息组件Kafka 3. Kafka安装 实验搭建所需要的软件: kafka_2.10-0.9.0.1 Zookeeper集群已经安装好.在上一篇博文有安装步骤,不清楚的朋友可以参考下. 一:Kafka解析 1. Kafka是生产者和消费者模式中广播概念,Kafka也可以实现队列的方式. 2. Kafka不仅是一个消息中间键,还是一个存储系统,可以将流进来的数据存储一段时间.这就与传统的流式处理不一样,传统的流式处理处理完数据之后就消失

Apache Kafka安全| Kafka的需求和组成部分

1.目标 - 卡夫卡安全 今天,在这个Kafka教程中,我们将看到Apache Kafka Security 的概念  .Kafka Security教程包括我们需要安全性的原因,详细介绍加密.有了这个,我们将讨论Kafka Security可以轻松解决的问题列表.此外,我们将看到Kafka身份验证和授权.此外,我们将看看ZooKeeper身份验证.那么,让我们开始Apache Kafka Security. Apache Kafka安全| Kafka的需求和组成部分 2.什么是Apache K

【Apache Kafka】 Kafka简介及其基本原理

??对于大数据,我们要考虑的问题有很多,首先海量数据如何收集(如Flume),然后对于收集到的数据如何存储(典型的分布式文件系统HDFS.分布式数据库HBase.NoSQL数据库Redis),其次存储的数据不是存起来就没事了,要通过计算从中获取有用的信息,这就涉及到计算模型(典型的离线计算MapReduce.流式实时计算Storm.Spark),或者要从数据中挖掘信息,还需要相应的机器学习算法.在这些之上,还有一些各种各样的查询分析数据的工具(如Hive.Pig等).除此之外,要构建分布式应用还

kafka笔记-Kafka在zookeeper中的存储结构【转】

参考链接:apache kafka系列之在zookeeper中存储结构  http://blog.csdn.net/lizhitao/article/details/23744675 1.topic注册信息 /brokers/topics/[topic] : 存储某个topic的partitions所有分配信息 Schema: {    "version": "版本编号目前固定为数字1",    "partitions": {        &q

【kafka】kafka特性

1.稀疏索引 kafka分布式实现 tar包内容 server.properties broker.id=0

ELK搭建实时日志分析平台(elk+kafka+metricbeat)-KAFKA搭建

一.kafka搭建 建立elk目录:mkdir /usr/loca/elk 安装zookeeper: 192.168.30.121: 192.168.30.122: 192.168.30.123: 3. kafka安装: a. 192.168.30.121 b. 192.168.30.122: c. 192.168.30.123: 4.启动: 在三台服务器上执行下面命令:

【Kafka】kafka的环境搭建,集群环境的搭建

Kafka是一个分布式的.可分区的.可复制的消息系统.它提供了普通消息系统的功能,但具有自己独特的设计 Kafka将消息以topic为单位进行归纳. 将向Kafka topic发布消息的程序成为producers. 将预订topics并消费消息的程序成为consumer. Kafka以集群的方式运行,可以由一个或多个服务组成,每个服务叫做一个broker. 下面来看下如何简单的使用: 首先,去官网下载kakfa的安装包 http://kafka.apache.org/downloads.html

【kafka】kafka.admin.AdminOperationException: replication factor: 1 larger than available brokers: 0

https://blog.csdn.net/bigtree_3721/article/details/78442912 I am trying to create topics in Kafka by following the guide on Apache Kafka website through command line. While running the command:bin/kafka-topics.sh --create --zookeeper localhost:2181 -

Kafka(1)--kafka基础知识

Kafka 的简介: Kafka 是一款分布式消息发布和订阅系统,具有高性能.高吞吐量的特点而被广泛应用与大数据传输场景.它是由 LinkedIn 公司开发,使用 Scala 语言编写,之后成为 Apache 基金会的一个顶级项目.kafka 提供了类似 JMS 的特性,但是在设计和实现上是完全不同的,而且他也不是 JMS 规范的实现. kafka 产生的背景: kafka 作为一个消息系统,早起设计的目的是用作 LinkedIn 的活动流(Activity Stream)和运营数据处理管道(P