Hadoop Ecosystem related ports

本文总结了Hadoop生态系统中各个组件使用的端口，包括了HDFS，Map Reduce，HBase，Hive，Spark，WebHCat，Impala，Alluxio，Sqoop等，后续会持续更新。

HDFS Ports：

Service	Servers	Default Ports Used	Protocol	Description	Need End User Access?	Configuration Parameters
NameNode WebUI	Master Nodes (NameNode and any back-up NameNodes)	50070	http	Web UI to look at current status of HDFS, explore file system	Yes (Typically admins, Dev/Support teams)	dfs.http.address
NameNode WebUI	Master Nodes (NameNode and any back-up NameNodes)	50470	https	Secure http service	Yes (Typically admins, Dev/Support teams)	dfs.https.address
NameNode metadata service	Master Nodes (NameNode and any back-up NameNodes)	8020/9000	IPC	File system metadata operations	Yes (All clients who directly need to interact with the HDFS)	Embedded in URI specified by fs.default.name
DataNode	All Slave Nodes	50075	http	DataNode WebUI to access the status, logs etc.	Yes (Typically admins, Dev/Support teams)	dfs.datanode.http.address
		50475	https	Secure http service	Yes (Typically admins, Dev/Support teams)	dfs.datanode.https.address
		50010		Data transfer		dfs.datanode.address
		50020	IPC	Metadata operations	No	dfs.datanode.ipc.address
Secondary NameNode	Secondary NameNode and any backup Secondary NameNode	50090	http	Checkpoint for NameNode metadata	No	dfs.secondary.http.address

Map Reduce Ports：

Service	Servers	Default Ports Used	Protocol	Description	Need End User Access?	Configuration Parameters
JobTracker WebUI	Master Nodes (JobTracker Node and any back-up JobTracker node )	50030	http	Web UI for JobTracker	Yes	mapred.job.tracker.http.address
JobTracker	Master Nodes (JobTracker Node)	8021	IPC	For job submissions	Yes (All clients who need to submit the MapReduce jobs including Hive, Hive server, Pig)	Embedded in URI specified by mapred.job.tracker
TaskTracker Web UI and Shuffle	All Slave Nodes	50060	http	DataNode Web UI to access status, logs, etc.	Yes (Typically admins, Dev/Support teams)	mapred.task.tracker.http.address
History Server WebUI		51111	http	Web UI for Job History	Yes	mapreduce.history.server.http.address

HBase Ports：

Service	Servers	Default Ports Used	Protocol	Description	Need End User Access?	Configuration Parameters
HMaster	Master Nodes (HBase Master Node and any back-up HBase Master node)	60000			Yes	hbase.master.port
HMaster Info Web UI	Master Nodes (HBase master Node and back up HBase Master node if any)	60010	http	The port for the HBaseMaster web UI. Set to -1 if you do not want the info server to run.	Yes	hbase.master.info.port
Region Server	All Slave Nodes	60020			Yes (Typically admins, dev/support teams)	hbase.regionserver.port
Region Server	All Slave Nodes	60030	http		Yes (Typically admins, dev/support teams)	hbase.regionserver.info.port
	All ZooKeeper Nodes	2888		Port used by ZooKeeper peers to talk to each other.Seehere for more information.	No	hbase.zookeeper.peerport
	All ZooKeeper Nodes	3888		Port used by ZooKeeper peers to talk to each other.Seehere for more information.		hbase.zookeeper.leaderport
		2181		Property from ZooKeeper‘s config zoo.cfg. The port at which the clients will connect.		hbase.zookeeper.property.clientPort

Hive Ports：

Service	Servers	Default Ports Used	Protocol	Description	Need End User Access?	Configuration Parameters
Hive Server2	Hive Server machine (Usually a utility machine)	10000	thrift	Service for programatically (Thrift/JDBC) connecting to Hive	Yes (Clients who need to connect to Hive either programatically or through UI SQL tools that use JDBC)	ENV Variable HIVE_PORT
Hive Metastore		9083	thrift		Yes (Clients that run Hive, Pig and potentially M/R jobs that use HCatalog)	hive.metastore.uris

WebHCat Ports：

Service	Servers	Default Ports Used	Protocol	Description	Need End User Access?
WebHCat Server	Any utility machine	50111	http	Web API on top of HCatalog and other Hadoop services	Yes

Spark Ports：

Service	Servers	Default Ports Used	Description
Spark GUI	Nodes running spark	7077	Spark web interface for monitoring and troubleshooting

Impala Ports：

Service	Servers	Default Ports Used	Description
Impala Daemon	Nodes running impala daemon	21000	Used by transmit commands and receive results by impala-shell
Impala Daemon	Nodes running impala daemon	21050	Used by applications through JDBC
Impala Daemon	Nodes running impala daemon	25000	Impala web interface for monitoring and troubleshooting
Impala StateStore Daemon	Nodes running impala StateStore daemon	25010	StateStore web interface for monitoring and troubleshooting
Impala Catalog Daemon	Nodes running impala catalog daemon	25020	Catalog service web interface for monitoring and troubleshooting

Alluxio Ports：

Service	Servers	Default Ports Used	Protocol	Description	Need End User Access?
Alluxio Web GUI	Any utility machine	19999	http	Web GUI to check alluxio status	Yes
Alluxio API	Any utility machine	19998	Tcp	Api to access data on alluxio	No

Sqoop Ports：

Service	Servers	Default Ports Used	Description
Sqoop server	Nodes running Sqoop	12000	Used by Sqoop client to access the sqoop server

时间： 2024-11-07 06:28:13

Hadoop Ecosystem related ports的相关文章

Hadoop ecosystem

How did it all start- huge data on the web! Nutch built to crawl this web data Huge data had to saved- HDFS was born! How to use this data? Map reduce framework built for coding and running analytics – java, any language-streaming/pipes How to get in

Ambari Install Hadoop ecosystem for 9 steps

Ambari for provisioning,managing and monitoring Hadoop 1. Install Ambari Server: 2. Enter list of hosts to be included in the cluster and provide your SSH key: 3. Register your hosts(Confirm hosts): 4. Host checks: 5. Choose Services(HDFS,MR,Nagios,G

hadoop发行版本

Azure HDInsight Azure HDInsight is Microsoft's distribution of Hadoop. The Azure HDInsight ecosystem includes the following features/components: Pig, Hive, Hbase, Sqoop, Oozie, Ambari, Microsoft Avro Library, YARN, Cluster Dashboard and Tez. Apart fr

ubuntu从头开始搭建hadoop伪分布式环境

13年学习过一段时间的hadoop,但是工作中用到的地方比较少,有些生疏,加上现在hadoop版本也已经比较新了,所以空闲时间想继续学习一下,找到这篇文章,从头开始搭建一个hadoop环境,转过来备忘 Hadoop developers usually test their scripts and code on a pseudo-distributed environment(also known as a single node setup), which is a virtual mach

What Is Apache Hadoop?

http://hadoop.apache.org/ 1 The Apache™ Hadoop® project develops open-source software for reliable, scalable,distributed computing. The Apache Hadoop software library is a framework that allows for the distributedprocessing of large data sets across

XI hadoop

XIhadoop 文本文件(索引): structured data ,RDBMS(表,字段,数据类型,约束): semi-structured data,半结构化数据(xml,json): google(网络爬虫.网络蜘蛛.网络机器人,20亿个页面,unstructureddata,pagerand页面排序算法): facebook,pv 500亿,化整为零(500G-->500*1G-->筛出50*100M数据),并行处理(将一个大问题切割成多个小问题,OLAP在线分析处理(数据挖掘):

PS-Scan ports扫描网络端口

用PS写出端口扫描 TCP139/445 AND UDP 137/138 用法简单:在c:\temp\target.txt写入多台IP地址端口可以自己定义以下是代码: <# This script can be used to Scan port TCP139/445 AND UDP 137/138 Need to modify Ip address under C:\temp\Target.txt first Date:2017-05-15 #> function Test-PortUD

关于hadoop

hadoop 是什么? 1. 适合海量数据的分布式存储与计算平台. 海量: 是指 1T 以上数据. 分布式: 任务分配到多态虚拟机上进行计算. 2. 多个任务是怎么被分配到多个虚拟机当中的? 分配是需要网络通讯的.而且是需要启动资源或者消耗一些硬件上的配置. 单 JVM 关注的如何『处理』,而不是交给其他人进行处理这个『管理』的过程. 所以最开始有两个关键的字『适合』, 只有当数据量超过 1T 的大数据处理才能凸显 hadoop 的优势; 当然,用 hadoop 处理几十G.

Hadoop集群中Hbase的介绍、安装、使用

导读 HBase – Hadoop Database,是一个高可靠性.高性能.面向列.可伸缩的分布式存储系统,利用HBase技术可在廉价PC Server上搭建起大规模结构化存储集群. 一.Hbase简介 HBase是Google Bigtable的开源实现,类似Google Bigtable利用GFS作为其文件存储系统,HBase利用Hadoop HDFS作为其文件存储系统:Google运行MapReduce来处理Bigtable中的海量数据,HBase同样利用Hadoop MapReduce