Hadoop Ecosystem related ports

本文总结了Hadoop生态系统中各个组件使用的端口,包括了HDFS,Map Reduce,HBase,Hive,Spark,WebHCat,Impala,Alluxio,Sqoop等,后续会持续更新。

HDFS Ports


Service


Servers


Default Ports Used


Protocol


Description


Need End User Access?


Configuration Parameters


NameNode WebUI


Master Nodes (NameNode and any back-up NameNodes)


50070


http


Web UI to look at current status of HDFS, explore file system


Yes (Typically admins, Dev/Support teams)


dfs.http.address


50470


https


Secure http service


dfs.https.address


NameNode metadata service


Master Nodes (NameNode and any back-up NameNodes)


8020/9000


IPC


File system metadata operations


Yes (All clients who directly need to interact with the HDFS)


Embedded in URI specified by fs.default.name


DataNode


All Slave Nodes


50075


http


DataNode WebUI to access the status, logs etc.


Yes (Typically admins, Dev/Support teams)


dfs.datanode.http.address


50475


https


Secure http service


dfs.datanode.https.address


50010

 
Data transfer

 
dfs.datanode.address


50020


IPC


Metadata operations


No


dfs.datanode.ipc.address


Secondary NameNode


Secondary NameNode and any backup Secondary NameNode


50090


http


Checkpoint for NameNode metadata


No


dfs.secondary.http.address

Map Reduce Ports:


Service


Servers


Default Ports Used


Protocol


Description


Need End User Access?


Configuration Parameters


JobTracker  WebUI


Master Nodes (JobTracker Node and any back-up Job­Tracker node )


50030


http


Web UI for JobTracker


Yes


mapred.job.tracker.http.address


JobTracker


Master Nodes (JobTracker Node)


8021


IPC


For job submissions


Yes (All clients who need to submit the MapReduce jobs  including Hive, Hive server, Pig)


Embedded in URI specified by mapred.job.tracker


Task­Tracker Web UI and Shuffle


All Slave Nodes


50060


http


DataNode Web UI to access status, logs, etc.


Yes (Typically admins, Dev/Support teams)


mapred.task.tracker.http.address


History Server WebUI

 
51111


http


Web UI for Job History


Yes


mapreduce.history.server.http.address

HBase Ports:


Service


Servers


Default Ports Used


Protocol


Description


Need End User Access?


Configuration Parameters


HMaster


Master Nodes (HBase Master Node and any back-up HBase Master node)


60000

   
Yes


hbase.master.port


HMaster Info Web UI


Master Nodes (HBase master Node and back up HBase Master node if any)


60010


http


The port for the HBase­Master web UI. Set to -1 if you do not want the info server to run.


Yes


hbase.master.info.port


Region Server


All Slave Nodes


60020

   
Yes (Typically admins, dev/support teams)


hbase.regionserver.port


Region Server


All Slave Nodes


60030


http

 
Yes (Typically admins, dev/support teams)


hbase.regionserver.info.port

 
All ZooKeeper Nodes


2888

 
Port used by ZooKeeper peers to talk to each other.Seehere for more information.


No


hbase.zookeeper.peerport

 
All ZooKeeper Nodes


3888

 
Port used by ZooKeeper peers to talk to each other.Seehere for more information.

 
hbase.zookeeper.leaderport

   
2181

 
Property from ZooKeeper‘s config zoo.cfg. The port at which the clients will connect.

 
hbase.zookeeper.property.clientPort

Hive Ports:


Service


Servers


Default Ports Used


Protocol


Description


Need End User Access?


Configuration Parameters


Hive Server2


Hive Server machine (Usually a utility machine)


10000


thrift


Service for programatically (Thrift/JDBC) connecting to Hive


Yes (Clients who need to connect to Hive either programatically or through UI SQL tools that use JDBC)


ENV Variable HIVE_PORT


Hive Metastore

 
9083


thrift

 
Yes (Clients that run Hive, Pig and potentially M/R jobs that use HCatalog)


hive.metastore.uris

WebHCat Ports:


Service


Servers


Default Ports Used


Protocol


Description


Need End User Access?


WebHCat Server


Any utility machine


50111


http


Web API on top of HCatalog and other Hadoop services


Yes

Spark Ports:


Service


Servers


Default Ports Used


Description


Spark GUI


Nodes running spark


7077


Spark web interface for monitoring and troubleshooting

Impala Ports:


Service


Servers


Default Ports Used


Description


Impala Daemon


Nodes running impala daemon


21000


Used by transmit commands and receive results by impala-shell


Impala Daemon


Nodes running impala daemon


21050


Used by applications through JDBC


Impala Daemon


Nodes running impala daemon


25000


Impala web interface for monitoring and troubleshooting


Impala StateStore Daemon


Nodes running impala StateStore daemon


25010


StateStore web interface for monitoring and troubleshooting


Impala Catalog Daemon


Nodes running impala catalog daemon


25020


Catalog service web interface for monitoring and troubleshooting

Alluxio Ports:


Service


Servers


Default Ports Used


Protocol


Description


Need End User Access?


Alluxio Web GUI


Any utility machine


19999


http


Web GUI to check alluxio status


Yes


Alluxio API


Any utility machine


19998


Tcp


Api to access data on alluxio


No

Sqoop Ports:


Service


Servers


Default Ports Used


Description


Sqoop server


Nodes running Sqoop


12000


Used by Sqoop client to access the sqoop server

时间: 2024-11-07 06:28:13

Hadoop Ecosystem related ports的相关文章

Hadoop ecosystem

How did it all start- huge data on the web! Nutch built to crawl this web data Huge data had to saved- HDFS was born! How to use this data? Map reduce framework built for coding and running analytics – java, any language-streaming/pipes How to get in

Ambari Install Hadoop ecosystem for 9 steps

Ambari for provisioning,managing and monitoring Hadoop 1. Install Ambari Server: 2. Enter list of hosts to be included in the cluster and provide your SSH key: 3. Register your hosts(Confirm hosts): 4. Host checks: 5. Choose Services(HDFS,MR,Nagios,G

hadoop发行版本

Azure HDInsight Azure HDInsight is Microsoft's distribution of Hadoop. The Azure HDInsight ecosystem includes the following features/components: Pig, Hive, Hbase, Sqoop, Oozie, Ambari, Microsoft Avro Library, YARN, Cluster Dashboard and Tez. Apart fr

ubuntu从头开始搭建hadoop伪分布式环境

13年学习过一段时间的hadoop,但是工作中用到的地方比较少,有些生疏,加上现在hadoop版本也已经比较新了,所以空闲时间想继续学习一下,找到这篇文章,从头开始搭建一个hadoop环境,转过来备忘 Hadoop developers usually test their scripts and code on a pseudo-distributed environment(also known as a single node setup), which is a virtual mach

What Is Apache Hadoop?

http://hadoop.apache.org/ 1 The Apache™ Hadoop® project develops open-source software for reliable, scalable,distributed computing. The Apache Hadoop software library is a framework that allows for the distributedprocessing of large data sets across

XI hadoop

XIhadoop 文本文件(索引): structured data ,RDBMS(表,字段,数据类型,约束): semi-structured data,半结构化数据(xml,json): google(网络爬虫.网络蜘蛛.网络机器人,20亿个页面,unstructureddata,pagerand页面排序算法): facebook,pv 500亿,化整为零(500G-->500*1G-->筛出50*100M数据),并行处理(将一个大问题切割成多个小问题,OLAP在线分析处理(数据挖掘):

PS-Scan ports扫描网络端口

用PS写出端口扫描 TCP139/445 AND UDP 137/138 用法简单:在c:\temp\target.txt写入多台IP地址 端口可以自己定义 以下是代码: <# This script can be used to Scan port TCP139/445 AND UDP 137/138 Need to modify Ip address under C:\temp\Target.txt first Date:2017-05-15 #> function Test-PortUD

关于hadoop

hadoop 是什么? 1. 适合海量数据的分布式存储与计算平台. 海量: 是指 1T 以上数据. 分布式: 任务分配到多态虚拟机上进行计算. 2. 多个任务是怎么被分配到多个虚拟机当中的? 分配是需要网络通讯的.而且是需要启动资源 或者 消耗一些硬件上的配置. 单 JVM 关注的如何『处理』,而不是交给其他人进行处理这个 『管理』的过程.  所以最开始有两个关键的字  『适合』, 只有当数据量超过 1T 的大数据处理才能凸显 hadoop 的优势;    当然,用 hadoop 处理 几十G.

Hadoop集群中Hbase的介绍、安装、使用

导读 HBase – Hadoop Database,是一个高可靠性.高性能.面向列.可伸缩的分布式存储系统,利用HBase技术可在廉价PC Server上搭建起大规模结构化存储集群. 一.Hbase简介 HBase是Google Bigtable的开源实现,类似Google Bigtable利用GFS作为其文件存储系统,HBase利用Hadoop HDFS作为其文件存储系统:Google运行MapReduce来处理Bigtable中的海量数据,HBase同样利用Hadoop MapReduce