一:HDFS 用户指导

1.hdfs的牛逼特性

  • Hadoop, including HDFS, is well suited for distributed storage and distributed processing using commodity hardware. It is fault tolerant, scalable, and extremely simple to expand. MapReduce, well known for its simplicity and applicability for large set of distributed applications, is an integral part of Hadoop. 分布式存储
  • HDFS is highly configurable with a default configuration well suited for many installations. Most of the time, configuration needs to be tuned only for very large clusters. 适当的配置
  • Hadoop is written in Java and is supported on all major platforms. 平台适应性
  • Hadoop supports shell-like commands to interact with HDFS directly. shell-like的操作方式
  • The NameNode and Datanodes have built in web servers that makes it easy to check current status of the cluster. 内置web服务,方便检查集群
  • New features and improvements are regularly implemented in HDFS. The following is a subset of useful features in HDFS:
    • File permissions and authentication.  文件权限验证
    • Rack awareness: to take a node‘s physical location into account while scheduling tasks and allocating storage.
    • Safemode: an administrative mode for maintenance.  安全模式,用于运维
    • fsck: a utility to diagnose health of the file system, to find missing files or blocks.  检查文件系统的工具,发现丢失的文件或者块
    • fetchdt: a utility to fetch DelegationToken and store it in a file on the local system.
    • Balancer: tool to balance the cluster when the data is unevenly distributed among DataNodes.
    • Upgrade and rollback: after a software upgrade, it is possible to rollback to HDFS‘ state before the upgrade in case of unexpected problems.
    • Secondary NameNode: performs periodic checkpoints of the namespace and helps keep the size of file containing log of HDFS modifications within certain limits at the NameNode.
    • Checkpoint node: performs periodic checkpoints of the namespace and helps minimize the size of the log stored at the NameNode containing changes to the HDFS. Replaces the role previously filled by the Secondary NameNode, though is not yet battle hardened. The NameNode allows multiple Checkpoint nodes simultaneously, as long as there are no Backup nodes registered with the system.
    • Backup node: An extension to the Checkpoint node. In addition to checkpointing it also receives a stream of edits from the NameNode and maintains its own in-memory copy of the namespace, which is always in sync with the active NameNode namespace state. Only one Backup node may be registered with the NameNode at once.

      来源: http://hadoop.apache.org/docs/r2.6.4/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html

2.webUI

默认是50070端口

3.hdfs基本管理命令

bin/hdfs dfsadmin -参数

  • -report: reports basic statistics of HDFS. Some of this information is also available on the NameNode front page. 报告状态
  • -safemode: though usually not required, an administrator can manually enter or leave Safemode.  开启安全模式
  • -finalizeUpgrade: removes previous backup of the cluster made during last upgrade. 删除上次集群更新时的备份
  • -refreshNodes: Updates the namenode with the set of datanodes allowed to connect to the namenode. Namenodes re-read datanode hostnames in the file defined bydfs.hostsdfs.hosts.exclude. Hosts defined in dfs.hosts are the datanodes that are part of the cluster. If there are entries in dfs.hosts, only the hosts in it are allowed to register with the namenode. Entries in dfs.hosts.exclude are datanodes that need to be decommissioned. Datanodes complete decommissioning when all the replicas from them are replicated to other datanodes. Decommissioned nodes are not automatically shutdown and are not chosen for writing for new replicas.
  • -printTopology : Print the topology of the cluster. Display a tree of racks and datanodes attached to the tracks as viewed by the NameNode. 打印拓扑

4.secondary namenode

namenode把文件系统的修改以日志追加方式写到本地文件系统,namenode启动时,先从镜像中读取HDFS的状态,然后再把日志中的修改合并到镜像中,再打开一个新的日志文件接收新的修改。namenode仅仅在启动时才合并状态镜像和日志,所以日志可能会变的非常大,在下次启动时需要合并的内容太多导致启动时间很长。

secondary namenode定时的从namenode合并日志,并且保证日志大小限制在一定的范围内。一般不和主namenode放一起,但机器的配置要和namenode一样。

secondary namenode上的checkpoint 里程由以下两个参数控制:

  • dfs.namenode.checkpoint.period, set to 1 hour by default, specifies the maximum delay between two consecutive checkpoints, and
  • dfs.namenode.checkpoint.txns, set to 1 million by default, defines the number of uncheckpointed transactions on the NameNode which will force an urgent checkpoint, even if the checkpoint period has not been reached.

dfs.namenode.checkpoint.preiod  两次执行checkpoint之间的最大时间间隔

dfs.namenode.checkpoint.txns    当没有checkpoint的事务达到多少时执行,即使未达到上面的参数设置的时间,默认是100万(比如10分钟修改了100万个,那么10分钟就执行一次checkpoint而非1小时)

5.checkpoint node

和secondary namenode极为相似,不同的地方是checkpoint下载hdfs状态镜像和日志文件,并在本地合并,合并后还上传到正在运行的namenode.

dfs.namenode.backup.address       地址

dfs.namenode.backup.http-address  ip端口

dfs.namenode.checkpoint.preiod 和dfs.namenode.checkpoint.txns  同样影响checkpoint

checkpoint node和secondary namenode实际上就是一个东西,只是名称有所不同

6.backup node

backup node的功能和checkpoint node一样,但是backup node能实时的从namenode读取namespace变化数据并合并到本地(注意:namenode是不合并,只有重启后才合并),所以backup node是namenode的完全实时备份。

目前一个集群只能有一个backup node,未来可以支持多个。一旦有个backup node,checkpoint node就无法再注册进集群。backup node的配置文件和checkpoint一致(dfs.namenode.backup.address \ dfs.namenode.backup.http-address),以bin/hdfs namenode -backup启动

7.import checkpoint

如果镜像文件和日志文件丢失,可以用import checkpoint方式从checkpoint节点读取。需要配置三个参数:

dfs.namenode.name.dir namenode的元数据文件夹

dfs.namenode.checkpoint.dir checkpoint node上传镜像的文件夹

以-importCheckpoint的方式启动namenode

8.balancer

HDFS中数据可能不是均衡的放在集群中。考虑到一下情况:

  • Policy to keep one of the replicas of a block on the same node as the node that is writing the block.  在当前读写的节点中保存一个数据备份。
  • Need to spread different replicas of a block across the racks so that cluster can survive loss of whole rack. 保存数据分布到各个机架,可以允许整个机架的丢失
  • One of the replicas is usually placed on the same rack as the node writing to the file so that cross-rack network I/O is reduced.
  • Spread HDFS data uniformly across the DataNodes in the cluster.

    来源: http://hadoop.apache.org/docs/r2.6.4/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Balancer

9.机架感知,略

10.safemode

当集群重新启动时,namenode读取状态镜像和日志信息,此时namenode等待datanode报告块信息,所以不会立即打开集群,此时namenode处于safemode,集群处于只读状态。等datanode报告完块信息后,集群自动打开,解除safemode状态。可以手动设置safemode状态。

11.fsck

fsck命令用来检查文件(文件块)不一致,与传统的fsck不一样的地方是,该命令并不修正错误,默认下不检查已经打开的文件.fsck命令不是hadoop shell 命令,但是可以以bin/hdfs fsck启动.

12.fecthdt

HDFS支持fecthdt命令来读取口令并存放在本地文件系统中.该口令可用于非安全验证的客户端连接到安全的服务器上(比如namenode).略..

13.recovery mode

恢复模式.如果仅有的namemode元数据丢失了,可以通过recovery mode找到部分数据,此时以namenode -recover启动namenode,然后按照提示输入文件位置,可以使用force参数不输入让hdfs自己找文件位置

14.upgrade and rollback

升级和回滚.略

15.File permissions and security

文件权限和安全.HDFS的文件权限类似LINUX.启动namenode的用户被视为HDFS的超级用户.

16.可扩展性

HDFS可以支持数千个节点的集群.每个集群只有一个namenode,因此namenode的内存成为集群大小的限制

<wiz_tmp_tag id="wiz-table-range-border" contenteditable="false" style="display: none;">

来自为知笔记(Wiz)

时间: 2024-10-10 14:52:14

一:HDFS 用户指导的相关文章

Netty4.x用户指导

题记 最近对netty有了兴趣,现在官方推荐版本是netty4.*,但是纵观网络,大部分都是关于netty3.x的知识. 最好的学习,莫过于通过官方文档进行学习,系统,透彻,权威,缺点是英文.本文,算做自己学习netty的第一篇,总体思路与User guide for 4.x基本一致,本篇文章不是严格意义的翻译文章.开始了... 1.前言 1.1 问题 现 在,我们使用通用的应用程序和程序库,进行互相交流.例如,我们经常使用HTTP client库从web服务器上获取信息,通过web servi

HDFS用户指南

https://hadoop.apache.org/docs/r1.2.1/hdfs_user_guide.html hdfs的一些特征: 1.hadoop 包含hdfs 很适合分布式存储以及分布式处理,它是容错的,可伸缩的,并且容易扩展的.MapReduce 以他的简单和适用性为一系列分布式系统服务. 2.HDFS是一个高可配置的并且有很好的给每个应用的默认的配置.很多时候,配置只有在很大集群时才需要修改. 3.hadoop是使用java编写,可以在很多主流平台使用. 4.Hadoop支持使用

HDFS Snapshots

Overview HDFS Snapshots are read-only point-in-time copies of the file system. Snapshots can be taken on a subtree of the file system or the entire file system. Some common use cases of snapshots are data backup, protection against user errors and di

hive超级用户drop partition权限问题bug

今天有个etl开发在drop partition的时候遇到了问题,因为是使用了自己的账号,而hdfs中对应partition的文件属主是hdfs的,在删除时会因为权限问题报错,切换用户为hdfs,做drop partition的错误,还是报错,看来没这么简单. 查看表的hdfs属性,目录的属主不是hdfs且目录对 hdfs没有写权限: [[email protected] ~]$ hadoop fs -ls -d hdfs://xxxx:9000/bip/external_table/vipdw

解决从linux本地文件系统上传文件到HDFS时的权限问题

当使用 hadoop fs -put localfile /user/xxx 时提示: put: Permission denied: user=root, access=WRITE, inode="/user/shijin":hdfs:supergroup:drwxr-xr-x 表明:权限不够.这里涉及到两个方面的权限.一个是本地文件系统中localfile 文件的权限,一个是HDFS上 /user/xxx目录的权限. 先看看 /user/xxx目录的权限:drwxr-xr-x  

解决从本地文件系统上传到HDFS时的权限问题

当使用 hadoop fs -put localfile /user/xxx 时提示: put: Permission denied: user=root, access=WRITE, inode="/user/shijin":hdfs:supergroup:drwxr-xr-x 表明:权限不够.这里涉及到两个方面的权限.一个是本地文件系统中localfile 文件的权限,一个是HDFS上 /user/xxx目录的权限. 先看看 /user/xxx目录的权限:drwxr-xr-x  

HDFS目录(文件 )权限管理

用户身份 在1.0.4这个版本的Hadoop中,客户端用户身份是通过宿主操作系统给出.对类Unix系统来说, 用户名等于`whoami`: 组列表等于`bash -c groups`. 将来会增加其他的方式来确定用户身份(比如Kerberos.LDAP等).期待用上文中提到的第一种方式来防止一个用户假冒另一个用户是不现实的.这种用户身份识别机制结合权限模型允许一个协作团体以一种有组织的形式共享文件系统中的资源. 不管怎样,用户身份机制对HDFS本身来说只是外部特性.HDFS并不提供创建用户身份.

HDFS数据迁移目录到正确姿势

添加了一块硬盘,原来的DataNode已经把原有的硬盘占满:怎么办,想要把旧有的数据迁移到新的硬盘上面: 1. 在CDH中修改目录(在HDFS组件中搜索.dir),本例中,新加的硬盘挂载在/data上面,NameNode,DataNode,以及CheckPoint路径都前加一个"/data": 2. 重启HDFS,NameNode可能会出错,没有关系: 3. 关闭CDH的集群: 4. 切换到hdfs用户,将就有路径下的/dfs拷贝到/data下面:如果不是则拷贝完毕后,要把dfs下面所

hadoop hdfs访问权限更新延迟问题

最近我遇到过一次solr的文件权限问题,导致索引崩溃.同事也遇到一次FS的类似现象问题.多次经历发现hadoop目前对目录的权限管理有同步问题. 正常情况下,以某个用户启动,则目录权限会变成该用户.至于用户所在的组,可以直接忽略,没有看到用处.但是有时候会出现这样的情形.明明运行程序是超级用户,也就是缺省使用hdfs用户(如果有设置缺省),但是可以访问solr, 权限的目录,而且一直稳定运行. 但是某一天,也许你忽然想改变一些配置.重启了一些应用.也许就忽然它就权限变更了.除非你用正确的用户启动