Experience on Namenode backup and restore --- checkpoint

Hadoop version: Hadoop 2.2.0.2.0.6.0-0009

Well, We can do this by building Secondary Namenode, Checkpoint node or Backup node.

Example:

Assuming you have a Secondary Namenode.

1. Check secondary namenode checkpoint status:

dfs.namenode.secondary.http-address in  %HADOOP_CONF_DIR%/hdfs-site.xml

fs.namenode.checkpoint.dir in  %HADOOP_CONF_DIR%/hdfs-site.xml

dfs.namenode.checkpoint.edits.dir in  %HADOOP_CONF_DIR%/hdfs-site.xml

dfs.namenode.checkpoint.period in  %HADOOP_CONF_DIR%/hdfs-site.xml

2. Backup your real time checkpoint by hand:

On Secondary namenode, Stop Hadoop secondary namenode service.

Run cmd.exe by user hadoop ( or some users have full permission )

[plain] view
plain
copy

  1. Runas /user:hadoop cmd.exe

You must have user hadoop password.

Backup real time checkpoint:

[plain] view
plain
copy

  1. cmd>%hadoop_home%/bin/hadoop secondarynamenode -checkpoint force

Start Hadoop secondary namenode service. and check secondary namenode checkpoint status ( see step 1)

3. Stop Namenode services or reboot Namenode ( if hadoop service set to booting manual ,the services would all stop after reboot )

As for test, I backup my dfs.namenode.name.dir (i.e C:\hdpdata\hdfs\nn)  first for my next test ( restore from my namenode dir backup ) .

Delete all files in C:\hdpdata\hdfs\nn ,

Open  dfs.namenode.checkpoint.dir (see %HADOOP_CONF_DIR%/hdfs-site.xml ) in secondary namenode (i.e. c:\hdpdata\hdfs\snn )

Copy all secondary checkpoint files( except the lock file) from this folder to your namenode‘s checkpoint dir (dfs.namenode.checkpoint.dir the same as secondary namenode)

Make sure namenode‘s checkpoint dir is empty already !

4.  Restore from checkpoint dir

Run cmd.exe by user hadoop ( or some users have full permission )

[plain] view
plain
copy

  1. Runas /user:hadoop cmd.exe

You must have user hadoop password.

Use this command to start hadoop service and import checkpoint from checkpoint dir

[plain] view
plain
copy

  1. cmd>%hadoop_home%/bin/hdfs namenode -importcheckpoint

Use ctrl+C to stop service which is completed. and Delete your namenode‘s checkpoint dir (dfs.namenode.checkpoint.dir the same as secondary namenode)

Start service by this command:

[plain] view
plain
copy

  1. cmd>start_local_hdp_services.cmd

Levae safemode

[plain] view
plain
copy

  1. cmd>%hadoop_home%/bin/hdfs dfsadmin -safemode leave

Balance you HDFS:

[plain] view
plain
copy

  1. cmd>%hadoop_home%/bin/hdfs balancer -threshold 5

5. Confirm your Hadoop service is restored successfully.

Open URL  http://namenode:50070/ to check if there are some missing block. If yes. Please kindly check where they are and what they are.

Because restore from secondary namenode isn‘t a real time restore solution. It may lost the last time what you do in the jobtracker. It doesn‘t matter. Just delete them.

Tips: If you want to restore a real time backup, please use multiplicate namenode dir mode. see next post... ...

时间: 2024-10-06 06:06:50

Experience on Namenode backup and restore --- checkpoint的相关文章

hadoop 2.5 hdfs namenode –format 出错Usage: java NameNode [-backup] |

在 cd  /home/hadoop/hadoop-2.5.2/bin 下 执行的./hdfs namenode -format 报错[[email protected] bin]$ ./hdfs namenode –format 16/07/11 09:21:21 INFO namenode.NameNode: STARTUP_MSG:/************************************************************STARTUP_MSG: Starti

TFS Express backup and restore

 When we setup source control server, we should always make a backup and restore plan for it. This article is to describe how to backup and restore a TFS Express instance from one server to another server. This blog is an English version, for Chine

Backup and restore of FAST Search for SharePoint 2010

一个同事问我一个问题: 如果FAST Search for SharePoint 2010被full restore到了一个之前的时间点, 那么当FAST Search重新开始一个增量爬网的时候, 会发生什么? FAST Search会查看内容数据库并发现上一次爬网的记录并为新item或更改的item制作索引么? FAST Search会发现索引与现在内容的不一致么? 还是说它直接会再来一次full crawl?   Some Basics =================== Fast Se

How to backup and restore database in SQL Server

/*By Dylan SUN*/ If you want to backup and restore one database in SQL Server. Firstly, create a shared folder, and add everyone with read/write right. Secondly, backup your database. You can use the following script : backup database DatabaseName to

第一章、关于SQL Server数据库的备份和还原(sp_addumpdevice、backup、Restore)

在sql server数据库中,备份和还原都只能在服务器上进行,备份的数据文件在服务器上,还原的数据文件也只能在服务器上,当在非服务器的机器上启动sql server客户端的时候,也可以通过该客户端来备份和还原数据库,但是这种操作实质是在服务器上进行的,备份的数据文件在服务器上,还原的数据文件也只能在服务器上,这个原则不会变,只是使用了客户端的一个工具来操作这个过程而已. 1.1.备份数据库 备份数据库有两种方式: 第一种是在企业管理器中,利用工具对数据库进行备份,这种备份的文件只会有一个,即以

SQL2005中使用backup、restore来备份和恢复数据库

在SQL2005数据库中利用SQL语句进行数据备份与还原: 备份backup:backup database 数据库名称 tO disk = 备份路径例:BACKUP DATABASE test TO disk = 'd:\bak\test.bak' 恢复restore:restore  database 数据库名称from disk = 备份路径例:RESTORE DATABASE test FROM disk = 'd:\bak\test.bak'

mongodb backup and restore

一.mongodb的冷备 mongodb的冷备就是:复制库的相关文件.因此在冷备前,要关闭服务器,本全中使用平滑关闭server的命令. >use admin >db.shutdownServer() 或者可以通过fsync方式使MongoDB将数据写入缓存中,然后再复制备份 >use admin >db.runCommand({"fsync":1,"lock":1}) 锁库后执行插入数据命令,发现无任何反应.备份完后,要解锁(防止这个时候停

GPO - Backup and Restore

Backup the GPO to a second server is very important. Restore a GPO if necessary. Note: WMI filter and Links need to be re-configured after restoration. 原文地址:https://www.cnblogs.com/keepmoving1113/p/12246893.html

suitecrm 如何backup and restore ,从一个server 转移到另一个 server . 并保证customer package , customer module 不丢

原server部分 1 :  suite backup 分为 数据库和 网站 两部分 , 在 网站目录下 config.php , 可以看到 数据库名字 等信息 . 在 /home 目录下 , 新建 liuyang 目录 ---   mkdir liuyang 给予写权限 ---  sudo chmod -R a+rw /home/liuyang 2 :登陆 suitecrm 网站 ,admin----backup 到刚才的目录 3 : 备份数据库   先登陆 : mysql -u root -