My title
page contents
HDFS Admin Commands
While the dfs module for bin/hadoop provides common file and directory manipulation commands, they all work with objects within the file system. The dfsadmin module
manipulates or queries the file system as a whole. The operation of the
commands in this module is described in this section.
Getting overall status: A brief status report for HDFS can be retrieved with bin/hadoop dfsadmin -report. This returns basic information about the overall health of the HDFS cluster, as well as some per-server metrics.
More involved status: If you need to know more details about what the state of the NameNode's metadata is, the command bin/hadoop dfsadmin -metasave filename will record this information in filename.
The metasave command will enumerate lists of blocks which are
under-replicated, in the process of being replicated, and scheduled for
deletion. NB: The help for this command states that it "saves NameNode's
primary data structures," but this is a misnomer; the NameNode's state
cannot be restored from this information. However, it will provide good
information about how the NameNode is managing HDFS's blocks.
Safemode: Safemode is an HDFS state in which the file system is
mounted read-only; no replication is performed, nor can files be created
or deleted. This is automatically entered as the NameNode starts, to
allow all DataNodes time to check in with the NameNode and announce
which blocks they hold, before the NameNode determines which blocks are
under-replicated, etc. The NameNode waits until a specific percentage of
the blocks are present and accounted-for; this is controlled in the
configuration by thedfs.safemode.threshold.pct parameter. After this threshold is met, safemode is automatically exited, and HDFS allows normal operations. The bin/hadoop dfsadmin -safemode whatcommand allows the user to manipulate safemode based on the value of what, described below:
- enter - Enters safemode
- leave - Forces the NameNode to exit safemode
- get - Returns a string indicating whether safemode is ON or OFF
- wait - Waits until safemode has exited and returns
Changing HDFS membership - When decommissioning nodes, it is
important to disconnect nodes from HDFS gradually to ensure that data is
not lost. See the section on decommissioning later in this document for an explanation of the use of the -refreshNodes dfsadmin command.
Upgrading HDFS versions - When upgrading from one version of
Hadoop to the next, the file formats used by the NameNode and DataNodes
may change. When you first start the new version of Hadoop on the
cluster, you need to tell Hadoop to change the HDFS version (or else it
will not mount), using the command: bin/start-dfs.sh -upgrade. It will then begin upgrading the HDFS version. The status of an ongoing upgrade operation can be queried with the bin/hadoop dfsadmin -upgradeProgress status command. More verbose information can be retrieved with bin/hadoop dfsadmin -upgradeProgress details. If the upgrade is blocked and you would like to force it to continue, use the command: bin/hadoop dfsadmin -upgradeProgress force. (Note: be sure you know what you are doing if you use this last command.)
When HDFS is upgraded, Hadoop retains backup information allowing you to
downgrade to the original HDFS version in case you need to revert
Hadoop versions. To back out the changes, stop the cluster, re-install
the older version of Hadoop, and then use the command: bin/start-dfs.sh -rollback. It will restore the previous HDFS state.
Only one such archival copy can be kept at a time. Thus, after a few
days of operation with the new version (when it is deemed stable), the
archival copy can be removed with the command bin/hadoop dfsadmin -finalizeUpgrade. The rollback command cannot be issued after this point. This must be performed before a second Hadoop upgrade is allowed.
Getting help - As with the dfs module, typing bin/hadoop dfsadmin -help cmd will provide more usage information about the particular command.
For Detail commands click on the below link
No comments:
Post a Comment