HADOOP Administration Course Content
HADOOP Administration Course Content
Hadoop Admin
- How the Hadoop Distributed File System and Map Reduce work
- What hardware configurations are optimal for Hadoop clusters
- How to configure Hadoop’s options for best cluster performance
- How to configure NameNode High Availability
- How to configure NameNode Federation
- How to configure the FairScheduler to provide service-level agreements for multiple users of a cluster
- How to install and implement Kerberos-based security for your cluster
- What system administration issues exist with other Hadoop projects such as Hive, Pig, and HBase
Introduction
- A brief history of Hadoop
- Core Hadoop components
- Fundamental concepts
The Hadoop Distributed File System
- HDFS features
- HDFS design assumptions
- Overview of HDFS architecture
- Writing and reading files
- NameNode considerations
- An overview of HDFS security
MapReduce
- What is MapReduce?
- Features of MapReduce
- Basic MapReduce concepts
- Architectural overview
- Failure recovery
Hadoop Ecosystem
- What is the Hadoop ecosystem?
- Integration tools
- Analysis tools
- Hive
- Hbase
- Sqoop
- Zookeeper
- Pig
Hadoop Cluster prerequisites
- General planning considerations
- Choosing the right hardware
- Network considerations
- Configuring nodes
Hadoop Installation
- Installing Hadoop
- Basic configuration parameters
- Advanced Configuration
Advanced Configuration
- Configuring rack awareness
- Configuring Federation
- Configuring High Availability
Managing and Scheduling Jobs
- Managing running jobs
- The FIFO Scheduler
- The FairScheduler
Cluster Maintenance
- Checking HDFS status
- Copying data between clusters
- Adding and removing cluster nodes
- Rebalancing the cluster
- NameNode Metadata backup
- Cluster upgrading
No comments:
Post a Comment