Hadoop Training Course Content

Course Content for Hadoop Developer

This Course Covers 100% Developer and 40% Administration Syllabus.

Introduction to BigData, Hadoop:-

 Big Data Introduction

 Hadoop Introduction

 What is Hadoop? Why Hadoop?

 Hadoop History?

 Different types of Components in Hadoop?

 HDFS, MapReduce, PIG, Hive, SQOOP, HBASE, OOZIE, Flume, Zookeeper and so on…

 What is the scope of Hadoop?

Deep Drive in HDFS (for Storing the Data):-

 Introduction of HDFS

 HDFS Design

 HDFS role in Hadoop

 Features of HDFS

 Daemons of Hadoop and its functionality

o Name Node

o Secondary Name Node

o Job Tracker

o Data Node

o Task Tracker

 Anatomy of File Wright

 Anatomy of File Read

 Network Topology

o Nodes

o Racks

o Data Center

 Parallel Copying using DistCp

 Basic Configuration for HDFS

 Data Organization

o Blocks and

o Replication

 Rack Awareness

 Heartbeat Signal

 How to Store the Data into HDFS

 How to Read the Data from HDFS

 Accessing HDFS (Introduction of Basic UNIX commands)

 CLI commands

MapReduce using Java (Processing the Data):-

 The introduction of MapReduce.

 MapReduce Architecture

 Data flow in MapReduce

o Splits

o Mapper

o Portioning

o Sort and shuffle

o Combiner

o Reducer

 Understand Difference Between Block and InputSplit

 Role of RecordReader

 Basic Configuration of MapReduce

 MapReduce life cycle

o Driver Code

o Mapper

o and Reducer

 How MapReduce Works

 Writing and Executing the Basic MapReduce Program using Java

 Submission & Initialization of MapReduce Job.

 File Input/Output Formats in MapReduce Jobs

o Text Input Format

o Key Value Input Format

o Sequence File Input Format

o NLine Input Format

 Joins

o Map-side Joins

o Reducer-side Joins

 Word Count Example

 Partition MapReduce Program

 Side Data Distribution

o Distributed Cache (with Program)

 Counters (with Program)

o Types of Counters

o Task Counters

o Job Counters

o User Defined Counters

o Propagation of Counters

 Job Scheduling


 Introduction to Apache PIG

 Introduction to PIG Data Flow Engine

 MapReduce vs. PIG in detail

 When should PIG use?

 Data Types in PIG

 Basic PIG programming

 Modes of Execution in PIG

o Local Mode and

o MapReduce Mode

 Execution Mechanisms

o Grunt Shell

o Script

o Embedded

 Operators/Transformations in PIG

 PIG UDF’s with Program

 Word Count Example in PIG

 The difference between the MapReduce and PIG


 Introduction to SQOOP

 Use of SQOOP

 Connect to mySql database

 SQOOP commands

o Import

o Export

o Eval

o Codegen etc…

 Joins in SQOOP

 Export to MySQL

 Export to HBase


 Introduction to HIVE

 HIVE Meta Store

 HIVE Architecture

 Tables in HIVE

o Managed Tables

o External Tables

 Hive Data Types

o Primitive Types

o Complex Types

 Partition

 Joins in HIVE

 HIVE UDF’s and UADF’s with Programs

 Word Count Example


 Introduction to HBASE

 Basic Configurations of HBASE

 Fundamentals of HBase

 What is NoSQL?

 HBase Data Model

o Table and Row

o Column Family and Column Qualifier

o Cell and its Versioning

 Categories of NoSQL Data Bases

o Key-Value Database

o Document Database

o Column Family Database

 HBASE Architecture

o HMaster

o Region Servers

o Regions

o MemStore

o Store


 How HBASE is differed from RDBMS

 HDFS vs. HBase

 Client-side buffering or bulk uploads

 HBase Designing Tables

 HBase Operations

o Get

o Scan

o Put

o Delete



 What is MongoDB?

 Where to Use?

 Configuration On Windows

 Inserting the data into MongoDB?

 Reading the MongoDB data.

Cluster Setup:–

 Downloading and installing the Ubuntu12.x

 Installing Java

 Installing Hadoop

 Creating Cluster

 Increasing Decreasing the Cluster size

 Monitoring the Cluster Health

 Starting and Stopping the Nodes


 Introduction Zookeeper

 Data Modal

 Operations


 Introduction to OOZIE

 Use of OOZIE

 Where to use?


 Introduction to Flume

 Uses of Flume

 Flume Architecture

o Flume Master

o Flume Collectors

o Flume Agents

Project Explanation with Architecture



