Contact Us @ +1-210-503-7101

BIGDATA - HADOOP

Hadoop Training – Course Content

Overview:
Apache Hadoop is the open source data management software that helps organizations analyze huge volumes of structured and unstructured data, is a very hot topic across the tech industry. It can be quickly learn to take advantage of the MapReduce framework through technical sessions and hands on labs.
Training Objectives of Hadoop:
Hadoop Course will provide the basic concepts of MapReduce applications developed using Hadoop, including a close look at framework components, use of Hadoop for a variety of data analysis tasks, and numerous examples of Hadoop in action. This course will further examine related technologies such as Hive, Pig, and Apache Accumulo.
Target Students / Prerequisites:
Students must be belonging to IT Background and familiar with Concepts in Java and Linux.
Introduction , The Motivation for Hadoop

  • Problems with traditional large-scale systems
  • Requirements for a new approach
Hadoop Basic Concepts
  • An Overview of Hadoop
  • The Hadoop Distributed File System
  • Hands on Exercise
  • How MapReduce Works
  • Hands on Exercies
  • Anatomy of a Hadoop Cluster
  • Other Hadoop Ecosystem Components
Writing a MapReduce Program
  • Examining a Sample MapReduce Program
  • With several examples
  • Basic API Concepts
  • The Driver Code
  • The Mapper
  • The Reducer
  • Hadoop’s Streaming API
Delving Deeper Into The Hadoop API
  • More About ToolRunner
  • Testing with MRUnit
  • Reducing Intermediate Data With Combiners
  • The configure and close methods for Map/Reduce Setup and Teardown
  • Writing Partitioners for Better Load Balancing
  • Hands-On Exercise
  • Directly Accessing HDFS
  • Using the Distributed Cache
  • Hands-On Exercise
Performing several hadoopjobs
  • The configure and close Methods
  • Sequence Files
  • Record Reader
  • Record Writer
  • Role of Reporter
  • Output Collector
  • Processing video files and audio files
  • Processing image files
  • Processing XML files
  • Counters
  • Directly Accessing HDFS
  • ToolRunner
  • Using The Distributed Cache
Common MapReduce Algorithms
  • Sorting and Searching
  • Indexing
  • Classification/Machine Learning
  • Term Frequency – Inverse Document Frequency
  • Word Co-Occurrence
  • Hands-On Exercise: Creating an Inverted Index
  • Identity Mapper
  • Identity Reducer
  • Exploring well known problems using MapReduce applications
Usining HBase
  • What is HBase?
  • HBase API
  • Managing large data sets with HBase
  • Using HBase in Hadoop applications
  • Hands-on Exercise
Using Hive and Pig
  • Hive Basics
  • Pig Basics
  • Hands on Exercise
Practical Development Tips and Techniques
  • Debugging MapReduce Code
  • Using LocalJobRunner Mode for Easier Debugging
  • Retrieving Job Information with Countrers
  • Logging
  • Splittable File Formats
  • Determining the Optimal Number of Reducers
  • Map-Only MapReduce Jobs
  • Hands on Exercise
Debugging MapReduce Programs
  • Testing with MRUnit
  • Logging
  • Classification/Machine Learning
  • Advanced MapReduce Programming
  • A Recap of the MapReduce Flow
  • The Secondary Sort
  • CustomizedInputFormats and OutputFormats
  • Pipelining Jobs With Oozie
  • Map-Side Joins
  • Reduce-Side Joins
Joining Data Sets in MapReduce
  • Map-Side Joins
  • The Secondary Sort
  • Reduce-Side Joins
Monitoring and debugging on a Production Cluster
  • Counters
  • Skipping Bad Records
  • Rerunning failed tasks with Isolation Runner
Tuning for Performance in MapReduce
  • Reducing network traffic with combiner
  • Partitioners
  • Reducing the amount of input data
  • Using Compression
  • Reusing the JVM
  • Running with speculative execution
  • Refactoring code and rewriting algorithms Parameters affecting Performance
  • Other Performance Aspects

About Tutor

card image

US IT Training

This is basic card with image on top, title, description and button.This is basic card with image on top, title, description and button.This is basic card with image on top, title, description and button.This is basic card with image on top, title, description and button.

US IT Training

This is basic card with image on top, title, description and button.This is basic card with image on top, title, description and button.This is basic card with image on top, title, description and button.This is basic card with image on top, title, description and button.This is basic card with image on top, title, description and button.This is basic card with image on top, title, description and button.