Big Data Hadoop Certification Training
-
Training TypeLive Training
-
CategoryData & Analytics
-
Duration35 Hours
-
Rating4.9/5
Big Data Hadoop Training Certification Course Introduction
About Big Data Hadoop Training Certification Course
Simpliv’s Big Data Hadoop Training covers in-depth knowledge on Big Data and Hadoop Ecosystem tools such as HDFS, YARN, MapReduce, Hive, and Pig.
Big Data Hadoop Training Certification Course Objective
Hadoop basics and Hadoop ecosystem
Managing, monitoring, scheduling and troubleshooting Hadoop clusters effectively
Working with Apache Spark, Scala and Storm for real-time data analytics
Working with Hive, Pig, HDFS, MapReduce, Sqoop, ZooKeeper and Flume
Testing of Hadoop clusters with MRUnit and other automation tools
Successfully integrating various ETL tools with Hive, Pig, and MapReduce
Who is the Big Data Hadoop Course Target Audience?
Software Developers
Project Managers
ETL and Data Warehousing Professionals
Software Architects
Data Analysts & Business Intelligence Professionals
DBAs
Mainframe professionals
Data Engineers
Senior IT Professionals
Testing professionals
Graduates interested in Big Data Field
What Basic Knowledge Required to Learn Big Data Hadoop Training?
There are no specific prerequisites to learn Hadoop. Prior knowledge of Java and SQL is beneficial.
Available Batches
Pricing
Require a Different Batch?
Request a Batch For
-
Introduction to Big Data & Big Data Challenges
-
Limitations & Solutions of Big Data Architecture
-
Hadoop & its Features
-
Hadoop Ecosystem
-
Hadoop 2.x Core Components
-
Hadoop Storage: HDFS (Hadoop Distributed File System)
-
Hadoop Processing: MapReduce Framework
-
Different Hadoop Distributions
-
Hadoop 2.x Cluster Architecture Preview
-
Federation and High Availability Architecture Preview
-
Typical Production Hadoop Cluster
-
Hadoop Cluster Modes
-
Common Hadoop Shell Commands Preview
-
Hadoop 2.x Configuration Files
-
Single Node Cluster & Multi-Node Cluster set up
-
Basic Hadoop Administration
-
Traditional way vs MapReduce way
-
Why MapReduce
-
YARN Components
-
YARN Architecture
-
YARN MapReduce Application Execution Flow
-
YARN Workflow
-
Anatomy of MapReduce Program
-
Input Splits, Relation between Input Splits and HDFS Blocks
-
MapReduce: Combiner & Partitioner
-
Counters
-
Distributed Cache
-
MRunit
-
Reduce Join
-
Custom Input Format
-
Sequence Input Format
-
XML file Parsing using MapReduce
-
Introduction to Apache Pig
-
MapReduce vs Pig
-
Pig Components & Pig Execution
-
Pig Data Types & Data Models in Pig
-
Pig Latin Programs
-
Shell and Utility Commands
-
Pig UDF & Pig Streaming
-
Testing Pig scripts with Punit
-
Aviation use-case in PIG
-
Introduction to Apache Hive
-
Hive vs Pig
-
Hive Architecture and Components
-
Hive Metastore
-
Comparison with Traditional Database
-
Hive Data Types and Data Models
-
Hive Partition
-
Hive Bucketing
-
Hive Tables (Managed Tables and External Tables)
-
Importing Data
-
Querying Data & Managing Outputs
-
Hive Script & Hive UDF
-
Retail use case in Hive
-
Hive QL: Joining Tables, Dynamic Partitioning
-
Custom MapReduce Scripts
-
Hive Indexes and views
-
Hive Query Optimizers
-
Hive Thrift Server
-
Hive UDF
-
Apache HBase: Introduction to NoSQL Databases and HBase
-
HBase v/s RDBMS
-
HBase Components
-
HBase Architecture
-
HBase Run Modes
-
HBase Configuration
-
HBase Cluster Deployment
-
HBase Data Model
-
HBase Shell
-
HBase Client API
-
Hive Data Loading Techniques
-
Apache Zookeeper Introduction
-
ZooKeeper Data Model
-
Zookeeper Service
-
HBase Bulk Loading
-
Getting and Inserting Data
-
HBase Filters
-
What is Spark
-
Spark Ecosystem
-
Spark Components
-
What is Scala
-
Why Scala
-
SparkContext
-
Spark RDD
-
Oozie
-
Oozie Components
-
Oozie Workflow
-
Scheduling Jobs with Oozie Scheduler
-
Demo of Oozie Workflow
-
Oozie Coordinator
-
Oozie Commands
-
Oozie Web Console
-
Oozie for MapReduce
-
Combining flow of MapReduce Jobs
-
Hive in Oozie
-
Hadoop Talend Integration