WHY BIGDATA?
What is Big Data?
Hadoop Architecture & Components
Hadoop Storage & File Formats
HDFS
HDFS Basics
File Storage
Fault Tolerance
Hadoop Processing – MapReduce
HIVE
What is Hive
Modeling in Hive and data loading process
Concepts of Partitioning, Bucketing
Hive data storage formats (ORC, RC, and Parquet)
Introduction to Hive QL and examples
Hive as an ELT tool
Performance tuning in Hive
MAPREDUCE
What Is MapReduce?
Basic MapReduce Concepts
Concepts of Mappers, Reducers Combiners and Paritioning
Inputs and Output to MR Program
BIGDATA USING R
Map/Reduce Programming using Java and R
Hadoop with R
NOSQL
NoSQL in Hadoop
HBASE
HBase – Introduction
HBase Data Model, HBase Master
HBase Families & Components
Data Storage and Distribution
HBase Master
PIG LATIN
Basics of Pig and Why Pig?
Grunt
Pig’s Data Model
Writing Evaluation
Filter
Load & Store Functions
Benefits of Pig over SQL language
Input and Output formats to MR program
SQOOP
Sqoop Overview
Sqoop Exercises
OOZIE/FLUME/YARN
Oozie Overview
Oozie Workflows
[OPTIONAL]
CLOUDERA
Introduction to Cloudera Manager
Ambari Administration
SPARK/SCALA
What Is Spark?, Basic concepts
How Spark differs from Map Reduce?
Working with SCALA
Parallel Programming with Spark
Spark Streaming
HADOOP SECURITY
Security Overview
Knox Exercise
Access Control Labels