Big Data Analytics

WHY BIGDATA?

What is Big Data?

Hadoop Architecture & Components

Hadoop Storage & File Formats

HDFS

HDFS Basics

File Storage

Fault Tolerance

Hadoop Processing – MapReduce

HIVE

What is Hive

Modeling in Hive and data loading process

Concepts of Partitioning, Bucketing

Hive data storage formats (ORC, RC, and Parquet)

Introduction to Hive QL and examples

Hive as an ELT tool

Performance tuning in Hive

MAPREDUCE

What Is MapReduce?

Basic MapReduce Concepts

Concepts of Mappers, Reducers

Combiners and Paritioning

Inputs and Output to MR Program

BIGDATA USING R

Map/Reduce Programming using Java and R

Hadoop with R

NOSQL

NoSQL in Hadoop

HBASE

HBase – Introduction

HBase Data Model, HBase Master

HBase Families & Components

Data Storage and Distribution

HBase Master

PIG LATIN

Basics of Pig and Why Pig?

Grunt

Pig’s Data Model

Writing Evaluation

Filter

Load & Store Functions

Benefits of Pig over SQL language

Input and Output formats to MR program

SQOOP

Sqoop Overview

Sqoop Exercises

OOZIE/FLUME/YARN

Oozie Overview

Oozie Workflows

[OPTIONAL]
CLOUDERA

Introduction to Cloudera Manager

Ambari Administration

SPARK/SCALA

What Is Spark?, Basic concepts

How Spark differs from Map Reduce?

Working with SCALA

Parallel Programming with Spark

Spark Streaming

HADOOP SECURITY

Security Overview

Knox Exercise

Access Control Labels

Contact

TVASHTAA DATA SOLUTIONS

3-5-1094/15/2/E, 3rd Floor,
Above Sri Suman Electricals,
Next to Narayanaguda Metro Station,
Opp. Blood Bank, Narayanaguda,
Hyderabad - 500029, Telangana, India.

PHONE: +917286900966, 04024760226.

E-MAIL: info@tvashtaa.com