This course will help you learn tools that you need to access, manipulate, transform, and analyze complex data sets using SQL and familiar scripting languages.
This course will cover details of Tools – Pig, Scoop, HDFS, Hive, Impala etc in order to understand how to pull, transform and analyze the data.
Joining multiple data sets and analyzing disparate data with Pig.
Organizing data into tables, performing transformations, and simplifying complex queries with Hive,
Making multi-structures data accessible with Hive.
Data Storage using HDFS.
Performing real-time, complex queries on datasets using Impala.
Knowledge of SQL is assumed, as is basic Linux command-line familiarity.
Knowledge of at least one scripting language (e.g., Bash scripting, Perl, Python, Ruby) would be helpful.
Prior knowledge of Apache Hadoop is not required.
Are you ready? Let’s get started!