Apache Spark Training in Chandigarh
Apache Spark Training in Chandigarh – webtechlearning provide the best training in Hadoop Spark in Chandigarh, Mohali and Panchkula. 100% practical training with live projects.
Apache Spark developer training
Table of Contents
Course Syllabus of Apache Spark:
Hadoop Overview
- Lecture
- How HDFS read/write the data
- YARN internal architecture
- HDFS Internal Architecture .
- Hands-On
- HDFS Shell Commands
- Install Hadoop & Spark in Ubuntu
- Configure hadoop/spark environment in Eclipse
Hive Overview
- Lecture
- How Hive functioning properly
- Optimize Hive queries
- Using Sqoop
- Hands-On
- Process csv, json data
- Bucketing, Partitioning tables.
- Import MySQL/Oracle data using Sqoop
Scala Basics
- Lecture
- Functional language
- Scala Vs Java
- Hands-On
- Strings, Numbers
- List, Array, Map, Set
- Control Statements, collections
- Functions, methods
- Patren matching
Spark Overview
- Lecture
- The power of Spark?
- Spark Ecosystem
- Spark Components vs Hadoop
- Hands-On
- Installation & Eclipse configuration
- Programs in Command line Interface & Eclipse
- Process Local, HDFS files
RDD Fundamentals
- Lecture
- Purpose and Structure of RDDs
- Transformations, Actions, and DAG
- Key-Value Pair RDDs
- Hands-On
- Creating RDDs from Data Files
- Reshaping Data to Add Structure
- Interactive Queries Using RDDs
SparkSQL and DataFrames
- Lecture
- Spark SQL and DataFrame Uses
- DataFrame / SQL APIs
- Catalyst Query Optimization
- Hands-on
- Creating (CSV, JSON) DataFrames
- Querying with DataFrame API and SQL
- Caching and Re-using DataFrames
- Process Hive data in Spark
Spark DataSet API
- Lecture
- Power of Dataset API in Spark 2.0
- Serialization concept in DataSet
- Hands-on
- Creating DataSet API
- Process CSV, JSON, XML, Text data
- DataSet Operation
Spark Job Execution
- Lecture
- Jobs, Stages, and Tasks
- Partitions and Shuffles
- Broadcast Variables and accumulators
- Job Performance
- Hands-On
- Visualizing DAG Execution
- Observing Task Scheduling
- Understanding Performance
- Measuring Memory Usage
- shared variables usage
Clustering Architecture
- Lecture
- Cluster Managers for Spark: Spark Standalone, YARN, and Mesos
- Understanding Spark on YARN
- What happened in cluster when you submit a job
- Hands-On
- Tracking Jobs through the Cluster UI
- Understanding Deploy Modes
- Submit a sample job and monitor job
Spark Streaming
- Lecture
- Streaming Sources and Tasks
- DStream APIs and Stateful Streams
- Flink Introduction
- Kafka architecture
- Hands-On
- Creating DStreams from Sources
- Operating on DStream Data
- Viewing Streaming Jobs in the Web UI
- Sample Flink Streaming program.
- Kafka sample program
AWS with Spark
- Lecture
- AWS architecture
- Redshift, EMR and EC2 functionalities
- How to minimize AWS cost
- Hands-On
- Submit a sample jar in AWS Cluster
- Create a cluster using EMR
- Read/Write data from Redshift
Advanced concepts in Spark
- Lecture
- Memory management in Spark
- How to optimize Spark Applications
- Spark how to integrate with other Applications
- Hands-On
- Spark with Cassandra Integration
- Alluxio/Tachyon hands on experience
Sample Spark Project
- Lecture
- End to end a project overview
- Complicated problems in a project
- Common steps in any project
- Hands-On
- Implement Spark SQL Mini project
- Kafka, Cassandra, Spark Streaming project
- Pull Twitter data and analyse the data
Important notes:
- Daily after training assign a task
- After training provide solution to that problem.
- Minimum 3 months online support & Job Assistance
- Training in Spark 2.x and spark 1.6.2 in Scala language
- Excellent Materials all major spark and Scala books
- Guide to get Cloudera/MapR/Databricks spark certification
WebtechLearning – Web Education Academy
SCO 54-55, 3rd Floor, Sector 34-A, Chandigarh
Mobile:09878375376