Apache Spark and Scala

1965 Ratings (4.3)

This training course delivers the key concepts and expertise developers need to use Apache Spark to develop high-performance parallel applications. Participants will learn how to use Spark SQL to query structured data and Spark Streaming to perform real-time processing on streaming data from a variety of sources. Developers will also practice writing applications that use core Spark to perform ETL processing and iterative algorithms.

active-directory-services-with-windows-server
request

Can’t find a batch you were looking for?

About the course

This training course delivers the key concepts and expertise developers need to use Apache Spark to develop high-performance parallel applications. Participants will learn how to use Spark SQL to query structured data and Spark Streaming to perform real-time processing on streaming data from a variety of sources. Developers will also practice writing applications that use core Spark to perform ETL processing and iterative algorithms.

The course covers how to work with “big data” stored in a distributed file system, and execute Spark applications on a Hadoop cluster. After taking this course, participants will be prepared to face real-world challenges and build applications to execute faster decisions, better decisions, and interactive analysis, applied to a wide variety of use cases, architectures, and industries.

Course Contents

Introduction to Apache Hadoop and the Hadoop Ecosystem
  • Apache Hadoop Overview
  • Data Processing
  • Introduction to the Hands-On Exercises
Apache Hadoop File Storage
  • Apache Hadoop Cluster Components
  • HDFS Architecture
  • Using HDFS
Distributed Processing on an Apache Hadoop Cluster
  • YARN Architecture
  • Working With YARN
Apache Spark Basics
  • What is Apache Spark?
  • Starting the Spark Shell
  • Using the Spark Shell
  • Getting Started with Datasets and DataFrames
  • DataFrame Operations
To see the full course content Download now
apache-spark-and-scala

Course Prerequisites

This course is designed for developers and engineers who have programming experience, but prior knowledge of Spark and Hadoop is not required. Apache Spark examples and hands-on exercises are presented in Scala and Python. The ability to program in one of those languages is required. Basic familiarity with the Linux command line is assumed. Basic knowledge of SQL is helpful.

Number of Hours: 40hrs

Keyfeatures

  • One to One Training
  • Online Training
  • Fastrack & Normal Track
  • Resume Modification
  • Mock Interviews
  • Video Tutorials
  • Materials
  • Real Time Projects
  • Virtual Live Experience
  • Preparing for Certification

FAQs

TechyEdz in BTM Layout 2nd Stage offers long-term courses, short-term courses and certification courses. Inclusive of comprehensive learning, the long-term program feature subjects such as Web- Development, Digital Marketing, Computer Application and Programming, Information Technology and Data Science. Some of the short-term courses cover topics like Cloud, RPA, Big Data, Microsoft, VMware & Oracle. Walk into this center all through the week between 07:00am – 09:00pm. Pay in Cash, Debit Cards, Credit Card and Online Payment.

TechyEdz Software Specialization in developing customized suite of HR consulting solutions based on the operational models of our clients with a special focus on small & Large Enterprises. We bring practical, results driven HR Practice to our clients business that would help them retain people, Improve business productivity and performance of employee.

TechyEdz Software Specialization in developing customized suite of HR consulting solutions based on the operational models of our clients with a special focus on small & Large Enterprises. We bring practical, results driven HR Practice to our clients business that would help them retain people, Improve business productivity and performance of employee.