Data Engineering with Scala and Spark

Build streaming and batch pipelines that process massive amounts of data using Scala

ETDRRB

Created by Eric Tome, David Radford, Rupam Bhattacharjee

Explore how to build efficient data pipelines using Scala and Spark, focusing on both streaming and batch processing. Learn to transform raw data into reliable, high-quality information that supports your organization's needs. Gain practical experience with modern cloud data architectures and best practices for scalable deployments.

Packt | Jan 2024 | 300 min

Level

Intermediate

What You Will Learn

You will start by setting up a development environment and gradually move into hands-on projects that use Scala and Spark for real-world data processing. Through practical exercises, you will learn to implement, test, and optimize pipelines while applying best practices for cloud deployments and software engineering.

Key Features

Build robust streaming and batch data pipelines using Scala and Spark
Apply test-driven development and CI/CD to automate and orchestrate workflows
Profile, clean, and optimize data for reliable and high-performance delivery

Target Audience

Ideal for data engineers with some experience who want to deepen their skills in building scalable pipelines using Scala and Spark. If you're looking to turn raw data into trusted, actionable insights and want to leverage modern cloud tools and automation, this course will help you reach your goals.

Related courses