Cover image for In-Memory Analytics with Apache Arrow

In-Memory Analytics with Apache Arrow

Accelerate data analytics for efficient processing of flat and hierarchical data structures

Matthew Topol

Created by Matthew Topol

Unlock faster data analytics by mastering Apache Arrow, a powerful in-memory data format designed for efficient processing of both flat and hierarchical data. Learn how to streamline data workflows, optimize performance, and integrate with popular analytical systems using Arrow's versatile libraries and tools.

Packt | Sep 2024 | 406 min

Start Trial
LevelExpert
CategoriesData Engineering, Data Warehousing and Big Data Processing Frameworks

What You Will Learn

You'll gain hands-on experience with real code examples in Python, C++, and Go while exploring how Apache Arrow works under the hood. Each topic is broken down with clear, practical explanations to help you understand the design choices and performance benefits. Along the way, you'll see how Arrow connects with other data formats and systems.

Key Features

  • Work seamlessly with Arrow's columnar format to boost analytics speed
  • Integrate Arrow with tools like pandas, Parquet, and Spark for efficient workflows
  • Accelerate machine learning pipelines using Arrow's APIs and subprojects

Target Audience

Ideal for developers, data engineers, and data scientists with some familiarity in data analysis. If you're looking to build efficient analytics utilities, speed up data pipelines, or work across different programming languages, you'll find practical guidance here. No deep expertise is required-just curiosity and a desire to improve your data processing skills.

Related courses