Cover image for Train Large Language Models Faster - Parallelism Deep Dive

Train Large Language Models Faster - Parallelism Deep Dive

Accelerating Large Language Model Training with Parallelism Techniques

Paulo Dichone

Created by Paulo Dichone

Explore the world of parallelism in large language model training and discover how to speed up your workflows using advanced techniques. Dive into practical tools like PyTorch, DeepSpeed, and Runpod.io to efficiently train models across multiple GPUs. Build scalable and resilient AI systems with hands-on experience in optimizing and managing LLM training.

Packt | Jun 2025 | 530 min

Start Trial
LevelExpert
CategoriesLLM Engineering, Deep Learning Architectures and Frameworks, PyTorch, Python

What You Will Learn

You will learn by combining theory with hands-on practice, working directly with real datasets and modern tools. Step-by-step, you will implement parallelism techniques, set up multi-GPU environments, and tackle real challenges in distributed training. By practicing these skills, you'll gain confidence in optimizing and scaling large language model training.

Key Features

  • Apply data, model, and hybrid parallelism to train large models faster and more efficiently
  • Use PyTorch, DeepSpeed, and Runpod.io to implement distributed training on real datasets
  • Develop robust workflows with checkpointing and fault tolerance for scalable AI systems

Target Audience

Perfect for AI and machine learning engineers, researchers, and data scientists who want to accelerate large model training. If you already understand machine learning basics, Python, and have some experience with GPUs and cloud computing, you'll find the content directly relevant. This is for those aiming to master parallelism and build high-performance AI systems.

Related courses