Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
LLM Design Patterns

You're reading from   LLM Design Patterns A Practical Guide to Building Robust and Efficient AI Systems

Arrow left icon
Product type Paperback
Published in May 2025
Publisher Packt
ISBN-13 9781836207030
Length 534 pages
Edition 1st Edition
Concepts
Arrow right icon
Author (1):
Arrow left icon
Ken Huang Ken Huang
Author Profile Icon Ken Huang
Ken Huang
Arrow right icon
View More author details
Toc

Table of Contents (38) Chapters Close

Preface 1. Part 1: Introduction and Data Preparation
2. Chapter 1: Introduction to LLM Design Patterns FREE CHAPTER 3. Chapter 2: Data Cleaning for LLM Training 4. Chapter 3: Data Augmentation 5. Chapter 4: Handling Large Datasets for LLM Training 6. Chapter 5: Data Versioning 7. Chapter 6: Dataset Annotation and Labeling 8. Part 2: Training and Optimization of Large Language Models
9. Chapter 7: Training Pipeline 10. Chapter 8: Hyperparameter Tuning 11. Chapter 9: Regularization 12. Chapter 10: Checkpointing and Recovery 13. Chapter 11: Fine-Tuning 14. Chapter 12: Model Pruning 15. Chapter 13: Quantization 16. Part 3: Evaluation and Interpretation of Large Language Models
17. Chapter 14: Evaluation Metrics 18. Chapter 15: Cross-Validation 19. Chapter 16: Interpretability 20. Chapter 17: Fairness and Bias Detection 21. Chapter 18: Adversarial Robustness 22. Chapter 19: Reinforcement Learning from Human Feedback 23. Part 4: Advanced Prompt Engineering Techniques
24. Chapter 20: Chain-of-Thought Prompting 25. Chapter 21: Tree-of-Thoughts Prompting 26. Chapter 22: Reasoning and Acting 27. Chapter 23: Reasoning WithOut Observation 28. Chapter 24: Reflection Techniques 29. Chapter 25: Automatic Multi-Step Reasoning and Tool Use 30. Part 5: Retrieval and Knowledge Integration in Large Language Models
31. Chapter 26: Retrieval-Augmented Generation 32. Chapter 27: Graph-Based RAG 33. Chapter 28: Advanced RAG 34. Chapter 29: Evaluating RAG Systems 35. Chapter 30: Agentic Patterns 36. Index 37. Other Books You May Enjoy

Balancing pruning and model performance

Finding the right balance between pruning and model performance is critical. Aggressive pruning can lead to significant performance degradation, while too little pruning may not yield enough benefits. The key is to identify which parts of the model can be pruned with minimal impact on accuracy. This requires careful validation after each pruning step and close monitoring of key performance metrics. These metrics include parameter reduction rates, inference speed gains, memory footprint reduction, changes in perplexity, and task-specific performance. Throughout the process, it’s crucial to balance the accuracy-efficiency trade-off to ensure the pruned model retains acceptable performance despite having fewer parameters

A common strategy is to apply fine-tuning after pruning to restore some of the lost performance. Fine-tuning allows the model to adjust to the pruned structure and recover its original capabilities:

import torch.nn...
lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime