Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
Mathematics of Machine Learning

You're reading from   Mathematics of Machine Learning Master linear algebra, calculus, and probability for machine learning

Arrow left icon
Product type Paperback
Published in May 2025
Publisher Packt
ISBN-13 9781837027873
Length 730 pages
Edition 1st Edition
Arrow right icon
Author (1):
Arrow left icon
Tivadar Danka Tivadar Danka
Author Profile Icon Tivadar Danka
Tivadar Danka
Arrow right icon
View More author details
Toc

Table of Contents (36) Chapters Close

Introduction Part 1: Linear Algebra FREE CHAPTER
1 Vectors and Vector Spaces 2 The Geometric Structure of Vector Spaces 3 Linear Algebra in Practice 4 Linear Transformations 5 Matrices and Equations 6 Eigenvalues and Eigenvectors 7 Matrix Factorizations 8 Matrices and Graphs References
Part 2: Calculus
9 Functions 10 Numbers, Sequences, and Series 11 Topology, Limits, and Continuity 12 Differentiation 13 Optimization 14 Integration References
Part 3: Multivariable Calculus
15 Multivariable Functions 16 Derivatives and Gradients 17 Optimization in Multiple Variables References
Part 4: Probability Theory
18 What is Probability? 19 Random Variables and Distributions 20 The Expected Value References
Part 5: Appendix
Other Books You May Enjoy
Index
Appendix A It’s Just Logic 1. Appendix B The Structure of Mathematics 2. Appendix C Basics of Set Theory 3. Appendix D Complex Numbers

What this book covers

Chapter 1, Vectors and vector spaces covers what vectors are and how to work with them. We’ll travel from concrete examples through precise mathematical definitions to implementations, understanding vector spaces and NumPy arrays, which are used to represent vectors efficiently. Besides the fundamentals, we’ll learn

Chapter 2, The geometric structure of vector spaces moves forward by studying the concept of norms, distances, inner products, angles, and orthogonality, enhancing the algebraic definition of vector spaces with some much-needed geometric structure. These are not just tools for visualization; they play a crucial role in machine learning. We’ll also encounter our first algorithm, the Gram-Schmidt orthogonalization method, turning any set of vectors into an orthonormal basis.

In Chapter 3, Linear algebra in practice, we break out NumPy once more, and implement everything that we’ve learned so far. Here, we learn how to work with the high-performance NumPy arrays in practice: operations, broadcasting, functions, culminating in the from-scratch implementation of the Gram-Schmidt algorithm. This is also the first time we encounter matrices, the workhorses of linear algebra.

Chapter 4, Linear transformations is about the true nature of matrices; that is, structure-preserving transformations between vector spaces. This way, seemingly arcane things – such as the definition of matrix multiplication – suddenly make sense. Once more, we take the leap from algebraic structures to geometric ones, allowing us to study matrices as transformations that distort their underlying space. We’ll also look at one of the most important descriptors of matrices: the determinants, describing how the underlying linear transformations affect the volume of the spaces.

Chapter 5, Matrices and equations presents the third (and for us, the final) face of matrices as systems of linear equations. In this chapter, we first learn how to solve systems of linear equations by hand using the Gaussian elimination, then supercharge it via our newfound knowledge of linear algebra, obtaining the mighty LU decomposition. With the help of the LU decomposition, we go hard and achieve a roughly 70000 × speedup on computing determinants.

Chapter 6 introduces two of the most important descriptors of matrices: eigenvalues and eigenvectors. Why do we need them?

Because in Chapter 7, Matrix factorizations, we are able to reach the pinnacle of linear algebra with their help. First, we show that real and symmetric matrices can be written in diagonal form by constructing a basis from their eigenvectors, known as the spectral decomposition theorem. In turn, a clever application of the spectral decomposition leads to the singular value decomposition, the single most important result of linear algebra.

Chapter 8, Matrices and graphs closes the linear algebra part of the book by studying the fruitful connection between linear algebra and graph theory. By representing matrices as graphs, we are able to show deep results such as the Frobenius normal form, or even talk about the eigenvalues and eigenvectors of graphs.

In Chapter 9, Functions, we take a detailed look at functions, a concept that we have used intuitively so far. This time, we make the intuition mathematically precise, learning that functions are essentially arrows between dots.

Chapter 10, Numbers, sequences, and series continues down the rabbit hole, looking at the concept of numbers. Each step from natural numbers towards real numbers represents a conceptual jump, peaking at the study of sequences and series.

With Chapter 11, Topology, limits, and continuity, we are almost at the really interesting parts. However, in calculus, the objects, concepts, and tools are most often described in terms of limits and continuous functions. So, we take a detailed look at what they are.

Chapter 12 is about the single most important concept in calculus: Differentiation. In this chapter, we learn that the derivative of a function describes 1) the slope of the tangent line, and 2) the best local linear approximation to a function. From a practical side, we also look at how derivatives behave with respect to operations, most importantly the function composition, yielding the essential chain rule, the bread and butter of backpropagation.

After all the setup, Chapter 13, Optimization introduces the algorithm that is used to train virtually every neural network: gradient descent. For that, we learn how the derivative describes the monotonicity of functions and how local extrema can be characterized with the first and second order derivatives.

Chapter 14, Integration wraps our study of univariate functions. Intuitively speaking, integration describes the (signed) area under the functions’ graph, but upon closer inspection, it also turns out to be the inverse of differentiation. In machine learning (and throughout all of mathematics, really), integrals describe various probabilities, expected values, and other essential quantities.

Now that we understand how calculus is done in single variables, Chapter 15 leads us to the world of Multivariable functions, where machine learning is done. There, we have an entire zoo of functions: scalar-vector, vector-scalar, and vector-vector ones.

In Chapter 16, Derivatives and gradients, we continue our journey, overcoming the difficulties of generalizing differentiation to multivariable functions. Here, we have three kinds of derivatives: partial, total, and directional; resulting in the gradient vector and the Jacobian and Hessian matrices.

As expected, optimization is also slightly more complicated in multiple variables. This issue is cleared up by Chapter 17, Optimization in multiple variables, where we learn the analogue of the univariate second-derivative test, and implement the almighty gradient descent in its final form, concluding our study of calculus.

Now that we have a mechanistic understanding of machine learning, Chapter 18, What is probability? shows us how to reason and model under uncertainty. In mathematical terms, probability spaces are defined by the Kolmogorov axioms, and we’ll also learn the tools that allow us to work with probabilistic models.

Chapter 19 introduces Random variables and distributions, allowing us not only to bring the tools of calculus into probability theory, but to compact probabilistic models into sequences or functions.

Finally, in Chapter 20, we learn the concept of The expected value, quantifying probabilistic models and distributions with averages, variances, covariances, and entropy.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Modal Close icon
Modal Close icon