Here is a Blunt Guide to Mathematically Rigorous Machine Learning


By Harsh Sikka, graduate student at Harvard University studying AI and biocomputation

I recently wrote a brief guide on the Math required for Machine Learning.People liked it, and asked me to write one on how to master ML at a mathematically rigorous, conceptual level. That is the focus of this guide, no bullshit, no easy routes, and real, fundamental understanding. I’ll be going through the later part of the curriculum myself.

A quick question to ask yourself: Why do I want to learn ML? The following material can be very difficult at times, and keeping discipline is often a matter of keeping your core motivation at heart. For example, I’m trying to validate a new brain inspired theoretical neural network architecture, and to be able to reason about it effectively, I need to have a deep intuition about current architectures and their underlying mathematics.

I won’t be going through the math portions again, you can check out my other article or this excellent post by YC on the topic. My advice, learn enough Linear Algebra, Stats, Probability, and Multivariate Calculus to feel good about yourself, and learn everything else as you have to.

1. Elements of Statistical Learning

Prioritize Chapters 1–4 and Chapters 7–8. This covers supervised learning, linear regression, classification, Model Assessment and Inference. Its okay if you don’t understand it at first, absolutely nobody does. Keep reading it and learning whatever math you need to until you get it. If you want, knock the whole book out, you won’t regret it.

If Elements is really just too hard, you can start with Introduction to Statistical Learning, by the same authors. The book sacrifices some mathematical explanation and focuses on a subset of the problems in Elements, but is a good ramping up point to understanding the material. There is an excellent accompanying course provided by Stanford for free.

Both books focus on R, which is worth learning.

2. Stanford CS 229

Once you’ve finished Elements, you’re in a great position to take Stanford’s ML course, taught by Andrew Ng. You can think about this like the mathematically rigorous version of his popular Coursera course. Going into this course, make sure to refresh your Multivariate Calculus and Linear Algebra skills, as well as some probability. They provide some handy refresher guides on the site page.

Do all the exercises and problem sets, and try doing the programming assignments in both R and Python. You’ll thank me later.

You can again opt to go for a slightly easier route in Andrew Ng’s Coursera course, which is focused more on implementation and less on underlying theory and the math. I would really just do all the programming assignments from there as well. You don’t have to do them in Octave/Matlab, you can do R and Python versions. There are plenty of repos to compare to on Github.

3. Deep Learning Book

At this point, you’re starting to get formidable. You have a fundamental mathematical understanding of many popular, historic techniques in Machine Learning, and can choose to dive into any vertical you want. Of course, most people want to go into Deep Learning because of its significance in industry.

Go through the DL book. It will refresh you on a lot of math and also fundamentally explain much of modern Deep Learning well. You can start messing around with implementations by spinning up a Linux box and doing cool shit with CNNs, RNNs and regular old feed forward neural networks. Use Tensorflow and Pytorch, and start to get a sense of how awesome some of these libraries are for abstracting a lot of the complexity you learned.

I’ve also heard the courses by Andrew Ng and co are worth it. They are not nearly as comprehensive as the textbook by Goodfellow, but seem to be a useful companion.

Read the source article at Medium.