Machine Learning in 5 Minutes— Classification

classiﬁcation edition
Machine Learning in 5 Minutes
Brian Lange

popular examples
-spam ﬁlters
-the Sorting Hat

things to know
- you need data labeled with the correct answers to
“train” these algorithms before they work
- feature = dimension = attribute of the data
- class = category = Harry Potter house

linear discriminants
“draw a line through it”

linear discriminants
“draw a line through it”
🎉

deﬁne what “shitty” means
6 wrong

deﬁne what “shitty” means
4 wrong

a map of shittiness
to ﬁnd the least shitty line
shittiness
slope
intercept

probably don’t use these
linear discriminants:

logistic regression
“divide it with a log function”

logistic regression
“divide it with a log function”
🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉
+ gives you probabilities
+ the model is a formula
+ can “threshold” to make model more or less
conservative
💩💩💩💩💩💩💩💩💩💩💩
- only works with linear decision boundaries

SVMs (support vector machines)
“*advanced* draw a line through it”
- better deﬁnition of “shitty”
- lines can turn into non-linear
shapes if you transform your
data

Machine Learning in 5 Minutes— Classification

🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉
works well on a lot of diﬀerent shapes of data
thanks to the kernel trick
💩💩💩💩💩💩💩💩💩💩💩
not super easy to explain to people
can only kinda do probabilities

KNN (k-nearest neighbors)
“what do similar cases look like?”

k=1

k=2

k=3

🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉
+ no training, adding new data is easy
+ you get to deﬁne “distance”  
💩💩💩💩💩💩💩💩💩💩💩
- can be outlier-sensitive
- you have to deﬁne “distance”

decision tree learners
make a ﬂow chart of it

x < 3?
yes no
3

x < 3?
yes no
y < 4?
yes no
3
4

x < 3?
yes no
y < 4?
yes no
x < 5?
yes no
3 5
4

🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉
+ ﬁt all kinds of arbitrary shapes
+ output is a clear set of
conditionals 
💩💩💩💩💩💩💩💩💩💩💩
- extremely prone to overﬁtting
- have to rebuild when you get new
data
- no probability estimates

ensemble models
make a bunch of models and combine them

ensemble models
make a bunch of models and combine them
🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉
- don’t overﬁt as much as their component parts
- Generally don’t require much parameter tweaking
- If data doesn’t change very often, you can make
them semi-online by just adding new trees
- Can provide probabilities
💩💩💩💩💩💩💩💩💩💩💩
- Slower than their component parts (though if
those are fast, it doesn’t matter)

Machine Learning in 5 Minutes— Classification

More Related Content

What's hot (20)

Similar to Machine Learning in 5 Minutes— Classification (17)

Recently uploaded (20)

Machine Learning in 5 Minutes— Classification