Neural Networks with Python
Tom Dierickx
Data Services Team Knowledge-Sharing
December 7, 2018
42
Today’s Agenda
● What is “learning” ?
● What is “machine learning”?
○ demo: decision tree (using scikit-learn and XGBoost inside PowerBI)
○ demo: logistic regression (using statsmodels w/ AWS SageMaker)
○ demo: neural network (using scikit-learn w/ Azure Notebooks)
● What is “deep learning”?
○ demo: neural network (using TensorFlow via Keras w/Google Colab)
● Where you can do it online today … for free!
● Resources and links
What is “Learning” ?
● We typically think of learning in terms of the act, or subjective
experience, of becoming aware of some new fact (which usually has
a “feel” to it) or chunk of information (which generally has a larger “feel” to it)
● Note how this is a very personal, human-centered interpretation
as it’s implicitly defined in terms of our own “consciousness”
● In truth, our brains give themselves a juicy hit of “dopamine” as a
reward for each novel fact/information/news acquired … and it’s
well-known that “emotional-to-us” things actually get packed
away deeper into our long-term memories stronger and longer
● … BUT, a more objective viewpoint might be to define “learning” in
terms of acquiring new skills that, hopefully, lead to increased
accuracy, efficiency, and speed to recall for us in some subject
area … after all, that’s why we humans evolved to learn right?
● So, let’s define learning as “becoming more proficient” in
something (i.e. spotting mistakes quicker, making less errors, connecting dots, etc)
What is “Machine Learning”?
● Continuous improvement in accurately “predicting” output values
(called “labels”) from input values (called “features”)
● More specifically, automatically improving “on it own” against some
pre-defined metric (typically, some “cost function”) that typically compares
predicted values versus actual values, across all observations [i.e. want
minimal error rate% (in classification problems) or minimal RMSE (in regression problems)]
● Various ML algorithms are used to “train” (i.e. learn rules) against some
“training data” until accuracy is thought sufficiently “good enough”
● The learned rules (i.e. usually in the form of some weighted coefficients; albeit, sometimes
10’s of 1000’s, or even more, of them!) are then applied against some unseen
“test data” (i.e. usually just some data, like 20%, that was held out from being trained on) to
validate accuracy on “new” data and hope it holds up
● Want minimal under-fitting (aka, “high bias”)
and minimal over-fitting (aka, “high variance”)
● Under-fitting can be improved with more
data, better features, or better algorithm
● Over-fitting can be improved with simpler models
and/or adding regularization parameters
What is “Machine Learning”? (Cont.)
Q: Why is ML such a “big deal” and so hyped today?
A: b/c it’s transforming our world as we speak! Instead of
a programmer having to (somehow!) know all the conditional
logic rules needed upfront to produce desired output,
internal rules that “just work” are “magically” inferred
Note: “learned rules” may not always be directly accessible or even interpretable
What is “Machine Learning”? (Cont.)
● Some popular tools of the trade (2018 edition)
○ Python (esp. Anaconda distro) is soaring (R is fading)
○ SQL (the “language”, in general) still essential
○ scikit-learn and TensorFlow (w/ Keras wrapper) very popular ML libraries
○ Apache Spark (big data backend) remains preferred over classic Hadoop
Example: Decision Tree
● Attempt to find “natural splits” in data (usually by minimizing “entropy”
and, thus, maximizing “information gain”; i.e., find the most homogeneous branches)
● Tend to overfit (thus, under-perform); but can improve with ensemble methods:
○ Bagged trees: (i.e. multiple trees generated by random sampling; use aggregate “consensus”)
■ “Random forest” most common algorithm
○ Boosted trees: (i.e. incrementally build tree “learning” from prior observations; “snowball”)
■ “XGBoost” most popular algorithm
Live Demo: Decision Tree
● PowerBI Desktop to:
○ fetch dataset on 80k+ UFO sightings from the interwebs via URL
○ serve as an interactive, GUI reporting container to slice & dice things
● Python scripts inside PowerBI to:
○ download historical lunar phase data from external website
○ combine everything using Pandas
○ predict most likely times for UFO sightings using:
■ scikit-learn module to build a “simple” decision tree
■ XGBoost module to gain even better results.
Example: Logistic Regression
● Good for predicting binary output (i.e. 1/0, yes/no, true/false, win/loss, pass/fail, in/out)
● Models “probability” [0 ≤ p ≤ 1] of “Y/N” responses; good for binary classification
Live Demo: Logistic Regression
● AWS SageMaker platform to:
○ Create jupyter notebook in the cloud
○ Look at NFL turnover +/- margin vs win/loss for week 13 games
○ Use statsmodels library to perform logistic regression
○ Use seaborn plotting library for creating nice visuals
Example: Neural Network
● A neural network is similar to logistic regression in some ways, but :
○ Has hidden layer in the middle, with multiple nodes, instead of a single output
○ These nodes (called “neurons”) in the middle each generate their own 0 ≤ value ≤ 1
○ Other activation functions to introduce non-linearity besides “sigmoid” function can be used
○ Output layer can support multiple, predicted output values (p.s. though not shown below)
● Technical notes:
○ Weights are inferred by gradient descent (i.e. partial derivatives) optimization algorithm
○ Weights are updated through a very iterative process called backpropagation
○ Can take many iterations to minimize cost function and for it to converge
Live Demo: Neural Network
● Microsoft Azure Notebooks platform to:
○ Create jupyter notebook in the cloud
○ Look at tic-tac-toe board configurations (https://datahub.io/machine-learning/tic-tac-toe-endgame)
○ Use scikit-learn library to train a “simple” neural network to “learn” what combination
of moves equates to winning or losing for X
○ Validate a prediction by hand to show how the math works
What is “Deep Learning”? (Cont.)
● There really is no exact definition, but is common to implicitly refer to a subset
of machine learning types that focus mostly on very “deep” neural networks
(aka, many hidden layers and nodes)
● Some evolving variants, like recurrent neural networks (RNN) for sequential
data (think: speech recognition) or convolutional neural networks (CNN) for image
data (think: image recognition) perform clever, custom calculations and connect the
hidden layers together in slightly different ways to perform even better (i.e.
faster, more accurately, with less calculations needed) than a “classic” feed forward
network (FFN)
What is “Deep Learning”? (Cont.)
GoogLeNet (Inception v1)—
Winner of ILSVRC 2014
(Image Classification)
● This is the best picture I have seen depicting how the various pieces of the
data and analytics landscape relate to each other (... and my “problem” is I find every
piece so interesting in its own right that I feel like I never know enough about any of them!!)
Current Landscape
Source: https://www.machinecurve.com/index.php/2017/09/30/the-differences-between-artificial-intelligence-machine-learning-more/
Example: Deep Neural Network
● Neural network having multiple hidden layers and more nodes, so can “learn”
more complex patterns … but requires much more data to do so, of course
● Newer architectures even employ different types of hidden layer nodes
● More complicated networks even stitch together multiple networks into one
larger network in a pipeline fashion
● It’s common to plug-in pre-trained networks, especially in audio and/or vision
applications, so don’t have to train from scratch; this is called transfer learning
Live Demo: Deep Neural Network
● Google Colab platform to:
○ Create jupyter-based notebook in the cloud
○ Look at 60 years worth of daily weather data for Rockford, IL
(generated from https://www.ncdc.noaa.gov)
○ Upload raw file to google drive
○ Use Keras wrapper library on top of TensorFlow library to
train a “deep” neural network to “learn” for us if it will rain or snow for
upcoming Saturday given today’s weather is X
Where you can do it online - for free!
There’s a bit of “space race” to take over the world through AI and ML and with
cloud-based computing now ubiquitous and a commodity resource, typically
metered by the hour, there’s lots of 100% free (for now, anyway) places to learn and
practice ML (generally) and Neural Networks (specifically) beyond just your own laptop
● Google Colab
(Python; 20GB RAM, free GPU/TPU hardware)
● Kaggle
(Python or R; 17GB RAM; google acquired in 2017; compete for prizes!)
● Azure Notebooks
(Python or R or F#; 4GB RAM)
● Amazon SageMaker
(Python, R, Scala; 4GB RAM, access to AWS ecosystem, free tier = 250 hours limit)
● IBM Watson Studio
(Python, R, Scala; 4GB RAM, feature-rich options)
● Many more out there popping up everyday...
Resources and links
● “The differences between AI, machine learning & more”
https://www.machinecurve.com/index.php/2017/09/30/the-differences-between-artificial-intelligence-machine-learning-more/
● “Introduction to Data Science”
https://www.saedsayad.com/data_mining_map.htm
● “Definitions of common machine learning terms”
https://ml-cheatsheet.readthedocs.io/en/latest/glossary.html
● “Decision Trees and Boosting, XGBoost | Two Minute Papers #55”
https://www.youtube.com/watch?v=0Xc9LIb_HTw
● “Logistic Regression - Fun and Easy Machine Learning”
https://www.youtube.com/watch?v=7qJ7GksOXoA
● “3 Blue, 1 Brown: Neural networks”
https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi
● “An introduction to Machine Learning (and a little bit of Deep Learning)”
https://www.slideshare.net/ThomasDaSilvaPaula/an-introduction-to-machine-learning-and-a-little-bit-of-deep-learning
● “Modern Convolutional Neural Network techniques for image segmentation”
https://www.slideshare.net/GioeleCiaparrone/modern-convolutional-neural-network-techniques-for-image-segmentation
● “Neural Networks and Deep Learning” free online course
https://www.coursera.org/learn/neural-networks-deep-learning
● “NUFORC geolocated and time standardized UFO reports”
https://github.com/planetsig/ufo-reports

Neural networks with python

  • 1.
    Neural Networks withPython Tom Dierickx Data Services Team Knowledge-Sharing December 7, 2018 42
  • 2.
    Today’s Agenda ● Whatis “learning” ? ● What is “machine learning”? ○ demo: decision tree (using scikit-learn and XGBoost inside PowerBI) ○ demo: logistic regression (using statsmodels w/ AWS SageMaker) ○ demo: neural network (using scikit-learn w/ Azure Notebooks) ● What is “deep learning”? ○ demo: neural network (using TensorFlow via Keras w/Google Colab) ● Where you can do it online today … for free! ● Resources and links
  • 3.
    What is “Learning”? ● We typically think of learning in terms of the act, or subjective experience, of becoming aware of some new fact (which usually has a “feel” to it) or chunk of information (which generally has a larger “feel” to it) ● Note how this is a very personal, human-centered interpretation as it’s implicitly defined in terms of our own “consciousness” ● In truth, our brains give themselves a juicy hit of “dopamine” as a reward for each novel fact/information/news acquired … and it’s well-known that “emotional-to-us” things actually get packed away deeper into our long-term memories stronger and longer ● … BUT, a more objective viewpoint might be to define “learning” in terms of acquiring new skills that, hopefully, lead to increased accuracy, efficiency, and speed to recall for us in some subject area … after all, that’s why we humans evolved to learn right? ● So, let’s define learning as “becoming more proficient” in something (i.e. spotting mistakes quicker, making less errors, connecting dots, etc)
  • 4.
    What is “MachineLearning”? ● Continuous improvement in accurately “predicting” output values (called “labels”) from input values (called “features”) ● More specifically, automatically improving “on it own” against some pre-defined metric (typically, some “cost function”) that typically compares predicted values versus actual values, across all observations [i.e. want minimal error rate% (in classification problems) or minimal RMSE (in regression problems)] ● Various ML algorithms are used to “train” (i.e. learn rules) against some “training data” until accuracy is thought sufficiently “good enough”
  • 5.
    ● The learnedrules (i.e. usually in the form of some weighted coefficients; albeit, sometimes 10’s of 1000’s, or even more, of them!) are then applied against some unseen “test data” (i.e. usually just some data, like 20%, that was held out from being trained on) to validate accuracy on “new” data and hope it holds up ● Want minimal under-fitting (aka, “high bias”) and minimal over-fitting (aka, “high variance”) ● Under-fitting can be improved with more data, better features, or better algorithm ● Over-fitting can be improved with simpler models and/or adding regularization parameters What is “Machine Learning”? (Cont.) Q: Why is ML such a “big deal” and so hyped today? A: b/c it’s transforming our world as we speak! Instead of a programmer having to (somehow!) know all the conditional logic rules needed upfront to produce desired output, internal rules that “just work” are “magically” inferred Note: “learned rules” may not always be directly accessible or even interpretable
  • 6.
    What is “MachineLearning”? (Cont.) ● Some popular tools of the trade (2018 edition) ○ Python (esp. Anaconda distro) is soaring (R is fading) ○ SQL (the “language”, in general) still essential ○ scikit-learn and TensorFlow (w/ Keras wrapper) very popular ML libraries ○ Apache Spark (big data backend) remains preferred over classic Hadoop
  • 7.
    Example: Decision Tree ●Attempt to find “natural splits” in data (usually by minimizing “entropy” and, thus, maximizing “information gain”; i.e., find the most homogeneous branches) ● Tend to overfit (thus, under-perform); but can improve with ensemble methods: ○ Bagged trees: (i.e. multiple trees generated by random sampling; use aggregate “consensus”) ■ “Random forest” most common algorithm ○ Boosted trees: (i.e. incrementally build tree “learning” from prior observations; “snowball”) ■ “XGBoost” most popular algorithm
  • 8.
    Live Demo: DecisionTree ● PowerBI Desktop to: ○ fetch dataset on 80k+ UFO sightings from the interwebs via URL ○ serve as an interactive, GUI reporting container to slice & dice things ● Python scripts inside PowerBI to: ○ download historical lunar phase data from external website ○ combine everything using Pandas ○ predict most likely times for UFO sightings using: ■ scikit-learn module to build a “simple” decision tree ■ XGBoost module to gain even better results.
  • 9.
    Example: Logistic Regression ●Good for predicting binary output (i.e. 1/0, yes/no, true/false, win/loss, pass/fail, in/out) ● Models “probability” [0 ≤ p ≤ 1] of “Y/N” responses; good for binary classification
  • 10.
    Live Demo: LogisticRegression ● AWS SageMaker platform to: ○ Create jupyter notebook in the cloud ○ Look at NFL turnover +/- margin vs win/loss for week 13 games ○ Use statsmodels library to perform logistic regression ○ Use seaborn plotting library for creating nice visuals
  • 11.
    Example: Neural Network ●A neural network is similar to logistic regression in some ways, but : ○ Has hidden layer in the middle, with multiple nodes, instead of a single output ○ These nodes (called “neurons”) in the middle each generate their own 0 ≤ value ≤ 1 ○ Other activation functions to introduce non-linearity besides “sigmoid” function can be used ○ Output layer can support multiple, predicted output values (p.s. though not shown below) ● Technical notes: ○ Weights are inferred by gradient descent (i.e. partial derivatives) optimization algorithm ○ Weights are updated through a very iterative process called backpropagation ○ Can take many iterations to minimize cost function and for it to converge
  • 12.
    Live Demo: NeuralNetwork ● Microsoft Azure Notebooks platform to: ○ Create jupyter notebook in the cloud ○ Look at tic-tac-toe board configurations (https://datahub.io/machine-learning/tic-tac-toe-endgame) ○ Use scikit-learn library to train a “simple” neural network to “learn” what combination of moves equates to winning or losing for X ○ Validate a prediction by hand to show how the math works
  • 13.
    What is “DeepLearning”? (Cont.) ● There really is no exact definition, but is common to implicitly refer to a subset of machine learning types that focus mostly on very “deep” neural networks (aka, many hidden layers and nodes) ● Some evolving variants, like recurrent neural networks (RNN) for sequential data (think: speech recognition) or convolutional neural networks (CNN) for image data (think: image recognition) perform clever, custom calculations and connect the hidden layers together in slightly different ways to perform even better (i.e. faster, more accurately, with less calculations needed) than a “classic” feed forward network (FFN)
  • 14.
    What is “DeepLearning”? (Cont.) GoogLeNet (Inception v1)— Winner of ILSVRC 2014 (Image Classification)
  • 15.
    ● This isthe best picture I have seen depicting how the various pieces of the data and analytics landscape relate to each other (... and my “problem” is I find every piece so interesting in its own right that I feel like I never know enough about any of them!!) Current Landscape Source: https://www.machinecurve.com/index.php/2017/09/30/the-differences-between-artificial-intelligence-machine-learning-more/
  • 16.
    Example: Deep NeuralNetwork ● Neural network having multiple hidden layers and more nodes, so can “learn” more complex patterns … but requires much more data to do so, of course ● Newer architectures even employ different types of hidden layer nodes ● More complicated networks even stitch together multiple networks into one larger network in a pipeline fashion ● It’s common to plug-in pre-trained networks, especially in audio and/or vision applications, so don’t have to train from scratch; this is called transfer learning
  • 17.
    Live Demo: DeepNeural Network ● Google Colab platform to: ○ Create jupyter-based notebook in the cloud ○ Look at 60 years worth of daily weather data for Rockford, IL (generated from https://www.ncdc.noaa.gov) ○ Upload raw file to google drive ○ Use Keras wrapper library on top of TensorFlow library to train a “deep” neural network to “learn” for us if it will rain or snow for upcoming Saturday given today’s weather is X
  • 18.
    Where you cando it online - for free! There’s a bit of “space race” to take over the world through AI and ML and with cloud-based computing now ubiquitous and a commodity resource, typically metered by the hour, there’s lots of 100% free (for now, anyway) places to learn and practice ML (generally) and Neural Networks (specifically) beyond just your own laptop ● Google Colab (Python; 20GB RAM, free GPU/TPU hardware) ● Kaggle (Python or R; 17GB RAM; google acquired in 2017; compete for prizes!) ● Azure Notebooks (Python or R or F#; 4GB RAM) ● Amazon SageMaker (Python, R, Scala; 4GB RAM, access to AWS ecosystem, free tier = 250 hours limit) ● IBM Watson Studio (Python, R, Scala; 4GB RAM, feature-rich options) ● Many more out there popping up everyday...
  • 19.
    Resources and links ●“The differences between AI, machine learning & more” https://www.machinecurve.com/index.php/2017/09/30/the-differences-between-artificial-intelligence-machine-learning-more/ ● “Introduction to Data Science” https://www.saedsayad.com/data_mining_map.htm ● “Definitions of common machine learning terms” https://ml-cheatsheet.readthedocs.io/en/latest/glossary.html ● “Decision Trees and Boosting, XGBoost | Two Minute Papers #55” https://www.youtube.com/watch?v=0Xc9LIb_HTw ● “Logistic Regression - Fun and Easy Machine Learning” https://www.youtube.com/watch?v=7qJ7GksOXoA ● “3 Blue, 1 Brown: Neural networks” https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi ● “An introduction to Machine Learning (and a little bit of Deep Learning)” https://www.slideshare.net/ThomasDaSilvaPaula/an-introduction-to-machine-learning-and-a-little-bit-of-deep-learning ● “Modern Convolutional Neural Network techniques for image segmentation” https://www.slideshare.net/GioeleCiaparrone/modern-convolutional-neural-network-techniques-for-image-segmentation ● “Neural Networks and Deep Learning” free online course https://www.coursera.org/learn/neural-networks-deep-learning ● “NUFORC geolocated and time standardized UFO reports” https://github.com/planetsig/ufo-reports