Data Science Salon: Deep Learning as a Product @ Scribd

Who is the talk for?
People interested in our experience using deep learning
Product side of deep learning
Not a technical talk on how to best utilize LSTM
cells/activation functions/networks architectures/GANS/etc

Motivation
Spelling correction to improve the user experience

Deep Learning as Product
Does the user even care?

What can Deep Learning even do?
Image Editing
Object Detection
Boundary Detection
Music Generation
Voice Generation
Translate Languages

This will be fun!
Style Transfer
- Product
- Visual
- Imperfect
- Visualize the errors

Classic Van Gogh Style Transfer Example
Source: http://ideate.xsead.cmu.edu/gallery/projects/art-transitions-style-transfers-by-neural-networks

https://www.theverge.com/2017/3/30/15124466/ai-photo-style-transfer-deep-neural-nets-adobe
Style Transfer

STYLE TRANSFER FTW - PRISMA
Base Foreground Transfer Heisenberg

Can it solve our problem?
If they can translate styles onto images...

Spelling correction
Spelling correction is solved right? If you have enough data
Google is ok, we are not
They can take a probabilistic approach, good for them :P
We need to get fancier

Let’s get started
1. Read blog posts
2. Read books
3. Read tutorials
4. Watch videos
5. Deep Learn

Which one to pick?
We did a comprehensive framework bakeoff…
AND THE WINNER IS ?!?!
J/K
We picked Keras
- Abstraction layer (easier to get started)
- Familiarity with Python

Life as a Deep Learner
http://everythingbrilliant.co.uk/wp-content/uploads/2017/01/emotional-journey-of-creating-anything-new.jpeg

Architectures
RNN
CNN
Deep/Wide

Which architecture to use?
1. Read blog posts
2. Read books
3. Read tutorials
4. Watch Videos
5. Deep Learn

TL;DR
Certain problems require certain architectures
For example:
CNN -> Image parsing/Object Detection

Where are we?
What do we have? Keras (framework)
What do we need? Architecture/algorithm
Constraints? Avoid implementing our own from scratch

Can we do even do this?
Let’s code some boilerplate for a basic neural network

Training
How long does model development take
TBD -> 1-2 weeks
How long to let the model train
7 days
Is more gpu moar better? Not yet...

So many parameters...
1. Read blog posts
2. Read books
3. Read tutorials
4. Watch videos
5. Deep Learn

Details (Hyperparameters)
Activation Units
Sigmoid, relu, prelu (yes, not even joking), lolu (actually joking)
Optimization Functions
Adam, sgd, etc, etc
Dropout Rate
Batch Size
Continues on forever...

Activation
Sigmoid RELU Leaky
RELU/PRELU

Let’s find our algorithm
1. Read blog posts
2. Read books
3. Read tutorials
4. Watch Videos
5. Deep Learn

Sequence to Sequence mapping
TL;DR
Given a sequence of things turn them into another sequence
Spelling Correction
‘Cande’ -> ‘candy’
English -> Spanish language translation
‘Hello’ -> ‘hola’

Where are we?
1. We have a framework
2. We have an algorithm
3. Let’s find an example

Deep Spelling Example
Blog post on Medium
Github code
90%+ Accuracy!

Data science blog posts
Not reproducible
Proprietary data sets that are not shared
Incorrect accuracy metrics
Someone please solve this (Academic + Industry partnership
on peer reviewed + reproducible data science algorithms)

PIT OF DESPAIR
http://everythingbrilliant.co.uk/wp-content/uploads/2017/01/emotional-journey-of-creating-anything-new.jpeg

Let’s find a new approach
1. Read blog posts
2. Read books
3. Read tutorials
4. Watch videos
5. Deep Learn

What about seq2seq?
New approach
- Existing libraries/frameworks
- OpenNMT
- Open Source contribution

What’s edit distance?
Number of letters (operations) to replace to transform a given word into another
kitten -> sitting = edit distance of 3
1. k -> s (substitution)
2. e -> i (substitution)
3. _ -> g (insertion)

https://gist.github.com/mrelich/36b5f37233026e828af6d63f6015554b#file-deep_spell_example-py

Production
Developed on AWS, but need to deploy on internally
We can’t use OpenNMT
Ops :(, they have their reasons but FML
What’s next?
Tensorflow to the rescue!!!

Seq2Seq in Tensorflow
Tensorflow is pain
Tensorflow is a collection of a code not a “library/framework”

Let’s figure out Tensorflow
1. Read blog posts
2. Read books
3. Read tutorials
4. Watch Videos
5. Deep Learn

Outcome
Added a week of development time...BUT
Simpler and Faster algorithm
Basis for production deployment

SO CLOSE!
We can properly correct spelling but…
We also have to guess partial words correctly
Query: ‘ub of the centry’
Corrected to: ‘club of the century’ ← Where we are
Correct query: ‘pub of the century’ ← Where we need to go

Dictionary matching FTW
Build a dictionary where the keys are phrases with 1 word dropped.
{ [joy luck]: “joy luck club”, [luck club]: “joy luck club”}

How much time does it add?
● Seq2Seq model adds 15ms
● Dictionary lookup adds .2ms

Why do we hard things
Hard Problem (Spelling) -> Knowledge accumulation
New problem -> Query parsing
[‘john grisham pelican brief’] -> [author,author,title,title]

More Projects
Query Parsing (Authors/Titles/Topics/Series/etc)
Content Summarization (Books/Articles/Documents)
Churn Prediction
Document Classification (Study Guides/Court cases/etc)

Data Blog (coming in 2018)
Posts on:
Spelling Correction
Query Tagging
Multi-Armed Bandits
A/B Test System Infrastructure

Work on hard data problems?
Work at Scribd.

Questions
Or we can all leave early!!!

Appendix
https://imgflip.com/i/1woc8x https://imgflip.com/i/1wpxvp https://imgflip.com/i/1wpy1z
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
http://www.dogster.com/wp-content/uploads/2015/05/scolded-dog.jpg
http://bravodog.ca/wp-content/uploads/2016/11/Reward-training-Calling-your-dog-to-come-4.jpeg
https://www.skylinelabs.in/blog/images/tensorflow.jpg
https://deeplearning4j.org/assets/themes/thedocs/img/DL4J-LOGO-2.png
https://raw.githubusercontent.com/dmlc/dmlc.github.io/master/img/logo-m/mxnet2.png
https://valohai.com/static/img/support-logos/theano.svg
http://cs231n.github.io/neural-networks-1/

Data Science Salon: Deep Learning as a Product @ Scribd

More Related Content

Similar to Data Science Salon: Deep Learning as a Product @ Scribd (20)

More from Formulatedby (20)

Recently uploaded (20)

Data Science Salon: Deep Learning as a Product @ Scribd

Editor's Notes