Bringing an AI Ecosystem to the Domain Expert and Enterprise AI Developer with Fred Reiss and Vijay Bommireddipalli

Bringing an AI Ecosystem to the
Domain Expert and Enterprise AI
Developer
Fred Reiss
Vijay Bommireddipalli
IBM Center for Open-Source Data & AI Technologies
(http://codait.org)
1

2
IBM’s history of strong AI leadership
1997: Deep Blue
• Deep Blue became the first machine to beat a world chess
champion in tournament play
2011: Jeopardy!
• Watson beat two top
Jeopardy! champions
1968, 2001: A Space Odyssey
• IBM was a technical
advisor
• HAL is “the latest in
machine intelligence”
2018: Open Tech, AI & emerging
standards
• New IBM centers of gravity for AI
• OS projects increasing exponentially
• Emerging global standards in AI

3
Center for Open Source
Data and AI Technologies
• CODAIT aims to make AI solutions
dramatically easier to create, deploy,
and manage in the enterprise
• Relaunch of the Spark Technology
Center (STC) to reflect expanded
mission
• codait (French)
• = coder/coded
• https://m.interglot.co
m/fr/en/codait
• CODAIT
• codait.org

4
CODAIT by the numb3rs
• The team contributes to over 10 open source projects.
These projects include - Spark, Tensorflow, Keras,
SystemML, Arrow, Bahir, Toree, Livy, Zeppelin, R4ML,
Stocator, Jupyter Enterprise Gateway
• 17 committers and many contributors in Apache projects-
Spark, Arrow, systemML, Bahir, Toree, Livy
• Over 980 JIRAs and 50,000 lines of code committed to
Apache Spark itself, and Over 65,000 LoC into
SystemML
– Established IBM as the number 1 contributor to Spark
Machine Learning in Spark 2.0 release
• Over 25 product lines within IBM leveraging Apache
Spark in some form or another. CODAIT engineers have
interacted and interlocked with many of them.
• Speakers at over 100 conferences, MeetUps, un-
conferences etc.
• codait (French)
• = coder/coded
m/fr/en/codait
Spark code contribution growth by
week
• CODAIT
• codait.org

Improving the Enterprise AI lifecycle in Open Source
5
Center for Open Source
Data and AI Technologies
• Code - Build and improve practical
frameworks to enable more developers
to realize immediate value (e.g. FfDL,
Tensorflow Jupyter, Spark)
• Content – Showcase solutions to
complex and real world AI problems
• Community – Bring developers and
data scientists to engage with IBM (e.g.
MAX)
• codait (French)
• = coder/coded
m/fr/en/codait
Gather
Data
Analyze
Data
Machine
Learning
Deep
Learning
Deploy
Model
Maintain
Model
Python
Data Science
Stack
Fabric for
Deep Learning
(FfDL)
Mleap +
PFA
Scikit-LearnPandas
Apache
Spark
Apache
Spark
Jupyter
Model
Asset
eXchange
Keras +
Tensorflow
CODAIT
codait.org

6
Fabric for Deep Learning
https://github.com/IBM/FfDL
https://www.youtube.com/watch?v=nQsY
WmkfLP4
• FfDL provides a scalable, resilient, and fault tolerant
deep-learning framework
• Fabric for Deep Learning or FfDL (pronounced as ‘fiddle’)
is an open source project which aims at making Deep
Learning easily accessible to the people it matters the
most i.e. Data Scientists, and AI developers.
• FfDL Provides a consistent way to deploy, train and
visualize Deep Learning jobs across multiple frameworks
like TensorFlow, Caffe, PyTorch, Keras etc.
• FfDL is being developed in close collaboration with IBM
Research and IBM Watson. It forms the core of
Watson`s Deep Learning service in open source.
• FfDL Github Page
https://github.com/IBM/FfDL
FfDL dwOpen Page
https://developer.ibm.com/code/open/proj
ects/fabric-for-deep-learning-ffdl/
FfDL Announcement Blog
http://developer.ibm.com/code/2018/03/20
/fabric-for-deep-learning
FfDL Technical Architecture Blog
http://developer.ibm.com/code/2018/03/20
/democratize-ai-with-fabric-for-deep-
learning
Deep Learning as a Service within
Watson Studio
https://www.ibm.com/cloud/deep-learning
• Research paper: “Scalable Multi-
Framework Management of Deep
Learning Training Jobs”
http://learningsys.org/nips17/assets/paper
s/paper_29.pdf
•
FfDL

Jupyter Enterprise
Gateway
March 30 2018 / © 2018 IBM Corporation
• Jupyter Enterprise Gateway at IBM Code
• https://developer.ibm.com/code/openprojects/jupyter-enterprise-gateway/
• Jupyter Enterprise Gateway source code at
GitHub
• https://github.com/jupyter-incubator/enterprise_gateway
• Jupyter Enterprise Gateway Documentation
• http://jupyter-enterprise-gateway.readthedocs.io/en/latest/
7
• A lightweight, multi-tenant, scalable
and secure gateway that enables
Jupyter Notebooks to share
resources across an Apache Spark
or Kubernetes cluster for
Enterprise/Cloud use cases

Road Map
• Background: Deep Learning Models
• The IBM Code Model Asset Exchange
• Demo
• What’s Next
8

CODAIT: Enabling End-to-End AI in
the Enterprise
9
Gather
Data
Analyze
Data
Machine
Learning
Deep
Learning
Deploy
Model
Maintain
Model
Python
Data Science
Stack
Fabric for
Deep Learning
(FfDL)
Mleap +
PFA
Scikit-LearnPandas
Apache
Spark
Apache
Spark
Jupyter
Model
Asset
eXchange
Keras +
Tensorflow

10
Making AI as
Ubiquitous as
the Telephone

11
This talk is about enabling domain experts
to use deep learning in the enterprise.
Q: What is deep learning?
A: Machine learning using deep neural
networks.
Q: What is a deep neural network?
A: A neural network with multiple hidden
layers.

12
Q: What is a neural network?

What is a neural network?
13
! = #$%+ &$'+ &$(
x1
x2
x3
y
a
b
c
Linear regression

Multiple linear
regressions at the same
time
14
x1
x2
x3
y3
y1
y4
y2

15
Dense
(3×4)
Dense
(4×2)
Input
(3)
Output
(2)
Same network in a more
compact notation
Multilayer Perceptron
Neural Network
Second layer
of linear
regressions

16
Dense
(3×4)
Dense
(4×2)
Input
(3)
Output
(2)
A: A neural network with multiple
hidden layers.

A: A neural network with multiple
hidden layers.
17
Dense
(3×8)
Dense
(8×6)
Input
(3)
Output
(2)Dense
(6×4)
Dense
(4×2)

Q: What is
deep learning?
A: Machine
learning using
deep neural
networks.
18
InceptionV3 Convolutional Neural Net
(A “medium-sized” deep learning model)
Image Source:
https://github.com/tensorflow/models/blob/master/research/inception/g3doc
/inception_v3_architecture.png

Characteristics of Deep Learning (1)
• State-of-the-Art
prediction quality in
many domains
– Image classification
– Machine translation
– Facial recognition
– Time series
prediction
– Many more
19

• Large, complex models
– Model size generally determined by “how big a model can
you fit on your device?”
20
Each box ≈ between
32 and 768 linear
regression models
InceptionV3 Convolutional Neural Net
(A “medium-sized” deep learning model)

Poorly understood today
…even by experts
– Why do the models
converge?
converge with low loss?
generalize?
21

Focus of this Talk
Incorporating well-
understood deep
learning models into
enterprise applications.
22

“cat”
The Parts of a Deep Learning Model
24
Dense
(3×8)
Dense
(8×6)
Input
(3)
Output
(2)Dense
(6×4)
Dense
(4×2)
Neural Network
Graph
Weights
(not to scale)
Driver Program

Example: Get an Image Classifier
Step 1: Find a suitable
neural network graph.
– Need to read some
papers
25

Step 2: Find code to generate the neural network
graph
26
TensorFlow code to build ResNet50 neural network graph

Step 3: Find some pre-
trained weights for your
graph
27
Caffe2 ResNet50 model weights*
* Caffe2 only. Find a different binary file if your framework is not Caffe2.

Step 4: Find example code that performs model
inference
28
TensorFlow code for training and batch inference* on ResNet50
* Single-crop inference only. Additional code required to use multiple crops.

Step 5: Write your own code to perform model
inference on one image at a time
Step 6: Package your inference code, graph
creation code, and pre-trained weights together
Step 7: Deploy your package
29

Model Marketplaces
• Collections of well-understood deep learning
models
• Provide a central place to find known-good
implementations of these models
30

The IBM Code Model Asset eXchange
• Free, open-source models.
• Wide variety of domains.
• Multiple deep learning
frameworks.
• Vetted and tested code and
IP.
• Build and deploy a web
service in 30 seconds.
• Start training on Watson
Studio in minutes.
31

Model Asset eXchange: Summary
• Free, open-source models.
• Wide variety of domains.
• Multiple deep learning
frameworks.
• Vetted and tested code and
IP.
• Build and deploy a web
service in 30 seconds.
• Start training on Watson
Studio in minutes.
33

Model Asset eXchange: What’s Next
• More models
• More deployment
options
• Code Patterns
showing how to use
the models (including
today’s demo!)
34

35
Thank you!
• http://codait.org
• https://developer.ibm.com/code/
exchanges/models/
• github.com/codait
• developer.ibm.com/code

Call for Code inspires developers
to solve pressing global problems
with sustainable software
solutions, delivering
on their vast potential to do good.
Bringing together NGOs, academic
institutions, enterprises, and
startup developers to compete
build effective disaster mitigation
solutions, with a focus on health
and well-being.
International Federation of Red
Cross/Red Crescent, The
American Red Cross, and the
United Nations Office of Human
Rights combine for the Call for
Code Award to elevate the profile
of developers.
Award winners will receive long-term
support through open source
foundations, financial prizes, the
opportunity to present their solution
to leading VCs, and will deploy their
solution through IBM’s Corporate
Service Corps.
Developers will jump-start their
project with dedicated IBM Code
Patterns, combined with optional
enterprise technology to build
projects over the course of three
months.
Judged by the world’s most renowned
technologists, the grand prize will be
presented in October at an Award
Event.
developer.ibm.com/callforcode

Date, Time, Location & Duration Session title and Speaker
Tue, June 5 | 11 AM
2010-2012, 30 mins
Productionizing Spark ML Pipelines with the Portable Format for Analytics
Nick Pentreath (IBM)
Tue, June 5 | 2 PM
2018, 30 mins
Making PySpark Amazing—From Faster UDFs to Dependency Management and Graphing!
Holden Karau (Google) Bryan Cutler (IBM)
Tue, June 5 | 2 PM
Nook by 2001, 30 mins
Making Data and AI Accessible for All
Armand Ruiz Gabernet (IBM)
Tue, June 5 | 2:40 PM
2002-2004, 30 mins
Cognitive Database: An Apache Spark-Based AI-Enabled Relational Database System
Rajesh Bordawekar (IBM T.J. Watson Research Center)
3016-3022, 30 mins
Dynamic Priorities for Apache Spark Application’s Resource Allocations
Michael Feiman (IBM Spectrum Computing) Shinnosuke Okada (IBM Canada Ltd.)
2001-2005, 30 mins
Model Parallelism in Spark ML Cross-Validation
Nick Pentreath (IBM) Bryan Cutler (IBM)
2007, 30 mins
Serverless Machine Learning on Modern Hardware Using Apache Spark
Patrick Stuedi (IBM)
2002-2004, 30 mins
Create a Loyal Customer Base by Knowing Their Personality Using AI-Based Personality Recommendation Engine;
Sourav Mazumder (IBM Analytics) Aradhna Tiwari (University of South Florida)
2007, 30 mins
Transparent GPU Exploitation on Apache Spark
Dr. Kazuaki Ishizaki (IBM) Madhusudanan Kandasamy (IBM)
2009-2011, 30 mins
Apache Spark Based Hyper-Parameter Selection and Adaptive Model Tuning for Deep Neural Networks
Yonggang Hu (IBM) Chao Xue (IBM)
IBM Sessions at Spark+AI Summit 2018 (Tuesday, June 5)
37

Date, Time, Location & Duration Session title and Speaker
Wed, June 6 | 12:50 PM Birds of a Feather: Apache Arrow in Spark and More
Bryan Cutler (IBM) Li Jin (Two Sigma Investments, LP)
Wed, June 6 | 2 PM
2002-2004, 30 mins
Deep Learning for Recommender Systems
Nick Pentreath (IBM) )
Wed, June 6 | 3:20 PM
2018, 30 mins
Bringing an AI Ecosystem to the Domain Expert and Enterprise AI Developer
Frederick Reiss (IBM) Vijay Bommireddipalli (IBM Center for Open-Source Data & AI Technologies)
IBM Sessions at Spark+AI Summit 2018 (Wednesday, June 6)
38
Meet us at IBM booth in the Expo area.

39
Thank you!
http://codait.org
https://developer.ibm.com/code/exchan
ges/models/
https://www.linkedin.com/in/fred-reiss/
https://www.linkedin.com/in/vijayrb
github.com/codait
developer.ibm.com/code

Bringing an AI Ecosystem to the Domain Expert and Enterprise AI Developer with Fred Reiss and Vijay Bommireddipalli

More Related Content

What's hot (16)

Similar to Bringing an AI Ecosystem to the Domain Expert and Enterprise AI Developer with Fred Reiss and Vijay Bommireddipalli (20)

More from Databricks (20)

Recently uploaded (20)

Bringing an AI Ecosystem to the Domain Expert and Enterprise AI Developer with Fred Reiss and Vijay Bommireddipalli