SlideShare a Scribd company logo
A Journey Through The Far Side of Data Science
Ted Washburne @ Stevens Institute of Technology / Sept. 27, 2018
Your competition
is pursuing
opportunities for
AI & smart
automation offers
in every business
unit and going
beyond cost
savings or
productivity
improvements
The MBA is no
longer more
valuable that a
BA degree in
computer science
with machine
learning expertise
SalesForce
asserts that
“66% of a sales
rep’s time is
spent not selling”
and AI can
automate the
drudgery work
like onboarding
The younger
wealth now
have 24/7
service
expectations
that can only
be delivered by
intelligent
automation Driving revenue
growth and
profitability
Lowering the
cost of many
financial activities
to near-zero by
understanding
the critical
building blocks in
designing an AI
and smart
automation
strategy
Data alone
doesn’t help
business leaders
transform their
organizations.
AI enables faster
and larger-scale
intelligent
process
optimization,
intelligent agents,
and innovation
1
Data is the
new
foundation
2
AI becomes
the new
norm
3
Innovation is
intensifying
4
AI for the
Front-Office
5
Technology
is the
Business
6
AI enabled
Customer
Experience
7
Everything
happens in
the Platform
Reach and depth of AI technology is transforming Business
… Intelligent Automation (Aitomation) is the key …
Complete a thorough benefit analysis
before committing to investments
Make aitomation a strategic imperative
and get senior leadership backing
Focus on AInnovation with a central
team providing Governance
Identify and rapidly scale high-
impact aitomation use cases
Artificial General Intelligence is making
significant inroads in Legal and Accounting
Develop the people capabilities needed for
maximum value (Python, Machine Learning)
Honestly asses the Competition and establish
ecosystem partners to challenge them
Who Made this all Possible?
Paul Werbos & The Chain Rule
• https://en.wikipedia.org/wiki/Ba
ckpropagation
• https://en.wikipedia.org/wiki/Pa
ul_Werbos
• http://explained.ai/matrix-
calculus/index.html
• differential matrix calculus, the
shotgun wedding of linear
algebra and multivariate calculus.
Silicon Graphics
-> Nvidia +
Ian Buck
• Inventor of CUDA, the
established standard for GPU
Computing worldwide. Built
engineering team from two
people into international and
matrixed organization in
numerical libraries, compilers,
system software, IDEs, profilers,
debuggers, APIs, QA, build and
release, and AI frameworks.
https://www.nvidia.com/en-us/deep-learning-ai/education/
Data Science Themes at SIT
Artificial
Intelligence,
Machine Learning
& Cybersecurity
"We knew it was
the Russians,
and they knew
we knew,"
Johnston told
the NYtimes of
the
cyberwarfare. "I
would say it was
the cyber
equivalent of
hand-to-hand
combat."
Biomedical
Engineering,
Healthcare & Life
Sciences
Deep Learning’s
Deepest Impact:
AI Storming
Through $6.5
Trillion
Healthcare
Industry
Complex Systems
& Networks
USAA developed
a
comprehensive
analytical
simulation
model of their
complex
operations and
the market that
is allowing
senior
executives to
explore a wide
range of
scenarios and
strategic options
and understand
long term
implications of
these decisions.
Data Science and
Information
Systems
Cognitive
knowledge
graphs encode a
model of expert
knowledge of
every domain
within a context.
This gives bots a
semantic
understanding
of the context
and helps them
to respond to
complex
queries.
Financial Systems
& Technologies
Providing an
extension of the
bank's main
quant team
covering the
whole range of
quant tasks from
numerical
algorithms to
multi-curve
building and
financial
modelling.
Resilience &
Sustainability
Utilities now use
machine learning
to classify network
assets at high risk
of failure and
manage the
complexities of
distributed energy
resource
management.
Preparation for
cyberattacks
include
Transformers,
circuit breakers, as
well as secure
warehouses to
store them in
select locations
and the
preplanned
transportation and
logistics to get
them where they
need to go as the
situation requires.
Maritime Security
HawkEye 360, a
developer of
space-based
radio frequency
(RF) mapping
and analytics
systems
develops deep
convolutional
neural networks,
Bayesian
propagation
networks and
statistical
anomaly
detection for
maritime
domain
awareness
Systems
Engineering
Research Center
*Constraint
Programming to
Incorporate
Engineering
Methodologies
into the Design
Process of
Complex
Systems.
*Recurrent Nets
to solve
Knapsack
problems in
Mission
Management.
*Petri Nets for
Process
Discovery
https://www.usnews.com/news/articles/
2016-09-23/is-the-energy-grid-in-danger
Chua, L.O., Lin, G.-N
“Nonlinear programming
without computation”
Early Years
Career
SIT Courses & Case Studies
•Citi Cards
•Capital One
FIN 615 Financial
Decision Making
•Ceph
•Schneider Electric
•Spark
MIS 630 Database
Systems and Decision
Support
•Snowflake
•Merging Accounts
•Bad Days
MIS 636 Data
Warehousing and
Business Intelligence
•Knapsack
•HSBC RPA
BIA 650 Process
Optimization and
Analytics
•BMW China
•Coin-OR
•Visa Fraud
•Morgan Stanley
High Frequency
BIA 670 Risk
Management: Methods
and Applications
•Kaggle
•Numpy
•Pandas
•Seaborn, Keras
BIA 652
Multivariate Data
Analytics
•Schneider Electric
•Unmanned ground
vehicle
• Capital One
BIA 664 Data and
Information Quality
•TV Advertising
•Direct Mail
BIA 654 Experimental
Design
•MBA Forecasts
•Asurion Trouble
Prediction
MIS 637 Knowledge
Discovery in Databases
•DBS Treasury
•Hastie, Stork
BIA 656 Statistical
Learning and Analytics
•Flu prediction
•Amobee link
prediction
•Alternative
Influence Network
BIA 658 Social Network
Analytics
•DoubleClick
Attribution
•Yelp reviews
BIA 660 Web Mining
•PwC Audit.ai
•Automated Feature
Engineering
BIA 662 Cognitive
Computing
•A template for
understanding Big
Debt Crises
•by Ray Dalio
BIA 670 Risk
Management: Methods
and Applications
•Omnicom
•HP Forecast
•Lowes Hardware
•Cambridge
Analytica
BIA 672 Marketing
Analytics
•Walmart Labs
•Nike
•NV Energy
•Hawkeye360
BIA 674 Supply Chain
Analytics
•DBS ATM Maint.
•Insurance
Telematics
•ASW
BIA 676 Data Streams
Analytics: Internet of
Things
•Spark killed Hadoop
•MapReduce is dead
•DataBricks
•SparkFlows.io
BIA 678 Big Data
Technologies
•Next Best Action
Models for Wealth
Management
•HR systems
BIA 686 Applied
Analytics in a World of
Big Data
https://www.youtube.co
m/watch?v=g6oIQ5MXBE4
https://aws.amazon.com/rekognition/
Useful Algorithms For Your BI Career
Reinforcement
Learning
Dueling Deep Q
Network
Robots for
unloading ships,
warehouse
forklifts,
harvesting crops
Recommender
Systems
Netflix Prize
A quantum-
inspired classical
algorithm for
recommendation
systems.
Ewin Tang, July 10,
2018
Segmentation
Decision Tree
Clustering (k-
means or EM)
& factoextra
•Avoid “lazy”
dimensionality
reduction, like
principal
components
Forecasting
Tsintermittent
Recurrent NN
XGBoost
•http://docs.h2o.ai/
driverless-
ai/latest-
stable/docs/usergu
ide/time-
series.html?highlig
ht=forecast
Regression
GLM/GBM
Classifiers
Trees or XGBoost
Logistic Regression
Convolutional
Neural Nets
NLP
Word2Vec
Truncated SVD
OR
Coin-OR branch
and cut
Knapsack
DS is IT + Biz + software engineering + design
Get to know Docker, Kuberflow, Kubernetes, and SeldonA 2018 paper in Nature
cited AlphaGo's
approach as the basis for
a new means of
computing potential
pharmaceutical drug
molecules
Systems Engineering Research
• I authored a paper on a Hopfield (recurrent) system that we had built at Lockheed Research
Labs, along with a few other contributors and got it accepted to a Neural Networks conference
in Maryland – as a poster paper
• Some weeks later, I am getting a new clearance and told to report to a building with no
windows.
• The project has something to do with Mission Management and it has a challenge in the area
of solving a really big knapsack optimization problem over multiple time windows
• The problem often arises in resource allocation where there are financial constraints and is studied in
fields such as combinatorics, computer science, complexity theory, cryptography, applied
mathematics, and daily fantasy sports.
• While researching the problem, I was concerned about a limitation of Hopfield networks
getting stuck in local minima and not finding the globally optimal solution
• Researching all the way back to the 1970’s, I found a paper by a Berkeley professor named
Chua that had a better behaved network for this problem
• The network could be modeled in the popular circuit modeling software “SPICE”, using a Sun SPARC
workstation, like was done at Analog Devices and Linear Technology (merger last year)
• SPICE (Simulation Program with Integrated Circuit Emphasis) is a general-purpose, open source analog electronic
circuit simulator. MatLab can do this now
• We found that we could set up very large constraint matrices and completely random starting
points and the system would find an almost perfect solution in under a second and then spend
the next 10 seconds converging to the optimal solution
• Far faster than a Cray using the traditional linear programming ‘Greedy Algorithm’
Maritime Security
• Neural Nets for Sonar
Signal Processing
Financial Systems & Technologies
NBA: Recommended Solution for Client Investment & Trading Tech / Treasury and Markets
Sales Person
Counterparty
Client
Product
Counterparty
Market Data
Quantitative
Product Info
Market Data
Revenue
P&L Records
Unstructured
Data (WWW)
Counterparties clustered by similarity
(clients and non-clients)
Graph or relations
• Salesperson x CPTY as nodes
• Products related to each graph edge
Temporal structure of relations:
Mutations of the graph in time
Market Regime
Switches
• Regime-driven behaviour (SP, CPTY)
• Regime-driven revenue and risks
Contracts/products parameters templates
for each SP/CPTY in time
Client Request
Historical Data Knowledge Discovery OUTCOMES
•Stable transitions?
•Periodic events?
•Hidden connections?
•Recognized states?
•Unstructured outlier?
•Atypical states series?
•Unrecovered gaps?
•Unstable behaviour?
Patterns
Anomalies
Client Existing Data
Required Extra Data for Future Phases
Discovered KnowledgeKey: NBA Data
NBA Model
MODELS
Data Science and Information Systems
• https://databricks.com/session/moving-ebays-
data-warehouse-over-to-apache-spark-spark-as-
core-etl-platform-at-ebay
• Snowflake - Beyond Hadoop: Modern Cloud Data
Warehousing
• Captial One / NY / Sr. Software Engineer
• All of our infrastructure runs on AWS, and we are
eyeing other cloud providers too. We use Elasticsearch
and ELK stack, Redis, PostgreSQL, Redshift, and
Snowflake. We do analytics using H2O, Spark and
MLlib, Databricks/EMR, TensorFlow and Keras. We
build awesome products for our users that use this
data. We write microservices in Go and Node.js,
orchestrated by Kubernetes, with user experiences
written in React and TypeScript. We embrace
serverless. Your hands-on expertise in at least some of
these tools will be valuable, as well as your track
record of providing effective technical guidance.
CPA + AI
Accounting jobs are not going away – the skill set is changing
• Business Setting
• PwC partnered with H2O.ai to build a revolutionary bot that uses AI and machine
learning to ‘x-ray’ a business, analyzing billions of data points in milliseconds,
seeing what humans can’t, and applying judgement to detect anomalies in the
general ledger. Called GL.ai, it is the first module of PwC’s Audit.ai.
• Approach
• GL.ai harnesses PwC’s global knowledge and experience, embedding it in
algorithms trained to replicate the thinking and decision-making of expert auditors.
• It examines every uploaded transaction, every user, every amount and every
account to find unusual transactions (indicating potential error or fraud) in the
general ledger, without bias or variability.
• Impact
• Experience confirms that GL.ai speeds up the audit process, generates insights that
boost efficiency, and provides comfort that attention is being focused on areas of
true risk. These benefits are a direct result of GL.ai’s ability to analyze huge
amounts of data, not limited by sampling.
• The next Audit.ai modules are in development. They are set to revolutionize the
audit, enhancing client service, quality and efficiency, and giving our people more
time to do what machines can’t: thinking strategically and engaging,
communicating and building the relationships needed to turn data insights into
business action.
Complex Systems & Networks
Prescriptive Models combine:
•Known system facts, structure, and process
•Facts derived from data and statistical analysis
•Causal hypotheses (business judgment and assumptions)
•Dynamics of the system
•To not only predict the behavior, but to also to tell you why the system behaves that way
•Point to actions you can take today
•Prepare for responses if certain events materialize (real options)
•Help to point out wrong assumptions
•Explore a wide range of outcomes (scenarios)
Hybrid Modeling
•Discrete Event Simulation – Models how entities flow through a process and consume resources. Good for finding bottlenecks and
throughput/volume issues for well defined processes and non-adaptive entities (PROCESS CENTRIC)
•System Dynamics (Causal Loop) – Models aggregate causal relationships between system components to study system level behavior.
Incorporates non-linear causation, feedback loops, delays, system interdependencies, and soft variables. Good for broad, system level
understanding of dynamic behavior (SYSTEM CENTRIC)
•Agent Based Modeling – Models individual agents and how they react to external stimuli and their relationships with other agents. The
complex dynamic system level behavior emerges from the interactions of simple agents following simple rules. Agents have biases and
bounded rationality. They adapt and learn, but operate in a noisy uncertain environment (INDIVIDUAL CENTRIC)
Superior Models
•Far Higher Granularity, Time
•Skills/Groups, Missing Effects
•Abandonment behavior, Multi-skilled Sales Reps, Routing logic
•Also Incorporate Call and Sales Rep Attributes, Attribute based routing, Individual Agent Behaviors
Business Results
•Benefits
•Increased Revenue due to reduced abandons
•Improved customer satisfaction due to reduced wait times
•Reduction in hiring and training costs
•Risk Reduction
•Better understanding of operational risks and where they might surface
•This allows USAA to design mitigation strategies that are more proactive than reactive
•Investment Prioritization
•Reduced/Avoided rework due to better sequencing of work
•Better allocation of resources to create most value
How this Relates to Strategy and Value
8
EXTERNAL
FACTORS
LEADERSHIP
DECISIONS
PROCESS
OUTCOMES/
MEASURES
(KEIs)Competition
GDP
Capability
Investments
Hire Staff
Customer SAT
Quality
Profit
contacts apps products
rework
VALUEOperational ProcessesStrategic Choices
Risks
Causal Hypotheses
-Analytics
-Business Judgment
-Assumptions
Example II – Portfolio Roadmap and Dependencies
15
* - Notional Data
Incorporate data analytics as well as
business judgment in your models
Using Data Science and Simulation to Create Business Value
Dr. Bipin Chadha - Data Scientist
USAA Enterprise Data Analytics Office
Nov. 2015
Biomedical Engineering, Healthcare & Life Sciences
Best Opportunities to Healthcare’s Needs
Clinical
• Value Stream
Mapping/Design, Leading
Kaizen and Relentless Root
Cause Analysis
• Personalization of care
using claims and biometric
data (Apple watch?) models
• Predict outcomes and
adverse events
• Accountable care – Medical
Economics
Enterprise
• Cybersecurity
• Payment Integrity
• Pricing Optimization
• Fraud, AML, KYC
• Intelligent RPA with Petri
Nets for Process Discovery
Marketing & Sales
•Rep-Broker-Sponsor Attribution
Modeling
•Real-time targeting in the call center
•CLV, health behavior/activation models,
experiment design
•Attribution models
•Develop audience segmentations, core
value propositions, messaging strategy
for the different segments, measure
impact and efficacy of marketing
investments
•End to end delivery of behavior change
campaigns
•Chatbots & knowledge graphs
•Customer Journey Analytics
•Direct-to-Consumer marketing & sales
Member
Experience
• Advocacy Analytics
• Flu region prediction (with
active listening of Social
Media) and Next Best
Action messaging
• Churn prediction
• Personalized Robo-advisors
provide greater
convenience and insightful,
real-time recommendations
Artificial Intelligence, Machine Learning & Cybersecurity
Detecting Fraud or Cybersecurity Transactions Involves Monitoring Multiple Views Simultaneously
Endpoint authentication
Is the session being
compromised?
Is this accounts'
behavior normal for this
channel?
Is this accounts’
behavior normal for all
channels?
Are multiple accounts
behavior showing
correlated behaviors
across multiple
channels?
Endpoints
Merchant POS
ATM
Online Purchase
Acquirer
third-party online
payment platform
Cloud Communication
Network
Production Fraud
Scoring
•Endpoint authentication
•Is the session being
compromised?
•Is this accounts' behavior
normal for this channel?
•Is this accounts’ behavior
normal for all channels?
•Are multiple accounts
behavior showing
correlated behaviors across
multiple channels?
Transaction histories
with fraud categorization
•Machine learning fraud
models
•Cross account and cross
channel graph analytics
Issuers
Banks
Alternative Payments
Enablers
Blockchain & Bitcoin
A Unified Approach to Interpreting Model Predictions
Scott Lundberg, Su-In Lee
What Capital One is Looking For
FINANCE ASSOCIATE
Finance
• Analyze financial metrics and performance
• Develop, improve and / or automate reporting and analysis to provide insight into business trends
• Play a key role in evolving product and strategy decisions by providing finance analysis and forecasts
• Prepare for the future by learning new Tech skills
Accounting
• Participate in the external financial reporting process, including the quarterly earnings release and
securitization trust reporting
• Perform financial and operational audits, testing controls and identifying efficiencies
• Monitor/Enhance business specific analytics in support of external and internal financial reporting
• Evaluate and engage in Robotics Process Automation (RPA) projects across the Controller’s
Organization.
Preferred Qualifications:
• Bachelor’s degree in Finance, or Economics, or Business, or Accounting
• A demonstrated interest in financial management and technology aptitude
• At least 6 months of experience or course work in Financial Planning & Analysis (FP&A)
• Aptitude with technologies such as Python, SQL and R is strongly preferred
DATA SCIENTIST INTERN
On any given day, you might:
• Evaluate open source and internally-developed modeling and analytics tools using real business data
• Integrate internal data with external data sources and APIs to discover and implement actionable
insights
• Design and craft rich data visualizations to communicate stories to customers and company leadership
We'd love to find someone who is…
• 1. Intellectually curious. You ask why, you explore, and you are excited to imagine and create new
ideas by inventing self-adaptive models or by tapping into unstructured data sources. You love mining
data for insights into behaviors, intent and sentiment.
• 2. A builder. You are passionate about delivering better experiences and better products to our
customers and have a deep sense of ownership for your craft.
• 3. An experimental scientist. You love putting on your lab coat and trying new things, new
combinations of tools, techniques and feature engineering approaches even if you sometimes fail.
Basic Qualifications:
• - At least 6 months of experience or course work in open source programming languages for data
analysis
• - At least 6 months of experience or course work in inferential statistics or machine learning
Preferred Qualifications:
• - Direct experience with either Python or R, plus one other general purpose programming language
such as Java or C/C++
• - Experience or course work with large scale data analysis
Further Reading Worthy Of Becoming The
Basis Of Your Thesis
• https://amp-businessinsider-
com.cdn.ampproject.org/c/s/amp.businessinsider.com/why-attitude-is-
more-important-than-iq-2017-2
• https://www-theverge-
com.cdn.ampproject.org/c/s/www.theverge.com/platform/amp/2018/9/5/
17822562/google-dataset-search-service-scholar-scientific-journal-open-
data-access
• https://www.liebertpub.com/doi/full/10.1089/big.2018.0083
• https://blogs.microsoft.com/blog/2018/07/19/powering-our-customers-
the-innovation-story-behind-microsofts-earnings/
• https://www.fastcompany.com/40590772/my-three-decades-at-disney-
taught-me-not-to-fear-automation
Thank You
Ted Washburne
tlcj97@yahoo.com
Appendix
H2O Materials
Cheat Sheets for AI, Neural Networks, Machine Learning, Deep Learning & Big Data
H2O Materials
Publically Available Materials
http://docs.h2o.ai/
https://github.com/h2oai/h2o-meetups
A Journey Through The Far Side Of Data Science
A Journey Through The Far Side Of Data Science
A Journey Through The Far Side Of Data Science
A Journey Through The Far Side Of Data Science
A Journey Through The Far Side Of Data Science
A Journey Through The Far Side Of Data Science
A Journey Through The Far Side Of Data Science
A Journey Through The Far Side Of Data Science
Cheat Sheets for AI, Neural
Networks, Machine Learning,
Deep Learning & Big Data
https://becominghuman.ai/cheat-sheets-for-ai-neural-networks-machine-learning-
deep-learning-big-data-678c51b4b463
A Journey Through The Far Side Of Data Science
A Journey Through The Far Side Of Data Science
A Journey Through The Far Side Of Data Science
A Journey Through The Far Side Of Data Science
A Journey Through The Far Side Of Data Science
A Journey Through The Far Side Of Data Science
A Journey Through The Far Side Of Data Science

More Related Content

PDF
[Ai in finance] AI in regulatory compliance, risk management, and auditing
Natalino Busa
 
PDF
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
StampedeCon
 
PDF
A Pragmatic AI Maturity Model
DATAVERSITY
 
PPTX
Big data in Private Banking
Jérôme Kehrli
 
PPTX
Machine learning in Banks
Abhishek Upadhyay
 
PDF
AI: A risk and way to manage risk
Karan Sachdeva
 
PDF
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
StampedeCon
 
PPTX
Cognitive technologies with David Schatsky at Blocks + Bots
Adrienne Debigare
 
[Ai in finance] AI in regulatory compliance, risk management, and auditing
Natalino Busa
 
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
StampedeCon
 
A Pragmatic AI Maturity Model
DATAVERSITY
 
Big data in Private Banking
Jérôme Kehrli
 
Machine learning in Banks
Abhishek Upadhyay
 
AI: A risk and way to manage risk
Karan Sachdeva
 
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
StampedeCon
 
Cognitive technologies with David Schatsky at Blocks + Bots
Adrienne Debigare
 

What's hot (20)

PDF
Smart Data Webinar: Machine Learning Update
DATAVERSITY
 
PDF
ML and AI in Finance: Master Class
QuantUniversity
 
PPTX
Artificial Intelligence for Banking Fraud Prevention
Jérôme Kehrli
 
PDF
EDW 2015 cognitive computing panel session
Steve Ardire
 
PDF
AXA x DSSG Meetup Sharing (Feb 2016)
Eugene Yan Ziyou
 
PDF
Vertex Perspectives | Global AI Hub?
Vertex Holdings
 
PPTX
McKinsey Big Data Overview
optier
 
PDF
Understanding the New World of Cognitive Computing
DATAVERSITY
 
PPTX
AI & Machine Learning - Webinar Deck
The Digital Insurer
 
PPTX
An AI Maturity Roadmap for Becoming a Data-Driven Organization
David Solomon
 
PPS
Predictive Enterprise Strategic Overview
Steven Gorenbergh
 
PDF
Big Data for Defense and Security
EMC
 
PDF
"Smart Data Web: Connecting data and extracting knowledge", Prof. Dr. Hans Us...
Dataconomy Media
 
PDF
How to optimize the supply chain with ai
GlobalTechCouncil
 
PPT
IBM Software Day 2013. Smarter analytics and big data. building the next gene...
IBM (Middle East and Africa)
 
PPT
Gene Villeneuve - Moving from descriptive to cognitive analytics
IBM Sverige
 
PDF
Digital Decisioning for the New Decade - 2020 and Beyond
SCL HUB Conference
 
PDF
Sumyag profile deck
Vishwanath Ramdas
 
PPTX
Applications for Cognitive Computing
IBM Watson
 
PDF
Cognitive analytics: What's coming in 2016?
IBM Analytics
 
Smart Data Webinar: Machine Learning Update
DATAVERSITY
 
ML and AI in Finance: Master Class
QuantUniversity
 
Artificial Intelligence for Banking Fraud Prevention
Jérôme Kehrli
 
EDW 2015 cognitive computing panel session
Steve Ardire
 
AXA x DSSG Meetup Sharing (Feb 2016)
Eugene Yan Ziyou
 
Vertex Perspectives | Global AI Hub?
Vertex Holdings
 
McKinsey Big Data Overview
optier
 
Understanding the New World of Cognitive Computing
DATAVERSITY
 
AI & Machine Learning - Webinar Deck
The Digital Insurer
 
An AI Maturity Roadmap for Becoming a Data-Driven Organization
David Solomon
 
Predictive Enterprise Strategic Overview
Steven Gorenbergh
 
Big Data for Defense and Security
EMC
 
"Smart Data Web: Connecting data and extracting knowledge", Prof. Dr. Hans Us...
Dataconomy Media
 
How to optimize the supply chain with ai
GlobalTechCouncil
 
IBM Software Day 2013. Smarter analytics and big data. building the next gene...
IBM (Middle East and Africa)
 
Gene Villeneuve - Moving from descriptive to cognitive analytics
IBM Sverige
 
Digital Decisioning for the New Decade - 2020 and Beyond
SCL HUB Conference
 
Sumyag profile deck
Vishwanath Ramdas
 
Applications for Cognitive Computing
IBM Watson
 
Cognitive analytics: What's coming in 2016?
IBM Analytics
 
Ad

Similar to A Journey Through The Far Side Of Data Science (20)

PDF
ICPSR - Complex Systems Models in the Social Sciences - Lecture 6 - Professor...
Daniel Katz
 
PPTX
Gamifying Strategy - Enterprise AI use cases on agent-based simulation and re...
AnandSRao1962
 
PDF
Machine Learning & AI - 2022 intro for pre-college students.pdf
Ed Fernandez
 
PDF
Fintech 2018 Edinburgh
Ray Bugg
 
PDF
Artificial intelligence could help data centers run far more efficiently
venkatvajradhar1
 
PDF
Defining the boundary for AI research in Intelligent Systems Dec 2021
Parasuram Balasubramanian
 
PDF
Aplications for machine learning in IoT
Yashesh Shroff
 
PPTX
Ypo 20190131 v1
home
 
PDF
Machine Learning: Past, Present and Future - by Tom Dietterich
BigML, Inc
 
PDF
VMblog - 2018 Artificial Intelligence and Machine Learning Predictions from 3...
vmblog
 
PPTX
Sir 20200115 v8
home
 
PPTX
Artificial Intelligence and QA
Eduard Mirescu
 
PDF
Smart Machines: Driving the 4th Industrial Revolution?
Bijilash Babu
 
PDF
Do Androids Play Games? Where Does Gamification Fit in a World of Robots and ...
Gamification Europe
 
PDF
Deep Learning Image Processing Applications in the Enterprise
Ganesan Narayanasamy
 
PDF
Ai digital (without videos)
AnandSRao1962
 
PPTX
Patterson Consulting: What is Artificial Intelligence?
Josh Patterson
 
PDF
The 10 best performing big data and business analytics companies 2020
Merry D'souza
 
PPTX
Hicss52 20190108 v2
home
 
PPTX
AI and Security
Anurag Sahay
 
ICPSR - Complex Systems Models in the Social Sciences - Lecture 6 - Professor...
Daniel Katz
 
Gamifying Strategy - Enterprise AI use cases on agent-based simulation and re...
AnandSRao1962
 
Machine Learning & AI - 2022 intro for pre-college students.pdf
Ed Fernandez
 
Fintech 2018 Edinburgh
Ray Bugg
 
Artificial intelligence could help data centers run far more efficiently
venkatvajradhar1
 
Defining the boundary for AI research in Intelligent Systems Dec 2021
Parasuram Balasubramanian
 
Aplications for machine learning in IoT
Yashesh Shroff
 
Ypo 20190131 v1
home
 
Machine Learning: Past, Present and Future - by Tom Dietterich
BigML, Inc
 
VMblog - 2018 Artificial Intelligence and Machine Learning Predictions from 3...
vmblog
 
Sir 20200115 v8
home
 
Artificial Intelligence and QA
Eduard Mirescu
 
Smart Machines: Driving the 4th Industrial Revolution?
Bijilash Babu
 
Do Androids Play Games? Where Does Gamification Fit in a World of Robots and ...
Gamification Europe
 
Deep Learning Image Processing Applications in the Enterprise
Ganesan Narayanasamy
 
Ai digital (without videos)
AnandSRao1962
 
Patterson Consulting: What is Artificial Intelligence?
Josh Patterson
 
The 10 best performing big data and business analytics companies 2020
Merry D'souza
 
Hicss52 20190108 v2
home
 
AI and Security
Anurag Sahay
 
Ad

Recently uploaded (20)

PPTX
Employee Salary Presentation.l based on data science collection of data
barridevakumari2004
 
PDF
D9110.pdfdsfvsdfvsdfvsdfvfvfsvfsvffsdfvsdfvsd
minhn6673
 
PPTX
Measurement of Afordability for Water Supply and Sanitation in Bangladesh .pptx
akmibrahimbd
 
PDF
A Systems Thinking Approach to Algorithmic Fairness.pdf
Epistamai
 
PPTX
lecture 13 mind test academy it skills.pptx
ggesjmrasoolpark
 
PDF
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 
PPTX
Presentation (1) (1).pptx k8hhfftuiiigff
karthikjagath2005
 
PPT
Real Life Application of Set theory, Relations and Functions
manavparmar205
 
PPTX
short term internship project on Data visualization
JMJCollegeComputerde
 
PDF
Classifcation using Machine Learning and deep learning
bhaveshagrawal35
 
PPTX
World-population.pptx fire bunberbpeople
umutunsalnsl4402
 
PPTX
Complete_STATA_Introduction_Beginner.pptx
mbayekebe
 
PPTX
IP_Journal_Articles_2025IP_Journal_Articles_2025
mishell212144
 
PDF
CH2-MODEL-SETUP-v2017.1-JC-APR27-2017.pdf
jcc00023con
 
PDF
Key_Statistical_Techniques_in_Analytics_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PPTX
Pipeline Automatic Leak Detection for Water Distribution Systems
Sione Palu
 
PDF
oop_java (1) of ice or cse or eee ic.pdf
sabiquntoufiqlabonno
 
PDF
Chad Readey - An Independent Thinker
Chad Readey
 
PPTX
International-health-agency and it's work.pptx
shreehareeshgs
 
PDF
WISE main accomplishments for ISQOLS award July 2025.pdf
StatsCommunications
 
Employee Salary Presentation.l based on data science collection of data
barridevakumari2004
 
D9110.pdfdsfvsdfvsdfvsdfvfvfsvfsvffsdfvsdfvsd
minhn6673
 
Measurement of Afordability for Water Supply and Sanitation in Bangladesh .pptx
akmibrahimbd
 
A Systems Thinking Approach to Algorithmic Fairness.pdf
Epistamai
 
lecture 13 mind test academy it skills.pptx
ggesjmrasoolpark
 
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 
Presentation (1) (1).pptx k8hhfftuiiigff
karthikjagath2005
 
Real Life Application of Set theory, Relations and Functions
manavparmar205
 
short term internship project on Data visualization
JMJCollegeComputerde
 
Classifcation using Machine Learning and deep learning
bhaveshagrawal35
 
World-population.pptx fire bunberbpeople
umutunsalnsl4402
 
Complete_STATA_Introduction_Beginner.pptx
mbayekebe
 
IP_Journal_Articles_2025IP_Journal_Articles_2025
mishell212144
 
CH2-MODEL-SETUP-v2017.1-JC-APR27-2017.pdf
jcc00023con
 
Key_Statistical_Techniques_in_Analytics_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
Pipeline Automatic Leak Detection for Water Distribution Systems
Sione Palu
 
oop_java (1) of ice or cse or eee ic.pdf
sabiquntoufiqlabonno
 
Chad Readey - An Independent Thinker
Chad Readey
 
International-health-agency and it's work.pptx
shreehareeshgs
 
WISE main accomplishments for ISQOLS award July 2025.pdf
StatsCommunications
 

A Journey Through The Far Side Of Data Science

  • 1. A Journey Through The Far Side of Data Science Ted Washburne @ Stevens Institute of Technology / Sept. 27, 2018
  • 2. Your competition is pursuing opportunities for AI & smart automation offers in every business unit and going beyond cost savings or productivity improvements The MBA is no longer more valuable that a BA degree in computer science with machine learning expertise SalesForce asserts that “66% of a sales rep’s time is spent not selling” and AI can automate the drudgery work like onboarding The younger wealth now have 24/7 service expectations that can only be delivered by intelligent automation Driving revenue growth and profitability Lowering the cost of many financial activities to near-zero by understanding the critical building blocks in designing an AI and smart automation strategy Data alone doesn’t help business leaders transform their organizations. AI enables faster and larger-scale intelligent process optimization, intelligent agents, and innovation 1 Data is the new foundation 2 AI becomes the new norm 3 Innovation is intensifying 4 AI for the Front-Office 5 Technology is the Business 6 AI enabled Customer Experience 7 Everything happens in the Platform Reach and depth of AI technology is transforming Business
  • 3. … Intelligent Automation (Aitomation) is the key … Complete a thorough benefit analysis before committing to investments Make aitomation a strategic imperative and get senior leadership backing Focus on AInnovation with a central team providing Governance Identify and rapidly scale high- impact aitomation use cases Artificial General Intelligence is making significant inroads in Legal and Accounting Develop the people capabilities needed for maximum value (Python, Machine Learning) Honestly asses the Competition and establish ecosystem partners to challenge them
  • 4. Who Made this all Possible? Paul Werbos & The Chain Rule • https://en.wikipedia.org/wiki/Ba ckpropagation • https://en.wikipedia.org/wiki/Pa ul_Werbos • http://explained.ai/matrix- calculus/index.html • differential matrix calculus, the shotgun wedding of linear algebra and multivariate calculus. Silicon Graphics -> Nvidia + Ian Buck • Inventor of CUDA, the established standard for GPU Computing worldwide. Built engineering team from two people into international and matrixed organization in numerical libraries, compilers, system software, IDEs, profilers, debuggers, APIs, QA, build and release, and AI frameworks. https://www.nvidia.com/en-us/deep-learning-ai/education/
  • 5. Data Science Themes at SIT Artificial Intelligence, Machine Learning & Cybersecurity "We knew it was the Russians, and they knew we knew," Johnston told the NYtimes of the cyberwarfare. "I would say it was the cyber equivalent of hand-to-hand combat." Biomedical Engineering, Healthcare & Life Sciences Deep Learning’s Deepest Impact: AI Storming Through $6.5 Trillion Healthcare Industry Complex Systems & Networks USAA developed a comprehensive analytical simulation model of their complex operations and the market that is allowing senior executives to explore a wide range of scenarios and strategic options and understand long term implications of these decisions. Data Science and Information Systems Cognitive knowledge graphs encode a model of expert knowledge of every domain within a context. This gives bots a semantic understanding of the context and helps them to respond to complex queries. Financial Systems & Technologies Providing an extension of the bank's main quant team covering the whole range of quant tasks from numerical algorithms to multi-curve building and financial modelling. Resilience & Sustainability Utilities now use machine learning to classify network assets at high risk of failure and manage the complexities of distributed energy resource management. Preparation for cyberattacks include Transformers, circuit breakers, as well as secure warehouses to store them in select locations and the preplanned transportation and logistics to get them where they need to go as the situation requires. Maritime Security HawkEye 360, a developer of space-based radio frequency (RF) mapping and analytics systems develops deep convolutional neural networks, Bayesian propagation networks and statistical anomaly detection for maritime domain awareness Systems Engineering Research Center *Constraint Programming to Incorporate Engineering Methodologies into the Design Process of Complex Systems. *Recurrent Nets to solve Knapsack problems in Mission Management. *Petri Nets for Process Discovery https://www.usnews.com/news/articles/ 2016-09-23/is-the-energy-grid-in-danger Chua, L.O., Lin, G.-N “Nonlinear programming without computation”
  • 8. SIT Courses & Case Studies •Citi Cards •Capital One FIN 615 Financial Decision Making •Ceph •Schneider Electric •Spark MIS 630 Database Systems and Decision Support •Snowflake •Merging Accounts •Bad Days MIS 636 Data Warehousing and Business Intelligence •Knapsack •HSBC RPA BIA 650 Process Optimization and Analytics •BMW China •Coin-OR •Visa Fraud •Morgan Stanley High Frequency BIA 670 Risk Management: Methods and Applications •Kaggle •Numpy •Pandas •Seaborn, Keras BIA 652 Multivariate Data Analytics •Schneider Electric •Unmanned ground vehicle • Capital One BIA 664 Data and Information Quality •TV Advertising •Direct Mail BIA 654 Experimental Design •MBA Forecasts •Asurion Trouble Prediction MIS 637 Knowledge Discovery in Databases •DBS Treasury •Hastie, Stork BIA 656 Statistical Learning and Analytics •Flu prediction •Amobee link prediction •Alternative Influence Network BIA 658 Social Network Analytics •DoubleClick Attribution •Yelp reviews BIA 660 Web Mining •PwC Audit.ai •Automated Feature Engineering BIA 662 Cognitive Computing •A template for understanding Big Debt Crises •by Ray Dalio BIA 670 Risk Management: Methods and Applications •Omnicom •HP Forecast •Lowes Hardware •Cambridge Analytica BIA 672 Marketing Analytics •Walmart Labs •Nike •NV Energy •Hawkeye360 BIA 674 Supply Chain Analytics •DBS ATM Maint. •Insurance Telematics •ASW BIA 676 Data Streams Analytics: Internet of Things •Spark killed Hadoop •MapReduce is dead •DataBricks •SparkFlows.io BIA 678 Big Data Technologies •Next Best Action Models for Wealth Management •HR systems BIA 686 Applied Analytics in a World of Big Data https://www.youtube.co m/watch?v=g6oIQ5MXBE4 https://aws.amazon.com/rekognition/
  • 9. Useful Algorithms For Your BI Career Reinforcement Learning Dueling Deep Q Network Robots for unloading ships, warehouse forklifts, harvesting crops Recommender Systems Netflix Prize A quantum- inspired classical algorithm for recommendation systems. Ewin Tang, July 10, 2018 Segmentation Decision Tree Clustering (k- means or EM) & factoextra •Avoid “lazy” dimensionality reduction, like principal components Forecasting Tsintermittent Recurrent NN XGBoost •http://docs.h2o.ai/ driverless- ai/latest- stable/docs/usergu ide/time- series.html?highlig ht=forecast Regression GLM/GBM Classifiers Trees or XGBoost Logistic Regression Convolutional Neural Nets NLP Word2Vec Truncated SVD OR Coin-OR branch and cut Knapsack DS is IT + Biz + software engineering + design Get to know Docker, Kuberflow, Kubernetes, and SeldonA 2018 paper in Nature cited AlphaGo's approach as the basis for a new means of computing potential pharmaceutical drug molecules
  • 10. Systems Engineering Research • I authored a paper on a Hopfield (recurrent) system that we had built at Lockheed Research Labs, along with a few other contributors and got it accepted to a Neural Networks conference in Maryland – as a poster paper • Some weeks later, I am getting a new clearance and told to report to a building with no windows. • The project has something to do with Mission Management and it has a challenge in the area of solving a really big knapsack optimization problem over multiple time windows • The problem often arises in resource allocation where there are financial constraints and is studied in fields such as combinatorics, computer science, complexity theory, cryptography, applied mathematics, and daily fantasy sports. • While researching the problem, I was concerned about a limitation of Hopfield networks getting stuck in local minima and not finding the globally optimal solution • Researching all the way back to the 1970’s, I found a paper by a Berkeley professor named Chua that had a better behaved network for this problem • The network could be modeled in the popular circuit modeling software “SPICE”, using a Sun SPARC workstation, like was done at Analog Devices and Linear Technology (merger last year) • SPICE (Simulation Program with Integrated Circuit Emphasis) is a general-purpose, open source analog electronic circuit simulator. MatLab can do this now • We found that we could set up very large constraint matrices and completely random starting points and the system would find an almost perfect solution in under a second and then spend the next 10 seconds converging to the optimal solution • Far faster than a Cray using the traditional linear programming ‘Greedy Algorithm’
  • 11. Maritime Security • Neural Nets for Sonar Signal Processing
  • 12. Financial Systems & Technologies NBA: Recommended Solution for Client Investment & Trading Tech / Treasury and Markets Sales Person Counterparty Client Product Counterparty Market Data Quantitative Product Info Market Data Revenue P&L Records Unstructured Data (WWW) Counterparties clustered by similarity (clients and non-clients) Graph or relations • Salesperson x CPTY as nodes • Products related to each graph edge Temporal structure of relations: Mutations of the graph in time Market Regime Switches • Regime-driven behaviour (SP, CPTY) • Regime-driven revenue and risks Contracts/products parameters templates for each SP/CPTY in time Client Request Historical Data Knowledge Discovery OUTCOMES •Stable transitions? •Periodic events? •Hidden connections? •Recognized states? •Unstructured outlier? •Atypical states series? •Unrecovered gaps? •Unstable behaviour? Patterns Anomalies Client Existing Data Required Extra Data for Future Phases Discovered KnowledgeKey: NBA Data NBA Model MODELS
  • 13. Data Science and Information Systems • https://databricks.com/session/moving-ebays- data-warehouse-over-to-apache-spark-spark-as- core-etl-platform-at-ebay • Snowflake - Beyond Hadoop: Modern Cloud Data Warehousing • Captial One / NY / Sr. Software Engineer • All of our infrastructure runs on AWS, and we are eyeing other cloud providers too. We use Elasticsearch and ELK stack, Redis, PostgreSQL, Redshift, and Snowflake. We do analytics using H2O, Spark and MLlib, Databricks/EMR, TensorFlow and Keras. We build awesome products for our users that use this data. We write microservices in Go and Node.js, orchestrated by Kubernetes, with user experiences written in React and TypeScript. We embrace serverless. Your hands-on expertise in at least some of these tools will be valuable, as well as your track record of providing effective technical guidance.
  • 14. CPA + AI Accounting jobs are not going away – the skill set is changing • Business Setting • PwC partnered with H2O.ai to build a revolutionary bot that uses AI and machine learning to ‘x-ray’ a business, analyzing billions of data points in milliseconds, seeing what humans can’t, and applying judgement to detect anomalies in the general ledger. Called GL.ai, it is the first module of PwC’s Audit.ai. • Approach • GL.ai harnesses PwC’s global knowledge and experience, embedding it in algorithms trained to replicate the thinking and decision-making of expert auditors. • It examines every uploaded transaction, every user, every amount and every account to find unusual transactions (indicating potential error or fraud) in the general ledger, without bias or variability. • Impact • Experience confirms that GL.ai speeds up the audit process, generates insights that boost efficiency, and provides comfort that attention is being focused on areas of true risk. These benefits are a direct result of GL.ai’s ability to analyze huge amounts of data, not limited by sampling. • The next Audit.ai modules are in development. They are set to revolutionize the audit, enhancing client service, quality and efficiency, and giving our people more time to do what machines can’t: thinking strategically and engaging, communicating and building the relationships needed to turn data insights into business action.
  • 15. Complex Systems & Networks Prescriptive Models combine: •Known system facts, structure, and process •Facts derived from data and statistical analysis •Causal hypotheses (business judgment and assumptions) •Dynamics of the system •To not only predict the behavior, but to also to tell you why the system behaves that way •Point to actions you can take today •Prepare for responses if certain events materialize (real options) •Help to point out wrong assumptions •Explore a wide range of outcomes (scenarios) Hybrid Modeling •Discrete Event Simulation – Models how entities flow through a process and consume resources. Good for finding bottlenecks and throughput/volume issues for well defined processes and non-adaptive entities (PROCESS CENTRIC) •System Dynamics (Causal Loop) – Models aggregate causal relationships between system components to study system level behavior. Incorporates non-linear causation, feedback loops, delays, system interdependencies, and soft variables. Good for broad, system level understanding of dynamic behavior (SYSTEM CENTRIC) •Agent Based Modeling – Models individual agents and how they react to external stimuli and their relationships with other agents. The complex dynamic system level behavior emerges from the interactions of simple agents following simple rules. Agents have biases and bounded rationality. They adapt and learn, but operate in a noisy uncertain environment (INDIVIDUAL CENTRIC) Superior Models •Far Higher Granularity, Time •Skills/Groups, Missing Effects •Abandonment behavior, Multi-skilled Sales Reps, Routing logic •Also Incorporate Call and Sales Rep Attributes, Attribute based routing, Individual Agent Behaviors Business Results •Benefits •Increased Revenue due to reduced abandons •Improved customer satisfaction due to reduced wait times •Reduction in hiring and training costs •Risk Reduction •Better understanding of operational risks and where they might surface •This allows USAA to design mitigation strategies that are more proactive than reactive •Investment Prioritization •Reduced/Avoided rework due to better sequencing of work •Better allocation of resources to create most value How this Relates to Strategy and Value 8 EXTERNAL FACTORS LEADERSHIP DECISIONS PROCESS OUTCOMES/ MEASURES (KEIs)Competition GDP Capability Investments Hire Staff Customer SAT Quality Profit contacts apps products rework VALUEOperational ProcessesStrategic Choices Risks Causal Hypotheses -Analytics -Business Judgment -Assumptions Example II – Portfolio Roadmap and Dependencies 15 * - Notional Data Incorporate data analytics as well as business judgment in your models Using Data Science and Simulation to Create Business Value Dr. Bipin Chadha - Data Scientist USAA Enterprise Data Analytics Office Nov. 2015
  • 16. Biomedical Engineering, Healthcare & Life Sciences Best Opportunities to Healthcare’s Needs Clinical • Value Stream Mapping/Design, Leading Kaizen and Relentless Root Cause Analysis • Personalization of care using claims and biometric data (Apple watch?) models • Predict outcomes and adverse events • Accountable care – Medical Economics Enterprise • Cybersecurity • Payment Integrity • Pricing Optimization • Fraud, AML, KYC • Intelligent RPA with Petri Nets for Process Discovery Marketing & Sales •Rep-Broker-Sponsor Attribution Modeling •Real-time targeting in the call center •CLV, health behavior/activation models, experiment design •Attribution models •Develop audience segmentations, core value propositions, messaging strategy for the different segments, measure impact and efficacy of marketing investments •End to end delivery of behavior change campaigns •Chatbots & knowledge graphs •Customer Journey Analytics •Direct-to-Consumer marketing & sales Member Experience • Advocacy Analytics • Flu region prediction (with active listening of Social Media) and Next Best Action messaging • Churn prediction • Personalized Robo-advisors provide greater convenience and insightful, real-time recommendations
  • 17. Artificial Intelligence, Machine Learning & Cybersecurity Detecting Fraud or Cybersecurity Transactions Involves Monitoring Multiple Views Simultaneously Endpoint authentication Is the session being compromised? Is this accounts' behavior normal for this channel? Is this accounts’ behavior normal for all channels? Are multiple accounts behavior showing correlated behaviors across multiple channels? Endpoints Merchant POS ATM Online Purchase Acquirer third-party online payment platform Cloud Communication Network Production Fraud Scoring •Endpoint authentication •Is the session being compromised? •Is this accounts' behavior normal for this channel? •Is this accounts’ behavior normal for all channels? •Are multiple accounts behavior showing correlated behaviors across multiple channels? Transaction histories with fraud categorization •Machine learning fraud models •Cross account and cross channel graph analytics Issuers Banks Alternative Payments Enablers Blockchain & Bitcoin
  • 18. A Unified Approach to Interpreting Model Predictions Scott Lundberg, Su-In Lee
  • 19. What Capital One is Looking For FINANCE ASSOCIATE Finance • Analyze financial metrics and performance • Develop, improve and / or automate reporting and analysis to provide insight into business trends • Play a key role in evolving product and strategy decisions by providing finance analysis and forecasts • Prepare for the future by learning new Tech skills Accounting • Participate in the external financial reporting process, including the quarterly earnings release and securitization trust reporting • Perform financial and operational audits, testing controls and identifying efficiencies • Monitor/Enhance business specific analytics in support of external and internal financial reporting • Evaluate and engage in Robotics Process Automation (RPA) projects across the Controller’s Organization. Preferred Qualifications: • Bachelor’s degree in Finance, or Economics, or Business, or Accounting • A demonstrated interest in financial management and technology aptitude • At least 6 months of experience or course work in Financial Planning & Analysis (FP&A) • Aptitude with technologies such as Python, SQL and R is strongly preferred DATA SCIENTIST INTERN On any given day, you might: • Evaluate open source and internally-developed modeling and analytics tools using real business data • Integrate internal data with external data sources and APIs to discover and implement actionable insights • Design and craft rich data visualizations to communicate stories to customers and company leadership We'd love to find someone who is… • 1. Intellectually curious. You ask why, you explore, and you are excited to imagine and create new ideas by inventing self-adaptive models or by tapping into unstructured data sources. You love mining data for insights into behaviors, intent and sentiment. • 2. A builder. You are passionate about delivering better experiences and better products to our customers and have a deep sense of ownership for your craft. • 3. An experimental scientist. You love putting on your lab coat and trying new things, new combinations of tools, techniques and feature engineering approaches even if you sometimes fail. Basic Qualifications: • - At least 6 months of experience or course work in open source programming languages for data analysis • - At least 6 months of experience or course work in inferential statistics or machine learning Preferred Qualifications: • - Direct experience with either Python or R, plus one other general purpose programming language such as Java or C/C++ • - Experience or course work with large scale data analysis
  • 20. Further Reading Worthy Of Becoming The Basis Of Your Thesis • https://amp-businessinsider- com.cdn.ampproject.org/c/s/amp.businessinsider.com/why-attitude-is- more-important-than-iq-2017-2 • https://www-theverge- com.cdn.ampproject.org/c/s/www.theverge.com/platform/amp/2018/9/5/ 17822562/google-dataset-search-service-scholar-scientific-journal-open- data-access • https://www.liebertpub.com/doi/full/10.1089/big.2018.0083 • https://blogs.microsoft.com/blog/2018/07/19/powering-our-customers- the-innovation-story-behind-microsofts-earnings/ • https://www.fastcompany.com/40590772/my-three-decades-at-disney- taught-me-not-to-fear-automation
  • 22. Appendix H2O Materials Cheat Sheets for AI, Neural Networks, Machine Learning, Deep Learning & Big Data
  • 23. H2O Materials Publically Available Materials http://docs.h2o.ai/ https://github.com/h2oai/h2o-meetups
  • 32. Cheat Sheets for AI, Neural Networks, Machine Learning, Deep Learning & Big Data https://becominghuman.ai/cheat-sheets-for-ai-neural-networks-machine-learning- deep-learning-big-data-678c51b4b463