SlideShare a Scribd company logo
2020 4 2
Oracle Cloud Infrastructure Data Science
Oracle Java Oracle CorporationOracle Java Oracle Corporation
2 Copyright © 2020 Oracle and/or its affiliates.
3 Copyright © 2020 Oracle and/or its affiliates.
•
•
•
-
-
-
• (ML) OSS
Oracle Accelerated Data
Science(ADS)
• ML
• PaaS IaaS
Overview
4 Copyright © 2020 Oracle and/or its affiliates.
•
•
• Notebook
•
• Jupyter Notebook ML
Compute
• Compartment VCN Subnet Compute
Block Volume
• ML
• Keras
• scikit-learn
• XGBoost
• Oracle Accelerated Data Science(ADS)
•
•
Accelerated Data Science
scikit-learn
ML
Jupyter Notebook
Noteboot
Compute Block Storage
5 Copyright © 2020 Oracle and/or its affiliates.
Notebook
Python
Notebook
OCI OCI Jupyter Notebook
6 Copyright © 2020 Oracle and/or its affiliates.
• Oracle Cloud Infrastructure Data Science
Python
•
API
• Oracle AutoML
•
•
•
Oracle Accelerated Data Science(ADS)
AutoML
Confidential – © 2020 Oracle Internal
⑥モデルの
解釈
②データの
変換
⑤モデルの
評価
Accelerated
data
Science
7 Copyright © 2020 Oracle and/or its affiliates.
• ADS
• DatasetFactory
•
•
• OCI Object Storage, Amazon S3, Google Cloud Storage, Azure Blob
• Oracle DB, ADW, MongoDB, HDFS, NoSQL DB, Elastic Search, etc.
•
• CSV, TSV, Parquet, libsvm, json, Excel, HDF5, SQL, xml, Apache Server Logfile(clf, log), arff
8 Copyright © 2020 Oracle and/or its affiliates.
#
ds = DatasetFactory.open("/path/to/data.data", format='csv', delimiter=" ")
# OCI Object Storage Service
ds = DatasetFactory.open("oci://<bucket-name>/<file-name>", storage_options = {
"config": "~/.oci/config",
"profile": "DEFAULT_USER"
})
# Amazon S3
ds = DatasetFactory.open("s3://bucket_name/iris.csv", storage_options = {
'key': 'aws key',
'secret': 'aws secret,
'blocksize': 1000000,
'client_kwargs': {
"endpoint_url": "https://s3-us-west-1.amazonaws.com"
}
})
# ADW
uri = f'oracle+cx_oracle://{os.environ["ADW_USER"]}:{os.environ["ADW_PASSWORD"]}@{os.environ["ADW_SID"]}’
ds = DatasetFactory.open(uri, format="sql", table=table, index_col=index_col, target='label')
9 Copyright © 2020 Oracle and/or its affiliates.
• RDB
• ( )
•
• ” ”
•
•
•
•
• etc.
10 Copyright © 2020 Oracle and/or its affiliates.
•
•
•
•
• String
• ( )
•
• Null Null
11 Copyright © 2020 Oracle and/or its affiliates.
1.
2.
3.
4.
ADS
#
ds.get_recommendations()
transformed_ds = ds.get_transformed_dataset()
#
transformed_ds = ds.auto_transform()
ADS AutoML
12 Copyright © 2020 Oracle and/or its affiliates.
ADS
( , )
( , )
“Drop”
get_recommendations()
13 Copyright © 2020 Oracle and/or its affiliates.
( , )
( , )
“Drop”
get_recommendations()
14 Copyright © 2020 Oracle and/or its affiliates.
( , )
( , )
“Drop”
get_recommendations()
15 Copyright © 2020 Oracle and/or its affiliates.
( )
( , )
“Up-sample” “Down-sample”
( , )
get_recommendations()
16 Copyright © 2020 Oracle and/or its affiliates.
•
•
•
• API(Seaborn, Matplotlib, GIS)
17 Copyright © 2020 Oracle and/or its affiliates.
# show_in_notebook()
ds.show_in_notebook()
5
18 Copyright © 2020 Oracle and/or its affiliates.
#
ds.plot("col02").show_in_notebook(figsize=(4,4))
#
ds.plot("col02", y="col01").show_in_notebook(figsize=(4,4))
#
ds.plot("col01", y="col03").show_in_notebook()
19 Copyright © 2020 Oracle and/or its affiliates.
API
# Matplotlib
from numpy.random import randn
df = pd.DataFrame(randn(1000, 4), columns=list('ABCD'))
def ts_plot(df, figsize):
ts = pd.Series(randn(1000), index=pd.date_range('1/1/2000',
periods=1000))
df.set_index(ts)
df = df.cumsum()
plt.figure()
df.plot(figsize=figsize)
plt.legend(loc='best')
ds = DatasetFactory.from_dataframe(df, target='A')
ds.call(ts_plot, figsize=(7,7))
Seaborn, Matplotlib, GIS
20 Copyright © 2020 Oracle and/or its affiliates.
• ADS AutoML
•
1.
2. ( )
3.
4.
#
train, test = transformed_ds.train_test_split(test_size=0.1)
#
ml_engine = OracleAutoMLProvider(n_jobs=-1, loglevel=logging.ERROR)
oracle_automl = AutoML(train, provider=ml_engine)
automl_model1, baseline = oracle_automl.train()
• AdaBoostClassifier
• DecisionTreeClassifier
• ExtraTreesClassifier
• KNeighborsClassifier
• LGBMClassifier
• LinearSVC
• LogisticRegression
• RandomForestClassifier
• SVC
• XGBClassifier
21 Copyright © 2020 Oracle and/or its affiliates.
Oracle AutoML
oracle_automl.visualize_algorithm_selection_trials() oracle_automl.visualize_adaptive_sampling_trials()
22 Copyright © 2020 Oracle and/or its affiliates.
Oracle AutoML
oracle_automl.visualize_feature_selection_trials() oracle_automl.visualize_tuning_trials()
23 Copyright © 2020 Oracle and/or its affiliates.
•
•
• ( )
TESTTESTTESTTESTTEST TRAIN TESTTESTTESTTESTTEST TRAIN
TRAIN TEST
TRAIN TRAINTEST
TRAINTEST
TRAINTEST
(※1)
1
2
3
4
5
※1 N 1 1 TEST N-1
TRAIN 2 1 TEST N-1 TRAIN
N
24 Copyright © 2020 Oracle and/or its affiliates.
)
•
• PR ROC
•
#
bin_evaluator = ADSEvaluator(test, models=[bin_lr_model, bin_rf_model],
training_data=train)
#
bin_evaluator.show_in_notebook(perfect=True)
25 Copyright © 2020 Oracle
•
•
•
•
•
•
• Global Explainer =
- (Feature Permutation Importance)
- (Individual Conditional Expectation(ICE))
- (Partial Dependence Plot(PDP))
• Local Explainer =
26 Copyright © 2020 Oracle and/or its affiliates.
ADS Global Explainer – Feature Permutation Importance
PassengerId Survived Pclass Name Sex Age SibSp Parch Fare Embarked
1 0 3 Braund, Mr. Owen male 22 1 0 7.25 S
2 1 1 Cumings, Mrs. John female 38 1 0 71.2833 C
3 1 3 Heikkinen, Miss. Laina female 26 0 0 7.925 S
4 1 1
Futrelle, Mrs. Jacques
Heath
female 35 1 0 53.1 S
PassengerId Survived Pclass Name Sex Age SibSp Parch Fare Embarked
1 0 3 Braund, Mr. Owen Female 22 1 0 7.25 S
2 1 1 Cumings, Mrs. John Male 38 1 0 71.2833 C
3 1 3 Heikkinen, Miss. Laina Male 26 0 0 7.925 S
4 1 1
Futrelle, Mrs. Jacques
Heath
male 35 1 0 53.1 S
(baseline_score) (shuffled_score)
baseline_score shuffled_score
baseline_score shuffled_score
•
•
baseline_score - shffuled_score
27 Copyright © 2020 Oracle and/or its affiliates.
# With ADSExplainer, create a global explanation object using
# the MLXGlobalExplainer provider
from ads.explanations.mlx_global_explainer import MLXGlobalExplainer
global_explainer = explainer.global_explanation(
provider=MLXGlobalExplainer())
# A summary of the global feature permutation importance algorithm and
# how to interpret the output can be displayed with
global_explainer.feature_importance_summary()
# Compute the global Feature Permutation Importance explanation
importances = global_explainer.compute_feature_importance()
# ADS supports multiple visualizations for the global Feature
# Permutation Importance explanations (see "Interpretation" above)
# Simple bar chart highlighting the average impact on model score
# across multiple iterations of the algorithm
importances.show_in_notebook()
# Build the model using AutoML. 'model' is a subclass of type ADSModel.
# Note that the ADSExplainer below works with any model (classifier or
# regressor) that is wrapped in an ADSModel
import logging
from ads.automl.provider import OracleAutoMLProvider
from ads.automl.driver import AutoML
ml_engine = OracleAutoMLProvider(n_jobs=-1, loglevel=logging.ERROR)
oracle_automl = AutoML(train, provider=ml_engine)
model, baseline = oracle_automl.train()
# Create the ADS explainer object, which is used to construct global
# and local explanation objects. The ADSExplainer takes as input the
# model to explain and the train/test dataset
from ads.explanations.explainer import ADSExplainer
explainer = ADSExplainer(test, model, training_data=train)
Global Explainer – Feature Importance Sample Code
28 Copyright © 2020 Oracle and/or its affiliates.
ADS Global Explainer - Individual Conditional Expectation(ICE)
F1 F2 F3 T
2 1.2 0 15.1
7 2.4 4 12.5
8 9.7 3 18.1
. ... ... 13.5
F1 F2 F3 T
2 1.2 0 15.1
F1 F2 F3 T
1 1.2 0 ?
2 2.4 4 ?
3 9.7 3 ?
. ... ... ?
F1 F2 F3 T
1 1.2 0 13.5
2 2.4 4 15.1
3 9.7 3 17.5
. ... ... ...
F1
T
F1
input
T
( )
T
F1
F1 T
Oracle
29 Copyright © 2020 Oracle and/or its affiliates.
ADS Global Explainer - Partial Dependence Plot(PDP)
F1 F2 F3 T
2 1.2 0 15.1
7 2.4 4 12.5
8 9.7 3 18.1
. ... ... 13.5
F1 F2 F3 T
2 1.2 0 15.1
F1 F2 F3 T
1 1.2 0 ?
2 2.4 4 ?
3 9.7 3 ?
. ... ... ?
F1 F2 F3 T
1 1.2 0 13.5
2 2.4 4 15.1
3 9.7 3 17.5
. ... ... ...
F1
T
ICE
ICE
PDP = ICE
( )
Oracle
ICE
30 Copyright © 2020 Oracle and/or its affiliates.
from ads.explanations.mlx_global_explainer import MLXGlobalExplainer
global_explainer = explainer.global_explanation(
provider=MLXGlobalExplainer())
# A summary of the global partial feature dependence explanation
# algorithm and how to interpret the output can be displayed with
global_explainer.partial_dependence_summary()
# Compute the 1-feature PDP on the categorical feature, "sex",
# and numerical feature, "age"
pdp_sex = global_explainer.compute_partial_dependence("sex")
pdp_age = global_explainer.compute_partial_dependence(
"age", partial_range=(0, 1))
# ADS supports PDP visualizations for both 1-feature and 2-feature
# Feature Dependence explanations, and ICE visualizations for 1-feature
# Feature Dependence explanations (see "Interpretation" above)
# Visualize the categorical feature PDP for the True (Survived) label
pdp_sex.show_in_notebook(labels=True)
# Note that the ADSExplainer below works with any model (classifier or
# regressor) that is wrapped in an ADSModel
import logging
from ads.automl.provider import OracleAutoMLProvider
from ads.automl.driver import AutoML
ml_engine = OracleAutoMLProvider(n_jobs=-1, loglevel=logging.ERROR)
oracle_automl = AutoML(train, provider=ml_engine)
model, baseline = oracle_automl.train()
# Create the ADS explainer object, which is used to construct
# global and local explanation objects. The ADSExplainer takes
# as input the model to explain and the train/test dataset
from ads.explanations.explainer import ADSExplainer
explainer = ADSExplainer(test, model, training_data=train)
# With ADSExplainer, create a global explanation object using
# the MLXGlobalExplainer provider
Global Explainer – ICE/PDP Sample Code
31 Copyright © 2020 Oracle and/or its affiliates.
Local Explainer
•
• ( α)
• (Survived= 0 or 1)
•
PassengerId Survived Pclass Name Sex Age SibSp Parch Fare Embarked
1 0 3 Braund, Mr. Owen male 22 1 0 7.25 S
2 1 1 Cumings, Mrs. John female 38 1 0 71.2833C
3 1 3
Heikkinen, Miss.
Laina
female 26 0 0 7.925 S
... ... ... ... ... ... ... ... ... ...
) (https://www.kaggle.com/c/titanic)
PassengerId Survived Pclass Name Sex Age SibSp Parch Fare Embarked
500 ? 1
Anna. Miss.
Bworn
female 36 1 0 71.283 C
PassengerId Survived Pclass Name Sex Age SibSp Parch Fare Embarked
500 1 1
Anna. Miss.
Bworn
female 36 1 0 71.283 C
Why?
32 Copyright © 2020 Oracle and/or its affiliates.
Local Explainer
PassengerId Survived Pclass Name Sex Age SibSp Parch Fare Embarked
1 0 3
Braund, Mr.
Owen
male 22 1 0 7.25 S
2 1 1
Cumings, Mrs.
John
female 38 1 0 71.2833 C
3 1 3
Heikkinen, Miss.
Laina
female 26 0 0 7.925 S
... ... ... ... ... ... ... ... ... ...
Oracle
PassengerId Survived Pclass Name Sex Age SibSp Parch Fare Embarked
500 ? 1
Anna. Miss.
Bworn
female 36 1 0 71.283 C
Passenger ID = 500
Passenger ID = 500
Oracle MLX
33 Copyright © 2020 Oracle and/or its affiliates.
Local Explainer
PassengerID 500
PassengerID 500
( )
34 Copyright © 2020 Oracle and/or its affiliates.
from ads.explanations.mlx_local_explainer import MLXLocalExplainer
local_explainer = explainer.local_explanation(
provider=MLXLocalExplainer())
# A summary of the local explanation algorithm and how to interpret
# the output can be displayed with
local_explainer.summary()
# Select a specific sample (instance/row) to generate a local
# explanation for
sample = 14
# Compute the local explanation on our sample from the test set
explanation = local_explainer.explain(test.X.iloc[sample:sample+1],
test.y.iloc[sample:sample+1])
# Visualize the explanation for the label True (Survived). See
# the "Interpretation" section above for more information
explanation.show_in_notebook(labels=True)
# Build the model using AutoML. 'model' is a subclass of type ADSModel.
# Note that the ADSExplainer below works with any model (classifier or
# regressor) that is wrapped in an ADSModel
import logging
from ads.automl.provider import OracleAutoMLProvider
from ads.automl.driver import AutoML
ml_engine = OracleAutoMLProvider(n_jobs=-1, loglevel=logging.ERROR)
oracle_automl = AutoML(train, provider=ml_engine)
model, baseline = oracle_automl.train()
# Create the ADS explainer object, which is used to construct
# global and local explanation objects. The ADSExplainer takes
# as input the model to explain and the train/test dataset
from ads.explanations.explainer import ADSExplainer
explainer = ADSExplainer(test, model, training_data=train)
# With ADSExplainer, create a local explanation object using
# the MLXLocalExplainer provider
Local Explainer
35 Copyright © 2020 Oracle and/or its affiliates.
•
•
•
Data Science Platform
• ADS ML
• scikit-learn, keras, xgboost, lightGBM
scikit-learn lightGBM
OCI [ ]> [ ]
Notebook
36 Copyright © 2020 Oracle and/or its affiliates.
Oracle Functions
OCI Data Science
OCI
API Gateway
http://hoge:8080/invoke/..
RESTEndpoint
OCI
Functions Service
OCI
Registry Service
Application
func.yml
func.py
scorefn.py
requirement.txt
?
cURL
•
•
• func.yml
• func.py
• scorefn.py
• requirement.txt
• ( )
• Fn OCI Functions
• OCI API Gateway
•
OCI (OCI
Functions)
• REST
(API
Gateway)
•
OCI
• REST
OCI Functions
20200402 oracle cloud infrastructure data science

More Related Content

Similar to 20200402 oracle cloud infrastructure data science (20)

PDF
Database@Home : The Future is Data Driven
Tammy Bednar
 
PDF
はじめてのOracle Cloud Infrastructure(Oracle Cloudウェビナーシリーズ: 2020年7月1日)
オラクルエンジニア通信
 
PDF
Database Basics with PHP -- Connect JS Conference October 17th, 2015
Dave Stokes
 
PDF
Ebs dba con4696_pdf_4696_0001
jucaab
 
PDF
【旧版】Oracle Cloud Infrastructure:サービス概要のご紹介 [2020年6月版]
オラクルエンジニア通信
 
PDF
Graal and Truffle: One VM to Rule Them All
Thomas Wuerthinger
 
PDF
Oracle Cloud Infrastructure:2020年8月度サービス・アップデート
オラクルエンジニア通信
 
PDF
20190713_MySQL開発最新動向
Machiko Ikoma
 
PDF
Oracle Cloud Infrastructure:2020年6月度サービス・アップデート
オラクルエンジニア通信
 
PDF
Oracle Cloud Infrastructure Data Science 概要資料(20200406)
オラクルエンジニア通信
 
PDF
RivieraJUG - MySQL Indexes and Histograms
Frederic Descamps
 
PDF
クラウドのコストを大幅削減!事例から見るクラウド間移行の効果(Oracle Cloudウェビナーシリーズ: 2020年7月8日)
オラクルエンジニア通信
 
PPTX
Oracle Database House Party_Oracle Machine Learning to Pick a Good Inexpensiv...
Charlie Berger
 
PPTX
#dbhouseparty - Using Oracle’s Converged “AI” Database to Pick a Good but Ine...
Tammy Bednar
 
PDF
MySQL Goes to 8! FOSDEM 2020 Database Track, January 2nd, 2020
Geir Høydalsvik
 
PDF
SkiPHP -- Database Basics for PHP
Dave Stokes
 
PDF
Oracle NoSQL
Oracle Korea
 
PDF
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Databricks
 
PDF
Oracle Database Migration to Oracle Cloud Infrastructure
SinanPetrusToma
 
PDF
"Quantum" Performance Effects
Sergey Kuksenko
 
Database@Home : The Future is Data Driven
Tammy Bednar
 
はじめてのOracle Cloud Infrastructure(Oracle Cloudウェビナーシリーズ: 2020年7月1日)
オラクルエンジニア通信
 
Database Basics with PHP -- Connect JS Conference October 17th, 2015
Dave Stokes
 
Ebs dba con4696_pdf_4696_0001
jucaab
 
【旧版】Oracle Cloud Infrastructure:サービス概要のご紹介 [2020年6月版]
オラクルエンジニア通信
 
Graal and Truffle: One VM to Rule Them All
Thomas Wuerthinger
 
Oracle Cloud Infrastructure:2020年8月度サービス・アップデート
オラクルエンジニア通信
 
20190713_MySQL開発最新動向
Machiko Ikoma
 
Oracle Cloud Infrastructure:2020年6月度サービス・アップデート
オラクルエンジニア通信
 
Oracle Cloud Infrastructure Data Science 概要資料(20200406)
オラクルエンジニア通信
 
RivieraJUG - MySQL Indexes and Histograms
Frederic Descamps
 
クラウドのコストを大幅削減!事例から見るクラウド間移行の効果(Oracle Cloudウェビナーシリーズ: 2020年7月8日)
オラクルエンジニア通信
 
Oracle Database House Party_Oracle Machine Learning to Pick a Good Inexpensiv...
Charlie Berger
 
#dbhouseparty - Using Oracle’s Converged “AI” Database to Pick a Good but Ine...
Tammy Bednar
 
MySQL Goes to 8! FOSDEM 2020 Database Track, January 2nd, 2020
Geir Høydalsvik
 
SkiPHP -- Database Basics for PHP
Dave Stokes
 
Oracle NoSQL
Oracle Korea
 
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Databricks
 
Oracle Database Migration to Oracle Cloud Infrastructure
SinanPetrusToma
 
"Quantum" Performance Effects
Sergey Kuksenko
 

More from Kenichi Sonoda (13)

PDF
Ocha_MLflow_MLOps.pdf
Kenichi Sonoda
 
PPTX
MLflowで学ぶMLOpsことはじめ
Kenichi Sonoda
 
PPTX
機械学習基盤として活用するAutonomous Database
Kenichi Sonoda
 
PPTX
[Oracle Code Night] Reinforcement Learning Demo Code
Kenichi Sonoda
 
PPTX
20210831 code night はじめての強化学習
Kenichi Sonoda
 
PDF
20210531 ora jam_stackgan
Kenichi Sonoda
 
PDF
[Code night 20200531]machine learning for begginer generation of virtual rea...
Kenichi Sonoda
 
PDF
20210226[oracle code night] 機械学習入門:ディープラーニングの基礎から転移学習まで
Kenichi Sonoda
 
PDF
[Oracle big data jam session #1] Apache Spark ことはじめ
Kenichi Sonoda
 
PDF
Oracle cloud infrastructure shared file service comparison 20181019 ss
Kenichi Sonoda
 
PDF
Oci file storage service deep dive 20181001 ss
Kenichi Sonoda
 
PDF
Configureing analytics system with apache spark and object storage service of...
Kenichi Sonoda
 
PDF
Oci object storage deep dive 20190329 ss
Kenichi Sonoda
 
Ocha_MLflow_MLOps.pdf
Kenichi Sonoda
 
MLflowで学ぶMLOpsことはじめ
Kenichi Sonoda
 
機械学習基盤として活用するAutonomous Database
Kenichi Sonoda
 
[Oracle Code Night] Reinforcement Learning Demo Code
Kenichi Sonoda
 
20210831 code night はじめての強化学習
Kenichi Sonoda
 
20210531 ora jam_stackgan
Kenichi Sonoda
 
[Code night 20200531]machine learning for begginer generation of virtual rea...
Kenichi Sonoda
 
20210226[oracle code night] 機械学習入門:ディープラーニングの基礎から転移学習まで
Kenichi Sonoda
 
[Oracle big data jam session #1] Apache Spark ことはじめ
Kenichi Sonoda
 
Oracle cloud infrastructure shared file service comparison 20181019 ss
Kenichi Sonoda
 
Oci file storage service deep dive 20181001 ss
Kenichi Sonoda
 
Configureing analytics system with apache spark and object storage service of...
Kenichi Sonoda
 
Oci object storage deep dive 20190329 ss
Kenichi Sonoda
 
Ad

Recently uploaded (20)

PDF
apidays Singapore 2025 - The API Playbook for AI by Shin Wee Chuang (PAND AI)
apidays
 
PPTX
apidays Singapore 2025 - The Quest for the Greenest LLM , Jean Philippe Ehre...
apidays
 
PPTX
b6057ea5-8e8c-4415-90c0-ed8e9666ffcd.pptx
Anees487379
 
PPTX
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
PDF
The European Business Wallet: Why It Matters and How It Powers the EUDI Ecosy...
Lal Chandran
 
PDF
apidays Singapore 2025 - Streaming Lakehouse with Kafka, Flink and Iceberg by...
apidays
 
PDF
What does good look like - CRAP Brighton 8 July 2025
Jan Kierzyk
 
PPTX
Listify-Intelligent-Voice-to-Catalog-Agent.pptx
nareshkottees
 
PPTX
apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...
apidays
 
PDF
OOPs with Java_unit2.pdf. sarthak bookkk
Sarthak964187
 
PDF
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
PDF
Data Retrieval and Preparation Business Analytics.pdf
kayserrakib80
 
PDF
apidays Singapore 2025 - Building a Federated Future, Alex Szomora (GSMA)
apidays
 
PPTX
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
PDF
apidays Singapore 2025 - How APIs can make - or break - trust in your AI by S...
apidays
 
PDF
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
PPTX
apidays Helsinki & North 2025 - From Chaos to Clarity: Designing (AI-Ready) A...
apidays
 
PDF
Context Engineering for AI Agents, approaches, memories.pdf
Tamanna
 
PPTX
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
PPT
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
apidays Singapore 2025 - The API Playbook for AI by Shin Wee Chuang (PAND AI)
apidays
 
apidays Singapore 2025 - The Quest for the Greenest LLM , Jean Philippe Ehre...
apidays
 
b6057ea5-8e8c-4415-90c0-ed8e9666ffcd.pptx
Anees487379
 
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
The European Business Wallet: Why It Matters and How It Powers the EUDI Ecosy...
Lal Chandran
 
apidays Singapore 2025 - Streaming Lakehouse with Kafka, Flink and Iceberg by...
apidays
 
What does good look like - CRAP Brighton 8 July 2025
Jan Kierzyk
 
Listify-Intelligent-Voice-to-Catalog-Agent.pptx
nareshkottees
 
apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...
apidays
 
OOPs with Java_unit2.pdf. sarthak bookkk
Sarthak964187
 
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
Data Retrieval and Preparation Business Analytics.pdf
kayserrakib80
 
apidays Singapore 2025 - Building a Federated Future, Alex Szomora (GSMA)
apidays
 
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
apidays Singapore 2025 - How APIs can make - or break - trust in your AI by S...
apidays
 
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
apidays Helsinki & North 2025 - From Chaos to Clarity: Designing (AI-Ready) A...
apidays
 
Context Engineering for AI Agents, approaches, memories.pdf
Tamanna
 
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
Growth of Public Expendituuure_55423.ppt
NavyaDeora
 
Ad

20200402 oracle cloud infrastructure data science

  • 1. 2020 4 2 Oracle Cloud Infrastructure Data Science
  • 2. Oracle Java Oracle CorporationOracle Java Oracle Corporation 2 Copyright © 2020 Oracle and/or its affiliates.
  • 3. 3 Copyright © 2020 Oracle and/or its affiliates. • • • - - - • (ML) OSS Oracle Accelerated Data Science(ADS) • ML • PaaS IaaS Overview
  • 4. 4 Copyright © 2020 Oracle and/or its affiliates. • • • Notebook • • Jupyter Notebook ML Compute • Compartment VCN Subnet Compute Block Volume • ML • Keras • scikit-learn • XGBoost • Oracle Accelerated Data Science(ADS) • • Accelerated Data Science scikit-learn ML Jupyter Notebook Noteboot Compute Block Storage
  • 5. 5 Copyright © 2020 Oracle and/or its affiliates. Notebook Python Notebook OCI OCI Jupyter Notebook
  • 6. 6 Copyright © 2020 Oracle and/or its affiliates. • Oracle Cloud Infrastructure Data Science Python • API • Oracle AutoML • • • Oracle Accelerated Data Science(ADS) AutoML Confidential – © 2020 Oracle Internal ⑥モデルの 解釈 ②データの 変換 ⑤モデルの 評価 Accelerated data Science
  • 7. 7 Copyright © 2020 Oracle and/or its affiliates. • ADS • DatasetFactory • • • OCI Object Storage, Amazon S3, Google Cloud Storage, Azure Blob • Oracle DB, ADW, MongoDB, HDFS, NoSQL DB, Elastic Search, etc. • • CSV, TSV, Parquet, libsvm, json, Excel, HDF5, SQL, xml, Apache Server Logfile(clf, log), arff
  • 8. 8 Copyright © 2020 Oracle and/or its affiliates. # ds = DatasetFactory.open("/path/to/data.data", format='csv', delimiter=" ") # OCI Object Storage Service ds = DatasetFactory.open("oci://<bucket-name>/<file-name>", storage_options = { "config": "~/.oci/config", "profile": "DEFAULT_USER" }) # Amazon S3 ds = DatasetFactory.open("s3://bucket_name/iris.csv", storage_options = { 'key': 'aws key', 'secret': 'aws secret, 'blocksize': 1000000, 'client_kwargs': { "endpoint_url": "https://s3-us-west-1.amazonaws.com" } }) # ADW uri = f'oracle+cx_oracle://{os.environ["ADW_USER"]}:{os.environ["ADW_PASSWORD"]}@{os.environ["ADW_SID"]}’ ds = DatasetFactory.open(uri, format="sql", table=table, index_col=index_col, target='label')
  • 9. 9 Copyright © 2020 Oracle and/or its affiliates. • RDB • ( ) • • ” ” • • • • • etc.
  • 10. 10 Copyright © 2020 Oracle and/or its affiliates. • • • • • String • ( ) • • Null Null
  • 11. 11 Copyright © 2020 Oracle and/or its affiliates. 1. 2. 3. 4. ADS # ds.get_recommendations() transformed_ds = ds.get_transformed_dataset() # transformed_ds = ds.auto_transform() ADS AutoML
  • 12. 12 Copyright © 2020 Oracle and/or its affiliates. ADS ( , ) ( , ) “Drop” get_recommendations()
  • 13. 13 Copyright © 2020 Oracle and/or its affiliates. ( , ) ( , ) “Drop” get_recommendations()
  • 14. 14 Copyright © 2020 Oracle and/or its affiliates. ( , ) ( , ) “Drop” get_recommendations()
  • 15. 15 Copyright © 2020 Oracle and/or its affiliates. ( ) ( , ) “Up-sample” “Down-sample” ( , ) get_recommendations()
  • 16. 16 Copyright © 2020 Oracle and/or its affiliates. • • • • API(Seaborn, Matplotlib, GIS)
  • 17. 17 Copyright © 2020 Oracle and/or its affiliates. # show_in_notebook() ds.show_in_notebook() 5
  • 18. 18 Copyright © 2020 Oracle and/or its affiliates. # ds.plot("col02").show_in_notebook(figsize=(4,4)) # ds.plot("col02", y="col01").show_in_notebook(figsize=(4,4)) # ds.plot("col01", y="col03").show_in_notebook()
  • 19. 19 Copyright © 2020 Oracle and/or its affiliates. API # Matplotlib from numpy.random import randn df = pd.DataFrame(randn(1000, 4), columns=list('ABCD')) def ts_plot(df, figsize): ts = pd.Series(randn(1000), index=pd.date_range('1/1/2000', periods=1000)) df.set_index(ts) df = df.cumsum() plt.figure() df.plot(figsize=figsize) plt.legend(loc='best') ds = DatasetFactory.from_dataframe(df, target='A') ds.call(ts_plot, figsize=(7,7)) Seaborn, Matplotlib, GIS
  • 20. 20 Copyright © 2020 Oracle and/or its affiliates. • ADS AutoML • 1. 2. ( ) 3. 4. # train, test = transformed_ds.train_test_split(test_size=0.1) # ml_engine = OracleAutoMLProvider(n_jobs=-1, loglevel=logging.ERROR) oracle_automl = AutoML(train, provider=ml_engine) automl_model1, baseline = oracle_automl.train() • AdaBoostClassifier • DecisionTreeClassifier • ExtraTreesClassifier • KNeighborsClassifier • LGBMClassifier • LinearSVC • LogisticRegression • RandomForestClassifier • SVC • XGBClassifier
  • 21. 21 Copyright © 2020 Oracle and/or its affiliates. Oracle AutoML oracle_automl.visualize_algorithm_selection_trials() oracle_automl.visualize_adaptive_sampling_trials()
  • 22. 22 Copyright © 2020 Oracle and/or its affiliates. Oracle AutoML oracle_automl.visualize_feature_selection_trials() oracle_automl.visualize_tuning_trials()
  • 23. 23 Copyright © 2020 Oracle and/or its affiliates. • • • ( ) TESTTESTTESTTESTTEST TRAIN TESTTESTTESTTESTTEST TRAIN TRAIN TEST TRAIN TRAINTEST TRAINTEST TRAINTEST (※1) 1 2 3 4 5 ※1 N 1 1 TEST N-1 TRAIN 2 1 TEST N-1 TRAIN N
  • 24. 24 Copyright © 2020 Oracle and/or its affiliates. ) • • PR ROC • # bin_evaluator = ADSEvaluator(test, models=[bin_lr_model, bin_rf_model], training_data=train) # bin_evaluator.show_in_notebook(perfect=True)
  • 25. 25 Copyright © 2020 Oracle • • • • • • • Global Explainer = - (Feature Permutation Importance) - (Individual Conditional Expectation(ICE)) - (Partial Dependence Plot(PDP)) • Local Explainer =
  • 26. 26 Copyright © 2020 Oracle and/or its affiliates. ADS Global Explainer – Feature Permutation Importance PassengerId Survived Pclass Name Sex Age SibSp Parch Fare Embarked 1 0 3 Braund, Mr. Owen male 22 1 0 7.25 S 2 1 1 Cumings, Mrs. John female 38 1 0 71.2833 C 3 1 3 Heikkinen, Miss. Laina female 26 0 0 7.925 S 4 1 1 Futrelle, Mrs. Jacques Heath female 35 1 0 53.1 S PassengerId Survived Pclass Name Sex Age SibSp Parch Fare Embarked 1 0 3 Braund, Mr. Owen Female 22 1 0 7.25 S 2 1 1 Cumings, Mrs. John Male 38 1 0 71.2833 C 3 1 3 Heikkinen, Miss. Laina Male 26 0 0 7.925 S 4 1 1 Futrelle, Mrs. Jacques Heath male 35 1 0 53.1 S (baseline_score) (shuffled_score) baseline_score shuffled_score baseline_score shuffled_score • • baseline_score - shffuled_score
  • 27. 27 Copyright © 2020 Oracle and/or its affiliates. # With ADSExplainer, create a global explanation object using # the MLXGlobalExplainer provider from ads.explanations.mlx_global_explainer import MLXGlobalExplainer global_explainer = explainer.global_explanation( provider=MLXGlobalExplainer()) # A summary of the global feature permutation importance algorithm and # how to interpret the output can be displayed with global_explainer.feature_importance_summary() # Compute the global Feature Permutation Importance explanation importances = global_explainer.compute_feature_importance() # ADS supports multiple visualizations for the global Feature # Permutation Importance explanations (see "Interpretation" above) # Simple bar chart highlighting the average impact on model score # across multiple iterations of the algorithm importances.show_in_notebook() # Build the model using AutoML. 'model' is a subclass of type ADSModel. # Note that the ADSExplainer below works with any model (classifier or # regressor) that is wrapped in an ADSModel import logging from ads.automl.provider import OracleAutoMLProvider from ads.automl.driver import AutoML ml_engine = OracleAutoMLProvider(n_jobs=-1, loglevel=logging.ERROR) oracle_automl = AutoML(train, provider=ml_engine) model, baseline = oracle_automl.train() # Create the ADS explainer object, which is used to construct global # and local explanation objects. The ADSExplainer takes as input the # model to explain and the train/test dataset from ads.explanations.explainer import ADSExplainer explainer = ADSExplainer(test, model, training_data=train) Global Explainer – Feature Importance Sample Code
  • 28. 28 Copyright © 2020 Oracle and/or its affiliates. ADS Global Explainer - Individual Conditional Expectation(ICE) F1 F2 F3 T 2 1.2 0 15.1 7 2.4 4 12.5 8 9.7 3 18.1 . ... ... 13.5 F1 F2 F3 T 2 1.2 0 15.1 F1 F2 F3 T 1 1.2 0 ? 2 2.4 4 ? 3 9.7 3 ? . ... ... ? F1 F2 F3 T 1 1.2 0 13.5 2 2.4 4 15.1 3 9.7 3 17.5 . ... ... ... F1 T F1 input T ( ) T F1 F1 T Oracle
  • 29. 29 Copyright © 2020 Oracle and/or its affiliates. ADS Global Explainer - Partial Dependence Plot(PDP) F1 F2 F3 T 2 1.2 0 15.1 7 2.4 4 12.5 8 9.7 3 18.1 . ... ... 13.5 F1 F2 F3 T 2 1.2 0 15.1 F1 F2 F3 T 1 1.2 0 ? 2 2.4 4 ? 3 9.7 3 ? . ... ... ? F1 F2 F3 T 1 1.2 0 13.5 2 2.4 4 15.1 3 9.7 3 17.5 . ... ... ... F1 T ICE ICE PDP = ICE ( ) Oracle ICE
  • 30. 30 Copyright © 2020 Oracle and/or its affiliates. from ads.explanations.mlx_global_explainer import MLXGlobalExplainer global_explainer = explainer.global_explanation( provider=MLXGlobalExplainer()) # A summary of the global partial feature dependence explanation # algorithm and how to interpret the output can be displayed with global_explainer.partial_dependence_summary() # Compute the 1-feature PDP on the categorical feature, "sex", # and numerical feature, "age" pdp_sex = global_explainer.compute_partial_dependence("sex") pdp_age = global_explainer.compute_partial_dependence( "age", partial_range=(0, 1)) # ADS supports PDP visualizations for both 1-feature and 2-feature # Feature Dependence explanations, and ICE visualizations for 1-feature # Feature Dependence explanations (see "Interpretation" above) # Visualize the categorical feature PDP for the True (Survived) label pdp_sex.show_in_notebook(labels=True) # Note that the ADSExplainer below works with any model (classifier or # regressor) that is wrapped in an ADSModel import logging from ads.automl.provider import OracleAutoMLProvider from ads.automl.driver import AutoML ml_engine = OracleAutoMLProvider(n_jobs=-1, loglevel=logging.ERROR) oracle_automl = AutoML(train, provider=ml_engine) model, baseline = oracle_automl.train() # Create the ADS explainer object, which is used to construct # global and local explanation objects. The ADSExplainer takes # as input the model to explain and the train/test dataset from ads.explanations.explainer import ADSExplainer explainer = ADSExplainer(test, model, training_data=train) # With ADSExplainer, create a global explanation object using # the MLXGlobalExplainer provider Global Explainer – ICE/PDP Sample Code
  • 31. 31 Copyright © 2020 Oracle and/or its affiliates. Local Explainer • • ( α) • (Survived= 0 or 1) • PassengerId Survived Pclass Name Sex Age SibSp Parch Fare Embarked 1 0 3 Braund, Mr. Owen male 22 1 0 7.25 S 2 1 1 Cumings, Mrs. John female 38 1 0 71.2833C 3 1 3 Heikkinen, Miss. Laina female 26 0 0 7.925 S ... ... ... ... ... ... ... ... ... ... ) (https://www.kaggle.com/c/titanic) PassengerId Survived Pclass Name Sex Age SibSp Parch Fare Embarked 500 ? 1 Anna. Miss. Bworn female 36 1 0 71.283 C PassengerId Survived Pclass Name Sex Age SibSp Parch Fare Embarked 500 1 1 Anna. Miss. Bworn female 36 1 0 71.283 C Why?
  • 32. 32 Copyright © 2020 Oracle and/or its affiliates. Local Explainer PassengerId Survived Pclass Name Sex Age SibSp Parch Fare Embarked 1 0 3 Braund, Mr. Owen male 22 1 0 7.25 S 2 1 1 Cumings, Mrs. John female 38 1 0 71.2833 C 3 1 3 Heikkinen, Miss. Laina female 26 0 0 7.925 S ... ... ... ... ... ... ... ... ... ... Oracle PassengerId Survived Pclass Name Sex Age SibSp Parch Fare Embarked 500 ? 1 Anna. Miss. Bworn female 36 1 0 71.283 C Passenger ID = 500 Passenger ID = 500 Oracle MLX
  • 33. 33 Copyright © 2020 Oracle and/or its affiliates. Local Explainer PassengerID 500 PassengerID 500 ( )
  • 34. 34 Copyright © 2020 Oracle and/or its affiliates. from ads.explanations.mlx_local_explainer import MLXLocalExplainer local_explainer = explainer.local_explanation( provider=MLXLocalExplainer()) # A summary of the local explanation algorithm and how to interpret # the output can be displayed with local_explainer.summary() # Select a specific sample (instance/row) to generate a local # explanation for sample = 14 # Compute the local explanation on our sample from the test set explanation = local_explainer.explain(test.X.iloc[sample:sample+1], test.y.iloc[sample:sample+1]) # Visualize the explanation for the label True (Survived). See # the "Interpretation" section above for more information explanation.show_in_notebook(labels=True) # Build the model using AutoML. 'model' is a subclass of type ADSModel. # Note that the ADSExplainer below works with any model (classifier or # regressor) that is wrapped in an ADSModel import logging from ads.automl.provider import OracleAutoMLProvider from ads.automl.driver import AutoML ml_engine = OracleAutoMLProvider(n_jobs=-1, loglevel=logging.ERROR) oracle_automl = AutoML(train, provider=ml_engine) model, baseline = oracle_automl.train() # Create the ADS explainer object, which is used to construct # global and local explanation objects. The ADSExplainer takes # as input the model to explain and the train/test dataset from ads.explanations.explainer import ADSExplainer explainer = ADSExplainer(test, model, training_data=train) # With ADSExplainer, create a local explanation object using # the MLXLocalExplainer provider Local Explainer
  • 35. 35 Copyright © 2020 Oracle and/or its affiliates. • • • Data Science Platform • ADS ML • scikit-learn, keras, xgboost, lightGBM scikit-learn lightGBM OCI [ ]> [ ] Notebook
  • 36. 36 Copyright © 2020 Oracle and/or its affiliates. Oracle Functions OCI Data Science OCI API Gateway http://hoge:8080/invoke/.. RESTEndpoint OCI Functions Service OCI Registry Service Application func.yml func.py scorefn.py requirement.txt ? cURL • • • func.yml • func.py • scorefn.py • requirement.txt • ( ) • Fn OCI Functions • OCI API Gateway • OCI (OCI Functions) • REST (API Gateway) • OCI • REST OCI Functions