SlideShare a Scribd company logo
IaaS, PaaS, and DevOps
for Data Science
Dmitry Petukhov,
Machine Learning Preacher, Microsoft AI MVP && Coffee Addicted
#Azure + #AI = #AzureDay
Agenda
−IaaS
−PaaS
−Architecture
−DevOps
… for Data Science
Who is who?
CRISP-DM
Pictures credit Wikipedia
Scrum
Hypothesis
Raw Data Collect
Receive -> Aggregate -> Transform -> Store
Investigate Data
Data structure (metadata)
Data distribution
Visualize data
Descriptive statistics
Pre-processing
Clean & consistent:
- Handling missing data
- Handling incorrect data
- Sensitive/private data
Convert data types (e.g. formatting culture-
sensitive data)
Feature Extraction*
Training Dataset
Machine Learning Workflow
Data Sources
Splitting Data
Test Dataset
To be continued…
Software Engineering Workflow
Machine Learning WorkflowSoftware Engineering Workflow
Model evaluation
Evaluate measures of quality model
(ROC, RMSE, F-Score, etc.)
Feature Selection**
Feature Selection
Feature Scaling (Normalization)
Dimension Reduction
Final Model
Training ML algorithm
Model Publication
Revision
FinalModelEvaluation
Data Flow
Cross-validation
Source 0xCODE.IN
−Distributed
−Large scalable
−Fault tolerance
−Reliable
−Big Data ready
−OSS-based
−Lawful
−Distributed
−Large scalable
−Fault tolerance
−Reliable
−Big Data ready
−OSS-based
−Lawful
Classic service requirements: Intellectual service requirements:
Data Scientist
Data (SDD)
On-Premises On-Knee Data Science
code1
result3
commit
4
compute2
Flexibility Distributed Large scalable Fault tolerance Reliable
OSS-based Big Data ready Secured Lawful
Pros and Cons:
+ Quick start and easy to use
+ Flexible
+ Markdown, code and visualization in
one place
+ Quick check of hypotheses
- All code in one(!) file
- Tedious version control
- Not scalable
- Never will be in production
Runtime
ML/DL Frameworks
OS
Virtualizations
Servers
Data
Intellectual App
ML Model
IDEs and Tools
Languages
Storage
Networking
On-premises
ML/DL Frameworks
OS
Virtualizations
Servers
Data
Intellectual App
ML Model
IDEs and Tools
Languages
Storage
Networking
PaaS
ML/DL Frameworks
OS
Virtualizations
Servers
Data
Intellectual App
ML Model
IDEs and Tools
Languages
Storage
Networking
Model as a Service
ML/DL Frameworks
OS
Virtualizations
Servers
Data
Intellectual App
ML Model
IDEs and Tools
Languages
Storage
Networking
IaaS
UnmanagedManaged
AI Lab. Where?
Pros and Cons:
+ Quick start and easy to use
+ Latest hardware
+ Preinstalled ML/DL frameworks
+ Horizontal scalable
+ On demand => zero acquisition cost
- High cost (24h)
- Not for production
- Old versions of frameworks
- Zoo of unused tools
IaaS: Data Science VM Images
Picture credit: Azure Docs
NVidia Tesla GPU(s)
Ubuntu / Windows Server
TensorFlow /
CNTK / via Keras
RStudio / Visual Studio
Azure NC/ND Series VMs
R / Python / C#
xgboost /
LightGBM /
CatBoost
Flexibility Distributed Large scalable Fault tolerance Reliable
OSS-based Big Data ready Secured Lawful
Microsoft ML Server
Azure Blob Storage
Apache Spark / Hadoop
Tasks
Azure HDInsight
Head
Node Worker
Node
HDFS API
Tasks
PaaS: Data Science VM Images
Flexibility Distributed Large scalable Fault tolerance Reliable
OSS-based Big Data ready Secured Lawful
Data Scientist
MS
ML
Client
<connect>
Pros and Cons:
+ as PaaS(!)
+ Distributed, scalable, etc.
+ REST API
+ Python and R
- High cost
- Not all operations / ML
algorithms can be
distributed
- Not all OSS
- Proprietary output
Batch Layer
Data Streams
Enterprise Data
Sources
Others
Train
ML model
Tune hyperparams
Cross-validation
Preprocessing
data
Speed Layer
Inference
ML model
Stream analysis
Preprocessing
data
1
2 3 4
5
1
2
3
4 5
6
Distributed File System
λ Architecture
Market Data
Streams
Macroeconomic
Data
Azure Event Hub
Train
ML model
Tune hyperparams
Cross-validation
Preprocessing
data
Inference
ML model
Aggregate stats
Preprocessing
data
Azure Storage
1
2 3
4-5
6
1
3
4 5
6
Tweets Feed
Azure Data Factory
h(θ0, θn)
2
Architecture as Service: λ Architecture for Data Science
Flexibility Distributed Large scalable Fault tolerance Reliable
OSS-based Big Data ready Secured Lawful
Store
data
Azure HDInsight
Azure SQL +
Microsoft ML Server
Data Scientist +
web-browser
<SSH tunnel>
Azure Storage
Azure Data Factory
Azure VM
Azure GPU Instances or
Memory-intensive Instances
AzureOn-premises
Data flow
Tasks flow
Azure Batch (AI)
Data transfer inside one DC
Low-priority VM
Storage options: tiers, redundancy
PremiumRS tier
B-series VM
Scheduling
Provisioning
Use Open Source
Only for Big Data cases
DevOps for Data Science
© 2018, Dmitry Petukhov. CC BY-SA 4.0 license. Microsoft and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
Thank you!
Q&A
Now or later (see contacts below)
Stay connected
Be friend at the Facebook/@codez0mb1e
Read me at the Habr/@codezombie
All contacts on http://0xCode.in/@codez0mb1e
Download slides from
http://0xcode.in/2018/azure-day-moscow or

More Related Content

PPTX
Managing your ML lifecycle with Azure Databricks and Azure ML
Parashar Shah
 
PPTX
Scalable Machine Learning using R and Azure HDInsight - Parashar
Parashar Shah
 
PDF
Azure Analysis Services (Azure Bootcamp 2018)
Turner Kunkel
 
PPTX
Azure Data Factory Data Flows Training v005
Mark Kromer
 
PPTX
Digital Transformation with Microsoft Azure
Luan Moreno Medeiros Maciel
 
PPTX
Minería de Datos en Sql Server 2008
Eduardo Castro
 
PPTX
Azure Analysis Services
nnakasone
 
PPTX
Azure Data Factory Data Flows Training (Sept 2020 Update)
Mark Kromer
 
Managing your ML lifecycle with Azure Databricks and Azure ML
Parashar Shah
 
Scalable Machine Learning using R and Azure HDInsight - Parashar
Parashar Shah
 
Azure Analysis Services (Azure Bootcamp 2018)
Turner Kunkel
 
Azure Data Factory Data Flows Training v005
Mark Kromer
 
Digital Transformation with Microsoft Azure
Luan Moreno Medeiros Maciel
 
Minería de Datos en Sql Server 2008
Eduardo Castro
 
Azure Analysis Services
nnakasone
 
Azure Data Factory Data Flows Training (Sept 2020 Update)
Mark Kromer
 

What's hot (20)

PPTX
Azure Data Lake Intro (SQLBits 2016)
Michael Rys
 
PDF
Azure AI platform - Automated ML workshop
Parashar Shah
 
PPTX
Mohamed Sabri: Operationalize machine learning with Kubeflow
Lviv Startup Club
 
PPTX
Global AI Bootcamp Madrid - Azure Databricks
Alberto Diaz Martin
 
PPTX
Eugene Polonichko "Architecture of modern data warehouse"
Lviv Startup Club
 
PDF
Training of Python scikit-learn models on Azure
Mark Tabladillo
 
PPTX
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...
Rukmani Gopalan
 
PPTX
Azure Data Factory Data Wrangling with Power Query
Mark Kromer
 
PDF
Data Engineering Basics
Catherine Kimani
 
PPTX
Data quality patterns in the cloud with ADF
Mark Kromer
 
PPTX
MLflow on and inside Azure
Databricks
 
PDF
Cortana Analytics Workshop: Azure Data Lake
MSAdvAnalytics
 
PPTX
Demystifying data engineering
Thang Bui (Bob)
 
PPTX
Introduction to PolyBase
James Serra
 
PDF
How to Build Modern Data Architectures Both On Premises and in the Cloud
VMware Tanzu
 
PPTX
ADF Mapping Data Flows Training Slides V1
Mark Kromer
 
PDF
Databricks Overview for MLOps
Databricks
 
PDF
Serverless data pipelines gcp
Catherine Kimani
 
PPTX
ETL in the Cloud With Microsoft Azure
Mark Kromer
 
PDF
Pipelines and Packages: Introduction to Azure Data Factory (Techorama NL 2019)
Cathrine Wilhelmsen
 
Azure Data Lake Intro (SQLBits 2016)
Michael Rys
 
Azure AI platform - Automated ML workshop
Parashar Shah
 
Mohamed Sabri: Operationalize machine learning with Kubeflow
Lviv Startup Club
 
Global AI Bootcamp Madrid - Azure Databricks
Alberto Diaz Martin
 
Eugene Polonichko "Architecture of modern data warehouse"
Lviv Startup Club
 
Training of Python scikit-learn models on Azure
Mark Tabladillo
 
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...
Rukmani Gopalan
 
Azure Data Factory Data Wrangling with Power Query
Mark Kromer
 
Data Engineering Basics
Catherine Kimani
 
Data quality patterns in the cloud with ADF
Mark Kromer
 
MLflow on and inside Azure
Databricks
 
Cortana Analytics Workshop: Azure Data Lake
MSAdvAnalytics
 
Demystifying data engineering
Thang Bui (Bob)
 
Introduction to PolyBase
James Serra
 
How to Build Modern Data Architectures Both On Premises and in the Cloud
VMware Tanzu
 
ADF Mapping Data Flows Training Slides V1
Mark Kromer
 
Databricks Overview for MLOps
Databricks
 
Serverless data pipelines gcp
Catherine Kimani
 
ETL in the Cloud With Microsoft Azure
Mark Kromer
 
Pipelines and Packages: Introduction to Azure Data Factory (Techorama NL 2019)
Cathrine Wilhelmsen
 
Ad

Similar to IaaS, PaaS, and DevOps for Data Scientist (20)

PDF
Prague data management meetup 2017-01-23
Martin Bém
 
PPTX
20160317 - PAZUR - PowerBI & R
Łukasz Grala
 
PPT
Data Mining for Developers
llangit
 
PPTX
Modern data warehouse
Rakesh Jayaram
 
PDF
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Trivadis
 
PDF
DP-900.pdf
PavanKumarMantha2
 
PPTX
How does Microsoft solve Big Data?
James Serra
 
PPT
BI 2008 Simple
llangit
 
PPTX
DA_01_Intro.pptx
Alok Mohapatra
 
PPTX
SQL Saturday Redmond 2019 ETL Patterns in the Cloud
Mark Kromer
 
PPTX
Cepta The Future of Data with Power BI
Kellyn Pot'Vin-Gorman
 
PPTX
Cloud Migration, Application Modernization, and Security
Tom Laszewski
 
PDF
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive
 
PPTX
PASS Summit - SQL Server 2017 Deep Dive
Travis Wright
 
PPTX
A lap around microsofts business intelligence platform
Ike Ellis
 
PDF
국내 건설 기계사 도입 사례를 통해 보는 AI가 적용된 수요 예측 관리 - 베스핀글로벌 조창윤 AI/ML팀 팀장
BESPIN GLOBAL
 
PDF
Best Practices for Building and Deploying Data Pipelines in Apache Spark
Databricks
 
PDF
Trivadis Azure Data Lake
Trivadis
 
PPTX
Azure Data.pptx
FedoRam1
 
PPT
SQL Server 2008 Data Mining
llangit
 
Prague data management meetup 2017-01-23
Martin Bém
 
20160317 - PAZUR - PowerBI & R
Łukasz Grala
 
Data Mining for Developers
llangit
 
Modern data warehouse
Rakesh Jayaram
 
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Trivadis
 
DP-900.pdf
PavanKumarMantha2
 
How does Microsoft solve Big Data?
James Serra
 
BI 2008 Simple
llangit
 
DA_01_Intro.pptx
Alok Mohapatra
 
SQL Saturday Redmond 2019 ETL Patterns in the Cloud
Mark Kromer
 
Cepta The Future of Data with Power BI
Kellyn Pot'Vin-Gorman
 
Cloud Migration, Application Modernization, and Security
Tom Laszewski
 
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive
 
PASS Summit - SQL Server 2017 Deep Dive
Travis Wright
 
A lap around microsofts business intelligence platform
Ike Ellis
 
국내 건설 기계사 도입 사례를 통해 보는 AI가 적용된 수요 예측 관리 - 베스핀글로벌 조창윤 AI/ML팀 팀장
BESPIN GLOBAL
 
Best Practices for Building and Deploying Data Pipelines in Apache Spark
Databricks
 
Trivadis Azure Data Lake
Trivadis
 
Azure Data.pptx
FedoRam1
 
SQL Server 2008 Data Mining
llangit
 
Ad

More from Dmitry Petukhov (15)

PPTX
Introduction to Auto ML
Dmitry Petukhov
 
PPTX
Intelligent Banking: AI cases in Retail and Commercial Banking
Dmitry Petukhov
 
PPTX
Introduction to Deep Learning
Dmitry Petukhov
 
PPTX
Introduction to Machine Learning
Dmitry Petukhov
 
PPTX
Microsoft Machine Learning Server. Architecture View
Dmitry Petukhov
 
PPTX
AI in IoT: Use Cases and Challenges
Dmitry Petukhov
 
PPTX
Azure Machine Learning
Dmitry Petukhov
 
PPTX
Machine Intelligence for Fraud Prediction
Dmitry Petukhov
 
PPTX
Machine Learning with Microsoft Azure
Dmitry Petukhov
 
PPTX
Democratizing Artificial Intelligence
Dmitry Petukhov
 
PPTX
AI for Retail Banking
Dmitry Petukhov
 
PPTX
R + Apache Spark
Dmitry Petukhov
 
PPTX
Introduction to R
Dmitry Petukhov
 
PPTX
Microsoft Azure + R
Dmitry Petukhov
 
PPTX
Machine Learning in Microsoft Azure
Dmitry Petukhov
 
Introduction to Auto ML
Dmitry Petukhov
 
Intelligent Banking: AI cases in Retail and Commercial Banking
Dmitry Petukhov
 
Introduction to Deep Learning
Dmitry Petukhov
 
Introduction to Machine Learning
Dmitry Petukhov
 
Microsoft Machine Learning Server. Architecture View
Dmitry Petukhov
 
AI in IoT: Use Cases and Challenges
Dmitry Petukhov
 
Azure Machine Learning
Dmitry Petukhov
 
Machine Intelligence for Fraud Prediction
Dmitry Petukhov
 
Machine Learning with Microsoft Azure
Dmitry Petukhov
 
Democratizing Artificial Intelligence
Dmitry Petukhov
 
AI for Retail Banking
Dmitry Petukhov
 
R + Apache Spark
Dmitry Petukhov
 
Introduction to R
Dmitry Petukhov
 
Microsoft Azure + R
Dmitry Petukhov
 
Machine Learning in Microsoft Azure
Dmitry Petukhov
 

Recently uploaded (20)

PPTX
Probability systematic sampling methods.pptx
PrakashRajput19
 
PPTX
IP_Journal_Articles_2025IP_Journal_Articles_2025
mishell212144
 
PPT
Real Life Application of Set theory, Relations and Functions
manavparmar205
 
PDF
blockchain123456789012345678901234567890
tanvikhunt1003
 
PPTX
Introduction-to-Python-Programming-Language (1).pptx
dhyeysapariya
 
PPTX
Multiscale Segmentation of Survey Respondents: Seeing the Trees and the Fores...
Sione Palu
 
PPTX
Introduction to computer chapter one 2017.pptx
mensunmarley
 
PDF
Blitz Campinas - Dia 24 de maio - Piettro.pdf
fabigreek
 
PDF
Key_Statistical_Techniques_in_Analytics_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
PPTX
INFO8116 -Big data architecture and analytics
guddipatel10
 
PPTX
Data-Users-in-Database-Management-Systems (1).pptx
dharmik832021
 
PPTX
MR and reffffffvvvvvvvfversal_083605.pptx
manjeshjain
 
PPTX
Introduction to Data Analytics and Data Science
KavithaCIT
 
PPTX
Blue and Dark Blue Modern Technology Presentation.pptx
ap177979
 
PPTX
lecture 13 mind test academy it skills.pptx
ggesjmrasoolpark
 
PDF
An Uncut Conversation With Grok | PDF Document
Mike Hydes
 
PPTX
INFO8116 - Week 10 - Slides.pptx data analutics
guddipatel10
 
PDF
Classifcation using Machine Learning and deep learning
bhaveshagrawal35
 
PPTX
Data Security Breach: Immediate Action Plan
varmabhuvan266
 
PDF
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 
Probability systematic sampling methods.pptx
PrakashRajput19
 
IP_Journal_Articles_2025IP_Journal_Articles_2025
mishell212144
 
Real Life Application of Set theory, Relations and Functions
manavparmar205
 
blockchain123456789012345678901234567890
tanvikhunt1003
 
Introduction-to-Python-Programming-Language (1).pptx
dhyeysapariya
 
Multiscale Segmentation of Survey Respondents: Seeing the Trees and the Fores...
Sione Palu
 
Introduction to computer chapter one 2017.pptx
mensunmarley
 
Blitz Campinas - Dia 24 de maio - Piettro.pdf
fabigreek
 
Key_Statistical_Techniques_in_Analytics_by_CA_Suvidha_Chaplot.pdf
CA Suvidha Chaplot
 
INFO8116 -Big data architecture and analytics
guddipatel10
 
Data-Users-in-Database-Management-Systems (1).pptx
dharmik832021
 
MR and reffffffvvvvvvvfversal_083605.pptx
manjeshjain
 
Introduction to Data Analytics and Data Science
KavithaCIT
 
Blue and Dark Blue Modern Technology Presentation.pptx
ap177979
 
lecture 13 mind test academy it skills.pptx
ggesjmrasoolpark
 
An Uncut Conversation With Grok | PDF Document
Mike Hydes
 
INFO8116 - Week 10 - Slides.pptx data analutics
guddipatel10
 
Classifcation using Machine Learning and deep learning
bhaveshagrawal35
 
Data Security Breach: Immediate Action Plan
varmabhuvan266
 
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 

IaaS, PaaS, and DevOps for Data Scientist

  • 1. IaaS, PaaS, and DevOps for Data Science Dmitry Petukhov, Machine Learning Preacher, Microsoft AI MVP && Coffee Addicted #Azure + #AI = #AzureDay
  • 5. Hypothesis Raw Data Collect Receive -> Aggregate -> Transform -> Store Investigate Data Data structure (metadata) Data distribution Visualize data Descriptive statistics Pre-processing Clean & consistent: - Handling missing data - Handling incorrect data - Sensitive/private data Convert data types (e.g. formatting culture- sensitive data) Feature Extraction* Training Dataset Machine Learning Workflow Data Sources Splitting Data Test Dataset To be continued… Software Engineering Workflow
  • 6. Machine Learning WorkflowSoftware Engineering Workflow Model evaluation Evaluate measures of quality model (ROC, RMSE, F-Score, etc.) Feature Selection** Feature Selection Feature Scaling (Normalization) Dimension Reduction Final Model Training ML algorithm Model Publication Revision FinalModelEvaluation Data Flow Cross-validation Source 0xCODE.IN
  • 7. −Distributed −Large scalable −Fault tolerance −Reliable −Big Data ready −OSS-based −Lawful −Distributed −Large scalable −Fault tolerance −Reliable −Big Data ready −OSS-based −Lawful Classic service requirements: Intellectual service requirements:
  • 8. Data Scientist Data (SDD) On-Premises On-Knee Data Science code1 result3 commit 4 compute2 Flexibility Distributed Large scalable Fault tolerance Reliable OSS-based Big Data ready Secured Lawful Pros and Cons: + Quick start and easy to use + Flexible + Markdown, code and visualization in one place + Quick check of hypotheses - All code in one(!) file - Tedious version control - Not scalable - Never will be in production
  • 9. Runtime ML/DL Frameworks OS Virtualizations Servers Data Intellectual App ML Model IDEs and Tools Languages Storage Networking On-premises ML/DL Frameworks OS Virtualizations Servers Data Intellectual App ML Model IDEs and Tools Languages Storage Networking PaaS ML/DL Frameworks OS Virtualizations Servers Data Intellectual App ML Model IDEs and Tools Languages Storage Networking Model as a Service ML/DL Frameworks OS Virtualizations Servers Data Intellectual App ML Model IDEs and Tools Languages Storage Networking IaaS UnmanagedManaged AI Lab. Where?
  • 10. Pros and Cons: + Quick start and easy to use + Latest hardware + Preinstalled ML/DL frameworks + Horizontal scalable + On demand => zero acquisition cost - High cost (24h) - Not for production - Old versions of frameworks - Zoo of unused tools IaaS: Data Science VM Images Picture credit: Azure Docs NVidia Tesla GPU(s) Ubuntu / Windows Server TensorFlow / CNTK / via Keras RStudio / Visual Studio Azure NC/ND Series VMs R / Python / C# xgboost / LightGBM / CatBoost Flexibility Distributed Large scalable Fault tolerance Reliable OSS-based Big Data ready Secured Lawful
  • 11. Microsoft ML Server Azure Blob Storage Apache Spark / Hadoop Tasks Azure HDInsight Head Node Worker Node HDFS API Tasks PaaS: Data Science VM Images Flexibility Distributed Large scalable Fault tolerance Reliable OSS-based Big Data ready Secured Lawful Data Scientist MS ML Client <connect> Pros and Cons: + as PaaS(!) + Distributed, scalable, etc. + REST API + Python and R - High cost - Not all operations / ML algorithms can be distributed - Not all OSS - Proprietary output
  • 12. Batch Layer Data Streams Enterprise Data Sources Others Train ML model Tune hyperparams Cross-validation Preprocessing data Speed Layer Inference ML model Stream analysis Preprocessing data 1 2 3 4 5 1 2 3 4 5 6 Distributed File System λ Architecture
  • 13. Market Data Streams Macroeconomic Data Azure Event Hub Train ML model Tune hyperparams Cross-validation Preprocessing data Inference ML model Aggregate stats Preprocessing data Azure Storage 1 2 3 4-5 6 1 3 4 5 6 Tweets Feed Azure Data Factory h(θ0, θn) 2 Architecture as Service: λ Architecture for Data Science Flexibility Distributed Large scalable Fault tolerance Reliable OSS-based Big Data ready Secured Lawful Store data
  • 14. Azure HDInsight Azure SQL + Microsoft ML Server Data Scientist + web-browser <SSH tunnel> Azure Storage Azure Data Factory Azure VM Azure GPU Instances or Memory-intensive Instances AzureOn-premises Data flow Tasks flow Azure Batch (AI) Data transfer inside one DC Low-priority VM Storage options: tiers, redundancy PremiumRS tier B-series VM Scheduling Provisioning Use Open Source Only for Big Data cases
  • 15. DevOps for Data Science
  • 16. © 2018, Dmitry Petukhov. CC BY-SA 4.0 license. Microsoft and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. Thank you!
  • 17. Q&A Now or later (see contacts below) Stay connected Be friend at the Facebook/@codez0mb1e Read me at the Habr/@codezombie All contacts on http://0xCode.in/@codez0mb1e Download slides from http://0xcode.in/2018/azure-day-moscow or

Editor's Notes

  • #10: Если нет законодательных ограничений Назовем их в алфавитном порядке
  • #11: Швейцарский нож
  • #12: С IaaS закончили переходим на PaaS Кружка кофе и DFS App servers: ML Server H20 SparkR
  • #13: In real life 2 data pipeline: - training - inference Архитектура из мира CS, которая поможет людям из Data Science
  • #18: (c) 2018, Dmitry Petukhov. CC BY-SA 4.0 license.