SlideShare a Scribd company logo
Data transformation
in AWS
#cloudconf2022
Alessandra Bilardi
Data & Automation Specialist
alessandra.bilardi@corley.it
corley.it
The importance
Methods
AWS Services
Comparison
SOMMARIO
The importance of data transformation
1 Quantity
2 Quality
3 Noise
4 Compatibility
AWS data transformation methods
1 Extraction, parsing, reduction,
cleaning, anonymization and
encryption
2 Translation, typecasting,
formatting, renaming, and
mapping
3 Filtering, aggregation,
summarization, indexing and
ordering
4 Enrichment
AWS data transformation services
1 AWS Glue
DataBrew
2 AWS Data Pipeline
3 Amazon SageMaker
Data Wrangler
4 Notebooks
Comparison of AWS data transformation services
Services Difficulty Execution times Costs
Glue (Job) 🏖🤓🔧 17 secs (12 s) $ 0.00
DataBrew 🏖🏖📚 1 min 19 secs $ 0.16
Notebooks 🏖🤓🔧 2 mins (1.61s) $ 0.00
SageMaker (Job) 🤓🔧🔧 5 mins (1m 12s) $ 0.02
Data Wrangler 🏖📚🔧 5 mins (1m 12s) $ 0.74
Questions
1 Qual è il mio obiettivo ?
2 Quanto esperto sono ?
3 Quante risorse ho ?
4 Qual è lo strumento giusto ?
Contacts
Thanks for listening.
#cloudconf2022

More Related Content

PDF
Forecasting in AWS - 2023-05-16
Alessandra Bilardi
 
PPT
Amazon Simpledb
Biswajeet Dasmajumdar
 
PDF
Unified Data Analytics: Helping Data Teams Solve the World’s Toughest Problems
Databricks
 
PDF
Enterprise Serverless Adoption. An Experience Report
SheenBrisals
 
PDF
KnolX AWS Tech. Stack
Knoldus Inc.
 
PDF
Big data and serverless - AWS UG The Netherlands
Marek Kuczynski
 
PDF
AI Meetup
Salma Virk
 
PPTX
Serverless Generative AI on AWS, AWS User Groups of Florida
CloudHesive
 
Forecasting in AWS - 2023-05-16
Alessandra Bilardi
 
Amazon Simpledb
Biswajeet Dasmajumdar
 
Unified Data Analytics: Helping Data Teams Solve the World’s Toughest Problems
Databricks
 
Enterprise Serverless Adoption. An Experience Report
SheenBrisals
 
KnolX AWS Tech. Stack
Knoldus Inc.
 
Big data and serverless - AWS UG The Netherlands
Marek Kuczynski
 
AI Meetup
Salma Virk
 
Serverless Generative AI on AWS, AWS User Groups of Florida
CloudHesive
 

Similar to Data transformation on AWS - 2022-10-11 (12)

PDF
Forecasting in AWS - 2025-01-25
Alessandra Bilardi
 
PDF
AWS Enterprise Summit - 엔터프라이즈에서의 AWS 클라우드 활용 - Markku Lepisto
Amazon Web Services Korea
 
PPTX
Serverless data and analytics on AWS for operations
CloudHesive
 
PDF
[Infographic] 2014 Cloud Comparison
RapidScale
 
PPTX
Azure Databricks - An Introduction 2019 Roadshow.pptx
pascalsegoul
 
PDF
DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...
Rustem Feyzkhanov
 
PDF
Cloud- IaaS in Perspective AWS
samof76
 
PDF
AWS glue technical enablement training
Info Alchemy Corporation
 
PPTX
AWS Techniques and lessons writing a minimal cost gitlab runner
Anthony Scata
 
PPTX
Serverless Azure
Mark Allan
 
PPTX
AWS re:Invent 2017 re:Cap
Christian Melendez
 
PDF
Cloud computing-1224001671523233-9
LLC NewLink
 
Forecasting in AWS - 2025-01-25
Alessandra Bilardi
 
AWS Enterprise Summit - 엔터프라이즈에서의 AWS 클라우드 활용 - Markku Lepisto
Amazon Web Services Korea
 
Serverless data and analytics on AWS for operations
CloudHesive
 
[Infographic] 2014 Cloud Comparison
RapidScale
 
Azure Databricks - An Introduction 2019 Roadshow.pptx
pascalsegoul
 
DataTalks.Club - Building Scalable End-to-End Deep Learning Pipelines in the ...
Rustem Feyzkhanov
 
Cloud- IaaS in Perspective AWS
samof76
 
AWS glue technical enablement training
Info Alchemy Corporation
 
AWS Techniques and lessons writing a minimal cost gitlab runner
Anthony Scata
 
Serverless Azure
Mark Allan
 
AWS re:Invent 2017 re:Cap
Christian Melendez
 
Cloud computing-1224001671523233-9
LLC NewLink
 

More from Alessandra Bilardi (20)

PDF
Amazon Q and Amazon Bedrock, fully managed vs. custom - 2025-06-25
Alessandra Bilardi
 
PDF
The Art of Data Visualization - 2025-05-31
Alessandra Bilardi
 
PDF
Data Management on AWS: from caos to centralized governance - 2025-03-26
Alessandra Bilardi
 
PDF
GenAI-powered assistants compared in a real case - 2025-03-18
Alessandra Bilardi
 
PDF
Overview of Hugging Face platform - 2024-10-24
Alessandra Bilardi
 
PDF
A gentle introduction to MLSecOps - 2024-10-11
Alessandra Bilardi
 
PDF
Custom processing and modeling with Amazon SageMaker - 2024-09-26
Alessandra Bilardi
 
PDF
Data scientist vs Cloud engineer: who wins ? - 2024-09-19
Alessandra Bilardi
 
PDF
Custom processing and modeling with Amazon SageMaker - 2024-06-17
Alessandra Bilardi
 
PDF
IoT: ingestion, streaming, real-time and interactive data analysis - 2024-05-29
Alessandra Bilardi
 
PDF
MLOps vs LLMOps (by workflows and use cases) - 2024-05-21
Alessandra Bilardi
 
PDF
How to analyze the data arriving from the IoT? - 2024-05-16
Alessandra Bilardi
 
PDF
Overview of the OpenCV library and some use cases - 2024-04-19
Alessandra Bilardi
 
PDF
How to move your ML system from local to production - 2024-03-15
Alessandra Bilardi
 
PDF
Overview of the Kaggle platform and its competitions
Alessandra Bilardi
 
PDF
Forecasting in AWS - 2024-02-01
Alessandra Bilardi
 
PDF
From your laptop to all resource that you need - 2023-12-09
Alessandra Bilardi
 
PDF
Parallelize data processing - 2023-10-24
Alessandra Bilardi
 
PDF
The Fourier transformation - 2023-07-23
Alessandra Bilardi
 
PDF
Anomaly Detection and IP Insights - 2023-06-10
Alessandra Bilardi
 
Amazon Q and Amazon Bedrock, fully managed vs. custom - 2025-06-25
Alessandra Bilardi
 
The Art of Data Visualization - 2025-05-31
Alessandra Bilardi
 
Data Management on AWS: from caos to centralized governance - 2025-03-26
Alessandra Bilardi
 
GenAI-powered assistants compared in a real case - 2025-03-18
Alessandra Bilardi
 
Overview of Hugging Face platform - 2024-10-24
Alessandra Bilardi
 
A gentle introduction to MLSecOps - 2024-10-11
Alessandra Bilardi
 
Custom processing and modeling with Amazon SageMaker - 2024-09-26
Alessandra Bilardi
 
Data scientist vs Cloud engineer: who wins ? - 2024-09-19
Alessandra Bilardi
 
Custom processing and modeling with Amazon SageMaker - 2024-06-17
Alessandra Bilardi
 
IoT: ingestion, streaming, real-time and interactive data analysis - 2024-05-29
Alessandra Bilardi
 
MLOps vs LLMOps (by workflows and use cases) - 2024-05-21
Alessandra Bilardi
 
How to analyze the data arriving from the IoT? - 2024-05-16
Alessandra Bilardi
 
Overview of the OpenCV library and some use cases - 2024-04-19
Alessandra Bilardi
 
How to move your ML system from local to production - 2024-03-15
Alessandra Bilardi
 
Overview of the Kaggle platform and its competitions
Alessandra Bilardi
 
Forecasting in AWS - 2024-02-01
Alessandra Bilardi
 
From your laptop to all resource that you need - 2023-12-09
Alessandra Bilardi
 
Parallelize data processing - 2023-10-24
Alessandra Bilardi
 
The Fourier transformation - 2023-07-23
Alessandra Bilardi
 
Anomaly Detection and IP Insights - 2023-06-10
Alessandra Bilardi
 

Recently uploaded (20)

PPTX
short term project on AI Driven Data Analytics
JMJCollegeComputerde
 
PDF
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 
PPTX
Databricks-DE-Associate Certification Questions-june-2024.pptx
pedelli41
 
PPTX
The whitetiger novel review for collegeassignment.pptx
DhruvPatel754154
 
PDF
Chad Readey - An Independent Thinker
Chad Readey
 
PPTX
Future_of_AI_Presentation for everyone.pptx
boranamanju07
 
PPTX
Multiscale Segmentation of Survey Respondents: Seeing the Trees and the Fores...
Sione Palu
 
PDF
D9110.pdfdsfvsdfvsdfvsdfvfvfsvfsvffsdfvsdfvsd
minhn6673
 
PDF
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 
PDF
blockchain123456789012345678901234567890
tanvikhunt1003
 
PDF
The_Future_of_Data_Analytics_by_CA_Suvidha_Chaplot_UPDATED.pdf
CA Suvidha Chaplot
 
PDF
oop_java (1) of ice or cse or eee ic.pdf
sabiquntoufiqlabonno
 
PPTX
Employee Salary Presentation.l based on data science collection of data
barridevakumari2004
 
PPTX
INFO8116 - Week 10 - Slides.pptx big data architecture
guddipatel10
 
PDF
Research about a FoodFolio app for personalized dietary tracking and health o...
AustinLiamAndres
 
PPTX
short term internship project on Data visualization
JMJCollegeComputerde
 
PPTX
Data Security Breach: Immediate Action Plan
varmabhuvan266
 
PDF
Classifcation using Machine Learning and deep learning
bhaveshagrawal35
 
PDF
WISE main accomplishments for ISQOLS award July 2025.pdf
StatsCommunications
 
PPTX
lecture 13 mind test academy it skills.pptx
ggesjmrasoolpark
 
short term project on AI Driven Data Analytics
JMJCollegeComputerde
 
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 
Databricks-DE-Associate Certification Questions-june-2024.pptx
pedelli41
 
The whitetiger novel review for collegeassignment.pptx
DhruvPatel754154
 
Chad Readey - An Independent Thinker
Chad Readey
 
Future_of_AI_Presentation for everyone.pptx
boranamanju07
 
Multiscale Segmentation of Survey Respondents: Seeing the Trees and the Fores...
Sione Palu
 
D9110.pdfdsfvsdfvsdfvsdfvfvfsvfsvffsdfvsdfvsd
minhn6673
 
202501214233242351219 QASS Session 2.pdf
lauramejiamillan
 
blockchain123456789012345678901234567890
tanvikhunt1003
 
The_Future_of_Data_Analytics_by_CA_Suvidha_Chaplot_UPDATED.pdf
CA Suvidha Chaplot
 
oop_java (1) of ice or cse or eee ic.pdf
sabiquntoufiqlabonno
 
Employee Salary Presentation.l based on data science collection of data
barridevakumari2004
 
INFO8116 - Week 10 - Slides.pptx big data architecture
guddipatel10
 
Research about a FoodFolio app for personalized dietary tracking and health o...
AustinLiamAndres
 
short term internship project on Data visualization
JMJCollegeComputerde
 
Data Security Breach: Immediate Action Plan
varmabhuvan266
 
Classifcation using Machine Learning and deep learning
bhaveshagrawal35
 
WISE main accomplishments for ISQOLS award July 2025.pdf
StatsCommunications
 
lecture 13 mind test academy it skills.pptx
ggesjmrasoolpark
 

Data transformation on AWS - 2022-10-11