Introducción a las soluciones
Big Data de Google
Ismael Yuste
Strategic Cloud Engineer Google Cloud
MSMK Madrid, 21 de Septiembre de 2017
Agenda
● Google Cloud Platform
● BigData
● Machine Learning
● Use Cases
Google Cloud
Platform
Google
Data Centers
Los centros de datos de Google son la
base de toda la plataforma de Google
Cloud. Ofrecen poder computación,
almacenamiento, memoria, GPUs para
nuestras aplicaciones. Además,
alberga el corazón de aplicaciones
como Gmail, Youtube, Search...
● Rapidez
● Baja latencia
● Eficiencia de operaciones
● Eficiencia Energética
● Uso de Energías Renovables
● Cercanía al usuario
● Seguridad de la Información
Google Datacenters - Cloud Regions
Big Data
Soluciones de Big Data integradas de
principio a fin, que permite capturar los
datos, procesarlos y almacenarlos en
una plataforma integrada. Combina
servicios nativos en la nube y
herramientas Open Source
gestionadas, tanto en tiempo real como
por lotes.
Big Data
BigQuery
Cloud
Dataflow
Cloud
Dataproc
Cloud
Datalab
Cloud
Pub/Sub
Genomics
Big Data - Big Query
Tu almacén de
datos corporativo,
rápido, económico
y completamente
gestionado para
análisis de
grandes grupos
de datos
● Ingestión de datos flexible.
● Disponibilidad global.
● Seguridad y permisos integrados.
● Control de coste.
● Altamente disponible.
● Completamente integrado.
● Conecta con otros productos de Google.
Big Data - Cloud Dataflow
Servicio
completamente
gestionado y
modelo de
programación
para el proceso de
Big Data
● Gestión de Recursos integrado.
● A demanda.
● Ejecución de los trabajos inteligente.
● Auto escalado.
● Modelo de programación unificado.
● Open Source.
● Monitorizaje.
● Integración.
● Procesado confiable y consistente.
Big Data - Cloud Dataproc
Servicio
gestionado Spark
y Hadoop
● Gestión de Cluster integrado.
● Cluster dimensionables.
● Integración.
● Versionado.
● Herramientas de Gestión.
● Acciones de inicialización.
● Gestión manual o automática.
● Máquinas Virtuales flexibles.
Big Data
Datalab. Herramienta de exploración, análisis y visualización de
Big Data.
Pub/Sub. Servicio global en tiempo real para gestión de
mensajes y streaming de datos.
Big Data
Dataprep. Servicio de datos inteligente que permite explorar,
limpiar y preparar datos estructurados o no para su posterior
análisis.
Data Studio. Convierte tus datos en informes y cuadros de
mando que son sencillos de crear, de compartir, y totalmente
personalizables, desde fuentes de datos como Bigquery,
Analytics o Youtube.
Data Lifecycle Steps
Ingest
The first stage is to pull in
the raw data, such as
streaming data from
devices, on-premises
batch data, application
logs, or mobile-app user
events and analytics.
Store
After the data has been
retrieved, it needs to be
stored in a format that is
durable and can be easily
accessed.
Process & Analyze
In this stage, the data is
transformed from raw
form into actionable
information.
Explore & Visualize
The final stage is to
convert the results of the
analysis into a format
that is easy to draw
insights from and to
share with colleagues
and peers.
© 2017 Google Inc. All rights reserved.
Ingestion Storage Process & Analyze
Cloud Pub/Sub
Stackdriver
Logging
Cloud Transfer
Service
Cloud Storage
Cloud SQL
Cloud Datastore
Cloud BigTable
BigQuery
Cloud Dataflow
Cloud Dataproc
BigQuery
Cloud Console
Google Data Studio
Google Sheets
Cloud Datalab
BI/Analytics
Partners
Cloud Spanner
Explore & Visualize
Products to Support Data Lifecycle
Typical Big Data
Jobs Programming
Resource
provisioning
Performance
tuning
Monitoring
Reliability
Deployment &
configuration
Handling
growing scale
Utilization
improvements
Big Data with
Google
Focus on insights.
Not infrastructure.
From batch to real-time.
Programming
Understanding
Data & Analytics
Cloud Dataproc
Fully managed Hadoop and Spark with
industry-leading performance
BigQuery
Fully managed data warehouse for
large-scale analytics
Cloud Dataflow
Real-time data pipelines, with open source
SDK via Apache Beam
Separation of Storage and Compute
● Access any storage system from any processing tool
● Keep as much data as you want, economically
● Share data in place, no more FTP and copying
Storage
Processing
BigQuery Storage
(tables)
BigQuery Analytics
Cloud Bigtable
(NoSQL)
Cloud Dataproc
Cloud Storage
(files)
Cloud Dataflow
10+ years of Big Data innovation - Open Source
Google
Papers
20082002 2004 2006 2010 2012 2014 2015
GFS
Map
Reduce
Flume
Java
Millwheel
Open
Source
2005
Google
Cloud
Products BigQuery Pub/Sub Dataflow Bigtable
BigTable Dremel PubSub
Tensorflow
Dataflow
Apache
Beam(Incubating)
Product Mapping
BigQuery
Cloud
Dataflow
Cloud
Dataproc
Cloud
Datalab
Cloud
Pub/Sub
Machine Learning
Google Cloud ML Platform facilita
servicios modernos de machine
learning, con modelos pre-entrenados y
un servicio para generar tus propios
modelos.
Machine Learning
Cloud Machine
Learning
Vision API
Speech
API
Natural
Language API
Translation
API
Jobs API
Machine Learning - Cloud ML
Machine
learning sobre
cualquier tipo y
volumen de
datos
● Predicción a escala.
● Construcción de modelos sencilla.
● Capacidades de Aprendizaje Profundo (Deep Learning).
● Integración.
● HyperTune.
● Servicio gestionado y escalable.
● Modelos portables.
Machine Learning - APIs
Vision API . Analiza imágenes con el poder
de Google.
Speech API. Convierte conversaciones a
texto con el poder de la nube.
Machine Learning - APIs
Natural Language API . Saca conclusiones
de texto desestructurado con Cloud ML.
Translation API. Traduce sobre la marcha
entre miles de pares de lenguas.
Machine Learning - APIs
Jobs API . Gestiona tu portal de empleo con
Cloud ML.
Cloud Video Intelligence API. Analiza y
extrae información de tus videos.
Referencias para estar al día
Google Cloud Platform Blog
Google Cloud Platform Web
GCP Twitter
Google + GCP Community
GCP Podcast
Google Cloud Platform Canal de Youtube
Ejemplos de uso
When art meets big data: Analyzing 200,000 items from The Met
collection in BigQuery
Today we’re adding a new public dataset to
Google BigQuery: over 200,000 items from The
Metropolitan Museum of Art (aka “The Met”),
representing all its public domain art from a
total of 1.5 million art objects. The Met Museum
Public Domain dataset includes metadata about
each piece of art, along with an image or
images of the artifact. Google and The Met
Museum have been close collaborators for
years through Google Arts & Culture and we’re
incredibly excited to bring the museum's public
dataset to BigQuery.
Ejemplos de uso
Traveloka’s journey to stream analytics on Google Cloud Platform
Traveloka is a travel technology company based
in Jakarta, Indonesia, currently operating in six
countries. Founded in 2012 by former Silicon
Valley engineers, its goal is to revolutionize
human mobility.
One of the most strategic parts of our business
is a streaming data processing pipeline that
powers a number of use cases, including fraud
detection, personalization, ads optimization,
cross selling, A/B testing, and promotion
eligibility. That pipeline is also used by our
business analysts for monitoring and
understanding business metrics, both for
historical analysis and in real time.
Ejemplos de uso
Getting Your Feet Wet in the Data Lake: Analytics 360 in BigQuery
Benefits for Data Engineers, Analysts and
Marketers
As a Big Data platform, BigQuery offers benefits
for multiple stages and roles in the Big Data
process:
For marketers and analysts, you can run ad hoc
queries and get the results within minutes or
seconds. The elusive quest for understanding
online and offline attribution, user funnels, and
long-term customer value comes within reach.
For data engineers, BigQuery offers a
tremendous operational benefit, as outlined in
the next section.
Ejemplos de uso
How WePay uses stream analytics for real-time fraud detection
using GCP and Apache Kafka
When payments platform WePay was founded in 2008,
MySQL was our only backend storage. It served its purpose
well when data volume and traffic throughput were relatively
low, but by 2016, our business was growing rapidly and they
were growing along with it. Consequently, we started to see
performance degradation to the point where we could no
longer run concurrent queries without a negative impact on
latency.
Clearly, we needed a new stream analytics pipeline for fraud
detection that would give us answers to queries in near-real
time without affecting our main transactional business
system. In this post, I’ll explain how we built and deployed
such a pipeline to production using Apache Kafka and
Google Cloud Platform (GCP) services like Google Cloud
Dataflow and Cloud Bigtable.
¿ Preguntas ?
Ismael Yuste
linkedin.com/in/ismaelyuste/
@IsmaelYuste

More Related Content

PDF
Google на конференции Big Data Russia
PDF
Google cloud big data summit master gcp big data summit la - 10-20-2015
PPTX
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
PDF
Make your data talk
PDF
Deep Learning Image Processing Applications in the Enterprise
PDF
Data analysis trend 2015 2016 v071
PDF
Enabling the Bank of the Future by Ignacio Bernal
PDF
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016
Google на конференции Big Data Russia
Google cloud big data summit master gcp big data summit la - 10-20-2015
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
Make your data talk
Deep Learning Image Processing Applications in the Enterprise
Data analysis trend 2015 2016 v071
Enabling the Bank of the Future by Ignacio Bernal
Google Cloud Platform & rockPlace Big Data Event-Mar.31.2016

What's hot (20)

PPTX
H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...
PPTX
Infochimps + CloudCon: Infinite Monkey Theorem
PDF
Critical Breakthroughs and Challenges in Big Data and Analytics
PPTX
Unlocking Operational Intelligence from the Data Lake
PDF
HOW TO APPLY BIG DATA ANALYTICS AND MACHINE LEARNING TO REAL TIME PROCESSING ...
PDF
Big Data Paris - A Modern Enterprise Architecture
PDF
QCon 2018 | Gimel | PayPal's Analytic Platform
PDF
Next generation Polyglot Architectures using Neo4j by Stefan Kolmar
PDF
WJAX 2013 Slides online: Big Data beyond Apache Hadoop - How to integrate ALL...
PDF
Snowflakes in the Cloud Real world experience on a new approach for Big Data
PDF
IoT at Google Scale
PDF
Advanced data science algorithms applied to scalable stream processing by Dav...
PDF
Building Identity Graph at Scale for Programmatic Media Buying Using Apache S...
PPTX
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
PDF
Google Cloud Machine Learning
PDF
Achieving Business Value by Fusing Hadoop and Corporate Data
PPTX
Eric Andersen Keynote
PDF
Single View of Well, Production and Assets
PDF
02 a holistic approach to big data
PPTX
50 Shades of Data - Dutch Oracle Architects Platform (February 2018)
H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...
Infochimps + CloudCon: Infinite Monkey Theorem
Critical Breakthroughs and Challenges in Big Data and Analytics
Unlocking Operational Intelligence from the Data Lake
HOW TO APPLY BIG DATA ANALYTICS AND MACHINE LEARNING TO REAL TIME PROCESSING ...
Big Data Paris - A Modern Enterprise Architecture
QCon 2018 | Gimel | PayPal's Analytic Platform
Next generation Polyglot Architectures using Neo4j by Stefan Kolmar
WJAX 2013 Slides online: Big Data beyond Apache Hadoop - How to integrate ALL...
Snowflakes in the Cloud Real world experience on a new approach for Big Data
IoT at Google Scale
Advanced data science algorithms applied to scalable stream processing by Dav...
Building Identity Graph at Scale for Programmatic Media Buying Using Apache S...
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
Google Cloud Machine Learning
Achieving Business Value by Fusing Hadoop and Corporate Data
Eric Andersen Keynote
Single View of Well, Production and Assets
02 a holistic approach to big data
50 Shades of Data - Dutch Oracle Architects Platform (February 2018)
Ad

Similar to Modern Thinking área digital MSKM 21/09/2017 (20)

PDF
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
PPTX
Google Cloud Platform: Prototype ->Production-> Planet scale
PDF
Quick Intro to Google Cloud Technologies
PDF
How Google Does Big Data - DevNexus 2014
PDF
Introduction to Google's Cloud Technologies
PDF
Intro to Google's Cloud Technologies
PDF
Building Apps on Google Cloud Technologies
PDF
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
PPTX
GDSC Cloud Jam.pptx
PDF
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
PDF
Google Cloud Platform for Data Science teams
PDF
Building Integrated Applications on Google's Cloud Technologies
PDF
Using Google Cloud Services with Spring Boot and Pivotal Cloud Foundry (Pivot...
PDF
Bridge to Cloud: Using Apache Kafka to Migrate to GCP
PDF
Google Cloud - Stand Out Features
PDF
Connecta Event: Big Query och dataanalys med Google Cloud Platform
PPTX
Sparkflows - Build E2E Data Analytics Use Cases in less than 30 mins
PDF
Big Data and ML on Google Cloud
PDF
Google Cloud Dataflow
PDF
Google Cloud: Data Analysis and Machine Learningn Technologies
Analyzing petabytes of smartmeter data using Cloud Bigtable, Cloud Dataflow, ...
Google Cloud Platform: Prototype ->Production-> Planet scale
Quick Intro to Google Cloud Technologies
How Google Does Big Data - DevNexus 2014
Introduction to Google's Cloud Technologies
Intro to Google's Cloud Technologies
Building Apps on Google Cloud Technologies
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
GDSC Cloud Jam.pptx
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
Google Cloud Platform for Data Science teams
Building Integrated Applications on Google's Cloud Technologies
Using Google Cloud Services with Spring Boot and Pivotal Cloud Foundry (Pivot...
Bridge to Cloud: Using Apache Kafka to Migrate to GCP
Google Cloud - Stand Out Features
Connecta Event: Big Query och dataanalys med Google Cloud Platform
Sparkflows - Build E2E Data Analytics Use Cases in less than 30 mins
Big Data and ML on Google Cloud
Google Cloud Dataflow
Google Cloud: Data Analysis and Machine Learningn Technologies
Ad

Recently uploaded (20)

PDF
LESSON 01 - TOPIC 02. Role of Information in Organizations.pdf
PPTX
Introduction to HUMAN RESOURCE MANGEMENT.pptx
PDF
Shaping the Future of Pharma with Trusted Reference Standards & Global Resear...
PDF
Search Central Live Deep Dive APAC 2025 LT
PDF
Google Display ads -Grow with Digital Experts
PPTX
Green 3D Illustration Digital Marketing Presentation_20250803_101117_0000.pptx
PDF
AYODHYA OUTDOOR MEDIA PLAN - SRI GARIMA PUBLICITY PRIVATE LIMITED
PDF
domain and Hosting by mayank adhikari ppt
PPTX
Complete_Denture_Lab_Steps_Presentation_With_Images.pptx
PDF
6 AI Marketing Myths That Are Slowing You Down & Draining Your Budgets
PPTX
You_Exec_-_Root_Cause_Analysis_Toolbox_Light_Free (1).pptx
PDF
Ulas Utku Bozdogan – Excellence in Action
DOCX
Elevate Transport Efficiency with QuickMove Transport Management System (TMS)...
PPTX
FINAL PPT strategic management lessons.pptx
PDF
‘’A Comprehensive Study on Tractor Purchase Behaviour among Semi-Urban and R...
PPTX
Secure India Summit 2025 – Awards Nomination Form 1.pptx
PDF
ShoutEx Startup Marketing Playbook 90 days.pdf
DOCX
FCL vs. LCL Freight Forwarding An Ultimate Handbook for Logistics Experts.docx
PPTX
From SEO to GEO The Future of Discovery in 2025
PDF
SEO Is Alive: Real Data That Kills the Internet Hysteria - Sid Lal, Bruce Cla...
LESSON 01 - TOPIC 02. Role of Information in Organizations.pdf
Introduction to HUMAN RESOURCE MANGEMENT.pptx
Shaping the Future of Pharma with Trusted Reference Standards & Global Resear...
Search Central Live Deep Dive APAC 2025 LT
Google Display ads -Grow with Digital Experts
Green 3D Illustration Digital Marketing Presentation_20250803_101117_0000.pptx
AYODHYA OUTDOOR MEDIA PLAN - SRI GARIMA PUBLICITY PRIVATE LIMITED
domain and Hosting by mayank adhikari ppt
Complete_Denture_Lab_Steps_Presentation_With_Images.pptx
6 AI Marketing Myths That Are Slowing You Down & Draining Your Budgets
You_Exec_-_Root_Cause_Analysis_Toolbox_Light_Free (1).pptx
Ulas Utku Bozdogan – Excellence in Action
Elevate Transport Efficiency with QuickMove Transport Management System (TMS)...
FINAL PPT strategic management lessons.pptx
‘’A Comprehensive Study on Tractor Purchase Behaviour among Semi-Urban and R...
Secure India Summit 2025 – Awards Nomination Form 1.pptx
ShoutEx Startup Marketing Playbook 90 days.pdf
FCL vs. LCL Freight Forwarding An Ultimate Handbook for Logistics Experts.docx
From SEO to GEO The Future of Discovery in 2025
SEO Is Alive: Real Data That Kills the Internet Hysteria - Sid Lal, Bruce Cla...

Modern Thinking área digital MSKM 21/09/2017

  • 1. Introducción a las soluciones Big Data de Google Ismael Yuste Strategic Cloud Engineer Google Cloud MSMK Madrid, 21 de Septiembre de 2017
  • 2. Agenda ● Google Cloud Platform ● BigData ● Machine Learning ● Use Cases
  • 4. Google Data Centers Los centros de datos de Google son la base de toda la plataforma de Google Cloud. Ofrecen poder computación, almacenamiento, memoria, GPUs para nuestras aplicaciones. Además, alberga el corazón de aplicaciones como Gmail, Youtube, Search... ● Rapidez ● Baja latencia ● Eficiencia de operaciones ● Eficiencia Energética ● Uso de Energías Renovables ● Cercanía al usuario ● Seguridad de la Información
  • 5. Google Datacenters - Cloud Regions
  • 6. Big Data Soluciones de Big Data integradas de principio a fin, que permite capturar los datos, procesarlos y almacenarlos en una plataforma integrada. Combina servicios nativos en la nube y herramientas Open Source gestionadas, tanto en tiempo real como por lotes. Big Data BigQuery Cloud Dataflow Cloud Dataproc Cloud Datalab Cloud Pub/Sub Genomics
  • 7. Big Data - Big Query Tu almacén de datos corporativo, rápido, económico y completamente gestionado para análisis de grandes grupos de datos ● Ingestión de datos flexible. ● Disponibilidad global. ● Seguridad y permisos integrados. ● Control de coste. ● Altamente disponible. ● Completamente integrado. ● Conecta con otros productos de Google.
  • 8. Big Data - Cloud Dataflow Servicio completamente gestionado y modelo de programación para el proceso de Big Data ● Gestión de Recursos integrado. ● A demanda. ● Ejecución de los trabajos inteligente. ● Auto escalado. ● Modelo de programación unificado. ● Open Source. ● Monitorizaje. ● Integración. ● Procesado confiable y consistente.
  • 9. Big Data - Cloud Dataproc Servicio gestionado Spark y Hadoop ● Gestión de Cluster integrado. ● Cluster dimensionables. ● Integración. ● Versionado. ● Herramientas de Gestión. ● Acciones de inicialización. ● Gestión manual o automática. ● Máquinas Virtuales flexibles.
  • 10. Big Data Datalab. Herramienta de exploración, análisis y visualización de Big Data. Pub/Sub. Servicio global en tiempo real para gestión de mensajes y streaming de datos.
  • 11. Big Data Dataprep. Servicio de datos inteligente que permite explorar, limpiar y preparar datos estructurados o no para su posterior análisis. Data Studio. Convierte tus datos en informes y cuadros de mando que son sencillos de crear, de compartir, y totalmente personalizables, desde fuentes de datos como Bigquery, Analytics o Youtube.
  • 12. Data Lifecycle Steps Ingest The first stage is to pull in the raw data, such as streaming data from devices, on-premises batch data, application logs, or mobile-app user events and analytics. Store After the data has been retrieved, it needs to be stored in a format that is durable and can be easily accessed. Process & Analyze In this stage, the data is transformed from raw form into actionable information. Explore & Visualize The final stage is to convert the results of the analysis into a format that is easy to draw insights from and to share with colleagues and peers.
  • 13. © 2017 Google Inc. All rights reserved. Ingestion Storage Process & Analyze Cloud Pub/Sub Stackdriver Logging Cloud Transfer Service Cloud Storage Cloud SQL Cloud Datastore Cloud BigTable BigQuery Cloud Dataflow Cloud Dataproc BigQuery Cloud Console Google Data Studio Google Sheets Cloud Datalab BI/Analytics Partners Cloud Spanner Explore & Visualize Products to Support Data Lifecycle
  • 14. Typical Big Data Jobs Programming Resource provisioning Performance tuning Monitoring Reliability Deployment & configuration Handling growing scale Utilization improvements
  • 15. Big Data with Google Focus on insights. Not infrastructure. From batch to real-time. Programming Understanding
  • 16. Data & Analytics Cloud Dataproc Fully managed Hadoop and Spark with industry-leading performance BigQuery Fully managed data warehouse for large-scale analytics Cloud Dataflow Real-time data pipelines, with open source SDK via Apache Beam
  • 17. Separation of Storage and Compute ● Access any storage system from any processing tool ● Keep as much data as you want, economically ● Share data in place, no more FTP and copying Storage Processing BigQuery Storage (tables) BigQuery Analytics Cloud Bigtable (NoSQL) Cloud Dataproc Cloud Storage (files) Cloud Dataflow
  • 18. 10+ years of Big Data innovation - Open Source Google Papers 20082002 2004 2006 2010 2012 2014 2015 GFS Map Reduce Flume Java Millwheel Open Source 2005 Google Cloud Products BigQuery Pub/Sub Dataflow Bigtable BigTable Dremel PubSub Tensorflow Dataflow Apache Beam(Incubating)
  • 20. Machine Learning Google Cloud ML Platform facilita servicios modernos de machine learning, con modelos pre-entrenados y un servicio para generar tus propios modelos. Machine Learning Cloud Machine Learning Vision API Speech API Natural Language API Translation API Jobs API
  • 21. Machine Learning - Cloud ML Machine learning sobre cualquier tipo y volumen de datos ● Predicción a escala. ● Construcción de modelos sencilla. ● Capacidades de Aprendizaje Profundo (Deep Learning). ● Integración. ● HyperTune. ● Servicio gestionado y escalable. ● Modelos portables.
  • 22. Machine Learning - APIs Vision API . Analiza imágenes con el poder de Google. Speech API. Convierte conversaciones a texto con el poder de la nube.
  • 23. Machine Learning - APIs Natural Language API . Saca conclusiones de texto desestructurado con Cloud ML. Translation API. Traduce sobre la marcha entre miles de pares de lenguas.
  • 24. Machine Learning - APIs Jobs API . Gestiona tu portal de empleo con Cloud ML. Cloud Video Intelligence API. Analiza y extrae información de tus videos.
  • 25. Referencias para estar al día Google Cloud Platform Blog Google Cloud Platform Web GCP Twitter Google + GCP Community GCP Podcast Google Cloud Platform Canal de Youtube
  • 26. Ejemplos de uso When art meets big data: Analyzing 200,000 items from The Met collection in BigQuery Today we’re adding a new public dataset to Google BigQuery: over 200,000 items from The Metropolitan Museum of Art (aka “The Met”), representing all its public domain art from a total of 1.5 million art objects. The Met Museum Public Domain dataset includes metadata about each piece of art, along with an image or images of the artifact. Google and The Met Museum have been close collaborators for years through Google Arts & Culture and we’re incredibly excited to bring the museum's public dataset to BigQuery.
  • 27. Ejemplos de uso Traveloka’s journey to stream analytics on Google Cloud Platform Traveloka is a travel technology company based in Jakarta, Indonesia, currently operating in six countries. Founded in 2012 by former Silicon Valley engineers, its goal is to revolutionize human mobility. One of the most strategic parts of our business is a streaming data processing pipeline that powers a number of use cases, including fraud detection, personalization, ads optimization, cross selling, A/B testing, and promotion eligibility. That pipeline is also used by our business analysts for monitoring and understanding business metrics, both for historical analysis and in real time.
  • 28. Ejemplos de uso Getting Your Feet Wet in the Data Lake: Analytics 360 in BigQuery Benefits for Data Engineers, Analysts and Marketers As a Big Data platform, BigQuery offers benefits for multiple stages and roles in the Big Data process: For marketers and analysts, you can run ad hoc queries and get the results within minutes or seconds. The elusive quest for understanding online and offline attribution, user funnels, and long-term customer value comes within reach. For data engineers, BigQuery offers a tremendous operational benefit, as outlined in the next section.
  • 29. Ejemplos de uso How WePay uses stream analytics for real-time fraud detection using GCP and Apache Kafka When payments platform WePay was founded in 2008, MySQL was our only backend storage. It served its purpose well when data volume and traffic throughput were relatively low, but by 2016, our business was growing rapidly and they were growing along with it. Consequently, we started to see performance degradation to the point where we could no longer run concurrent queries without a negative impact on latency. Clearly, we needed a new stream analytics pipeline for fraud detection that would give us answers to queries in near-real time without affecting our main transactional business system. In this post, I’ll explain how we built and deployed such a pipeline to production using Apache Kafka and Google Cloud Platform (GCP) services like Google Cloud Dataflow and Cloud Bigtable.
  • 30. ¿ Preguntas ? Ismael Yuste linkedin.com/in/ismaelyuste/ @IsmaelYuste