SlideShare a Scribd company logo
SQL AND SEARCH WITH
SPARK

IN YOUR BROWSER
Romain Rigaux
100 000
SQL and Search with Spark in your browser
SQL and Search with Spark in your browser
WEB INTERFACE FOR ANALYZING DATA
WITH APACHE HADOOP


FREE AND OPEN SOURCE
—> USER TOOL
WHAT IS HUE?
CORE FEATURE: QUERY 1. Explore
2. Compose
3. Productionize
1. PREPARE, BROWSE
LIGHT ETL

http://livy.io/
INDEXER

morphline.conf
schema.xml
+
+
Collection
2. COMPOSE
EDITOR(S)

API
HS2
HS2
oozie
solr
Oracle, PostgreSQL, MySQL, JDBC, Phoenix, MR, ...
SEARCH DASHBOARDS

/select
/admin/collections
/get
/luke...
/add_widget
/zoom_in
/select_facet
/select_range...
REST AJAX
Templates
+
JS Model
state:CA
3. PRODUCTIONIZE
WORKFLOWS
HDFS file
Saved query
SCHEDULER

/data/20160601
/data/20160602
/data/20160603
...
year=2016,month=06,day=01
year=2016,month=06,day=02
year=2016,month=06,day=03
...
Instance wf 1
Instance wf 2
Instance wf 3
Instance wf 4
workflow.xml
coordinator.xml
DEMO
TIME

Bank auditing: > 10 000$
WHAT IS NEXT 

- Improved Query experience
- Richer SQL experience
- Metadata search
- Single page layout UI
TWITTER
@gethue
USER GROUP
hue-user@
WEBSITE
http://gethue.com
LEARN
http://learn.gethue.com
THANKS!


More Related Content

What's hot (20)

PPTX
Hadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
DataWorks Summit
 
PDF
Sqoop on Spark for Data Ingestion
DataWorks Summit
 
PPTX
Apache sqoop with an use case
Davin Abraham
 
PDF
Spark Summit EU talk by Steve Loughran
Spark Summit
 
PPTX
Introduction to sqoop
Uday Vakalapudi
 
PDF
Introduction to Apache Sqoop
Avkash Chauhan
 
PDF
The Hidden Life of Spark Jobs
DataWorks Summit
 
PDF
How To Connect Spark To Your Own Datasource
MongoDB
 
PDF
Building Robust ETL Pipelines with Apache Spark
Databricks
 
PPTX
Why Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
Cloudera, Inc.
 
PPTX
October 2014 HUG : Hive On Spark
Yahoo Developer Network
 
PPTX
ETL with SPARK - First Spark London meetup
Rafal Kwasny
 
PPTX
Cloudera Impala + PostgreSQL
liuknag
 
PPTX
Powering a Virtual Power Station with Big Data
DataWorks Summit/Hadoop Summit
 
PDF
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Skills Matter
 
PPTX
Producing Spark on YARN for ETL
DataWorks Summit/Hadoop Summit
 
PPTX
Hadoop and rdbms with sqoop
Guy Harrison
 
PDF
Strata Conference + Hadoop World NY 2016: Lessons learned building a scalable...
Sumeet Singh
 
PDF
HPE Hadoop Solutions - From use cases to proposal
DataWorks Summit
 
PPTX
Functional Programming and Big Data
DataWorks Summit
 
Hadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
DataWorks Summit
 
Sqoop on Spark for Data Ingestion
DataWorks Summit
 
Apache sqoop with an use case
Davin Abraham
 
Spark Summit EU talk by Steve Loughran
Spark Summit
 
Introduction to sqoop
Uday Vakalapudi
 
Introduction to Apache Sqoop
Avkash Chauhan
 
The Hidden Life of Spark Jobs
DataWorks Summit
 
How To Connect Spark To Your Own Datasource
MongoDB
 
Building Robust ETL Pipelines with Apache Spark
Databricks
 
Why Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
Cloudera, Inc.
 
October 2014 HUG : Hive On Spark
Yahoo Developer Network
 
ETL with SPARK - First Spark London meetup
Rafal Kwasny
 
Cloudera Impala + PostgreSQL
liuknag
 
Powering a Virtual Power Station with Big Data
DataWorks Summit/Hadoop Summit
 
Introduction to Sqoop Aaron Kimball Cloudera Hadoop User Group UK
Skills Matter
 
Producing Spark on YARN for ETL
DataWorks Summit/Hadoop Summit
 
Hadoop and rdbms with sqoop
Guy Harrison
 
Strata Conference + Hadoop World NY 2016: Lessons learned building a scalable...
Sumeet Singh
 
HPE Hadoop Solutions - From use cases to proposal
DataWorks Summit
 
Functional Programming and Big Data
DataWorks Summit
 

Viewers also liked (20)

PDF
Workload Automation + Hadoop?
DataWorks Summit/Hadoop Summit
 
PPTX
Analysis of Major Trends in Big Data Analytics
DataWorks Summit/Hadoop Summit
 
PPTX
Spark SQL versus Apache Drill: Different Tools with Different Rules
DataWorks Summit/Hadoop Summit
 
PPTX
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
DataWorks Summit/Hadoop Summit
 
PPTX
Solving Performance Problems on Hadoop
Tyler Mitchell
 
PPTX
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
DataWorks Summit/Hadoop Summit
 
PPTX
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
DataWorks Summit/Hadoop Summit
 
PPTX
Hdfs 2016-hadoop-summit-san-jose-v4
Chris Nauroth
 
PPTX
Active Learning for Fraud Prevention
DataWorks Summit/Hadoop Summit
 
PPTX
Keep your Hadoop Cluster at its Best
DataWorks Summit/Hadoop Summit
 
PPT
Toward Better Multi-Tenancy Support from HDFS
DataWorks Summit/Hadoop Summit
 
PPTX
Hortonworks Data In Motion Series Part 4
Hortonworks
 
PPTX
How to build a successful Data Lake
DataWorks Summit/Hadoop Summit
 
PDF
Pivotal Big Data Suite: A Technical Overview
VMware Tanzu
 
PPTX
Keys for Success from Streams to Queries
DataWorks Summit/Hadoop Summit
 
PDF
Wall Street Derivative Risk Solutions Using Apache Geode
Andre Langevin
 
PPTX
Driving Real Insights Through Data Science
VMware Tanzu
 
PPTX
Troubleshooting App Health and Performance with PCF Metrics 1.2
VMware Tanzu
 
PPTX
Introduction to Hadoop at Data-360 Conference
Avkash Chauhan
 
PPTX
Why is my Hadoop* job slow?
DataWorks Summit/Hadoop Summit
 
Workload Automation + Hadoop?
DataWorks Summit/Hadoop Summit
 
Analysis of Major Trends in Big Data Analytics
DataWorks Summit/Hadoop Summit
 
Spark SQL versus Apache Drill: Different Tools with Different Rules
DataWorks Summit/Hadoop Summit
 
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
DataWorks Summit/Hadoop Summit
 
Solving Performance Problems on Hadoop
Tyler Mitchell
 
Faster, Faster, Faster: The True Story of a Mobile Analytics Data Mart on Hive
DataWorks Summit/Hadoop Summit
 
Apache Zeppelin + LIvy: Bringing Multi Tenancy to Interactive Data Analysis
DataWorks Summit/Hadoop Summit
 
Hdfs 2016-hadoop-summit-san-jose-v4
Chris Nauroth
 
Active Learning for Fraud Prevention
DataWorks Summit/Hadoop Summit
 
Keep your Hadoop Cluster at its Best
DataWorks Summit/Hadoop Summit
 
Toward Better Multi-Tenancy Support from HDFS
DataWorks Summit/Hadoop Summit
 
Hortonworks Data In Motion Series Part 4
Hortonworks
 
How to build a successful Data Lake
DataWorks Summit/Hadoop Summit
 
Pivotal Big Data Suite: A Technical Overview
VMware Tanzu
 
Keys for Success from Streams to Queries
DataWorks Summit/Hadoop Summit
 
Wall Street Derivative Risk Solutions Using Apache Geode
Andre Langevin
 
Driving Real Insights Through Data Science
VMware Tanzu
 
Troubleshooting App Health and Performance with PCF Metrics 1.2
VMware Tanzu
 
Introduction to Hadoop at Data-360 Conference
Avkash Chauhan
 
Why is my Hadoop* job slow?
DataWorks Summit/Hadoop Summit
 
Ad

Similar to SQL and Search with Spark in your browser (20)

PDF
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
Big Data Spain
 
PDF
Big Data Day LA 2015 - Solr Search with Spark for Big Data Analytics in Actio...
Data Con LA
 
PDF
20150627 bigdatala
gethue
 
PDF
Hadoop Summit - Interactive Big Data Analysis with Solr, Spark and Hue
gethue
 
PDF
Interactive Query and Search for your Big Data
DataWorks Summit
 
PDF
SF Solr Meetup - Interactively Search and Visualize Your Big Data
gethue
 
PDF
Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014
gethue
 
PDF
Hadoop Israel - HBase Browser in Hue
gethue
 
PPT
Hive Training -- Motivations and Real World Use Cases
nzhang
 
PDF
Hue: The Hadoop UI - HUG France
gethue
 
PPTX
HBase app HUG talk
Kevin (Xi Zhao) Wang
 
PDF
Harness the power of Spark and Solr in Hue: Big Data Amsterdam v.2.0
gethue
 
PDF
HBase + Hue - LA HBase User Group
gethue
 
PDF
Apache Flink and Apache Hudi.pdf
dogma28
 
ODP
An Introduction to Hadoop Hue Gui
Mike Frampton
 
PPTX
Hive : WareHousing Over hadoop
Chirag Ahuja
 
PDF
Building large scale transactional data lake using apache hudi
Bill Liu
 
PDF
Integrate Hue with your Hadoop cluster - Yahoo! Hadoop Meetup
gethue
 
PDF
April 2014 HUG : Integrating HUE with Multi-tenant cluster
Yahoo Developer Network
 
PPTX
Hive - A theoretical overview in Detail.pptx
Mithun DSouza
 
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
Big Data Spain
 
Big Data Day LA 2015 - Solr Search with Spark for Big Data Analytics in Actio...
Data Con LA
 
20150627 bigdatala
gethue
 
Hadoop Summit - Interactive Big Data Analysis with Solr, Spark and Hue
gethue
 
Interactive Query and Search for your Big Data
DataWorks Summit
 
SF Solr Meetup - Interactively Search and Visualize Your Big Data
gethue
 
Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014
gethue
 
Hadoop Israel - HBase Browser in Hue
gethue
 
Hive Training -- Motivations and Real World Use Cases
nzhang
 
Hue: The Hadoop UI - HUG France
gethue
 
HBase app HUG talk
Kevin (Xi Zhao) Wang
 
Harness the power of Spark and Solr in Hue: Big Data Amsterdam v.2.0
gethue
 
HBase + Hue - LA HBase User Group
gethue
 
Apache Flink and Apache Hudi.pdf
dogma28
 
An Introduction to Hadoop Hue Gui
Mike Frampton
 
Hive : WareHousing Over hadoop
Chirag Ahuja
 
Building large scale transactional data lake using apache hudi
Bill Liu
 
Integrate Hue with your Hadoop cluster - Yahoo! Hadoop Meetup
gethue
 
April 2014 HUG : Integrating HUE with Multi-tenant cluster
Yahoo Developer Network
 
Hive - A theoretical overview in Detail.pptx
Mithun DSouza
 
Ad

More from DataWorks Summit/Hadoop Summit (20)

PPT
Running Apache Spark & Apache Zeppelin in Production
DataWorks Summit/Hadoop Summit
 
PPT
State of Security: Apache Spark & Apache Zeppelin
DataWorks Summit/Hadoop Summit
 
PDF
Unleashing the Power of Apache Atlas with Apache Ranger
DataWorks Summit/Hadoop Summit
 
PDF
Enabling Digital Diagnostics with a Data Science Platform
DataWorks Summit/Hadoop Summit
 
PDF
Revolutionize Text Mining with Spark and Zeppelin
DataWorks Summit/Hadoop Summit
 
PDF
Double Your Hadoop Performance with Hortonworks SmartSense
DataWorks Summit/Hadoop Summit
 
PDF
Hadoop Crash Course
DataWorks Summit/Hadoop Summit
 
PDF
Data Science Crash Course
DataWorks Summit/Hadoop Summit
 
PDF
Apache Spark Crash Course
DataWorks Summit/Hadoop Summit
 
PDF
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
 
PPTX
Schema Registry - Set you Data Free
DataWorks Summit/Hadoop Summit
 
PPTX
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
DataWorks Summit/Hadoop Summit
 
PDF
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
DataWorks Summit/Hadoop Summit
 
PPTX
Mool - Automated Log Analysis using Data Science and ML
DataWorks Summit/Hadoop Summit
 
PPTX
How Hadoop Makes the Natixis Pack More Efficient
DataWorks Summit/Hadoop Summit
 
PPTX
HBase in Practice
DataWorks Summit/Hadoop Summit
 
PPTX
The Challenge of Driving Business Value from the Analytics of Things (AOT)
DataWorks Summit/Hadoop Summit
 
PDF
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
DataWorks Summit/Hadoop Summit
 
PPTX
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
DataWorks Summit/Hadoop Summit
 
PPTX
Backup and Disaster Recovery in Hadoop
DataWorks Summit/Hadoop Summit
 
Running Apache Spark & Apache Zeppelin in Production
DataWorks Summit/Hadoop Summit
 
State of Security: Apache Spark & Apache Zeppelin
DataWorks Summit/Hadoop Summit
 
Unleashing the Power of Apache Atlas with Apache Ranger
DataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
DataWorks Summit/Hadoop Summit
 
Revolutionize Text Mining with Spark and Zeppelin
DataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
DataWorks Summit/Hadoop Summit
 
Hadoop Crash Course
DataWorks Summit/Hadoop Summit
 
Data Science Crash Course
DataWorks Summit/Hadoop Summit
 
Apache Spark Crash Course
DataWorks Summit/Hadoop Summit
 
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
 
Schema Registry - Set you Data Free
DataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
DataWorks Summit/Hadoop Summit
 
How Hadoop Makes the Natixis Pack More Efficient
DataWorks Summit/Hadoop Summit
 
HBase in Practice
DataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
DataWorks Summit/Hadoop Summit
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
DataWorks Summit/Hadoop Summit
 
Backup and Disaster Recovery in Hadoop
DataWorks Summit/Hadoop Summit
 

Recently uploaded (20)

PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Mastering Financial Management in Direct Selling
Epixel MLM Software
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI