SlideShare a Scribd company logo
Google App Engine
A Big Data Laboratory?
J Singh, Early Stage IT
March 20, 2012
2
© J Singh, 2011 2
App Engine as a Big Data Laboratory?
• Why bother? Why not use Hadoop?
• EvaluatingApp Engine as a Big Data Laboratory
– Loading Data
– Analytics Capabilities
– Visualization Capabilities
• Conclusions
3
© J Singh, 2011 3
Why Bother? Why not Hadoop?
• No install and configuration required
– Focus on the task: Analytics and Visualization
– Use the technology that powers Google Earth and Google Finance
• Works with Google Datastore
– Makes sense if your data is already there
• No import/export of data necessary
• But a purely „low-level‟ programming environment
– Write Map and Reduce functions in Python / Java
– No Pig, Hive, …
• Is this story for real? We wanted to find out.
4
© J Singh, 2011 4
Loading Data into GAE
• What? No native OS environment to work in?
– No OS commands, no file system accessible to the programmer
– Data Prep must be done elsewhere.
• But other options exist
1. Upload a file into Blobstore through an HTTP request
• Max object size 2GB, max get/put in one call: 1MB.
• Process into Datastore entities using BlobstoreInputReader or
BlobstoreZipInputReader classes.
2. Use remote_api to upload CSV files
• It‟s painful
– Only needs to be done one-time, we hope
– Or we need to set up a process for staging and feeding the data
5
© J Singh, 2011 5
Data Analysis: NumPy and SciPy
• NumPy and SciPy libraries using the traditional computing
model (not Map/Reduce) include:
– Array and Matrix manipulation
– Optimization algorithms, e.g., curve fitting, linear regression,
multi-variate regression.
– Multithreading (for embarrassingly parallel problems)
• Replace map(…) with parallel_map(…).
– map is a Python primitive
– parallel_map is a NumPy primitive
– Other scientific algorithms, e.g., Kalman Filtering, Signal
smoothing, Markov Chains.
• NumPy and SciPy depended on Python 2.7
– Enabled in Fall, 2011.
6
© J Singh, 2011 6
Data Analysis: MapReduce
• Input Reader
– Several provided by GAE, can write your own
• Map function: Written by Programmer
• Shuffle function: Provided
– Can write your own overrides for partitioning
(sharding) and comparison (use in sort)
• Reduce function: Written by Programmer
– Can be skipped if not needed
• Output Writer
– Several provided by GAE, can write your own
7
© J Singh, 2011 7
Data Analysis: Pipeline API
• Based on Python Generator functions
• Allows chaining of map reduce jobs
– Primitives for setting up various types of chains
• MapreducePipeline (prev page) was just one type of pipeline
• Available for Python or Java
– Python side better documented
Split and Merge example
class aPipe(pipeline.Pipeline):
def run(self, e_kind, prop_name, *value_list):
all_bs = []
for v in value_list:
stage = yield bPipe(e_kind, prop_name, v)
all_bs.append(stage)
yield common.Append(*all_bs)
8
© J Singh, 2011 8
Data Visualization
• Appengine supports multiple web frameworks for serving data
directly from the Datastore into an HTML5 Browser:
– Django, Jinja2, CherryPy, …
• Options:
– jQuery Visualize
– Google Visualization API
• Including MotionCharts
– Hans Rosling‟s Visualization API
– Check out his TED talk
• Conclusion:
– A rich set of facilities for visualization and taking action
9
© J Singh, 2011 9
Decision Factors
Usage Discussion
Proof of Concept
or
Demo
In GAE
Need a process for Data Loading
But saves on having to do Hadoop setup
Absence of Pig/Hive may be a limiting factor
Advantage in Visualization
Better security and isolation than Hadoop
Production
In GAE
Analyze cost before committing
Lock-in risk?
Production
elsewhere
Good semantic match between Datastore and HBase.
Need to do Hadoop setup and operation
10
© J Singh, 2011 10
Thank you
• J Singh
– President, Early Stage IT
• Technology Services and Strategy for Startups
• DataThinks.org is a new service of Early Stage IT
– “Big Data” analytics solutions

More Related Content

PPTX
Facebook Analytics with Elastic Map/Reduce
J Singh
 
PPTX
The Hadoop Ecosystem
J Singh
 
PPTX
Introduction to Pig
Prashanth Babu
 
PDF
Hadoop ecosystem
Ran Silberman
 
PDF
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
Mahantesh Angadi
 
PDF
The Zoo Expands: Labrador *Loves* Elephant, Thanks to Hamster
Milind Bhandarkar
 
PPTX
Hadoop Demystified + MapReduce (Java and C#), Pig, and Hive Demos
Lester Martin
 
PDF
Hadoop Internals (2.3.0 or later)
Emilio Coppa
 
Facebook Analytics with Elastic Map/Reduce
J Singh
 
The Hadoop Ecosystem
J Singh
 
Introduction to Pig
Prashanth Babu
 
Hadoop ecosystem
Ran Silberman
 
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
Mahantesh Angadi
 
The Zoo Expands: Labrador *Loves* Elephant, Thanks to Hamster
Milind Bhandarkar
 
Hadoop Demystified + MapReduce (Java and C#), Pig, and Hive Demos
Lester Martin
 
Hadoop Internals (2.3.0 or later)
Emilio Coppa
 

What's hot (20)

PDF
Facebooks Petabyte Scale Data Warehouse using Hive and Hadoop
royans
 
PPTX
February 2014 HUG : Hive On Tez
Yahoo Developer Network
 
PDF
Hadoop in Practice (SDN Conference, Dec 2014)
Marcel Krcah
 
PPTX
February 2014 HUG : Pig On Tez
Yahoo Developer Network
 
PPTX
Map reduce paradigm explained
Dmytro Sandu
 
PPTX
Apache Tez – Present and Future
DataWorks Summit
 
PDF
Extending Hadoop for Fun & Profit
Milind Bhandarkar
 
PDF
Hadoop Primer
Steve Staso
 
PPTX
Hadoop/MapReduce/HDFS
praveen bhat
 
PPTX
Pig, Making Hadoop Easy
Nick Dimiduk
 
PPTX
Pig on Tez: Low Latency Data Processing with Big Data
DataWorks Summit
 
PPTX
Pig programming is more fun: New features in Pig
daijy
 
PDF
Hadoop Overview & Architecture
EMC
 
PPTX
Functional Programming and Big Data
DataWorks Summit
 
PDF
Getting started with Hadoop, Hive, and Elastic MapReduce
obdit
 
PDF
Future of Data Intensive Applicaitons
Milind Bhandarkar
 
PPT
Hadoop Hive Talk At IIT-Delhi
Joydeep Sen Sarma
 
ODP
Hadoop demo ppt
Phil Young
 
PPTX
Hive+Tez: A performance deep dive
t3rmin4t0r
 
PDF
Apache Hadoop 1.1
Sperasoft
 
Facebooks Petabyte Scale Data Warehouse using Hive and Hadoop
royans
 
February 2014 HUG : Hive On Tez
Yahoo Developer Network
 
Hadoop in Practice (SDN Conference, Dec 2014)
Marcel Krcah
 
February 2014 HUG : Pig On Tez
Yahoo Developer Network
 
Map reduce paradigm explained
Dmytro Sandu
 
Apache Tez – Present and Future
DataWorks Summit
 
Extending Hadoop for Fun & Profit
Milind Bhandarkar
 
Hadoop Primer
Steve Staso
 
Hadoop/MapReduce/HDFS
praveen bhat
 
Pig, Making Hadoop Easy
Nick Dimiduk
 
Pig on Tez: Low Latency Data Processing with Big Data
DataWorks Summit
 
Pig programming is more fun: New features in Pig
daijy
 
Hadoop Overview & Architecture
EMC
 
Functional Programming and Big Data
DataWorks Summit
 
Getting started with Hadoop, Hive, and Elastic MapReduce
obdit
 
Future of Data Intensive Applicaitons
Milind Bhandarkar
 
Hadoop Hive Talk At IIT-Delhi
Joydeep Sen Sarma
 
Hadoop demo ppt
Phil Young
 
Hive+Tez: A performance deep dive
t3rmin4t0r
 
Apache Hadoop 1.1
Sperasoft
 
Ad

Viewers also liked (20)

PDF
OpenLSH - a framework for locality sensitive hashing
J Singh
 
PDF
Tableau reseller partner in Australia Bilytica Best business Intelligence com...
Carie John
 
PDF
2016 Standardization of Laboratory Test Coding - PHI Conference
Megan Sawchuk
 
PDF
Whitepaper2012 "Virtual Laboratory for Analytic Geometry" UNAM
metagraphos
 
PDF
Tableau reseller partner in Cape Verde Bilytica Best business Intelligence Co...
Carie John
 
PDF
Dmla0609 Hoeck Presentation
Wolfgang G. Hoeck
 
PPT
Checking in on Healthcare Data Analytics
Cybera Inc.
 
PDF
INCREASING LABORATORY EFFICIENCY AND VALUE OF LABORATORY DATA BY MAXIMISING ...
Keynetix
 
PPT
Exploring the Role of Information Technology Systems in Preventing and Managi...
Health Informatics New Zealand
 
PPT
Process Improvement - 10 Essential Ingredients
Richard Ouellette
 
PDF
Advanced Laboratory Analytics — A Disruptive Solution for Health Systems
Viewics
 
PDF
2008 Spotfire Life Science Forum
Wolfgang G. Hoeck
 
PDF
The Evolution of Laboratory Data Systems: Replacing Paper, Streamlining Proce...
IDBS
 
PDF
Clinical data analytics
SB BHATTACHARYYA
 
PPTX
eHealth: Big Data, Sports Analysis & Clinical Records
Health Informatics New Zealand
 
PDF
Electronic Medical Records - Paperless to Big Data Initiative
Data Science Thailand
 
PPSX
Basics of laboratory internal quality control, Ola Elgaddar, 2012
Ola Elgaddar
 
PPT
Quality control in the medical laboratory
Adnan Jaran
 
PPTX
Data analysis powerpoint
jamiebrandon
 
PPTX
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Health Catalyst
 
OpenLSH - a framework for locality sensitive hashing
J Singh
 
Tableau reseller partner in Australia Bilytica Best business Intelligence com...
Carie John
 
2016 Standardization of Laboratory Test Coding - PHI Conference
Megan Sawchuk
 
Whitepaper2012 "Virtual Laboratory for Analytic Geometry" UNAM
metagraphos
 
Tableau reseller partner in Cape Verde Bilytica Best business Intelligence Co...
Carie John
 
Dmla0609 Hoeck Presentation
Wolfgang G. Hoeck
 
Checking in on Healthcare Data Analytics
Cybera Inc.
 
INCREASING LABORATORY EFFICIENCY AND VALUE OF LABORATORY DATA BY MAXIMISING ...
Keynetix
 
Exploring the Role of Information Technology Systems in Preventing and Managi...
Health Informatics New Zealand
 
Process Improvement - 10 Essential Ingredients
Richard Ouellette
 
Advanced Laboratory Analytics — A Disruptive Solution for Health Systems
Viewics
 
2008 Spotfire Life Science Forum
Wolfgang G. Hoeck
 
The Evolution of Laboratory Data Systems: Replacing Paper, Streamlining Proce...
IDBS
 
Clinical data analytics
SB BHATTACHARYYA
 
eHealth: Big Data, Sports Analysis & Clinical Records
Health Informatics New Zealand
 
Electronic Medical Records - Paperless to Big Data Initiative
Data Science Thailand
 
Basics of laboratory internal quality control, Ola Elgaddar, 2012
Ola Elgaddar
 
Quality control in the medical laboratory
Adnan Jaran
 
Data analysis powerpoint
jamiebrandon
 
Big Data in Healthcare Made Simple: Where It Stands Today and Where It’s Going
Health Catalyst
 
Ad

Similar to Big Data Laboratory (20)

PPTX
eBay Experimentation Platform on Hadoop
Tony Ng
 
PPTX
Experimentation Platform on Hadoop
DataWorks Summit
 
PPTX
Social Media Mining using GAE Map Reduce
J Singh
 
PPTX
Python ml
Shubham Sharma
 
PDF
Machine learning model to production
Georg Heiler
 
PPTX
MongoDB for Spatio-Behavioral Data Analysis and Visualization
MongoDB
 
PDF
App Engine Meetup
John Woodell
 
PDF
Python Powered Data Science at Pivotal (PyData 2013)
Srivatsan Ramanujam
 
PDF
SQL Server Konferenz 2014 - SSIS & HDInsight
Tillmann Eitelberg
 
PDF
Guider: An Integrated Runtime Performance Analyzer on AGL
Peace Lee
 
PPTX
EPUG UKI - Lancaster Analytics
jhkrug
 
PDF
Big Data Analytics (ML, DL, AI) hands-on
Dony Riyanto
 
PPT
Capital onehadoopintro
Doug Chang
 
PDF
Data Pipelines with Python - NWA TechFest 2017
Casey Kinsey
 
PDF
O365Engage17 - How to Automate SharePoint Provisioning with PNP Framework
NCCOMMS
 
PPTX
Machine Learning Platform in LINE Fukuoka
LINE Corporation
 
PDF
Data analytics in the cloud with Jupyter notebooks.
Graham Dumpleton
 
PPTX
aip_developer_overview_icar_2014
Matthew Vaughn
 
PPTX
What's up?
Lars Marius Garshol
 
PDF
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Matt Stubbs
 
eBay Experimentation Platform on Hadoop
Tony Ng
 
Experimentation Platform on Hadoop
DataWorks Summit
 
Social Media Mining using GAE Map Reduce
J Singh
 
Python ml
Shubham Sharma
 
Machine learning model to production
Georg Heiler
 
MongoDB for Spatio-Behavioral Data Analysis and Visualization
MongoDB
 
App Engine Meetup
John Woodell
 
Python Powered Data Science at Pivotal (PyData 2013)
Srivatsan Ramanujam
 
SQL Server Konferenz 2014 - SSIS & HDInsight
Tillmann Eitelberg
 
Guider: An Integrated Runtime Performance Analyzer on AGL
Peace Lee
 
EPUG UKI - Lancaster Analytics
jhkrug
 
Big Data Analytics (ML, DL, AI) hands-on
Dony Riyanto
 
Capital onehadoopintro
Doug Chang
 
Data Pipelines with Python - NWA TechFest 2017
Casey Kinsey
 
O365Engage17 - How to Automate SharePoint Provisioning with PNP Framework
NCCOMMS
 
Machine Learning Platform in LINE Fukuoka
LINE Corporation
 
Data analytics in the cloud with Jupyter notebooks.
Graham Dumpleton
 
aip_developer_overview_icar_2014
Matthew Vaughn
 
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Matt Stubbs
 

More from J Singh (18)

PPTX
Designing analytics for big data
J Singh
 
PDF
Open LSH - september 2014 update
J Singh
 
PPTX
PaaS - google app engine
J Singh
 
PPTX
Mining of massive datasets using locality sensitive hashing (LSH)
J Singh
 
PPTX
Data Analytic Technology Platforms: Options and Tradeoffs
J Singh
 
PPTX
High Throughput Data Analysis
J Singh
 
PPTX
NoSQL and MapReduce
J Singh
 
PPTX
CS 542 -- Concurrency Control, Distributed Commit
J Singh
 
PPTX
CS 542 -- Failure Recovery, Concurrency Control
J Singh
 
PPTX
CS 542 -- Query Optimization
J Singh
 
PPTX
CS 542 -- Query Execution
J Singh
 
PPTX
CS 542 Putting it all together -- Storage Management
J Singh
 
PPTX
CS 542 Parallel DBs, NoSQL, MapReduce
J Singh
 
PPTX
CS 542 Database Index Structures
J Singh
 
PPTX
CS 542 Controlling Database Integrity and Performance
J Singh
 
PPTX
CS 542 Overview of query processing
J Singh
 
PPTX
CS 542 Introduction
J Singh
 
PDF
Cloud Computing from an Entrpreneur's Viewpoint
J Singh
 
Designing analytics for big data
J Singh
 
Open LSH - september 2014 update
J Singh
 
PaaS - google app engine
J Singh
 
Mining of massive datasets using locality sensitive hashing (LSH)
J Singh
 
Data Analytic Technology Platforms: Options and Tradeoffs
J Singh
 
High Throughput Data Analysis
J Singh
 
NoSQL and MapReduce
J Singh
 
CS 542 -- Concurrency Control, Distributed Commit
J Singh
 
CS 542 -- Failure Recovery, Concurrency Control
J Singh
 
CS 542 -- Query Optimization
J Singh
 
CS 542 -- Query Execution
J Singh
 
CS 542 Putting it all together -- Storage Management
J Singh
 
CS 542 Parallel DBs, NoSQL, MapReduce
J Singh
 
CS 542 Database Index Structures
J Singh
 
CS 542 Controlling Database Integrity and Performance
J Singh
 
CS 542 Overview of query processing
J Singh
 
CS 542 Introduction
J Singh
 
Cloud Computing from an Entrpreneur's Viewpoint
J Singh
 

Recently uploaded (20)

PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
PDF
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PPTX
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
PDF
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
PDF
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
PDF
Software Development Methodologies in 2025
KodekX
 
PPTX
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
PDF
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
PDF
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
PDF
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
PDF
Doc9.....................................
SofiaCollazos
 
PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PPTX
The Future of AI & Machine Learning.pptx
pritsen4700
 
PDF
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
PPTX
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
Software Development Methodologies in 2025
KodekX
 
The-Ethical-Hackers-Imperative-Safeguarding-the-Digital-Frontier.pptx
sujalchauhan1305
 
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
Doc9.....................................
SofiaCollazos
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
The Future of AI & Machine Learning.pptx
pritsen4700
 
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 

Big Data Laboratory

  • 1. Google App Engine A Big Data Laboratory? J Singh, Early Stage IT March 20, 2012
  • 2. 2 © J Singh, 2011 2 App Engine as a Big Data Laboratory? • Why bother? Why not use Hadoop? • EvaluatingApp Engine as a Big Data Laboratory – Loading Data – Analytics Capabilities – Visualization Capabilities • Conclusions
  • 3. 3 © J Singh, 2011 3 Why Bother? Why not Hadoop? • No install and configuration required – Focus on the task: Analytics and Visualization – Use the technology that powers Google Earth and Google Finance • Works with Google Datastore – Makes sense if your data is already there • No import/export of data necessary • But a purely „low-level‟ programming environment – Write Map and Reduce functions in Python / Java – No Pig, Hive, … • Is this story for real? We wanted to find out.
  • 4. 4 © J Singh, 2011 4 Loading Data into GAE • What? No native OS environment to work in? – No OS commands, no file system accessible to the programmer – Data Prep must be done elsewhere. • But other options exist 1. Upload a file into Blobstore through an HTTP request • Max object size 2GB, max get/put in one call: 1MB. • Process into Datastore entities using BlobstoreInputReader or BlobstoreZipInputReader classes. 2. Use remote_api to upload CSV files • It‟s painful – Only needs to be done one-time, we hope – Or we need to set up a process for staging and feeding the data
  • 5. 5 © J Singh, 2011 5 Data Analysis: NumPy and SciPy • NumPy and SciPy libraries using the traditional computing model (not Map/Reduce) include: – Array and Matrix manipulation – Optimization algorithms, e.g., curve fitting, linear regression, multi-variate regression. – Multithreading (for embarrassingly parallel problems) • Replace map(…) with parallel_map(…). – map is a Python primitive – parallel_map is a NumPy primitive – Other scientific algorithms, e.g., Kalman Filtering, Signal smoothing, Markov Chains. • NumPy and SciPy depended on Python 2.7 – Enabled in Fall, 2011.
  • 6. 6 © J Singh, 2011 6 Data Analysis: MapReduce • Input Reader – Several provided by GAE, can write your own • Map function: Written by Programmer • Shuffle function: Provided – Can write your own overrides for partitioning (sharding) and comparison (use in sort) • Reduce function: Written by Programmer – Can be skipped if not needed • Output Writer – Several provided by GAE, can write your own
  • 7. 7 © J Singh, 2011 7 Data Analysis: Pipeline API • Based on Python Generator functions • Allows chaining of map reduce jobs – Primitives for setting up various types of chains • MapreducePipeline (prev page) was just one type of pipeline • Available for Python or Java – Python side better documented Split and Merge example class aPipe(pipeline.Pipeline): def run(self, e_kind, prop_name, *value_list): all_bs = [] for v in value_list: stage = yield bPipe(e_kind, prop_name, v) all_bs.append(stage) yield common.Append(*all_bs)
  • 8. 8 © J Singh, 2011 8 Data Visualization • Appengine supports multiple web frameworks for serving data directly from the Datastore into an HTML5 Browser: – Django, Jinja2, CherryPy, … • Options: – jQuery Visualize – Google Visualization API • Including MotionCharts – Hans Rosling‟s Visualization API – Check out his TED talk • Conclusion: – A rich set of facilities for visualization and taking action
  • 9. 9 © J Singh, 2011 9 Decision Factors Usage Discussion Proof of Concept or Demo In GAE Need a process for Data Loading But saves on having to do Hadoop setup Absence of Pig/Hive may be a limiting factor Advantage in Visualization Better security and isolation than Hadoop Production In GAE Analyze cost before committing Lock-in risk? Production elsewhere Good semantic match between Datastore and HBase. Need to do Hadoop setup and operation
  • 10. 10 © J Singh, 2011 10 Thank you • J Singh – President, Early Stage IT • Technology Services and Strategy for Startups • DataThinks.org is a new service of Early Stage IT – “Big Data” analytics solutions