SlideShare a Scribd company logo
Text Analytics & Linked Data
Management As-a-Service
Marin Dimitrov, Alex Simov, Yavor Petkov
May 31st, 2015
Text Analytics & Linked Data Management -aaS / Wasabi’2015 #1May 2015
About Ontotext
• Provides products & solutions for content
enrichment and metadata management
– 70 employees, headquarters in Sofia (Bulgaria)
– Sales presence in London, NYC & Boston
• Major clients and industries
– Media & Publishing
– Health Care & Life Sciences
– Cultural Heritage & Digital Libraries
– Government
– Education
#2Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
• Semantic Technology adoption challenges
• The Self-Service Semantic Suite (S4)
• Lessons learned
Contents
#3Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
Semantic Technology Adoption
Challenges
#4Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
Time-to-value gap (Gartner)
#5Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
From Wasabi @
ESWC’2014
Performance,
Integration,
Penetration,
Payback & ROI
• Limiting factors
– Complexity & cost of existing solutions
– Limited resources to evaluate novel technologies
(startups)
– Slow procurement processes, risk aversion (enterprises)
• How can we…
– Reduce time-to-market
– Reduce adoption risks
– Optimise costs
Semantic Technology adoption
#6Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
The Self-Service Semantic Suite
(S4)
#7Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
• Capabilities for text analytics, content enrichment
and smart data management
– Text analytics for news, life sciences and social media
– RDF graph database as-a-service
– Access to large open knowledge graphs
• Available on-demand, anytime, anywhere
– Simple RESTful services
• Simple pay-per-use pricing
– No upfront commitments
What is S4?
#8Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
What is S4?
#9Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
• Enables quick prototyping
– Instantly available, no provisioning & operations
required
– Focus on building applications, don’t worry about
infrastructure
• Free tier!
• Easy to start, shorter learning curve
– Various add-ons, SDKs and demo code
• Based on enterprise semantic technology
Benefits
#10Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
• Text analytics services
– News annotation
– News categorisation
– Biomedical
– Twitter
• Entity linking & disambiguation
– Mappings to DBpedia & GeoNames instances
– Mappings to biomedical data sources (LinkedLifeData)
• HTML, MS Word, XML, plain text input
• Simple JSON output
Text analytics with S4
#11Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
News analytics example
#12
S4 result
Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
• Low-cost graph DBaaS available 24/7
• Ideal for small & moderate data volumes
– database options: 1M, 10M, 50M, 250M and 1B triples
• Instantly deploy new databases when needed
• Zero administration: automated operations,
maintenance & upgrades
• Users pay only for the actual database utilisation
– Number of triples stored + number of queries per month
• OpenRDF REST API
Fully managed RDF DB in the Cloud
#13Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
Fully managed RDF DB in the Cloud
#14Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
• SPARQL query endpoint to the FactForge semantic
data warehouse
– 500 million entities / 5 billion triples
• Key LOD datasets integrated
– DBpedia, Freebase/WikiData, GeoNames, WordNet
– Dublin Core, SKOS, PROTON ontologies and
vocabularies
Knowledge graphs with S4
#15Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
Cloud native architecture of S4
#16Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
Elasticity vs
High Availability vs
Cost Efficiency
Lessons Learned
#17Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
• You must build a “cost aware” cloud platform
• Cloud-native architectures are more efficient, but
more difficult to build
• A microservices architecture improve system
resilience & agility, but difficult to design right
• Extensive and continuous benchmarking &
monitoring
– Some problems emerge only at large scale
• Assume failures will happen & design for resilience
Lessons learned
#18Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
Thank you!
#19Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015

More Related Content

What's hot (19)

PDF
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Fwdays
 
PPTX
What Data-Driven Websites Are and How They Work
Tessa Mero
 
PDF
Scylla Summit 2022: Scalable and Sustainable Supply Chains with DLT and ScyllaDB
ScyllaDB
 
PDF
Strata+Hadoop World NY 2016 - Avinash Ramineni
Avinash Ramineni
 
PPTX
Simplified minimalistic workflows for the publication of Linked Open Data
Salvatore Virtuoso
 
PDF
PGDay.Amsterdam 2018 - Jeroen de Graaff - Step-by-step implementation of Post...
PGDay.Amsterdam
 
PPTX
Choosing the Right Open Source Database
All Things Open
 
PPTX
Introduction to Big Data
Md. Afif Al Mamun
 
PPTX
sitMAI, Helping a Friend
Phillip Parkinson
 
PDF
Automate your data flows with Apache NIFI
Adam Doyle
 
PPTX
Sasaki practical-linked-data
Felix Sasaki
 
PDF
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
StampedeCon
 
PDF
Memory Database Technology is Driving a New Cycle of Business Innovation
VoltDB
 
PDF
Drupal and the Semantic Web - ESIP Webinar
scorlosquet
 
PPTX
Data Ingestion Engine
Adam Doyle
 
PPTX
7 Container Design Patterns
Christian Melendez
 
PDF
Mike Stonebraker on Designing An Architecture For Real-time Event Processing
VoltDB
 
PPTX
Evaluation of TPC-H on Spark and Spark SQL in ALOJA
DataWorks Summit
 
PDF
ML Production Pipelines: A Classification Model
Databricks
 
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Fwdays
 
What Data-Driven Websites Are and How They Work
Tessa Mero
 
Scylla Summit 2022: Scalable and Sustainable Supply Chains with DLT and ScyllaDB
ScyllaDB
 
Strata+Hadoop World NY 2016 - Avinash Ramineni
Avinash Ramineni
 
Simplified minimalistic workflows for the publication of Linked Open Data
Salvatore Virtuoso
 
PGDay.Amsterdam 2018 - Jeroen de Graaff - Step-by-step implementation of Post...
PGDay.Amsterdam
 
Choosing the Right Open Source Database
All Things Open
 
Introduction to Big Data
Md. Afif Al Mamun
 
sitMAI, Helping a Friend
Phillip Parkinson
 
Automate your data flows with Apache NIFI
Adam Doyle
 
Sasaki practical-linked-data
Felix Sasaki
 
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
StampedeCon
 
Memory Database Technology is Driving a New Cycle of Business Innovation
VoltDB
 
Drupal and the Semantic Web - ESIP Webinar
scorlosquet
 
Data Ingestion Engine
Adam Doyle
 
7 Container Design Patterns
Christian Melendez
 
Mike Stonebraker on Designing An Architecture For Real-time Event Processing
VoltDB
 
Evaluation of TPC-H on Spark and Spark SQL in ALOJA
DataWorks Summit
 
ML Production Pipelines: A Classification Model
Databricks
 

Viewers also liked (14)

PDF
Enabling Low-cost Open Data Publishing and Reuse
Marin Dimitrov
 
PDF
Ontotext in EC Funded Projects 2002-2012
Marin Dimitrov
 
PDF
S4: The Self-Service Semantic Suite
Marin Dimitrov
 
PDF
Scaling to Millions of Concurrent SPARQL Queries on the Cloud
Marin Dimitrov
 
PDF
From Python to Java
Nikolay Stoitsev
 
PDF
Delivering Linked Data Training to Data Science Practitioners
Marin Dimitrov
 
PPTX
Hackconf 2016 - Да пишем код за хиляди сървъри
Nikolay Stoitsev
 
PPTX
Scaling up Linked Data
Marin Dimitrov
 
PDF
From Big Data to Smart Data
Marin Dimitrov
 
PPT
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Yahoo Developer Network
 
PPT
Graph db
Gagan Agrawal
 
PDF
Crossing the Chasm with Semantic Technology
Marin Dimitrov
 
PDF
Semantic Technologies for Big Data
Marin Dimitrov
 
PPTX
Data Infrastructure at LinkedIn
Amy W. Tang
 
Enabling Low-cost Open Data Publishing and Reuse
Marin Dimitrov
 
Ontotext in EC Funded Projects 2002-2012
Marin Dimitrov
 
S4: The Self-Service Semantic Suite
Marin Dimitrov
 
Scaling to Millions of Concurrent SPARQL Queries on the Cloud
Marin Dimitrov
 
From Python to Java
Nikolay Stoitsev
 
Delivering Linked Data Training to Data Science Practitioners
Marin Dimitrov
 
Hackconf 2016 - Да пишем код за хиляди сървъри
Nikolay Stoitsev
 
Scaling up Linked Data
Marin Dimitrov
 
From Big Data to Smart Data
Marin Dimitrov
 
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Yahoo Developer Network
 
Graph db
Gagan Agrawal
 
Crossing the Chasm with Semantic Technology
Marin Dimitrov
 
Semantic Technologies for Big Data
Marin Dimitrov
 
Data Infrastructure at LinkedIn
Amy W. Tang
 
Ad

Similar to Text Analytics & Linked Data Management As-a-Service (20)

PPTX
Webinar: Metadata Enrichment in Publishing
Ontotext
 
PPTX
Semantic Technology in Publishing & Finance
Vladimir Alexiev, PhD, PMP
 
PDF
Workshop_CITA2015
Bebo White
 
PPTX
Data Engineering at Udemy
Ankara Big Data Meetup
 
PDF
Choosing the Right Graph Database to Succeed in Your Project
Ontotext
 
PPTX
Open Source SQL for Hadoop: Where are we and Where are we Going?
DataWorks Summit
 
PPTX
Swimming Across the Data Lake, Lessons learned and keys to success
DataWorks Summit/Hadoop Summit
 
PPTX
Gaining Advantage in e-Learning with Semantic Adaptive Technology
Ontotext
 
PPTX
A Survey of Exploratory Search Systems Based on LOD Resources
Karwan Jacksi
 
PPTX
Grand Challenges Learning Analytics
amberg
 
PDF
CV
Saim Kaya
 
PDF
Boston Hadoop Meetup: Presto for the Enterprise
Matt Fuller
 
PPTX
Open Information in need of liberation: Aspire and the conundrum of linked data
Talis
 
PDF
Accelerate Self-Service Analytics with Data Virtualization and Visualization
Denodo
 
PPTX
Power BI as a storyteller
Berkovich Consulting
 
PPTX
Emerging technologies in academic libraries
Michael Cummings
 
PPTX
Semantics and Machine Learning
Vladimir Alexiev, PhD, PMP
 
PDF
RWDG Webinar: Big Data & BI Analytics Require Data Governance
DATAVERSITY
 
PDF
Saim Kaya CV
Saim Kaya
 
PDF
Swiss API Day - SmartWave - Impact of APIs on integration
SmartWave
 
Webinar: Metadata Enrichment in Publishing
Ontotext
 
Semantic Technology in Publishing & Finance
Vladimir Alexiev, PhD, PMP
 
Workshop_CITA2015
Bebo White
 
Data Engineering at Udemy
Ankara Big Data Meetup
 
Choosing the Right Graph Database to Succeed in Your Project
Ontotext
 
Open Source SQL for Hadoop: Where are we and Where are we Going?
DataWorks Summit
 
Swimming Across the Data Lake, Lessons learned and keys to success
DataWorks Summit/Hadoop Summit
 
Gaining Advantage in e-Learning with Semantic Adaptive Technology
Ontotext
 
A Survey of Exploratory Search Systems Based on LOD Resources
Karwan Jacksi
 
Grand Challenges Learning Analytics
amberg
 
Boston Hadoop Meetup: Presto for the Enterprise
Matt Fuller
 
Open Information in need of liberation: Aspire and the conundrum of linked data
Talis
 
Accelerate Self-Service Analytics with Data Virtualization and Visualization
Denodo
 
Power BI as a storyteller
Berkovich Consulting
 
Emerging technologies in academic libraries
Michael Cummings
 
Semantics and Machine Learning
Vladimir Alexiev, PhD, PMP
 
RWDG Webinar: Big Data & BI Analytics Require Data Governance
DATAVERSITY
 
Saim Kaya CV
Saim Kaya
 
Swiss API Day - SmartWave - Impact of APIs on integration
SmartWave
 
Ad

More from Marin Dimitrov (15)

PPTX
Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Marin Dimitrov
 
PDF
Mapping Your Career Journey
Marin Dimitrov
 
PDF
Open Source @ Uber
Marin Dimitrov
 
PDF
Trust - the Key Success Factor for Teams & Organisations
Marin Dimitrov
 
PDF
Uber @ Telerik Academy 2018
Marin Dimitrov
 
PDF
Machine Learning @ Uber
Marin Dimitrov
 
PDF
Career Advice for My Younger Self
Marin Dimitrov
 
PDF
Scaling Your Engineering Organization with Distributed Sites
Marin Dimitrov
 
PDF
Building, Scaling and Leading High-Performance Teams
Marin Dimitrov
 
PDF
Uber @ Career Days 2017 (Sofia University)
Marin Dimitrov
 
PDF
Career Days 2012 @ Sofia University
Marin Dimitrov
 
PDF
Linked Data for the Enterprise: Opportunities and Challenges
Marin Dimitrov
 
PDF
Semantic Technologies and Triplestores for Business Intelligence
Marin Dimitrov
 
PDF
Linked Data Marketplaces
Marin Dimitrov
 
PDF
Linked Data Management
Marin Dimitrov
 
Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Marin Dimitrov
 
Mapping Your Career Journey
Marin Dimitrov
 
Open Source @ Uber
Marin Dimitrov
 
Trust - the Key Success Factor for Teams & Organisations
Marin Dimitrov
 
Uber @ Telerik Academy 2018
Marin Dimitrov
 
Machine Learning @ Uber
Marin Dimitrov
 
Career Advice for My Younger Self
Marin Dimitrov
 
Scaling Your Engineering Organization with Distributed Sites
Marin Dimitrov
 
Building, Scaling and Leading High-Performance Teams
Marin Dimitrov
 
Uber @ Career Days 2017 (Sofia University)
Marin Dimitrov
 
Career Days 2012 @ Sofia University
Marin Dimitrov
 
Linked Data for the Enterprise: Opportunities and Challenges
Marin Dimitrov
 
Semantic Technologies and Triplestores for Business Intelligence
Marin Dimitrov
 
Linked Data Marketplaces
Marin Dimitrov
 
Linked Data Management
Marin Dimitrov
 

Recently uploaded (20)

PPTX
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PDF
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
PDF
The Future of Artificial Intelligence (AI)
Mukul
 
PDF
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
PPTX
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
PPTX
AVL ( audio, visuals or led ), technology.
Rajeshwri Panchal
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PPTX
Farrell_Programming Logic and Design slides_10e_ch02_PowerPoint.pptx
bashnahara11
 
PPTX
AI Code Generation Risks (Ramkumar Dilli, CIO, Myridius)
Priyanka Aash
 
PPTX
Agile Chennai 18-19 July 2025 | Workshop - Enhancing Agile Collaboration with...
AgileNetwork
 
PDF
State-Dependent Conformal Perception Bounds for Neuro-Symbolic Verification
Ivan Ruchkin
 
PPTX
Simple and concise overview about Quantum computing..pptx
mughal641
 
PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
PDF
Generative AI vs Predictive AI-The Ultimate Comparison Guide
Lily Clark
 
PPTX
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
PDF
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
The Future of Artificial Intelligence (AI)
Mukul
 
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
AVL ( audio, visuals or led ), technology.
Rajeshwri Panchal
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
Farrell_Programming Logic and Design slides_10e_ch02_PowerPoint.pptx
bashnahara11
 
AI Code Generation Risks (Ramkumar Dilli, CIO, Myridius)
Priyanka Aash
 
Agile Chennai 18-19 July 2025 | Workshop - Enhancing Agile Collaboration with...
AgileNetwork
 
State-Dependent Conformal Perception Bounds for Neuro-Symbolic Verification
Ivan Ruchkin
 
Simple and concise overview about Quantum computing..pptx
mughal641
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
Generative AI vs Predictive AI-The Ultimate Comparison Guide
Lily Clark
 
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 

Text Analytics & Linked Data Management As-a-Service

  • 1. Text Analytics & Linked Data Management As-a-Service Marin Dimitrov, Alex Simov, Yavor Petkov May 31st, 2015 Text Analytics & Linked Data Management -aaS / Wasabi’2015 #1May 2015
  • 2. About Ontotext • Provides products & solutions for content enrichment and metadata management – 70 employees, headquarters in Sofia (Bulgaria) – Sales presence in London, NYC & Boston • Major clients and industries – Media & Publishing – Health Care & Life Sciences – Cultural Heritage & Digital Libraries – Government – Education #2Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 3. • Semantic Technology adoption challenges • The Self-Service Semantic Suite (S4) • Lessons learned Contents #3Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 4. Semantic Technology Adoption Challenges #4Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 5. Time-to-value gap (Gartner) #5Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015 From Wasabi @ ESWC’2014 Performance, Integration, Penetration, Payback & ROI
  • 6. • Limiting factors – Complexity & cost of existing solutions – Limited resources to evaluate novel technologies (startups) – Slow procurement processes, risk aversion (enterprises) • How can we… – Reduce time-to-market – Reduce adoption risks – Optimise costs Semantic Technology adoption #6Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 7. The Self-Service Semantic Suite (S4) #7Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 8. • Capabilities for text analytics, content enrichment and smart data management – Text analytics for news, life sciences and social media – RDF graph database as-a-service – Access to large open knowledge graphs • Available on-demand, anytime, anywhere – Simple RESTful services • Simple pay-per-use pricing – No upfront commitments What is S4? #8Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 9. What is S4? #9Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 10. • Enables quick prototyping – Instantly available, no provisioning & operations required – Focus on building applications, don’t worry about infrastructure • Free tier! • Easy to start, shorter learning curve – Various add-ons, SDKs and demo code • Based on enterprise semantic technology Benefits #10Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 11. • Text analytics services – News annotation – News categorisation – Biomedical – Twitter • Entity linking & disambiguation – Mappings to DBpedia & GeoNames instances – Mappings to biomedical data sources (LinkedLifeData) • HTML, MS Word, XML, plain text input • Simple JSON output Text analytics with S4 #11Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 12. News analytics example #12 S4 result Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 13. • Low-cost graph DBaaS available 24/7 • Ideal for small & moderate data volumes – database options: 1M, 10M, 50M, 250M and 1B triples • Instantly deploy new databases when needed • Zero administration: automated operations, maintenance & upgrades • Users pay only for the actual database utilisation – Number of triples stored + number of queries per month • OpenRDF REST API Fully managed RDF DB in the Cloud #13Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 14. Fully managed RDF DB in the Cloud #14Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 15. • SPARQL query endpoint to the FactForge semantic data warehouse – 500 million entities / 5 billion triples • Key LOD datasets integrated – DBpedia, Freebase/WikiData, GeoNames, WordNet – Dublin Core, SKOS, PROTON ontologies and vocabularies Knowledge graphs with S4 #15Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 16. Cloud native architecture of S4 #16Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015 Elasticity vs High Availability vs Cost Efficiency
  • 17. Lessons Learned #17Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 18. • You must build a “cost aware” cloud platform • Cloud-native architectures are more efficient, but more difficult to build • A microservices architecture improve system resilience & agility, but difficult to design right • Extensive and continuous benchmarking & monitoring – Some problems emerge only at large scale • Assume failures will happen & design for resilience Lessons learned #18Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015
  • 19. Thank you! #19Text Analytics & Linked Data Management -aaS / Wasabi’2015 May 2015