Real-time Information
Discovery and Analytics
P L A M E N P E N E V
C O - F O U N D E R
KNOWLEDGE GROWS
WHERE DATA FLOWS,
Yatrus is building a platform for discovering early
market moving information by using
non-traditional sources for hedge funds and
investments bankers.
www.yatrusanalytics.comData Science Society 18 Jan 2016
www.yatrusanalytics.comData Science Society 18 Jan 2016
Why real-time analysis
matters?
www.yatrusanalytics.comData Science Society 18 Jan 2016
Data- big and small, fast and
smart, Variety of data
What are the volumes of data
that we are seeing today?
30 BILION PIECES OF CONTENT
were added to face book this past month
by 600 million plus users.
ZYNGA PROCESSES 1 PETABYTE OF CONTENT
for players every day; a volume of data that is
unmatched in social game industry.
32 BILLIONS SEARCHES
were perfomed last month.... on Twitter.
MORE THAN 2 BILLION VIDEOS
were watched on YouTube....yesterday.
Will be online, pushing the data created
and shared to nearly 8 zettabytes
WORLDWIDE IP TRAFIC
will quadruple by 2015
3
bilion
A new IDC study says the market
for big technology and services
will grow from 3.2 billion in 2010 to
$16.9 billion in 2015. That’s a 40%
growth CAGR.
$16.9 billion
$3.2 billion
2/3of surveyed business in North
America said big data will
become a concern for within
the next five years.
business and consumer life
of the data in the world today
has been created in the last two years alone.
Everyday
creates 2.5 quintillion
bytes of data per day. 90%
Agriculture
Health Care
Real Estate
Marketing
Cybersecurity
Retail
Energy
Finance
BIG
DATA
www.yatrusanalytics.comData Science Society 18 Jan 2016
Context and Semantics
• To put data into context and unite disparate sources
• Semantics seems to be the solution
www.yatrusanalytics.comData Science Society 18 Jan 2016
Context and Semantics
SEMANTIC
WEB
DATA
Agent
Knowledge
Graph
Communication
Approach
Urls
Networks
Blog
USER
Dataset
Nodes
Metadata
Interface
Concepts
Class
Content
Algorithm
Detections
www.yatrusanalytics.comData Science Society 18 Jan 2016
Linked Open Data- Knowledge
bases and Ontologies
• Knowledge networks
• Dbpedia
• Freebase
• Domain ontologies
www.yatrusanalytics.comData Science Society 18 Jan 2016
Linked Open Data- Knowledge
bases and Ontologies
Dbpedia
Riese
US
Consus
Data
World
fact-
book
Eurostat
Yago
Lingvoj
Umbel DBLP
Hannover
RKB
Explorer
Linked
MDB
Flikr
wrappr
Revyu
Semantic
web.org
Flikr
exporter
RDF Boock
mashup
BBC
Programes
Geo-
names
Crunch
base
QDOS
Audio-
Scrobbler
Music-
brainz
Myspace
wrapper
BBC
Playcount
Data
Jamendo
Magna-
tune
Wiki-
company
Gov-
track
W3C
WordNet
Project
Guten-
Berg
DBLP
Berlin
BBC
John
Peel
Doap-
space
FOAF
profiles
www.yatrusanalytics.comData Science Society 18 Jan 2016
Natural language processing
and Semantics
• Watson and domain ontolgies and knowledge bases- rea-
soning
• Named entity recognition
• Classification
• Watson
Data Science Society 18 Jan 2016
Architectural advances
• Spark
• Lambda Architecture
• Hadoop + graph dbs
• Storm
• Druid
• Cassandra
www.yatrusanalytics.com
Data Science Society 18 Jan 2016
Social networks – Twitter
“The heartbeat of the world’’
• Twitter specifics
• Value out of twitter data
www.yatrusanalytics.com
Data Science Society 18 Jan 2016
Social Networks
www.yatrusanalytics.com
True fact sayings:
105.779.710Registered USERS
140characters in
a message
400milion Active users
39Average age
of user
6000Tweets per second
350000Tweets per minutes
500-700million Tweets per day
Data Science Society 18 Jan 2016
Open Data World
• Talking about a variety
• Government data
• Demographics
• Company data
• FInancial
www.yatrusanalytics.com
Goverment
Data aggregators
Social data
Weather data Sports data
Google
Markets
Universities and
research
News data
Data Science Society 18 Jan 2016
Yatrus real-time analytical flow
• Network analysis
• Sentiment analysis
• Complex systems science
• Natural Language Processing
• Machine Learning
www.yatrusanalytics.com
Data Science Society 18 Jan 2016
Yatrus real-time analytical flow
www.yatrusanalytics.com
Data Science Society 18 Jan 2016
Yatrus real-time analytical flow
www.yatrusanalytics.com
Bearish
Bearish
bullish
BullishOptimism Optimism
Capitulation
Despondency
Excitement
Thrill
Euphoria
Anxiety
Denial
Fear
Depresion
Depresion
Hope
ReliefPanic
Complex Systems
www.yatrusanalytics.com
• Definitions-Consisting of many diverse
and autonomous but interrelated and
interdependent components or parts
linked through many (dense) intercon-
nections.
• Ecosystems, Brains, Societies,
the Internet (of Things)
SANDY PENTLAND
“Its all about paying attention to patterns in life and
using that information to help with things like setting
privacy patterns, sharing things with people, notify-
ing people - basically, to help you live your life."
Data Science Society 18 Jan 2016
Data Science Society 18 Jan 2016
Yatrus real-time analytical flow
www.yatrusanalytics.com
SOCIAL
PHYSICS
by: Alex Pentland
Data Science Society 18 Jan 2016
Systems' analytics tools
www.yatrusanalytics.com
• Network analysis
• Fitness landscape
• Agent-based modeling
Data Science Society 18 Jan 2016
Conclusions
www.yatrusanalytics.comData Science Society 18 Jan 2016
Conclusions
www.yatrusanalytics.com

Real-time information analysis: social networks and open data

  • 1.
    Real-time Information Discovery andAnalytics P L A M E N P E N E V C O - F O U N D E R
  • 2.
    KNOWLEDGE GROWS WHERE DATAFLOWS, Yatrus is building a platform for discovering early market moving information by using non-traditional sources for hedge funds and investments bankers. www.yatrusanalytics.comData Science Society 18 Jan 2016
  • 3.
    www.yatrusanalytics.comData Science Society18 Jan 2016 Why real-time analysis matters?
  • 4.
    www.yatrusanalytics.comData Science Society18 Jan 2016 Data- big and small, fast and smart, Variety of data What are the volumes of data that we are seeing today? 30 BILION PIECES OF CONTENT were added to face book this past month by 600 million plus users. ZYNGA PROCESSES 1 PETABYTE OF CONTENT for players every day; a volume of data that is unmatched in social game industry. 32 BILLIONS SEARCHES were perfomed last month.... on Twitter. MORE THAN 2 BILLION VIDEOS were watched on YouTube....yesterday. Will be online, pushing the data created and shared to nearly 8 zettabytes WORLDWIDE IP TRAFIC will quadruple by 2015 3 bilion A new IDC study says the market for big technology and services will grow from 3.2 billion in 2010 to $16.9 billion in 2015. That’s a 40% growth CAGR. $16.9 billion $3.2 billion 2/3of surveyed business in North America said big data will become a concern for within the next five years. business and consumer life of the data in the world today has been created in the last two years alone. Everyday creates 2.5 quintillion bytes of data per day. 90%
  • 5.
  • 6.
    www.yatrusanalytics.comData Science Society18 Jan 2016 Context and Semantics • To put data into context and unite disparate sources • Semantics seems to be the solution
  • 7.
    www.yatrusanalytics.comData Science Society18 Jan 2016 Context and Semantics SEMANTIC WEB DATA Agent Knowledge Graph Communication Approach Urls Networks Blog USER Dataset Nodes Metadata Interface Concepts Class Content Algorithm Detections
  • 8.
    www.yatrusanalytics.comData Science Society18 Jan 2016 Linked Open Data- Knowledge bases and Ontologies • Knowledge networks • Dbpedia • Freebase • Domain ontologies
  • 9.
    www.yatrusanalytics.comData Science Society18 Jan 2016 Linked Open Data- Knowledge bases and Ontologies Dbpedia Riese US Consus Data World fact- book Eurostat Yago Lingvoj Umbel DBLP Hannover RKB Explorer Linked MDB Flikr wrappr Revyu Semantic web.org Flikr exporter RDF Boock mashup BBC Programes Geo- names Crunch base QDOS Audio- Scrobbler Music- brainz Myspace wrapper BBC Playcount Data Jamendo Magna- tune Wiki- company Gov- track W3C WordNet Project Guten- Berg DBLP Berlin BBC John Peel Doap- space FOAF profiles
  • 10.
    www.yatrusanalytics.comData Science Society18 Jan 2016 Natural language processing and Semantics • Watson and domain ontolgies and knowledge bases- rea- soning • Named entity recognition • Classification • Watson
  • 11.
    Data Science Society18 Jan 2016 Architectural advances • Spark • Lambda Architecture • Hadoop + graph dbs • Storm • Druid • Cassandra www.yatrusanalytics.com
  • 12.
    Data Science Society18 Jan 2016 Social networks – Twitter “The heartbeat of the world’’ • Twitter specifics • Value out of twitter data www.yatrusanalytics.com
  • 13.
    Data Science Society18 Jan 2016 Social Networks www.yatrusanalytics.com True fact sayings: 105.779.710Registered USERS 140characters in a message 400milion Active users 39Average age of user 6000Tweets per second 350000Tweets per minutes 500-700million Tweets per day
  • 14.
    Data Science Society18 Jan 2016 Open Data World • Talking about a variety • Government data • Demographics • Company data • FInancial www.yatrusanalytics.com Goverment Data aggregators Social data Weather data Sports data Google Markets Universities and research News data
  • 15.
    Data Science Society18 Jan 2016 Yatrus real-time analytical flow • Network analysis • Sentiment analysis • Complex systems science • Natural Language Processing • Machine Learning www.yatrusanalytics.com
  • 16.
    Data Science Society18 Jan 2016 Yatrus real-time analytical flow www.yatrusanalytics.com
  • 17.
    Data Science Society18 Jan 2016 Yatrus real-time analytical flow www.yatrusanalytics.com Bearish Bearish bullish BullishOptimism Optimism Capitulation Despondency Excitement Thrill Euphoria Anxiety Denial Fear Depresion Depresion Hope ReliefPanic
  • 18.
    Complex Systems www.yatrusanalytics.com • Definitions-Consistingof many diverse and autonomous but interrelated and interdependent components or parts linked through many (dense) intercon- nections. • Ecosystems, Brains, Societies, the Internet (of Things) SANDY PENTLAND “Its all about paying attention to patterns in life and using that information to help with things like setting privacy patterns, sharing things with people, notify- ing people - basically, to help you live your life." Data Science Society 18 Jan 2016
  • 19.
    Data Science Society18 Jan 2016 Yatrus real-time analytical flow www.yatrusanalytics.com SOCIAL PHYSICS by: Alex Pentland
  • 20.
    Data Science Society18 Jan 2016 Systems' analytics tools www.yatrusanalytics.com • Network analysis • Fitness landscape • Agent-based modeling
  • 21.
    Data Science Society18 Jan 2016 Conclusions www.yatrusanalytics.comData Science Society 18 Jan 2016 Conclusions www.yatrusanalytics.com