Big Data in
organisaties


Roland Haeve;
Global Director Big Data
for Atos International

             07/04/2013
What is Big Data?                                                                  Data Pioneers
                                                                                   10 april 2013




▶ In the year 2000 we produced 2 Exabytes of new data

▶ In the year 2011 we produced 1.8 Zettabytes of new data


▶This is: 1.800.000.000.000.000.000.000 bytes

▶ In 2020: 40x more data towards 35 Zettabytes




              ▶This growth every year to even Yottabyte(s) (=10 to the powe r24)



                                                      2
What is Big Data, the 3-4       Data Pioneers
                                10 april 2013
traditional V’s




Source: Oracle




                            3
From the traditional 3-4 V’s                                Data Pioneers
                                                            10 april 2013
towards the 5-7 V’s

 Viscosity – Viscosity measures the
 resistance to flow in the volume of
                                                 Value
 data. This resistance can come from
 different data sources, friction from
 integration flow rates, and processing
 required to turn the data into insight.
 Technologies to deal with viscosity
 include improved streaming, agile
 integration bus’, and complex event
 processing.
 Virality – Virality describes how quickly
 information gets dispersed across
 people to people (P2P) networks.
 Virality measures how quickly data is
 spread and shared to each unique
 node. Time is a determinant factor
 along with rate of spread.
 Veracity: Trust & Quality




                                                 Veracity


                                             4
Big Data & Internet of Things                                                       Data Pioneers
                                                                                    10 april 2013
Context is key for generating value



                                                                                                    Sensors /
                                                                                                    Actuators




  Web Portal to get user
         actions
                                            M2M
                                            M2M                         B2B Partner IS
                                                                        (Data Provider)             Mediators
   (Nfc, gps, accelero)              Machine to Machine



 Subscriptions
                      Big Data
                                               Context               Big Data
                                                                                                Aggregators
                       Engine                   Broker                Engine
 Aggregation
 Correlation                                   Platform
      Application      Application   Application   Application   Application      Application    Applications




                                                           5
Struggles for Business                                            Data Pioneers
                                                                   10 april 2013




▶ Driver: Who is the driving Force, IT, Business, Cost?

▶ Opportunities: Which Opportunities is Big Data (Analytics) deliver, how Big Data
  can make a difference?

▶ How to Start: Which Roadmap(s) should we follow?

▶ How to Integrate: How integrate Big Data (Strategy) within the
  current Infra-architecture?



     McKinsey calls Big Data “the next frontier
     for innovation,competition and
     productivity”



                                             6
Data Pioneers
Use Cases                                                                                         10 april 2013



Large Hadron Collider: An example of sensor and machine data is found at the Large Hadron Collider at CERN, the
European Organization for Nuclear Research. CERN scientists can generate 40 terabytes of data every second during
experiments.
Boeing Jets: Boeing jet engines can produce 10 terabytes of operational information for every 30 minutes they turn. A
four-engine jumbo jet can create 640 terabytes of data on just one Atlantic crossing; multiply that by the more than
25,000 flights flown each day, and you get an understanding of the impact that sensor and machine-produced data can
make on a BI environment.
Twitter: The micro blogging site Twitter serves more than 200 million users who produce more than 90 million "tweets"
per day, or 800 per second. Each of these posts is approximately 200 bytes in size. On an average day, this traffic equals
more than 12 gigabytes and, throughout the Twitter ecosystem, the company produces a total of eight terabytes of data
per day. In comparison, the New York Stock Exchange produces about one terabyte of data per day.
Wal-Mart: Transactional data has grown in velocity and volume at many companies. As recently as 2005, the largest data
warehouse in the world was estimated to be 100 terabytes in size. Today, Wal-Mart, the world's largest retailer, is logging
one million customer transactions per hour and feeding information into databases estimated at 2.5 petabytes in size.
Financial services: Discover fraud patterns based on multi-years worth of credit card transactions and in a time scale that
does not allow new patterns to accumulate significant losses. Measure transaction processing latency across many
business processes by processing and correlating system log data.
Internet retailers: Discover fraud patterns in Internet retailing by mining web click logs. Assess risk by product type and
session Internet Protocol (IP) address activity.
Retailers: Perform sentiment analysis by analysing social media data.
Drug discovery: Perform large-scale text analytics on publicly available information sources.
Healthcare: Analyse medical insurance claims data for financial analysis, fraud detection, and preferred patient treatment
plans. Analyse patient electronic health records for evaluation of patient care regimes and drug safety.
Mobile telecom: Discover mobile phone churn patterns based on analysis of call detail records and correlation with
activity in subscribers' networks of callers.
IT technical support: Perform large-scale text analytics on help desk support data and publicly available support forums
to correlate system failures with known problems.
Scientific research: Analyse scientific data to extract features (e.g., identify celestial objects from telescope imagery).
Internet travel: Improve product ranking (e.g., of hotels) by analysis of multi-years worth of web click logs.


                                                                    7
Examples of Atos projects / cases   Data Pioneers
                                    10 april 2013




                           8
DAaaS       Data Pioneers
            10 april 2013




        9
DAaaS        Data Pioneers
             10 april 2013




        10
Big Data & Internet of Things                         Data Pioneers
Smart Metering at ERDF                                10 april 2013




▶ Atos is the first IT services company to manage
  such a large scale implementation of smart
  meters in Europe
▶ Targeting 35 million meters being installed for
  French distribution system operator ERDF. The
  smart meter solutions developed by Atos help
  Smart Utilities to meet three goals: lower costs;
  improved delivery and more efficient services
  to home and business users and a reduction of
  energy usage by regulating the network. At the
  beginning of March 2011, ERDF started the
  operation of its new IT platform of its Linky
  project.




                                             11
Atos Olympische Spelen, London 2012   Data Pioneers
and vision for 2020 (Real Big Data)   10 april 2013




                              12
Opportunity from CNES : Big Data                             Data Pioneers
                                                             10 april 2013
for Control Systems
▶ Atos won a 25 M€ contract with French Space Agency
  (CNES) for a “Product Line” to build Control Systems for
  spacecraft
  – First control system will be for a military satellite
▶ IP of some components will be shared between CNES
  and Atos
▶ More interesting asset to share is infrastructure
▶ Key components : several data stores
  – Distributed architecture
  – Lightweight
  – Very fast
  – Based on manageable, understandable open source
     components
     • Security, maintainability, long term support, …

▶ Our innovative architecture has been a key element of
  our selection


                                              13
Red Spotted Hankey,                                                   Data Pioneers
                                                                      10 april 2013
Travel Web Site

                               Business Issue
                               ▶ Limited understanding of the dynamics of marketing
                                  response and external influences on web traffic and
                                  sales
                               ▶ “Static” customer information
                               Use Cases sentiment translate into an increase in web
                                  sales?
                               Does a local radio advertising campaign translate into
                                  increased web traffic and sales?
                               ▶ How can RSH derive the best possible value from its
                                  marketing strategies, eg:
    redspottedhanky.com           – Does a positive spike in social media sen
    sells discounted train        – What impact does weather have on web traffic?
 tickets on-line. Customers    Solution
   gather loyalty points for   ▶ Cloud based Big Data platform integrating, storing and
    each ticket purchased         analysing unstructured and structured data
  which can be used to buy     ▶ Hadoop based solution integrating weather, twitter
   additional train tickets.      feeds, ticketing sales, CRM and web traffic data into
                                  single repository for trend identification and analysis




                                              14
MyCity – Real Time Traffic                                                                Data Pioneers
                                                                                          10 april 2013
Forecast
 Traffic sensors of the City of Berlin              CityCockpit for RTTF         Vehicle’s on-board unit




                                                             .       .
                                                                 .                             Smart
                                                                                               phone app



1           1200
                            Real-time sensor data                             Real-time data
                                    Additional data                          Forecasted data



                   Other data sources
Traffic web        (e.g. crowd sourced and open data)    Traffic data server          Traffic forecast server
server                                                   Data, Services & Analytics      4 hours forecast service


                                                              15
Nieuwe mogelijkheden; Customer profiling                                Data Pioneers
Personal Based Economy / Personal Data Economy                          10 april 2013



▶ Laatste Web klikken van de klant / click-stream analysis
  – tonen juiste advertenties
  – Flexibele prijzen / aanbiedingen
  – Loyaliteitsprogramma
▶ Klant “usage patterns” van uw services
  – Veel gebelde telefoonnummers  speciale aanbiedingen
▶ Locatie van klant
                                                                      ..en vergeet niet dit
  – Location based services
                                                                      kun je ook allemaal
▶ Genetische / DNA patronen van uw klant / patiënt
                                                                      weer combineren met
  – Voorschrijven de best werkende medicatie gebaseerd op best
                                                                      Big Data!......
     werkende statistische analyse
  – Preventieve geneeskunde
▶ Beleid / Declaratie profiel van klanten
  – Fraude detectie / opsporing / management
  – Proactieve verzekeringspakket aanbieden
▶ Klanten bezitten Twitter stream, Facebook pagina
  – Detecteer hobby's en interesses
  – Detecteer belangrijke gebeurtenissen (Geboorte, verhuizing, etc.)
  – Quantified Self by Numbers

                                                  16
Keeping track of the Customer journey                                                Data Pioneers
                                                                                     10 april 2013



 From Traditional (single path, predictable process)
         View TV or                      Compare         Choose
                          Go to Store                                 Buy Item
          print ad                       Options       Best Option




 To Connected (multi channel, multi path, complex unpredictable process)
                                                                                              Levels of
                 Search            Smartphone
                                                                                              Customer
                                      app               Compare                                Interest
                                                         prices
                                                                                               Loyalty
                                                                     Demo in
                                                     Like on          store
                                                                                             Commitment
                                  Watch on          Facebook
                  View            Youtube
                 print ad                                                  Watch             Evaluation
                                                                          tutorial
                                                   Buy Item

                                                                 Read                          Interest
          Read                 Online
                                                                reviews
          Blog                shopping            View
                                                banner ad                                    Awareness




                                                        17
Now Banking (Atos Smart Mobility + Big Data) Data Pioneers
interacting with consumers and guiding them in their day 2 day life’s                                         10 april 2013



 Home              Travel           Work              Hospital        Shopping          Culture         Travel           Home




-Personal FM                                                                                                             -Personal FM
-Casualty Man                                                                                                            -Casualty Man
-Savings                                                                                -Micro credit                    -Savings
-Mortgages         -Car insurance                                     -Credit/Debit     -Sustainable    -Car insurance   -Mortgages
-Investments       -Liability       -Income           -Health          cards             banking        -Liability       -Investments
-Financial goals   -Work away*      -Life insurance    insurance      -Personal loans   -Sponsoring     -Work away       -Financial goals

Morning                                                                                                                        Evening
*Atos proposition for banks facilitating to work anywhere , anytime



                                                                            18
Risks to be aware of                                           Data Pioneers
                                                               10 april 2013
(several, and quite diverse)

           Policies:
                              The risks of misuse:
     security, privacy,                              Emergent, immature
                            “Lies, Damned Lies and
   intellectual property,                               technologies
                                    Statistics”
          liability ..




   Mixing “old” tech with   Data Garbage: “Digital   Access to data can be
    the new platforms             Diogenes”              problematic




                                                     Scarcity of talent in a
                             Transparency is hard
   Data ownership issues                             complex field (“Data
                                  to achieve
                                                          Scientists”)



                                           19
More info                                                     Data Pioneers
                                                              10 april 2013




▶ See Factsheet and whitepaper Open source Solutions For Big Data Management:
  http://nl.Atos.net/BigData




                                           20
Contact?                                         Data Pioneers
                                                 10 april 2013




           »   Name: Roland Haeve
           »   Role: Global Director Big Data;
               Information Management &
               Analytics
           »   Mail: Roland.Haeve@atos.net
           »   Tel: 06-22465013
           »   @Rhaeve




                                21

Data Pioneers - Roland Haeve (Atos Nederland) - Big data in organisaties

  • 1.
    Big Data in organisaties RolandHaeve; Global Director Big Data for Atos International 07/04/2013
  • 2.
    What is BigData? Data Pioneers 10 april 2013 ▶ In the year 2000 we produced 2 Exabytes of new data ▶ In the year 2011 we produced 1.8 Zettabytes of new data ▶This is: 1.800.000.000.000.000.000.000 bytes ▶ In 2020: 40x more data towards 35 Zettabytes ▶This growth every year to even Yottabyte(s) (=10 to the powe r24) 2
  • 3.
    What is BigData, the 3-4 Data Pioneers 10 april 2013 traditional V’s Source: Oracle 3
  • 4.
    From the traditional3-4 V’s Data Pioneers 10 april 2013 towards the 5-7 V’s Viscosity – Viscosity measures the resistance to flow in the volume of Value data. This resistance can come from different data sources, friction from integration flow rates, and processing required to turn the data into insight. Technologies to deal with viscosity include improved streaming, agile integration bus’, and complex event processing. Virality – Virality describes how quickly information gets dispersed across people to people (P2P) networks. Virality measures how quickly data is spread and shared to each unique node. Time is a determinant factor along with rate of spread. Veracity: Trust & Quality Veracity 4
  • 5.
    Big Data &Internet of Things Data Pioneers 10 april 2013 Context is key for generating value Sensors / Actuators Web Portal to get user actions M2M M2M B2B Partner IS (Data Provider) Mediators (Nfc, gps, accelero) Machine to Machine Subscriptions Big Data Context Big Data Aggregators Engine Broker Engine Aggregation Correlation Platform Application Application Application Application Application Application Applications 5
  • 6.
    Struggles for Business Data Pioneers 10 april 2013 ▶ Driver: Who is the driving Force, IT, Business, Cost? ▶ Opportunities: Which Opportunities is Big Data (Analytics) deliver, how Big Data can make a difference? ▶ How to Start: Which Roadmap(s) should we follow? ▶ How to Integrate: How integrate Big Data (Strategy) within the current Infra-architecture? McKinsey calls Big Data “the next frontier for innovation,competition and productivity” 6
  • 7.
    Data Pioneers Use Cases 10 april 2013 Large Hadron Collider: An example of sensor and machine data is found at the Large Hadron Collider at CERN, the European Organization for Nuclear Research. CERN scientists can generate 40 terabytes of data every second during experiments. Boeing Jets: Boeing jet engines can produce 10 terabytes of operational information for every 30 minutes they turn. A four-engine jumbo jet can create 640 terabytes of data on just one Atlantic crossing; multiply that by the more than 25,000 flights flown each day, and you get an understanding of the impact that sensor and machine-produced data can make on a BI environment. Twitter: The micro blogging site Twitter serves more than 200 million users who produce more than 90 million "tweets" per day, or 800 per second. Each of these posts is approximately 200 bytes in size. On an average day, this traffic equals more than 12 gigabytes and, throughout the Twitter ecosystem, the company produces a total of eight terabytes of data per day. In comparison, the New York Stock Exchange produces about one terabyte of data per day. Wal-Mart: Transactional data has grown in velocity and volume at many companies. As recently as 2005, the largest data warehouse in the world was estimated to be 100 terabytes in size. Today, Wal-Mart, the world's largest retailer, is logging one million customer transactions per hour and feeding information into databases estimated at 2.5 petabytes in size. Financial services: Discover fraud patterns based on multi-years worth of credit card transactions and in a time scale that does not allow new patterns to accumulate significant losses. Measure transaction processing latency across many business processes by processing and correlating system log data. Internet retailers: Discover fraud patterns in Internet retailing by mining web click logs. Assess risk by product type and session Internet Protocol (IP) address activity. Retailers: Perform sentiment analysis by analysing social media data. Drug discovery: Perform large-scale text analytics on publicly available information sources. Healthcare: Analyse medical insurance claims data for financial analysis, fraud detection, and preferred patient treatment plans. Analyse patient electronic health records for evaluation of patient care regimes and drug safety. Mobile telecom: Discover mobile phone churn patterns based on analysis of call detail records and correlation with activity in subscribers' networks of callers. IT technical support: Perform large-scale text analytics on help desk support data and publicly available support forums to correlate system failures with known problems. Scientific research: Analyse scientific data to extract features (e.g., identify celestial objects from telescope imagery). Internet travel: Improve product ranking (e.g., of hotels) by analysis of multi-years worth of web click logs. 7
  • 8.
    Examples of Atosprojects / cases Data Pioneers 10 april 2013 8
  • 9.
    DAaaS Data Pioneers 10 april 2013 9
  • 10.
    DAaaS Data Pioneers 10 april 2013 10
  • 11.
    Big Data &Internet of Things Data Pioneers Smart Metering at ERDF 10 april 2013 ▶ Atos is the first IT services company to manage such a large scale implementation of smart meters in Europe ▶ Targeting 35 million meters being installed for French distribution system operator ERDF. The smart meter solutions developed by Atos help Smart Utilities to meet three goals: lower costs; improved delivery and more efficient services to home and business users and a reduction of energy usage by regulating the network. At the beginning of March 2011, ERDF started the operation of its new IT platform of its Linky project. 11
  • 12.
    Atos Olympische Spelen,London 2012 Data Pioneers and vision for 2020 (Real Big Data) 10 april 2013 12
  • 13.
    Opportunity from CNES: Big Data Data Pioneers 10 april 2013 for Control Systems ▶ Atos won a 25 M€ contract with French Space Agency (CNES) for a “Product Line” to build Control Systems for spacecraft – First control system will be for a military satellite ▶ IP of some components will be shared between CNES and Atos ▶ More interesting asset to share is infrastructure ▶ Key components : several data stores – Distributed architecture – Lightweight – Very fast – Based on manageable, understandable open source components • Security, maintainability, long term support, … ▶ Our innovative architecture has been a key element of our selection 13
  • 14.
    Red Spotted Hankey, Data Pioneers 10 april 2013 Travel Web Site Business Issue ▶ Limited understanding of the dynamics of marketing response and external influences on web traffic and sales ▶ “Static” customer information Use Cases sentiment translate into an increase in web sales? Does a local radio advertising campaign translate into increased web traffic and sales? ▶ How can RSH derive the best possible value from its marketing strategies, eg: redspottedhanky.com – Does a positive spike in social media sen sells discounted train – What impact does weather have on web traffic? tickets on-line. Customers Solution gather loyalty points for ▶ Cloud based Big Data platform integrating, storing and each ticket purchased analysing unstructured and structured data which can be used to buy ▶ Hadoop based solution integrating weather, twitter additional train tickets. feeds, ticketing sales, CRM and web traffic data into single repository for trend identification and analysis 14
  • 15.
    MyCity – RealTime Traffic Data Pioneers 10 april 2013 Forecast Traffic sensors of the City of Berlin CityCockpit for RTTF Vehicle’s on-board unit . . . Smart phone app 1 1200 Real-time sensor data Real-time data Additional data Forecasted data Other data sources Traffic web (e.g. crowd sourced and open data) Traffic data server Traffic forecast server server Data, Services & Analytics 4 hours forecast service 15
  • 16.
    Nieuwe mogelijkheden; Customerprofiling Data Pioneers Personal Based Economy / Personal Data Economy 10 april 2013 ▶ Laatste Web klikken van de klant / click-stream analysis – tonen juiste advertenties – Flexibele prijzen / aanbiedingen – Loyaliteitsprogramma ▶ Klant “usage patterns” van uw services – Veel gebelde telefoonnummers  speciale aanbiedingen ▶ Locatie van klant ..en vergeet niet dit – Location based services kun je ook allemaal ▶ Genetische / DNA patronen van uw klant / patiënt weer combineren met – Voorschrijven de best werkende medicatie gebaseerd op best Big Data!...... werkende statistische analyse – Preventieve geneeskunde ▶ Beleid / Declaratie profiel van klanten – Fraude detectie / opsporing / management – Proactieve verzekeringspakket aanbieden ▶ Klanten bezitten Twitter stream, Facebook pagina – Detecteer hobby's en interesses – Detecteer belangrijke gebeurtenissen (Geboorte, verhuizing, etc.) – Quantified Self by Numbers 16
  • 17.
    Keeping track ofthe Customer journey Data Pioneers 10 april 2013 From Traditional (single path, predictable process) View TV or Compare Choose Go to Store Buy Item print ad Options Best Option To Connected (multi channel, multi path, complex unpredictable process) Levels of Search Smartphone Customer app Compare Interest prices Loyalty Demo in Like on store Commitment Watch on Facebook View Youtube print ad Watch Evaluation tutorial Buy Item Read Interest Read Online reviews Blog shopping View banner ad Awareness 17
  • 18.
    Now Banking (AtosSmart Mobility + Big Data) Data Pioneers interacting with consumers and guiding them in their day 2 day life’s 10 april 2013 Home Travel Work Hospital Shopping Culture Travel Home -Personal FM -Personal FM -Casualty Man -Casualty Man -Savings -Micro credit -Savings -Mortgages -Car insurance -Credit/Debit -Sustainable -Car insurance -Mortgages -Investments -Liability -Income -Health cards banking -Liability -Investments -Financial goals -Work away* -Life insurance insurance -Personal loans -Sponsoring -Work away -Financial goals Morning Evening *Atos proposition for banks facilitating to work anywhere , anytime 18
  • 19.
    Risks to beaware of Data Pioneers 10 april 2013 (several, and quite diverse) Policies: The risks of misuse: security, privacy, Emergent, immature “Lies, Damned Lies and intellectual property, technologies Statistics” liability .. Mixing “old” tech with Data Garbage: “Digital Access to data can be the new platforms Diogenes” problematic Scarcity of talent in a Transparency is hard Data ownership issues complex field (“Data to achieve Scientists”) 19
  • 20.
    More info Data Pioneers 10 april 2013 ▶ See Factsheet and whitepaper Open source Solutions For Big Data Management: http://nl.Atos.net/BigData 20
  • 21.
    Contact? Data Pioneers 10 april 2013 » Name: Roland Haeve » Role: Global Director Big Data; Information Management & Analytics » Mail: [email protected] » Tel: 06-22465013 » @Rhaeve 21