Research in Business Analytics

                                  February 2013




                           Contacts: Arnaud Quirin <aquirin@gradiant.org>
                                          Héctor Cerezo <hcerezo@gradiant.org>


GALICIAN RESEARCH AND DEVELOPMENT CENTER IN ADVANCED TELECOMMUNICATIONS
Business Analytics in
Gradiant: an Overview
 Key technologies
  •   Data/Graph mining
  •   Big Data / Business Intelligence
  •   Data network characterization/analysis
 HW and SW Resources
  •   HPC resources (304 cores, 1.22 TBs RAM, including 8 virtualization and 2 GPU
      servers)
 Project participation
  •   National projects: AIBOT, CELTIC, SmartED
  •   International projects: LIFTGATE
 IP & Technology Transfer
  •   Real-time data compression algorithm for in-memory databases (US pat. pend.
      61566689)
  •   Scalable analytical frameworks (in-memory databases, Hadoop, Cascading, HBase)
 25+ publications in international journals, conferences, ...




                 CENTRO TECNOLÓXICO DE TELECOMUNICACIÓNS DE GALICIA
Expertise in Data/Graph
Mining
 Data/Graph mining
  •   This technology achieves useful insights in large and complex data collections
      through the combination of pre-processing, transformation, modeling and
      analysis techniques.
 Application cases in Gradiant
  •   Satellite image classification of marine seaweeds for ecological conservation,
      prediction of wind farms electrical supply, prediction and simulation models in
      emergency departments.
  •   Discovering the relationships among scientific and industrial data for
      technological surveys; user profiling, relationship and influence mining using
      Social Media content.




                                                                        http://www.mapofscience.
                                                                        com/



                 CENTRO TECNOLÓXICO DE TELECOMUNICACIÓNS DE GALICIA
Expertise in Scalable Business
Intelligence

 Big Data/NoSQL
  •   Big Data solutions address the problem of analyzing large datasets with efficient
      scalable frameworks. This includes distributed NoSQL databases.
 Business Intelligence (BI)
  •   BI solutions address the problem of processing past records of digital companies
      to achieve insightful analytics.
  •   Gradiant has experience in main memory analytics, as well as in technologies to
      visualize interactive analytics of aggregated business-generated content.
 Application cases in Gradiant
  •   Gradiant participates in projects requiring fast-processing of Social Media
      content (~4 million records/day). Storage in HBase and analysis with Hadoop
      and Cascading.
  •   Gradiant owns in-memory algorithms for analytical databases (US pat. pend.
      61566689), comparable to QlikView solutions.



                 CENTRO TECNOLÓXICO DE TELECOMUNICACIÓNS DE GALICIA
Expertise in Data Network
Analysis

 Data network characterization/analysis
  •   Data network analytic solutions combining monitoring
      and characterization of network operator traffic using
      simulation and mathematical models.
  •   Gradiant has experience in projects with Vodafone.
 Application cases in Gradiant
  •   Monitoring and traffic characterization of xDSL-
      connected users in operator networks, leased link
      performance analysis and dimensioning.
  •   Detection of saturation points in leased links using
      indirect measures.
  •   Large Network simulation models (experience with ns3
      and OMNeT++).
  •   Visualization of statistical data (d3.js and processing.js).




                 CENTRO TECNOLÓXICO DE TELECOMUNICACIÓNS DE GALICIA
HW and SW Resources

Hardware
•   Storage (in cabinet), total 62 TBs
    -   HP M6612 2TB 6G SAS 7.2K 3.5in MDL HDD x 24
    -   HP M6625 600GB 6G SAS 10K 2.5in HDD x 24
•   Servers, total 304 cores and 1.22 TBs RAM
    -   1 x HP DL980 G7 Intel E7 2860 (80 cores, 512 GBs RAM)
    -   2 x HP SL390 G Intel X5675 (24 cores, 192 GBs)
    -   4 x HP DL380G7 Intel Performance (48 cores, 192 GBs)
    -   2 x HP DL380G7 Intel Efficiency (24 cores, 96 GBs)
    -   2 x HP DL385G7 AMD Performance (64 cores, 128 GBs)
    -   2 x HP DL385G7 AMD Efficiency (64 cores, 128GBs)
•   Cryptoprocessors
    -   2 x Utimaco SE10 PCIe boards

Software
•   Big Data platform
    Hadoop/Hbase/Cascading


                 CENTRO TECNOLÓXICO DE TELECOMUNICACIÓNS DE GALICIA
Current Project Cases (I)

 CELTIC: Strategical knowledge
 gathered from business intelligence
 technologies
 •   Spanish consortium, supported by
     Innterconecta tech funds.
 •   Under contract, 2012-2014
 •   Partners: INDRA (leader), Elogia, SaecData,
     Imaxin Software
 BA Goals
 •   User profiling for marketing decision
     support, design of customer influence
     metrics relying on scalable efficient
     frameworks.

             CENTRO TECNOLÓXICO DE TELECOMUNICACIÓNS DE GALICIA
Current Project Cases (II)

 SmartED: Intelligent management for
 emergency health services
 •   Spanish consortium, supported by CDTI tech
     fund.
 •   Under contract, 2011-2013
 •   Partners: Everis (leader), Arantia, Balidea
 BA Goals
 •   Simulation of hospital emergency
     departments, neural net-based prediction
     models of patient arrivals.




             CENTRO TECNOLÓXICO DE TELECOMUNICACIÓNS DE GALICIA
Current Project Cases (III)

 AIBOT: Framework for the
 integration of BSS/OSS systems
 designed to market non-
 telecommunication services through
 telecommunication operators
 •   Spanish consortium, supported by
     Innterconecta tech funds
 •   Under contract, 2012-2015
 •   Partners: Sivsa (leader), Discalis, Optare
 BA Goals
 •   Root Case Analysis (RCA) algorithms in
     the cloud to track incidence causes in
     complex telco systems




              CENTRO TECNOLÓXICO DE TELECOMUNICACIÓNS DE GALICIA
IP & Technology Transfer
 Patent
 •   Real-time data compression algorithm for in-memory databases
     (US pat. pend. 61566689)
 Non patented IP
 •   Distributed framework to process large amounts of data
 •   Scalable framework for user profiling and relationship analysis
     of social media data
 •   Prediction algorithms for hospital emergencies
 •   Indirect techniques to predict saturation in leased network lines
 •   Root Case Analysis (RCA) prediction algorithms to track
     incidences in complex systems




              CENTRO TECNOLÓXICO DE TELECOMUNICACIÓNS DE GALICIA
Dissemination


•   P. Sendin-Raña, E. Rodriguez-Fernandez, F.J. Gonzalez-Castaño et al, “Web-oriented
    business intelligence solution based on Associative Query Logic”, Software, Practice
    and Experience, 40(9): 779-796 (2010).
•   P. Sendín Raña, F. J. González Castaño, E. Pérez Barros, P. S. Rodríguez Hernández,
    F. Gil Castiñeira, J, M. Pousada Carballo, "Improving the performance and
    functionality of Mondrian open-source OLAP systems," Software, Practice &
    Experience, 39(3): 279-298 (2009).
•   A. Quirin, O. Cordon, V. P. Guerrero-Bote, B. Vargas-Quesada, F. Moya-Anegon; A
    Quick MST-based Algorithm to Obtain Pathfinder Networks; Journal of the American
    Society for Information Science and Technology, 59(12): 1912-1924 (2008).
•   A. Quirin, O. Cordon, B. Vargas-Quesada, F. de Moya-Anegon; Graph-based Data
    Mining: A New Tool for the Analysis and Comparison of Scientific Domains
    Represented as Scientograms; Informetrics, 4(3): 291-312 (2010).
•   E. Serrano, A. Quirin, J. Botia, O. Cordon; Debugging Complex Software Systems by
    Means of Pathfinder Networks; Information Sciences, 180(5): 561-583 (2010).




                   CENTRO TECNOLÓXICO DE TELECOMUNICACIÓNS DE GALICIA

Gradiant - Technology Offer in Business Analytics

  • 1.
    Research in BusinessAnalytics February 2013 Contacts: Arnaud Quirin <[email protected]> Héctor Cerezo <[email protected]> GALICIAN RESEARCH AND DEVELOPMENT CENTER IN ADVANCED TELECOMMUNICATIONS
  • 2.
    Business Analytics in Gradiant:an Overview Key technologies • Data/Graph mining • Big Data / Business Intelligence • Data network characterization/analysis HW and SW Resources • HPC resources (304 cores, 1.22 TBs RAM, including 8 virtualization and 2 GPU servers) Project participation • National projects: AIBOT, CELTIC, SmartED • International projects: LIFTGATE IP & Technology Transfer • Real-time data compression algorithm for in-memory databases (US pat. pend. 61566689) • Scalable analytical frameworks (in-memory databases, Hadoop, Cascading, HBase) 25+ publications in international journals, conferences, ... CENTRO TECNOLÓXICO DE TELECOMUNICACIÓNS DE GALICIA
  • 3.
    Expertise in Data/Graph Mining Data/Graph mining • This technology achieves useful insights in large and complex data collections through the combination of pre-processing, transformation, modeling and analysis techniques. Application cases in Gradiant • Satellite image classification of marine seaweeds for ecological conservation, prediction of wind farms electrical supply, prediction and simulation models in emergency departments. • Discovering the relationships among scientific and industrial data for technological surveys; user profiling, relationship and influence mining using Social Media content. http://www.mapofscience. com/ CENTRO TECNOLÓXICO DE TELECOMUNICACIÓNS DE GALICIA
  • 4.
    Expertise in ScalableBusiness Intelligence Big Data/NoSQL • Big Data solutions address the problem of analyzing large datasets with efficient scalable frameworks. This includes distributed NoSQL databases. Business Intelligence (BI) • BI solutions address the problem of processing past records of digital companies to achieve insightful analytics. • Gradiant has experience in main memory analytics, as well as in technologies to visualize interactive analytics of aggregated business-generated content. Application cases in Gradiant • Gradiant participates in projects requiring fast-processing of Social Media content (~4 million records/day). Storage in HBase and analysis with Hadoop and Cascading. • Gradiant owns in-memory algorithms for analytical databases (US pat. pend. 61566689), comparable to QlikView solutions. CENTRO TECNOLÓXICO DE TELECOMUNICACIÓNS DE GALICIA
  • 5.
    Expertise in DataNetwork Analysis Data network characterization/analysis • Data network analytic solutions combining monitoring and characterization of network operator traffic using simulation and mathematical models. • Gradiant has experience in projects with Vodafone. Application cases in Gradiant • Monitoring and traffic characterization of xDSL- connected users in operator networks, leased link performance analysis and dimensioning. • Detection of saturation points in leased links using indirect measures. • Large Network simulation models (experience with ns3 and OMNeT++). • Visualization of statistical data (d3.js and processing.js). CENTRO TECNOLÓXICO DE TELECOMUNICACIÓNS DE GALICIA
  • 6.
    HW and SWResources Hardware • Storage (in cabinet), total 62 TBs - HP M6612 2TB 6G SAS 7.2K 3.5in MDL HDD x 24 - HP M6625 600GB 6G SAS 10K 2.5in HDD x 24 • Servers, total 304 cores and 1.22 TBs RAM - 1 x HP DL980 G7 Intel E7 2860 (80 cores, 512 GBs RAM) - 2 x HP SL390 G Intel X5675 (24 cores, 192 GBs) - 4 x HP DL380G7 Intel Performance (48 cores, 192 GBs) - 2 x HP DL380G7 Intel Efficiency (24 cores, 96 GBs) - 2 x HP DL385G7 AMD Performance (64 cores, 128 GBs) - 2 x HP DL385G7 AMD Efficiency (64 cores, 128GBs) • Cryptoprocessors - 2 x Utimaco SE10 PCIe boards Software • Big Data platform Hadoop/Hbase/Cascading CENTRO TECNOLÓXICO DE TELECOMUNICACIÓNS DE GALICIA
  • 7.
    Current Project Cases(I) CELTIC: Strategical knowledge gathered from business intelligence technologies • Spanish consortium, supported by Innterconecta tech funds. • Under contract, 2012-2014 • Partners: INDRA (leader), Elogia, SaecData, Imaxin Software BA Goals • User profiling for marketing decision support, design of customer influence metrics relying on scalable efficient frameworks. CENTRO TECNOLÓXICO DE TELECOMUNICACIÓNS DE GALICIA
  • 8.
    Current Project Cases(II) SmartED: Intelligent management for emergency health services • Spanish consortium, supported by CDTI tech fund. • Under contract, 2011-2013 • Partners: Everis (leader), Arantia, Balidea BA Goals • Simulation of hospital emergency departments, neural net-based prediction models of patient arrivals. CENTRO TECNOLÓXICO DE TELECOMUNICACIÓNS DE GALICIA
  • 9.
    Current Project Cases(III) AIBOT: Framework for the integration of BSS/OSS systems designed to market non- telecommunication services through telecommunication operators • Spanish consortium, supported by Innterconecta tech funds • Under contract, 2012-2015 • Partners: Sivsa (leader), Discalis, Optare BA Goals • Root Case Analysis (RCA) algorithms in the cloud to track incidence causes in complex telco systems CENTRO TECNOLÓXICO DE TELECOMUNICACIÓNS DE GALICIA
  • 10.
    IP & TechnologyTransfer Patent • Real-time data compression algorithm for in-memory databases (US pat. pend. 61566689) Non patented IP • Distributed framework to process large amounts of data • Scalable framework for user profiling and relationship analysis of social media data • Prediction algorithms for hospital emergencies • Indirect techniques to predict saturation in leased network lines • Root Case Analysis (RCA) prediction algorithms to track incidences in complex systems CENTRO TECNOLÓXICO DE TELECOMUNICACIÓNS DE GALICIA
  • 11.
    Dissemination • P. Sendin-Raña, E. Rodriguez-Fernandez, F.J. Gonzalez-Castaño et al, “Web-oriented business intelligence solution based on Associative Query Logic”, Software, Practice and Experience, 40(9): 779-796 (2010). • P. Sendín Raña, F. J. González Castaño, E. Pérez Barros, P. S. Rodríguez Hernández, F. Gil Castiñeira, J, M. Pousada Carballo, "Improving the performance and functionality of Mondrian open-source OLAP systems," Software, Practice & Experience, 39(3): 279-298 (2009). • A. Quirin, O. Cordon, V. P. Guerrero-Bote, B. Vargas-Quesada, F. Moya-Anegon; A Quick MST-based Algorithm to Obtain Pathfinder Networks; Journal of the American Society for Information Science and Technology, 59(12): 1912-1924 (2008). • A. Quirin, O. Cordon, B. Vargas-Quesada, F. de Moya-Anegon; Graph-based Data Mining: A New Tool for the Analysis and Comparison of Scientific Domains Represented as Scientograms; Informetrics, 4(3): 291-312 (2010). • E. Serrano, A. Quirin, J. Botia, O. Cordon; Debugging Complex Software Systems by Means of Pathfinder Networks; Information Sciences, 180(5): 561-583 (2010). CENTRO TECNOLÓXICO DE TELECOMUNICACIÓNS DE GALICIA