SlideShare a Scribd company logo
Mucho Big Data ¿y la Seguridad para cuándo?
July 9, 2013
Juan Carlos Vázquez
Sales Systems Engineer, LTAM
"Los datos personales son el petróleo del siglo XXI"
Una montaña de datos
>15000 Millones
Dispositivos Conectados2
(15B)
1. IDC “Server Workloads Forecast” 2009. 2.IDC “The Internet Reaches Late Adolescence” Dec 2009, extrapolationby Intel for 2015 2.ECG “Worldwide Device Estimates Year 2020 - Intel One Smart Network Work” forecast
3. Source: http://www.cisco.com/assets/cdc_content_elements/networking_solutions/service_provider/visual_networking_ip_traffic_chart.html extrapolatedto 2015
En 2015… Mayor demanda para los Data Centers
>1000 Million Mas
Netizen’s1
(1B)
>1 Zetabyte Tráfico
en Internet3
(1000 Exabytes)
Source: IDC, 2011 Worldwide Enterprise Storage Systems 2011–2015 Forecast Update.
Worldwide Enterprise Storage Consumption Capacity Shipped by Model, 2006–2015 (PB)
2.7 ZB de datos en 2012, 15,000 milliones de dispositivos conectados en 2015
Al rededor de 24 Petabytes
De datos procesados por Google* al día en 2011
4,000 milliones
Piezas de contenido compartidas en Facebook* cada
día (Julio 2011)
250 milliones
…de Tweets por día en Octubre de 2011
5.5 milliones
Emails (legítimos) por segundo en 2011
Una explosión de datos
Más datos…
6
En 2020, el volumen de información será de 35.2 Zettabytes
En el 2020, el volumen de información digital alcanzará los 35.2 Zettabytes (1 ZB
es igual a 1 trillón de GB), frente al 1.8 ZB de 2010. Ese crecimiento exponencial
de los datos hace de Big Data la fuerza motriz de la era de la información, de
acuerdo con estimaciones de Sogeti, compañía del Grupo Capgemini.
Por su parte, la consultora Gartner afirma que las empresas capaces de tener
información más valiosa, procesarla y administrarla, obtendrán resultados
financieros un 20% mejor que sus competidores.
Un caso
El New York Times usó 100 instancias de Amazon EC2
y Hadoop para procesar 4 TB de datos en imágenes
TIFF y obtener 11 millones de PDFs en 24 hrs a un
costo de $240 usd
http://en.wikipedia.org/wiki/Apache_Hadoop
Otro caso
Los clusters para Hadoop en Yahoo! cuentan
con 40,000 servidores y almacenan 40
petabytes de datos, y donde el cluster mayor es
de 4,000 sevidores
http://www.aosabook.org/en/hdfs.html
Solo un caso más
En 2010 Facebook declaró que tenía el cluster
de Hadoop mas grande del mundo con 21 PB.
En 2011 anunció que había crecido a 30PB y
hacia la mitad de 2012 alcanzó los 100PB. En
Noviembre 8, 2012 ellos anunciaron que su
almacen de datos crece casi la mitad de un PB
por día.
http://en.wikipedia.org/wiki/Apache_Hadoop
Big Data
10
Es un término aplicado a conjuntos de datos que superan la capacidad del
software habitual para ser capturados, gestionados y procesados en un
tiempo razonable. Los tamaños del “Big Data" se encuentran
constantemente en movimiento creciente, de esta forma en 2012 se
encontraba dimensionado en un tamaño de una docena de terabytes hasta
varios petabytes de datos en un único data set.
Los retos incluyen la captura, el procesamiento, el almacenamiento, el
compartir inteligencia, el análisis y la visualización.
Beneficio para el sector Salud, Financiero, Telcos, Energía, Tráfico, Marketing,
Manufactura, Seguridad… quién hará la pregunta correcta?
The four Vs
11
• Volume. When the term big data is used, data volume typically ranges multiple terabytes
to petabytes. This certainly fits the enterprise security model as it is not uncommon for
large organizations to collect tens of terabytes of security data on a monthly basis.
• Velocity. This term is used with respect to real-time data analysis requirements. In
cybersecurity, velocity can refer to the need for immediate anomaly, or incident
detection. Real-time data analysis is critical here to minimize damages associated with a
cybersecurity attack.
• Variety. Big data can be made up of multiple data types and feeds including structured
and unstructured data. From a security perspective, data variety could include log files,
network flows, IP packet capture, external threat/vulnerability intelligence, click streams,
network/physical access, and social networking activity, etc. It is not unusual for
enterprises to collect hundreds of different types of data feeds for security analysis.
• Veracity. Big data must be trustworthy and accurate. From a security perspective, this
means trusting the confidentiality, integrity, and availability of data sources like log files
and external data feeds.
Thousands of Events
The Big Security Data Challenge
BILLIONS OF EVENTS
Correlate Events
Consolidate Logs
Perimeter
APTs
Cloud
Data
Insider
BILLIONS OF EVENTS
The Security Dilemma
MONITORING TECHNIQUES MUST ADVANCE
VISIBILITY
INSTRUMENTATION
Instrumentation and data collection are still critical, but applying filters derived
from intelligence is the path to achieving better security.
Big Data vs. Big Security Data
Datasets whose size and variety is beyond the ability of typical
database software to capture, store, manage and analyze.
Understanding Security Data As Big Data
• How do I gather security context?
• How do I manage big
security information?
• How do I make security
information management work?
BIG DATA
BIG SECURITY DATA
• Size of Security Data doubling
annually
• Advanced threats demand
collecting more data
• Legacy data management
approaches failing
• SIEM use shifting from
compliance to security
Security Big Data is about matching security intelligence with the right collected data.
Gartner says…
• The amount of data analyzed by enterprise
information security organizations will double every
year through 2016.
• By 2016, 40% of enterprises will actively analyze at
least 10 terabytes of data for information security
intelligence, up from less than 3% in 2011.
• By 2016, 40% of Type A enterprises will create and
staff a security analytics role, up from less than 1%
in 2011.
Goal…
One of the primary drivers of security
analytics will be the need to identify when
an advanced targeted attack has bypassed
traditional preventative security controls
and has penetrated the organization.
Needle in a Datastack
17
• Organizations are storing approximately 11-15 terabytes of security data a week.
• The ability to detect data breaches within minutes is critical in preventing data loss, yet
only 35 percent of firms stated that they have the ability to do this.
• In fact, more than a fifth (22 percent) said they would need a day to identify a breach,
and five percent said this process would take up to a week. On average, organizations
reported that it takes 10 hours for a security breach to be recognized.
• Nearly three quarters (73 percent) of respondents claimed they can assess their
security status in real-time and they also responded with confidence in their ability to
identify in real-time insider threat detection (74 percent), perimeter threats (78
percent), zero day malware (72 percent) and compliance controls (80 percent).
However, of the 58 percent of organizations that said they had suffered a security
breach in the last year, just a quarter (24 percent) had recognized it within minutes. In
addition, when it came to actually finding the source of the breach, only 14 percent
could do so in minutes, while 33 percent said it took a day and 16 percent said a week.
The study, conducted by research firm Vanson Bourne, interviewed 500 senior IT decision makers in January 2013, including 200 in the USA and 100 each in the UK, Germany and Australia.
Datos útiles…de Verizon 2012
18
• “84% de los incidentes de seguridad (intrusiones
exitosas) se han reflejado en los logs”
• “Sólo el 8% de los incidentes de seguridad
detectados por las empresas han sido por minar
sus logs”
Normalización
19
What else happened at this time?
Near this time?
What is the time zone?
What is this service? What other
messages did it produce?
What other systems does it run on?
What is the hosts IP address?
Other names? Location on the
network/datacenter?
Who is the admin? Is this
system vulnerable to exploits?
What does this number
mean? s this documented
somewhere?
Who is this user? What is the users
access-level? What is the users
real name, department, location?
What other events from this user?
What is this port? Is this a
normal port for this
service? What else is this
service being used for?
DNS name, Windows name, Other names?
Whois info? Organization owner? Where does
the IP originate from (geo location info)? What
else happened on this host? Which other hosts
did this IP communicate with?
SIEM is Still Evolving …Beyond Logs
SEM + SIM = SIEM
SIEM is the Evolution and Integration of
Two Distinct Technologies
 Security Event Management (SEM)
― Primarily focused on Collecting and
Aggregating Security Events
 Security Information Management (SIM)
― Primarily focused on the Enrichment,
Normalization, and Correlation of
Security Events
Security Information & Event
Management (SIEM) is a Set of
Technologies for:
 Log Data Collection
 Correlation
 Aggregation
 Normalization
 Retention
 Analysis and Workflow
1 2 3
Three Major Factors Driving the Majority of SIEM Implementations
Real-Time
Threat Visibility
Security
Operational
Efficiency
Compliance and/or Log
Management Requirements
The State of SIEM
Antiquated Architectures Force
Choices Between Time-to-Data
and Intelligence
Events Alone Do Not Provide
Enough Context to
Combat Today’s Threats
Complex Usability and
Implementation Have Caused
Costs To Skyrocket
00001001001111
11010101110101
10001010010100
00101011101101
VS
Legacy SIEM REALITY:
Turns Security Data Into
Actionable Information
Provides an Intelligent
Investigation Platform
Supports Management and
Demonstration of Compliance
SIEM Promise:
Shifting from Compliance to Security
23
Source: InformationWeek 2012 Security Information and Event Management Vendor Evaluation Survey of 322 business technology
professionals, April 2012
SIEM as solution to detect CyberAttacks
Medium Risk High Risk
Global Threat Intelligence and SIEM
McAfee Labs
IP Reputation Updates
GOOD SUSPECT BAD
IP REPUTATION CHECK
Botnet/
DDos
Mail/
Spam
Sending
Web
Access
Malware
Hosting
Network
Probing
Network
Probing
Presence
of Malware
DNS
Hosting
Activity
Intrusion
Attacks
AUTOMATIC IDENTIFICATION
AUTOMATIC RISK ANALYSIS
VIA ADVANCED CORRELATION
ENGINE
GTI with SIEM Delivers Even Greater Value
Sorting Through a Sea of Events…
200M events
18,000 alerts
and logs
Dozens of
endpoints
Handful
of users
Specific files
breached
(if any)
Optimized
response
RESPOND
Have I Been Communicating With Bad Actors?
Which Communication Was Not Blocked?
What Specific Servers/Endpoints/ Devices Were Breached?
Which User Accounts Were Compromised?
What Occurred With Those Accounts?
How Should I Respond?
Manejo de Eventos…
Priorizar los eventos de seguridad
De arriba hacia abajo…
Si bueno, con quién hablo?
Conocimiento de mi ambiente…
McAfee ESM
McAfee Starts at the Core
July 9, 2013
32
McAfee DB
• Real-time, complex analysis
• Indexing purpose-built for SIEM
• Massive context feeds with enrichment
• Historical retrieval and analytics
• Integrated log and event management
• No DBA required
SMART FAST
Scale, Analytical flexibility, Performance
Sitios Web Maliciosos…
33
El malware está aquí…
Spam y Bots en descenso…
Conclusiones…
• Usar y encender tus Logs
• Primero un Log Mgmt antes que un SIEM
• No hay “balas de plata”
• Gana el pensamiento vs la tecnología
• Menos es más
• Windows Events Logs
• Syslogs
• DNS
• App Logs
• Context Awareness (Geolocation, Users, VM, Asset Mgmt, etc)
• Casos de uso , caso de uso, casos de uso!
• Arquitecturas de Big Data
• Alta velocidad (I/O), horas para ver un reporte? O minutos para una vista?
• Feeds de Seguridad (Sistemas de reputación)
• Seguridad Interconectada
• IP mala de reputación automáticamente bloqueada por el IPS.
• Equipo que tuvo contacto con IP maliciosa ser analizado desde el SIEM
“If you’re in a fight, you need to know that while it’s happening, not after the fact”
El contexto de la integración masiva de datos

More Related Content

PPTX
Big Data Security Analytics (BDSA) with Randy Franklin
Sridhar Karnam
 
PDF
Leverage Big Data for Security Intelligence
Stefaan Van daele
 
PDF
DataWorks 2018: How Big Data and AI Saved the Day
Interset
 
PPT
Big Data, Security Intelligence, (And Why I Hate This Title)
Coastal Pet Products, Inc.
 
PDF
IANS Forum Charlotte: Operationalizing Big Data Security [Tech Spotlight]
Interset
 
PPTX
IANS Forum Seattle Technology Spotlight: Looking for and Finding the Inside...
Interset
 
PPTX
How to Operationalize Big Data Security Analytics - Technology Spotlight at I...
Interset
 
PDF
Atlanta ISSA 2010 Enterprise Data Protection Ulf Mattsson
Ulf Mattsson
 
Big Data Security Analytics (BDSA) with Randy Franklin
Sridhar Karnam
 
Leverage Big Data for Security Intelligence
Stefaan Van daele
 
DataWorks 2018: How Big Data and AI Saved the Day
Interset
 
Big Data, Security Intelligence, (And Why I Hate This Title)
Coastal Pet Products, Inc.
 
IANS Forum Charlotte: Operationalizing Big Data Security [Tech Spotlight]
Interset
 
IANS Forum Seattle Technology Spotlight: Looking for and Finding the Inside...
Interset
 
How to Operationalize Big Data Security Analytics - Technology Spotlight at I...
Interset
 
Atlanta ISSA 2010 Enterprise Data Protection Ulf Mattsson
Ulf Mattsson
 

What's hot (20)

PDF
WEBINAR: How To Use Artificial Intelligence To Prevent Insider Threats
Interset
 
PDF
Data centric security key to digital business success - ulf mattsson - bright...
Ulf Mattsson
 
PPTX
Big data security challenges and recommendations!
cisoplatform
 
PPT
VeriSign iDefense Security Intelligence Services
TechBiz Forense Digital
 
PDF
Big security for_big_data
Shyam Sarkar
 
PDF
How the latest trends in data security can help your data protection strategy...
Ulf Mattsson
 
PPTX
Myths and realities of data security and compliance - Isaca Alanta - ulf matt...
Ulf Mattsson
 
PDF
How can i find my security blind spots in Oracle - nyoug - sep 2016
Ulf Mattsson
 
PDF
How to protect the cookies once someone gets into the cookie jar
JudgeEagle
 
PPTX
Big Data Analytics for Cyber Security: A Quick Overview
Femi Ashaye
 
PPTX
Security bigdata
Jitendra Chauhan
 
PPTX
IANS Forum Dallas - Technology Spotlight Session
Interset
 
DOCX
Big data security
Anne ndolo
 
PDF
How can i find my security blind spots ulf mattsson - aug 2016
Ulf Mattsson
 
PDF
Big Data Security Intelligence and Analytics for Advanced Threat Protection
Blue Coat
 
PPTX
User and entity behavior analytics: building an effective solution
Yolanta Beresna
 
PPTX
November 2013 HUG: Cyber Security with Hadoop
Yahoo Developer Network
 
PDF
What's behind a cyber attack
Andreanne Clarke
 
PPTX
One Year After WannaCry - Has Anything Changed? A Root Cause Analysis of Data...
Forcepoint LLC
 
ODP
Big security for big data
Ari Elias-Bachrach
 
WEBINAR: How To Use Artificial Intelligence To Prevent Insider Threats
Interset
 
Data centric security key to digital business success - ulf mattsson - bright...
Ulf Mattsson
 
Big data security challenges and recommendations!
cisoplatform
 
VeriSign iDefense Security Intelligence Services
TechBiz Forense Digital
 
Big security for_big_data
Shyam Sarkar
 
How the latest trends in data security can help your data protection strategy...
Ulf Mattsson
 
Myths and realities of data security and compliance - Isaca Alanta - ulf matt...
Ulf Mattsson
 
How can i find my security blind spots in Oracle - nyoug - sep 2016
Ulf Mattsson
 
How to protect the cookies once someone gets into the cookie jar
JudgeEagle
 
Big Data Analytics for Cyber Security: A Quick Overview
Femi Ashaye
 
Security bigdata
Jitendra Chauhan
 
IANS Forum Dallas - Technology Spotlight Session
Interset
 
Big data security
Anne ndolo
 
How can i find my security blind spots ulf mattsson - aug 2016
Ulf Mattsson
 
Big Data Security Intelligence and Analytics for Advanced Threat Protection
Blue Coat
 
User and entity behavior analytics: building an effective solution
Yolanta Beresna
 
November 2013 HUG: Cyber Security with Hadoop
Yahoo Developer Network
 
What's behind a cyber attack
Andreanne Clarke
 
One Year After WannaCry - Has Anything Changed? A Root Cause Analysis of Data...
Forcepoint LLC
 
Big security for big data
Ari Elias-Bachrach
 
Ad

Similar to El contexto de la integración masiva de datos (20)

PDF
Big security for big data
Giuliano Tavaroli
 
PDF
Big Data Dectives
- Mark - Fullbright
 
PPTX
Big data security the perfect storm
Ulf Mattsson
 
PDF
Big Data Analytics to Enhance Security คุณอนพัทย์ พิพัฒน์กิติบดี Technical Ma...
BAINIDA
 
PPTX
Splunk for Security Breakout Session
Splunk
 
PPTX
Using Big Data to Counteract Advanced Threats
Zivaro Inc
 
PPTX
Good Guys vs Bad Guys: Using Big Data to Counteract Advanced Threats
Zivaro Inc
 
PPTX
Advanced threat protection and big data
Peter Wood
 
PPTX
SplunkLive! - Splunk for Security
Splunk
 
PPTX
SplunkLive! Splunk for Security
Splunk
 
PDF
Kind of big data in info sec
Ben Finke
 
PPTX
Big Data For Threat Detection & Response
Harry McLaren
 
PPTX
Big data security
CloudBees
 
PDF
Visualization in the Age of Big Data
Raffael Marty
 
PDF
Big Data & Security Have Collided - What Are You Going to do About It?
EMC
 
PPTX
Splunk for Enterprise Security featuring User Behavior Analytics
Splunk
 
PPTX
44CON 2014 - Security Analytics Beyond Cyber, Phil Huggins
44CON
 
PPTX
Security Analytics Beyond Cyber
Phil Huggins FBCS CITP
 
PDF
SWOT of Bigdata Security Using Machine Learning Techniques
ijistjournal
 
Big security for big data
Giuliano Tavaroli
 
Big Data Dectives
- Mark - Fullbright
 
Big data security the perfect storm
Ulf Mattsson
 
Big Data Analytics to Enhance Security คุณอนพัทย์ พิพัฒน์กิติบดี Technical Ma...
BAINIDA
 
Splunk for Security Breakout Session
Splunk
 
Using Big Data to Counteract Advanced Threats
Zivaro Inc
 
Good Guys vs Bad Guys: Using Big Data to Counteract Advanced Threats
Zivaro Inc
 
Advanced threat protection and big data
Peter Wood
 
SplunkLive! - Splunk for Security
Splunk
 
SplunkLive! Splunk for Security
Splunk
 
Kind of big data in info sec
Ben Finke
 
Big Data For Threat Detection & Response
Harry McLaren
 
Big data security
CloudBees
 
Visualization in the Age of Big Data
Raffael Marty
 
Big Data & Security Have Collided - What Are You Going to do About It?
EMC
 
Splunk for Enterprise Security featuring User Behavior Analytics
Splunk
 
44CON 2014 - Security Analytics Beyond Cyber, Phil Huggins
44CON
 
Security Analytics Beyond Cyber
Phil Huggins FBCS CITP
 
SWOT of Bigdata Security Using Machine Learning Techniques
ijistjournal
 
Ad

More from Software Guru (20)

PDF
Hola Mundo del Internet de las Cosas
Software Guru
 
PDF
Estructuras de datos avanzadas: Casos de uso reales
Software Guru
 
PPTX
Building bias-aware environments
Software Guru
 
PDF
El secreto para ser un desarrollador Senior
Software Guru
 
PDF
Cómo encontrar el trabajo remoto ideal
Software Guru
 
PDF
Automatizando ideas con Apache Airflow
Software Guru
 
PPTX
How thick data can improve big data analysis for business:
Software Guru
 
PDF
Introducción al machine learning
Software Guru
 
PDF
Democratizando el uso de CoDi
Software Guru
 
PDF
Gestionando la felicidad de los equipos con Management 3.0
Software Guru
 
PDF
Taller: Creación de Componentes Web re-usables con StencilJS
Software Guru
 
PPTX
El camino del full stack developer (o como hacemos en SERTI para que no solo ...
Software Guru
 
PDF
¿Qué significa ser un programador en Bitso?
Software Guru
 
PDF
Colaboración efectiva entre desarrolladores del cliente y tu equipo.
Software Guru
 
PDF
Pruebas de integración con Docker en Azure DevOps
Software Guru
 
PDF
Elixir + Elm: Usando lenguajes funcionales en servicios productivos
Software Guru
 
PDF
Así publicamos las apps de Spotify sin stress
Software Guru
 
PPTX
Achieving Your Goals: 5 Tips to successfully achieve your goals
Software Guru
 
PDF
Acciones de comunidades tech en tiempos del Covid19
Software Guru
 
PDF
De lo operativo a lo estratégico: un modelo de management de diseño
Software Guru
 
Hola Mundo del Internet de las Cosas
Software Guru
 
Estructuras de datos avanzadas: Casos de uso reales
Software Guru
 
Building bias-aware environments
Software Guru
 
El secreto para ser un desarrollador Senior
Software Guru
 
Cómo encontrar el trabajo remoto ideal
Software Guru
 
Automatizando ideas con Apache Airflow
Software Guru
 
How thick data can improve big data analysis for business:
Software Guru
 
Introducción al machine learning
Software Guru
 
Democratizando el uso de CoDi
Software Guru
 
Gestionando la felicidad de los equipos con Management 3.0
Software Guru
 
Taller: Creación de Componentes Web re-usables con StencilJS
Software Guru
 
El camino del full stack developer (o como hacemos en SERTI para que no solo ...
Software Guru
 
¿Qué significa ser un programador en Bitso?
Software Guru
 
Colaboración efectiva entre desarrolladores del cliente y tu equipo.
Software Guru
 
Pruebas de integración con Docker en Azure DevOps
Software Guru
 
Elixir + Elm: Usando lenguajes funcionales en servicios productivos
Software Guru
 
Así publicamos las apps de Spotify sin stress
Software Guru
 
Achieving Your Goals: 5 Tips to successfully achieve your goals
Software Guru
 
Acciones de comunidades tech en tiempos del Covid19
Software Guru
 
De lo operativo a lo estratégico: un modelo de management de diseño
Software Guru
 

Recently uploaded (20)

PDF
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
PDF
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PDF
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
PPTX
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PDF
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PPTX
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
PDF
Software Development Methodologies in 2025
KodekX
 
PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PDF
BLW VOCATIONAL TRAINING SUMMER INTERNSHIP REPORT
codernjn73
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PDF
How-Cloud-Computing-Impacts-Businesses-in-2025-and-Beyond.pdf
Artjoker Software Development Company
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PDF
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
PDF
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
PDF
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
Research-Fundamentals-and-Topic-Development.pdf
ayesha butalia
 
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
Software Development Methodologies in 2025
KodekX
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
BLW VOCATIONAL TRAINING SUMMER INTERNSHIP REPORT
codernjn73
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
How-Cloud-Computing-Impacts-Businesses-in-2025-and-Beyond.pdf
Artjoker Software Development Company
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 

El contexto de la integración masiva de datos

  • 1. Mucho Big Data ¿y la Seguridad para cuándo? July 9, 2013 Juan Carlos Vázquez Sales Systems Engineer, LTAM
  • 2. "Los datos personales son el petróleo del siglo XXI"
  • 4. >15000 Millones Dispositivos Conectados2 (15B) 1. IDC “Server Workloads Forecast” 2009. 2.IDC “The Internet Reaches Late Adolescence” Dec 2009, extrapolationby Intel for 2015 2.ECG “Worldwide Device Estimates Year 2020 - Intel One Smart Network Work” forecast 3. Source: http://www.cisco.com/assets/cdc_content_elements/networking_solutions/service_provider/visual_networking_ip_traffic_chart.html extrapolatedto 2015 En 2015… Mayor demanda para los Data Centers >1000 Million Mas Netizen’s1 (1B) >1 Zetabyte Tráfico en Internet3 (1000 Exabytes)
  • 5. Source: IDC, 2011 Worldwide Enterprise Storage Systems 2011–2015 Forecast Update. Worldwide Enterprise Storage Consumption Capacity Shipped by Model, 2006–2015 (PB) 2.7 ZB de datos en 2012, 15,000 milliones de dispositivos conectados en 2015 Al rededor de 24 Petabytes De datos procesados por Google* al día en 2011 4,000 milliones Piezas de contenido compartidas en Facebook* cada día (Julio 2011) 250 milliones …de Tweets por día en Octubre de 2011 5.5 milliones Emails (legítimos) por segundo en 2011 Una explosión de datos
  • 6. Más datos… 6 En 2020, el volumen de información será de 35.2 Zettabytes En el 2020, el volumen de información digital alcanzará los 35.2 Zettabytes (1 ZB es igual a 1 trillón de GB), frente al 1.8 ZB de 2010. Ese crecimiento exponencial de los datos hace de Big Data la fuerza motriz de la era de la información, de acuerdo con estimaciones de Sogeti, compañía del Grupo Capgemini. Por su parte, la consultora Gartner afirma que las empresas capaces de tener información más valiosa, procesarla y administrarla, obtendrán resultados financieros un 20% mejor que sus competidores.
  • 7. Un caso El New York Times usó 100 instancias de Amazon EC2 y Hadoop para procesar 4 TB de datos en imágenes TIFF y obtener 11 millones de PDFs en 24 hrs a un costo de $240 usd http://en.wikipedia.org/wiki/Apache_Hadoop
  • 8. Otro caso Los clusters para Hadoop en Yahoo! cuentan con 40,000 servidores y almacenan 40 petabytes de datos, y donde el cluster mayor es de 4,000 sevidores http://www.aosabook.org/en/hdfs.html
  • 9. Solo un caso más En 2010 Facebook declaró que tenía el cluster de Hadoop mas grande del mundo con 21 PB. En 2011 anunció que había crecido a 30PB y hacia la mitad de 2012 alcanzó los 100PB. En Noviembre 8, 2012 ellos anunciaron que su almacen de datos crece casi la mitad de un PB por día. http://en.wikipedia.org/wiki/Apache_Hadoop
  • 10. Big Data 10 Es un término aplicado a conjuntos de datos que superan la capacidad del software habitual para ser capturados, gestionados y procesados en un tiempo razonable. Los tamaños del “Big Data" se encuentran constantemente en movimiento creciente, de esta forma en 2012 se encontraba dimensionado en un tamaño de una docena de terabytes hasta varios petabytes de datos en un único data set. Los retos incluyen la captura, el procesamiento, el almacenamiento, el compartir inteligencia, el análisis y la visualización. Beneficio para el sector Salud, Financiero, Telcos, Energía, Tráfico, Marketing, Manufactura, Seguridad… quién hará la pregunta correcta?
  • 11. The four Vs 11 • Volume. When the term big data is used, data volume typically ranges multiple terabytes to petabytes. This certainly fits the enterprise security model as it is not uncommon for large organizations to collect tens of terabytes of security data on a monthly basis. • Velocity. This term is used with respect to real-time data analysis requirements. In cybersecurity, velocity can refer to the need for immediate anomaly, or incident detection. Real-time data analysis is critical here to minimize damages associated with a cybersecurity attack. • Variety. Big data can be made up of multiple data types and feeds including structured and unstructured data. From a security perspective, data variety could include log files, network flows, IP packet capture, external threat/vulnerability intelligence, click streams, network/physical access, and social networking activity, etc. It is not unusual for enterprises to collect hundreds of different types of data feeds for security analysis. • Veracity. Big data must be trustworthy and accurate. From a security perspective, this means trusting the confidentiality, integrity, and availability of data sources like log files and external data feeds.
  • 12. Thousands of Events The Big Security Data Challenge BILLIONS OF EVENTS Correlate Events Consolidate Logs Perimeter APTs Cloud Data Insider BILLIONS OF EVENTS
  • 13. The Security Dilemma MONITORING TECHNIQUES MUST ADVANCE VISIBILITY INSTRUMENTATION Instrumentation and data collection are still critical, but applying filters derived from intelligence is the path to achieving better security.
  • 14. Big Data vs. Big Security Data Datasets whose size and variety is beyond the ability of typical database software to capture, store, manage and analyze. Understanding Security Data As Big Data • How do I gather security context? • How do I manage big security information? • How do I make security information management work? BIG DATA BIG SECURITY DATA • Size of Security Data doubling annually • Advanced threats demand collecting more data • Legacy data management approaches failing • SIEM use shifting from compliance to security Security Big Data is about matching security intelligence with the right collected data.
  • 15. Gartner says… • The amount of data analyzed by enterprise information security organizations will double every year through 2016. • By 2016, 40% of enterprises will actively analyze at least 10 terabytes of data for information security intelligence, up from less than 3% in 2011. • By 2016, 40% of Type A enterprises will create and staff a security analytics role, up from less than 1% in 2011.
  • 16. Goal… One of the primary drivers of security analytics will be the need to identify when an advanced targeted attack has bypassed traditional preventative security controls and has penetrated the organization.
  • 17. Needle in a Datastack 17 • Organizations are storing approximately 11-15 terabytes of security data a week. • The ability to detect data breaches within minutes is critical in preventing data loss, yet only 35 percent of firms stated that they have the ability to do this. • In fact, more than a fifth (22 percent) said they would need a day to identify a breach, and five percent said this process would take up to a week. On average, organizations reported that it takes 10 hours for a security breach to be recognized. • Nearly three quarters (73 percent) of respondents claimed they can assess their security status in real-time and they also responded with confidence in their ability to identify in real-time insider threat detection (74 percent), perimeter threats (78 percent), zero day malware (72 percent) and compliance controls (80 percent). However, of the 58 percent of organizations that said they had suffered a security breach in the last year, just a quarter (24 percent) had recognized it within minutes. In addition, when it came to actually finding the source of the breach, only 14 percent could do so in minutes, while 33 percent said it took a day and 16 percent said a week. The study, conducted by research firm Vanson Bourne, interviewed 500 senior IT decision makers in January 2013, including 200 in the USA and 100 each in the UK, Germany and Australia.
  • 18. Datos útiles…de Verizon 2012 18 • “84% de los incidentes de seguridad (intrusiones exitosas) se han reflejado en los logs” • “Sólo el 8% de los incidentes de seguridad detectados por las empresas han sido por minar sus logs”
  • 20. What else happened at this time? Near this time? What is the time zone? What is this service? What other messages did it produce? What other systems does it run on? What is the hosts IP address? Other names? Location on the network/datacenter? Who is the admin? Is this system vulnerable to exploits? What does this number mean? s this documented somewhere? Who is this user? What is the users access-level? What is the users real name, department, location? What other events from this user? What is this port? Is this a normal port for this service? What else is this service being used for? DNS name, Windows name, Other names? Whois info? Organization owner? Where does the IP originate from (geo location info)? What else happened on this host? Which other hosts did this IP communicate with? SIEM is Still Evolving …Beyond Logs
  • 21. SEM + SIM = SIEM SIEM is the Evolution and Integration of Two Distinct Technologies  Security Event Management (SEM) ― Primarily focused on Collecting and Aggregating Security Events  Security Information Management (SIM) ― Primarily focused on the Enrichment, Normalization, and Correlation of Security Events Security Information & Event Management (SIEM) is a Set of Technologies for:  Log Data Collection  Correlation  Aggregation  Normalization  Retention  Analysis and Workflow 1 2 3 Three Major Factors Driving the Majority of SIEM Implementations Real-Time Threat Visibility Security Operational Efficiency Compliance and/or Log Management Requirements
  • 22. The State of SIEM Antiquated Architectures Force Choices Between Time-to-Data and Intelligence Events Alone Do Not Provide Enough Context to Combat Today’s Threats Complex Usability and Implementation Have Caused Costs To Skyrocket 00001001001111 11010101110101 10001010010100 00101011101101 VS Legacy SIEM REALITY: Turns Security Data Into Actionable Information Provides an Intelligent Investigation Platform Supports Management and Demonstration of Compliance SIEM Promise:
  • 23. Shifting from Compliance to Security 23 Source: InformationWeek 2012 Security Information and Event Management Vendor Evaluation Survey of 322 business technology professionals, April 2012
  • 24. SIEM as solution to detect CyberAttacks
  • 25. Medium Risk High Risk Global Threat Intelligence and SIEM McAfee Labs IP Reputation Updates GOOD SUSPECT BAD IP REPUTATION CHECK Botnet/ DDos Mail/ Spam Sending Web Access Malware Hosting Network Probing Network Probing Presence of Malware DNS Hosting Activity Intrusion Attacks AUTOMATIC IDENTIFICATION AUTOMATIC RISK ANALYSIS VIA ADVANCED CORRELATION ENGINE
  • 26. GTI with SIEM Delivers Even Greater Value Sorting Through a Sea of Events… 200M events 18,000 alerts and logs Dozens of endpoints Handful of users Specific files breached (if any) Optimized response RESPOND Have I Been Communicating With Bad Actors? Which Communication Was Not Blocked? What Specific Servers/Endpoints/ Devices Were Breached? Which User Accounts Were Compromised? What Occurred With Those Accounts? How Should I Respond?
  • 28. Priorizar los eventos de seguridad
  • 29. De arriba hacia abajo…
  • 30. Si bueno, con quién hablo?
  • 31. Conocimiento de mi ambiente…
  • 32. McAfee ESM McAfee Starts at the Core July 9, 2013 32 McAfee DB • Real-time, complex analysis • Indexing purpose-built for SIEM • Massive context feeds with enrichment • Historical retrieval and analytics • Integrated log and event management • No DBA required SMART FAST Scale, Analytical flexibility, Performance
  • 34. El malware está aquí…
  • 35. Spam y Bots en descenso…
  • 36. Conclusiones… • Usar y encender tus Logs • Primero un Log Mgmt antes que un SIEM • No hay “balas de plata” • Gana el pensamiento vs la tecnología • Menos es más • Windows Events Logs • Syslogs • DNS • App Logs • Context Awareness (Geolocation, Users, VM, Asset Mgmt, etc) • Casos de uso , caso de uso, casos de uso! • Arquitecturas de Big Data • Alta velocidad (I/O), horas para ver un reporte? O minutos para una vista? • Feeds de Seguridad (Sistemas de reputación) • Seguridad Interconectada • IP mala de reputación automáticamente bloqueada por el IPS. • Equipo que tuvo contacto con IP maliciosa ser analizado desde el SIEM
  • 37. “If you’re in a fight, you need to know that while it’s happening, not after the fact”