SlideShare a Scribd company logo
Copyright © 2014 Splunk Inc.
Machine Data 101:
Turning Data Into Insight
Eric Merkel
Sr. Sales Engineer
Agenda
 What is Machine Data? What is Splunk?
 Non-Traditional Data Sources
 Data Enrichment
 Level Up on Search and Reporting Commands
 Data Models and Pivot
 Advanced Visualizations and the Web Framework
3
What Does Machine Data Look Like?
Sources
Order Processing
Twitter
Care IVR
Middleware
Error
4
Machine Data Contains Critical Insights
Customer ID Order ID
Customer’s Tweet
Time Waiting On Hold
Twitter ID
Product ID
Company’s Twitter ID
Customer IDOrder ID
Customer ID
Sources
Order Processing
Twitter
Care IVR
Middleware
Error
5
Machine Data Contains Critical Insights
Order ID
Customer’s Tweet
Time Waiting On Hold
Product ID
Company’s Twitter ID
Order ID
Customer ID
Twitter ID
Customer ID
Customer ID
Sources
Order Processing
Twitter
Care IVR
Middleware
Error
6
Structured
RDBMS
SQL Search
Schema at Write Schema at Read
Traditional Splunk
Splunk Approach to Machine Data
Copyright © 2014 Splunk Inc.
7
ETL Universal Indexing
Volume Velocity Variety
Unstructured
Splunk: The Platform for Machine Data
8
Developer
Platform
Report
and
analyze
Custom
dashboards
Monitor
and alert
Ad hoc
search
Online
Services
Web
Proxy
Data Loss
Prevention
Storage Desktops
Packaged
Applications
Custom
Applications
Databases
Call Detail
Records
Smartphones
and Devices
Firewall
Authentication
File
servers
Endpoint
Threat
Intelligence
Asset
& CMDB
Employee /
HR Info
Data
Stores
Applications
External Lookups
Badging
records
Email
servers
VPN
Any amount, any location, any source
Schema-
on-the-fly
Universal
indexing
No
back-end
RDBMS
No need
to filter
data
Platform for Operational Intelligence
The Splunk Portfolio
Rich Ecosystem of
Apps & Add-Ons
Splunk Premium
Solutions
Mainframe
Data
Relational
Databases
MobileForwarders Syslog/TCP
IoT
Devices
Network
Wire Data
Hadoop
Workshop Setup
Workshop Setup
11
Wi-fi Access: Splunk! (pwd: splunk2017)
1. Download free Splunk Enterprise http://www.splunk.com/download
2. Download tutorial data (tutorialdata.zip): http://splunk.box.com/mdw101
3. Download lookup table (http_status.csv): http://splunk.box.com/mdw101
4. Add tutorial data to Splunk
Non-Traditional
Data Sources
Non-Traditional Data Sources
 Network Inputs
 HTTP Event Collector
 Log Event Alert Action
 Splunk App for Stream
 Scripted Inputs
 Database Inputs
 Splunk ODBC Driver
 Modular Inputs
 zLinux Forwarder
 MINT
 Non-Splunk Datastores
13
Traditional Data Sources
 Captures events from log files in real time
 Runs scripts to gather system metrics, connect
to APIs and databases
 Listens to syslog and gathers Windows events
 Universally indexes any data format so it
doesn’t need adapters
14
Windows
• Registry
• Event logs
• File system
• sysinternals
Linux/Unix
• Configurations
• Syslog
• File system
• Ps, iostat, top
Virtualization
• Hypervisor
• Guest OS
• Guest Apps
Applications
• Web logs
• Log4J, JMS, JMX
• .NET events
• Code and scripts
Databases
• Configurations
• Audit/query logs
• Tables
• Schemas
Network
• Configurations
• syslog
• SNMP
• netflow
Network Inputs
 Collect data over any UDP or TCP port
 Some devices only send data over a network port
 Best Practice: use syslog-ng or rsyslog
 Offers persistence
 Categorizes data by host
15
HTTP Event Collector (HEC)
 Collect data over HTTP or HTTPS directly to Splunk
 Application Developer focus – few lines of code in app
to send data
 HEC Features Include:
 Token-based, not credential based
 Indexer Acknowledgements – guarantees data indexing
 Raw and JSON formatted event payloads
 SSL, CORS (Cross Origin Resource Sharing), and Network
Restrictions
16
Log Event Alert Action
 Use Splunk alerting to index a custom log event
 Splunk searchable index of custom alert events
 Configurable Features Include:
 Host
 Source
 Sourcetype
 Index
 Event text – construct the exact syntax of the log event,
including any text, tokens, or other information
17
The Splunk App for Stream
Wire Data Enhances the Platform for
Operational Intelligence
Efficient, Cloud-ready Wire Data Collection
Simple Deployment Supports Fast Time to Value
18
Stream = Better Insights for *
Solution Area Contextual Data Wire Data Enriched View
Application
Management
application logs,
monitoring data,
metrics, events
protocol conversations on
database performance, DNS
lookups, client data, business
transaction paths…
Measure application response
times, deeper insights for root-
cause diagnostics, trace tx
paths, establish baselines…
IT Operations application logs,
monitoring data,
metrics, events
payload data including process
times, errors, transaction
traces, ICA latency, SQL
statements, DNS records…
Analyze traffic volume, speed
and packets to identify
infrastructure performance
issues, capacity constraints,
changes; establish baselines…
19
Stream = Better Insights for *
Solution Area Contextual Data Wire Data Enriched View
Security app + infra logs,
monitoring data,
events
protocol identification,
protocol headers, content
and payload information,
flow records
Build analytics and context for
incident response, threat
detection, monitoring and
compliance
Digital
Intelligence
website activity,
clickstream data,
metrics
browser-level customer
interactions
Customer Experience – analyze
website and application bottlenecks to
improve customer experience and
online revenues
Customer Support (online, call center)
– faster root cause analysis and
resolution of customer issues with
website or apps
20
Scripted Inputs
21
 Send data to Splunk via a custom script
 Splunk indexes anything written to stdout
 Splunk handles scheduling
 Supports shell, Python scripts, WIN batch, PowerShell
 Any other utility that can format and stream data
Streaming Mode
 Splunk executes script and indexes stdout
 Checks for any running instances
Write to File Mode
 Splunk launches script which produces
output file, no need for external scheduler
 Splunk monitors output file
Use Cases for Scripted Inputs
22
 Alternative to file-base or network-based inputs
 Stream data from command-line tools, such as vmstat and iostat
 Poll a web service, API or database and process the results
 Reformat complex or binary data for easier parsing into events and fields
 Maintain data sources with slow or resource-intensive startup
procedures
 Provide special or complex handling for transient or unstable inputs
 Scripts that manage passwords and credentials
 Wrapper scripts for command line inputs that contain special characters
Database Inputs
 Create value with structured data
 Enrich search results with additional
business context
 Easily import data for deeper analysis
 Integrate multiple DBs concurrently
 Simple set-up, non-invasive and secure
DB Connect provides reliable, scalable,
real-time integration between Splunk
and traditional relational databases
23
Configure Database Inputs
24
 DB Connect App
 Real-time, scalable integration with relational DBs
 Browse and navigate schemas and tables before data import
 Reliable scheduled import
 Seamless installation and UI configuration
 Supports connection pooling and caching
 “Tail” tables or import entire tables
 Detect and import new/updated rows using timestamps or unique IDs
 Supports many RDBMS flavors
 AWS RDS Aurora, AWS RedShift, IBM DB2 for Linux, Informix, MemSQL, MS SQL, MySQL,
Oracle, PostgreSQL, SAP SQL Anywhere (aka Sybase SA), Sybase ASE and IQ, Teradata
Splunk ODBC Driver
25
 Interact with, manipulate and visualize machine data in
Splunk Enterprise using business software tools
 Leverage analytics from Splunk alongside third party
solutions such as Microsoft Excel and Tableau Desktop
 Industry-standard connectivity to Splunk Enterprise
 Empowers business users with direct and secure access
to machine data
 Combine machine data with structured data for better
operational context
Modular Inputs
26
 Create your own custom inputs
 Scripted input with structure and intelligence
 First class citizen in the Splunk management interface
 Appears under Settings > Data Inputs
 Benefits over simple scripted input
 Instance control: launch a single or multiple instances
 Input validation
 Support multiple platforms
 Stream data as text or XML
 Secure access to mod input scripts via REST endpoints
Example Modular Inputs
27
Twitter
 Stream JSON data from a Twitter source to Splunk using Tweepy
Amazon S3 Online Storage
 Index data from the Amazon S3 online storage web service
Java Messaging Service (JMS)
 Poll message queues and topics through JMS Messaging API
 Talks to multiple providers: MQSeries (Websphere MQ), ActiveMQ,
TibcoEMS, HornetQ, RabbitMQ, Native JMS, WebLogic JMS, Sonic MQ
Splunk Windows Inputs
 Retrieve WIN event logs, registry keys, perfmon counters
More Modular Inputs
28
zLinux Forwarder
29
 Easily collect and index data on IBM mainframes
 Collect application and platform data
 Download as new Forwarder distribution for s390x Linux
Extend Operational Intelligence to Mobile Apps
30
Deliver Better
Performing, More
Reliable Apps
Deliver Real-Time
Omni-Channel
Analytics
End-to-End
Performance and
Capacity Insights
Monitor App Usage and Performance
• Improve user retention by quickly
identifying crashes and
performance issues
• Establish whether issues are
caused by an app or the network(s)
• Correlate app, OS and device type
to diagnose crash and network
performance issues
31
Integrated Analytics Platform for Diverse Data Stores
Full-featured,
Integrated
Product
Fast Insights
for Everyone
Works with
What You
Have Today
Explore Visualize Dashboard
s
ShareAnalyze
Hadoop Clusters NoSQL and Other Data Stores
Hadoop Client Libraries Streaming Resource Libraries
Bi-directional
Integration
with Hadoop
Connect to NoSQL and Other Data Stores
• Build custom streaming resource
libraries
• Search and analyze data from
other data stores in Hunk
• In partnership with leading
NoSQL vendors
• Use in conjunction with DB
Connect for relational database
lookups
Virtual Indexes
 Enables seamless use of
almost the entire Splunk
stack on data
 Automatically handles
MapReduce
 Technology is patent
pending
Data Enrichment
Agenda
 Tags – categorize and add meaning to data
 Field Aliases – simplify search and correlation
 Calculated Fields – shortcut complex/repetitive computations
 Event Types – group common events and share knowledge
 Lookups – augment data with additional external fields
37
 Adds inline meaning/context/specificity to raw data
 Used to normalize metadata or raw data
 Simplifies correlation of multiple data sources
 Created in Splunk
 Transferred from external sources
What is Data Enrichment?
38
 Add meaning/context/specificity to raw data
 Labels describing team, category, platform, geography
 Applied to field-value combination
 Multiple tags can be applied for each field-value
 Case sensitive
Tags
39
Create Tags
40
 Search events with tag in any field
 Search events with tag in a specific field
 Search events with tag using wildcards
Find the Web Servers
Tags in Action
41
tag=webserver
tag::host=webserver
tag=web*
 Tag the host as
webserver
 Tag the sourcetype
as web
1
2
3
4
5
Back to
Slides
 Normalize field labels to simplify search and correlation
 Apply multiple aliases to a single field
 Example: Username | cs_username | User  user
 Example: c_ip | client | client_ip  clientip
 Processed after field extractions + before lookups
 Can apply to lookups
 Aliases appear alongside original fields
Field Aliases
42
Re-Label Field to Intuitive Name
Create Field Alias
43
1
2
3
 Create field alias of clientip = customer
 Search events in last 15 minutes, find
customer field
 Field alias (customer) and original field
(clientip) are both displayed
Search using an Intuitive Field Name
Field Alias in Action
44
1
3
2
sourcetype=access_combined
 Shortcut for performing
repetitive/long/complex
transformations using
eval command
 Based on extracted or
discovered fields only
 Do not apply to lookup or
generated fields
Calculated Fields
45
Compute Kilobytes from Bytes
Create Calculated Field
46
1
2
1
2
3
 Create kilobytes = bytes/1024
 Search events in last 15 minutes for
kilobytes and bytes
Search Using Kilobytes instead of Bytes
Calculated Fields in Action
47
1
2
sourcetype=access_combined
Back to
Slides
 Classify and group common events
 Capture and share knowledge
 Based on search
 Use in combination with fields and tags to define
event topography
Event Types
48
 Best Practice: Use punct field
 Default metadata field describing event structure
 Built on interesting characters: ",;-#$%&+./:=?@'|*nr"(){}<>[]^! »
 Can use wildcards
Create Event Types
49
event punct
####<Jun 3, 2014 5:38:22 PM MDT> <Notice>
<WebLogicServer> <bea03> <asiAdminServer>
<WrapperStartStopAppMain> <>WLS Kernel<> <>
<BEA-000360> <Server started in RUNNING mode>
####<_,__::__>_<>_<>_<>_<>_<>_
172.26.34.223 - - [01/Jul/2005:12:05:27 -0700]
"GET /trade/app?action=logout HTTP/1.1" 200 2953
..._-_-_[:::_-]_"_?=_/."__
 Show punct for sourcetype=access_combined
 Pick a punct, then wildcard it after the timestamp
 Add NOT status=200
 Save as “bad” event type + Color:red + Priority:1 (shift
reload in browser to show coloring)
Classify Events as Known Bad
Create Event Type
50
eventtype=bad
sourcetype="access_combined" punct="..._-_-_[//_:::]*" NOT status=200
1
2
3
4
Back to
Slides
Lookups to Enrich Raw Data
LDAP
AD
Watch
Lists
CRM/
ERP
CMDB
External Data Sources
Insight comes out
Data goes inCreate additional fields
from the raw data with
a lookup to an external
data source
 Augment raw events with additional fields
 Provide context or supporting details
 Translate field values to more descriptive data
 Example: add text descriptions for error codes, IDs
 Example: add contact details to user names or IDs
 Example: add descriptions to HTTP status codes
 File-based or scripted lookups
Lookups
52
53
1. Upload/create table
2. Assign table to lookup object
3. Map lookup to data set
Convert a Code into a Description
Configure a Static Lookup
1. Get the lookup http_status.csv file from link:
http://splunk.box.com/mdw101
 Lookup table files > Add new
 Name: http_status.csv (must have .csv file extension)
 Upload: <path to .csv>
 Verify lookup was created successfully
1. Create HTTP Status Table
54
| inputlookup http_status.csv
1
2
3
 Lookup definitions > Add new
 Name: http_status
 Type: File-based
 Lookup file: http_status.csv
 Invoke the lookup manually
2. Add Lookup Definition
55
1
2
sourcetype=access_combined | lookup
http_status status OUTPUT status_description
 Automatic lookups > Add new
 Name: http_status (cannot have spaces)
 Lookup table: http_status
 Apply to: sourcetype = access_combined
 Lookup input field: status
 Lookup output field: status_description
 Verify lookup is invoked automatically
3. Configure Automatic Lookup
56
1
2
sourcetype=access_combined
Back to
Slides
 Temporal lookups for time-based lookups
 Example: Identify users on your network based on their IP address
and the timestamp in DHCP logs
 Use search results to populate a lookup table
 … | outputlookup <tablename|filename>
 Call an external command or script
 Python scripts only
 Example: DNS lookup for IP  Host
 Create a lookup table using a relational database
 Review matches against a database column or SQL query
Fancy Lookups
57
 Creating and Managing Alerts (Job Inspector)
 Macros
 Workflow Actions
More Data Enrichment
58
Level Up on Search &
Reporting Commands
Agenda
 Doing more with basic search commands
 Advanced search commands
 Doing more with basic reporting commands
60
Search Syntax Components
61
Anatomy of a Search
62
Disk
 top – limit
 rare – same options as top
 timechart – parameters
 stats – functions (sum, avg, list, values, sparkline)
 sort – inline ascending or descending
 addcoltotals
 addtotals
Doing More with Basic Search Commands
63
 Commands have parameters or qualifiers
 top and rare have similar syntax
 Each search command has its own syntax – show inline help
Find Most and Least Active Customers
Using the top + rare Commands
... | top limit=20 clientip
... | rare limit=20 clientip
IPs with the
most visits
IPs with the
least visits
 Sort inline descending or ascending
65
... | stats count by clientip | sort - count
... | stats count by clientip | sort + count
Number of requests by
customer - descending
Number of requests by
customer - ascending
Sort the Number of Customer Requests
Using the sort Command
 Show Search Command Reference Docs
 Functions for eval + where
 Functions for stats + chart and timechart
 Invoke a function
 Rename inline
66
... | stats sum(bytes) by clientip | sort - sum(bytes)
... | stats sum(bytes) as totalbytes by clientip | sort - totalbytes
Total payload by
customer - descending
Total payload by
customer - descending
Determine Total Customer Payload
Using functions + rename command
 List all values of a field
 List only distinct values of a field
67
... | stats values(action) by clientip
... | stats list(action) by clientip
Activity by customer
Distinct actions by
customer
Observe Customer Activity
Using the list + values Functions
 Show distinct actions and cardinality of each action
68
sourcetype=access_combined
| stats count(action) as value by clientip, action
| eval pair=action + " (" + value + ")"
| stats list(pair) as values by clientip
Analyze Customer Activity
Combine list + values Functions
 Add columns
 Sum specific columns
69
... | stats count by clientip, action
2 cols: clientip + action
... | stats sum(bytes) as totalbytes, avg(bytes) as avgbytes,
count as totalevents by clientip | addcoltotals totalbytes,
totalevents
Sum totalbytes and
totalevents colums
Building a Table of Customer Activity
Add Columns and Sum Columns
70
... | stats sum(bytes) as totalbytes, sum(other) as totalother
by clientip | addtotals fieldname=totalstuff
For each row, add
totalbytes + totalother
A better example:
physical memory + virtual memory =
total memory
Building a Table of Customer Activity
Sum Across Rows
71
... | stats sparkline(count) as trendline by clientip
In context of
larger event set
... | stats sparkline(count) as trendline sum(bytes) by clientip
Inline in tables
Trend Individual Customer Activity
Sparklines in Action
Back to
Slides
Advanced Search Commands
Command Short Description Hints
transaction Group events by a common field value. Convenient, but resource intensive.
cluster Cluster similar events together. Can be used on _raw or field.
associate Identifies correlations between fields. Calculates entropy btn field values.
correlate Calculates the correlation between
different fields.
Evaluates relationship of all fields in
a result set.
contingency Builds a contingency table for two fields. Computes co-occurrence, or % two
fields exist in same events.
anomalies Computes an unexpectedness score for
an event.
Computes similarity of event (X) to a
set of previous events (P).
anomalousvalue Finds and summarizes irregular, or
uncommon, search results.
Considers frequency of occurrence
or number of stdev from the mean
 Sew events together + creates duration + eventcount
 Sparklines inline in tables
73
... | transaction JSESSIONID | table JSESSIONID, action, product_id
Group by
JSESSIONID
View Customer Activity by Session
Using the transaction Command
 Intelligent group (creates cluster_count and cluster_label)
 Sparklines inline in tables
Cluster
74
... | cluster showcount=1 | table _raw, cluster_count, cluster_label
Back to
Slides
 Predict over time
 Chart Overlay with and without streamstats
 Maps with iplocation + geostats
 Single value
 Metered visuals with gauge
Doing More with Basic Reporting Commands
75
 Predict future values using lower/upper bounds – single and multiple series
76
... | timechart count as traffic | predict traffic
Predict Website Traffic
Using the predict Command
77
sourcetype=access_combined (action=view OR action=purchase)
| timechart span=10m count(eval(action="view")) as Viewed,
count(eval(action="purchase")) as Purchased
Compare Browsing vs. Buying Activity
Simple Chart Overlay
78
... | iplocation clientip | geostats count by clientip
Combine IP lookup
with geo mapping
Map Customer Activity Geographically
Geolocation in Action
79
... | stats count
Display a Simple Count of Events
Single Value in Action
Display Counts Using Gauges
Single Value, Radial and Filler Gauges in Action
80
... | stats count | gauge count 10000 20000 30000 40000 50000
Back to
Slides
Data Model and Pivot
Agenda
 What is a data model?
 Build a data model
 Pivot Interface
 Accelerate a data model
82
Powerful Analytics Anyone Can Use
Enables non-technical users to build complex
reports without the search language
Provides more meaningful representation
of underlying raw machine data
Acceleration technology delivers results
faster as volume increases
83
Pivot
Data
Model
Analytics
Store
Define Relationships in Machine Data
Data Model
• Describes how underlying
machine data is represented
and accessed
• Defines meaningful
relationships in the data
• Enables single authoritative
view of underlying raw data
Hierarchical object view of underlying data
Add constraints to
filter out events
Transparent Acceleration
• Automatically collected
– Handles timing issues,
backfill…
• Automatically maintained
– Uses acceleration window
• Stored on the indexers
– Peer to the buckets
• Fault tolerant collection
Time window of data
that is accelerated
Check to enable
acceleration of
data model
High Performance
Analytics Store
Easy-to-Use Analytics
• Drag-and-drop interface
enables any user to analyze
data
• Create complex queries and
reports without learning
search language
• Click to visualize any chart
type; reports dynamically
update when fields change
Select fields from
data model
Time window
All chart types available in the chart toolbox
Save report
to share
Pivot
 Defines least common denominator for a
data domain
 Standard method to parse, categorize,
normalize data
 Set of field names and tags by domain
 Packaged as a Data Models in a Splunk App
 Domains: security, web, inventory, JVM,
performance, network sessions, and more
 Minimal setup to use Pivot interface
Common Information Model (CIM) App
87
Custom Visualizations
and the Web
Framework Toolkit
Agenda
 Developer Platform
 Web Framework Toolkit (WFT)
 REST API and SDKs
 Get a Flying Start
89
Optimizing the Analytics Process
90
Focus on the data – intuitive tools
to enable the analyst
No single visualization exists to
handle all data sets.
Never lose sight of the raw data
Splunk
Analytics
Explore
Context
Visualize
Algorithms
Simple, Interactive, and Extensible
91
VISUALIZATION
EXPLORATION
CUSTOMIZABLE
FRAMEWORK
POWERFUL
ANALYTICS
Pivot
Data Models
Interactive Forms
Contextual Drilldown
Dashboard Editor
Web Framework
The Splunk Enterprise Platform
Collection
Indexing
Search Processing Language
Core Functions
Inputs, Apps, Other Content
SDKContent
Core Engine
User and Developer Interfaces
Web Framework
REST API
Powerful Platform for Enterprise Developers
Developers Can Customize and Extend
REST API
Build Splunk Apps Extend and Integrate Splunk
Simple XML
JavaScript
HTML5
Web
Framework
Java
JavaScript
Python
Ruby
C#
PHP
Data Models
Search Extensibility
Modular Inputs
SDKs
A Wealth of Splunk Apps
Over 1,300 apps available on the Splunk apps site
API
SDKs UI
Server, Storage,
Network
Server
Virtualization
Operating
Systems
Custom
Applications
Business
Applications
Cloud
Services
App Performance
Monitoring
Ticketing/ and Other
Web Intelligence
Mobile
Applications
Stream
 Interactive, cut/paste examples from popular
source repositories: D3, GitHub, jQuery
 Splunk 6.x Dashboard Examples App
https://apps.splunk.com/app/1603
 Custom SimpleXML Extensions App
https://apps.splunk.com/app/1772
 Splunk Web Framework Toolkit App
https://apps.splunk.com/app/1613
Example Advanced Visualizations
95
Resources
Splunk Documentation
97
• http://docs.splunk.com
• Official Product Docs
• Wiki and community topics
• Updated daily
• Can be printed to .PDF
Splunk Answers
98
• http://answers.splunk.com
• Community driven
• Splunk supported
• Knowledge exchange
• Q & A
Splunk Education
99
• Recommended for Users
– Using Splunk
– Searching & Reporting
• Recommended for UI/Dashboard Developers
– Developing Apps
• Instructor-Led Courses
– Web
– Onsite
Eric Merkel
emerkel@splunk.com
Happy Splunking!

More Related Content

What's hot (20)

PDF
Conf2014_SplunkSearchOptimization
Splunk
 
PDF
My first 90 days with ClickHouse.pdf
Alkin Tezuysal
 
PPTX
Logging using ELK Stack for Microservices
Vineet Sabharwal
 
PPTX
Securing Data in Hadoop at Uber
DataWorks Summit
 
PDF
Salary Study 2019 - Software Developer Luxembourg, Paris, Brussels
Eric BUSCH
 
PPTX
Introduction to Satellite(1).pptx
joshua45075
 
PDF
Planning For Catastrophe with IBM WAS and IBM BPM
WASdev Community
 
PDF
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
HostedbyConfluent
 
PPTX
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Simplilearn
 
PDF
华为智慧农业解决方案
ssuser220dc6
 
PDF
SQL Server Tuning to Improve Database Performance
Mark Ginnebaugh
 
PPTX
XPagesDay2015 - 誰も教えてくれなかったデバッグ方法
Mitsuru Katoh
 
PDF
Tuning data warehouse
Srinivasan R
 
PDF
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Databricks
 
PPTX
Introduction to SharePoint Framework (SPFx)
Fabio Franzini
 
PPTX
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
Flink Forward
 
PDF
Embulk, an open-source plugin-based parallel bulk data loader
Sadayuki Furuhashi
 
PDF
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Spark Summit
 
PDF
PostgreSQL: Advanced indexing
Hans-Jürgen Schönig
 
PPTX
SplunkLive! Paris 2018: Splunk Overview
Splunk
 
Conf2014_SplunkSearchOptimization
Splunk
 
My first 90 days with ClickHouse.pdf
Alkin Tezuysal
 
Logging using ELK Stack for Microservices
Vineet Sabharwal
 
Securing Data in Hadoop at Uber
DataWorks Summit
 
Salary Study 2019 - Software Developer Luxembourg, Paris, Brussels
Eric BUSCH
 
Introduction to Satellite(1).pptx
joshua45075
 
Planning For Catastrophe with IBM WAS and IBM BPM
WASdev Community
 
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
HostedbyConfluent
 
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Simplilearn
 
华为智慧农业解决方案
ssuser220dc6
 
SQL Server Tuning to Improve Database Performance
Mark Ginnebaugh
 
XPagesDay2015 - 誰も教えてくれなかったデバッグ方法
Mitsuru Katoh
 
Tuning data warehouse
Srinivasan R
 
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Databricks
 
Introduction to SharePoint Framework (SPFx)
Fabio Franzini
 
Virtual Flink Forward 2020: A deep dive into Flink SQL - Jark Wu
Flink Forward
 
Embulk, an open-source plugin-based parallel bulk data loader
Sadayuki Furuhashi
 
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Spark Summit
 
PostgreSQL: Advanced indexing
Hans-Jürgen Schönig
 
SplunkLive! Paris 2018: Splunk Overview
Splunk
 

Viewers also liked (20)

PPTX
Threat Hunting with Splunk Hands-on
Splunk
 
PDF
Building Business Service Intelligence with ITSI
Splunk
 
PPTX
Building a Security Information and Event Management platform at Travis Per...
Splunk
 
PPTX
Threat Hunting with Splunk
Splunk
 
PPTX
Softcat Splunk Discovery Day Manchester, March 2017
Splunk
 
PPTX
Wie Sie Ransomware aufspüren und was Sie dagegen machen können
Splunk
 
PPTX
Splunk Overview
Splunk
 
PDF
Splunk Enterprise for IT Troubleshooting Hands-On
Splunk
 
PDF
Machine Data 101
Splunk
 
PPTX
Delivering business value from operational insights at ING Bank
Splunk
 
PDF
Getting Started with IT Service Intelligence
Splunk
 
PDF
Molina Healthcare Customer Presentation
Splunk
 
PDF
Building Business Service Intelligence with ITSI
Splunk
 
PPTX
How to Design, Build and Map IT and Business Services in Splunk
Splunk
 
PPTX
SplunkLive! Frankfurt 2017 - MediaMarktSaturn
Splunk
 
PPTX
Splunk Webinar – IT Operations auf den nächsten Level bringen
Splunk
 
PDF
Machine Learning + Analytics in Splunk
Splunk
 
PPTX
Getting Started with Splunk Enterprise
Splunk
 
PPTX
Splunk Discovery Day Hamburg - Data Driven Insights
Splunk
 
PPTX
Scale Splunk
Splunk
 
Threat Hunting with Splunk Hands-on
Splunk
 
Building Business Service Intelligence with ITSI
Splunk
 
Building a Security Information and Event Management platform at Travis Per...
Splunk
 
Threat Hunting with Splunk
Splunk
 
Softcat Splunk Discovery Day Manchester, March 2017
Splunk
 
Wie Sie Ransomware aufspüren und was Sie dagegen machen können
Splunk
 
Splunk Overview
Splunk
 
Splunk Enterprise for IT Troubleshooting Hands-On
Splunk
 
Machine Data 101
Splunk
 
Delivering business value from operational insights at ING Bank
Splunk
 
Getting Started with IT Service Intelligence
Splunk
 
Molina Healthcare Customer Presentation
Splunk
 
Building Business Service Intelligence with ITSI
Splunk
 
How to Design, Build and Map IT and Business Services in Splunk
Splunk
 
SplunkLive! Frankfurt 2017 - MediaMarktSaturn
Splunk
 
Splunk Webinar – IT Operations auf den nächsten Level bringen
Splunk
 
Machine Learning + Analytics in Splunk
Splunk
 
Getting Started with Splunk Enterprise
Splunk
 
Splunk Discovery Day Hamburg - Data Driven Insights
Splunk
 
Scale Splunk
Splunk
 
Ad

Similar to Machine Data 101 Hands-on (20)

PPTX
Workshop splunk 6.5-saint-louis-mo
Mohamad Hassan
 
PPTX
Machine Data 101: Turning Data Into Insight
Splunk
 
PPTX
Machine Data 101: Turning Data Into Insight
Splunk
 
PDF
Machine Data 101 Workshop
Splunk
 
PPTX
Machine Data 101
Splunk
 
PDF
Splunk workshop-Machine Data 101
Splunk
 
PDF
Getting Started with Splunk Enterprise
Splunk
 
PDF
Getting Started with Splunk Enterprise
Splunk
 
PDF
Getting Started with Splunk Enterprise
Splunk
 
PPTX
Splunk Discovery Day Düsseldorf 2016 - Splunk für IT Operations
Splunk
 
PPTX
Data Onboarding Breakout Session
Splunk
 
PPTX
What's New in 6.3 + Data On-Boarding
Splunk
 
PPTX
Splunk Discovery: Warsaw 2018 - Getting Data In
Splunk
 
PDF
SplunkLive Auckland - Operational Intelligence
Splunk
 
PDF
SplunkLive Wellington 2015 - Operational Intelligence
Splunk
 
PDF
Delivering New Visibility and Analytics for IT Operations
Gabrielle Knowles
 
PDF
Azure IoT Suite
Samir Arezki ☁
 
PDF
Motadata brochure
RajDodiya4
 
PDF
inmation Presentation
inmation Software GmbH
 
PPT
SQL Server 2008 Positioning
ukdpe
 
Workshop splunk 6.5-saint-louis-mo
Mohamad Hassan
 
Machine Data 101: Turning Data Into Insight
Splunk
 
Machine Data 101: Turning Data Into Insight
Splunk
 
Machine Data 101 Workshop
Splunk
 
Machine Data 101
Splunk
 
Splunk workshop-Machine Data 101
Splunk
 
Getting Started with Splunk Enterprise
Splunk
 
Getting Started with Splunk Enterprise
Splunk
 
Getting Started with Splunk Enterprise
Splunk
 
Splunk Discovery Day Düsseldorf 2016 - Splunk für IT Operations
Splunk
 
Data Onboarding Breakout Session
Splunk
 
What's New in 6.3 + Data On-Boarding
Splunk
 
Splunk Discovery: Warsaw 2018 - Getting Data In
Splunk
 
SplunkLive Auckland - Operational Intelligence
Splunk
 
SplunkLive Wellington 2015 - Operational Intelligence
Splunk
 
Delivering New Visibility and Analytics for IT Operations
Gabrielle Knowles
 
Azure IoT Suite
Samir Arezki ☁
 
Motadata brochure
RajDodiya4
 
inmation Presentation
inmation Software GmbH
 
SQL Server 2008 Positioning
ukdpe
 
Ad

More from Splunk (20)

PDF
Splunk Leadership Forum Wien - 20.05.2025
Splunk
 
PDF
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
PDF
Building Resilience with Energy Management for the Public Sector
Splunk
 
PDF
IT-Lagebild: Observability for Resilience (SVA)
Splunk
 
PDF
Nach dem SOC-Aufbau ist vor der Automatisierung (OFD Baden-Württemberg)
Splunk
 
PDF
Monitoring einer Sicheren Inter-Netzwerk Architektur (SINA)
Splunk
 
PDF
Praktische Erfahrungen mit dem Attack Analyser (gematik)
Splunk
 
PDF
Cisco XDR & Splunk SIEM - stronger together (DATAGROUP Cyber Security)
Splunk
 
PDF
Security - Mit Sicherheit zum Erfolg (Telekom)
Splunk
 
PDF
One Cisco - Splunk Public Sector Summit Germany April 2025
Splunk
 
PDF
.conf Go 2023 - Data analysis as a routine
Splunk
 
PDF
.conf Go 2023 - How KPN drives Customer Satisfaction on IPTV
Splunk
 
PDF
.conf Go 2023 - Navegando la normativa SOX (Telefónica)
Splunk
 
PDF
.conf Go 2023 - Raiffeisen Bank International
Splunk
 
PDF
.conf Go 2023 - På liv og død Om sikkerhetsarbeid i Norsk helsenett
Splunk
 
PDF
.conf Go 2023 - Many roads lead to Rome - this was our journey (Julius Bär)
Splunk
 
PDF
.conf Go 2023 - Das passende Rezept für die digitale (Security) Revolution zu...
Splunk
 
PDF
.conf go 2023 - Cyber Resilienz – Herausforderungen und Ansatz für Energiever...
Splunk
 
PDF
.conf go 2023 - De NOC a CSIRT (Cellnex)
Splunk
 
PDF
conf go 2023 - El camino hacia la ciberseguridad (ABANCA)
Splunk
 
Splunk Leadership Forum Wien - 20.05.2025
Splunk
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Building Resilience with Energy Management for the Public Sector
Splunk
 
IT-Lagebild: Observability for Resilience (SVA)
Splunk
 
Nach dem SOC-Aufbau ist vor der Automatisierung (OFD Baden-Württemberg)
Splunk
 
Monitoring einer Sicheren Inter-Netzwerk Architektur (SINA)
Splunk
 
Praktische Erfahrungen mit dem Attack Analyser (gematik)
Splunk
 
Cisco XDR & Splunk SIEM - stronger together (DATAGROUP Cyber Security)
Splunk
 
Security - Mit Sicherheit zum Erfolg (Telekom)
Splunk
 
One Cisco - Splunk Public Sector Summit Germany April 2025
Splunk
 
.conf Go 2023 - Data analysis as a routine
Splunk
 
.conf Go 2023 - How KPN drives Customer Satisfaction on IPTV
Splunk
 
.conf Go 2023 - Navegando la normativa SOX (Telefónica)
Splunk
 
.conf Go 2023 - Raiffeisen Bank International
Splunk
 
.conf Go 2023 - På liv og død Om sikkerhetsarbeid i Norsk helsenett
Splunk
 
.conf Go 2023 - Many roads lead to Rome - this was our journey (Julius Bär)
Splunk
 
.conf Go 2023 - Das passende Rezept für die digitale (Security) Revolution zu...
Splunk
 
.conf go 2023 - Cyber Resilienz – Herausforderungen und Ansatz für Energiever...
Splunk
 
.conf go 2023 - De NOC a CSIRT (Cellnex)
Splunk
 
conf go 2023 - El camino hacia la ciberseguridad (ABANCA)
Splunk
 

Recently uploaded (20)

PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PDF
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
PDF
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PPTX
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
PDF
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
PDF
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
PPTX
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
PDF
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
PDF
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
PDF
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PPTX
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
The Builder’s Playbook - 2025 State of AI Report.pdf
jeroen339954
 
Windsurf Meetup Ottawa 2025-07-12 - Planning Mode at Reliza.pdf
Pavel Shukhman
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
Building Search Using OpenSearch: Limitations and Workarounds
Sease
 
Reverse Engineering of Security Products: Developing an Advanced Microsoft De...
nwbxhhcyjv
 
Log-Based Anomaly Detection: Enhancing System Reliability with Machine Learning
Mohammed BEKKOUCHE
 
Top iOS App Development Company in the USA for Innovative Apps
SynapseIndia
 
Presentation - Vibe Coding The Future of Tech
yanuarsinggih1
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
SFWelly Summer 25 Release Highlights July 2025
Anna Loughnan Colquhoun
 
DevBcn - Building 10x Organizations Using Modern Productivity Metrics
Justin Reock
 
CIFDAQ Market Insights for July 7th 2025
CIFDAQ
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
"Autonomy of LLM Agents: Current State and Future Prospects", Oles` Petriv
Fwdays
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
CIFDAQ Weekly Market Wrap for 11th July 2025
CIFDAQ
 

Machine Data 101 Hands-on

  • 1. Copyright © 2014 Splunk Inc. Machine Data 101: Turning Data Into Insight Eric Merkel Sr. Sales Engineer
  • 2. Agenda  What is Machine Data? What is Splunk?  Non-Traditional Data Sources  Data Enrichment  Level Up on Search and Reporting Commands  Data Models and Pivot  Advanced Visualizations and the Web Framework 3
  • 3. What Does Machine Data Look Like? Sources Order Processing Twitter Care IVR Middleware Error 4
  • 4. Machine Data Contains Critical Insights Customer ID Order ID Customer’s Tweet Time Waiting On Hold Twitter ID Product ID Company’s Twitter ID Customer IDOrder ID Customer ID Sources Order Processing Twitter Care IVR Middleware Error 5
  • 5. Machine Data Contains Critical Insights Order ID Customer’s Tweet Time Waiting On Hold Product ID Company’s Twitter ID Order ID Customer ID Twitter ID Customer ID Customer ID Sources Order Processing Twitter Care IVR Middleware Error 6
  • 6. Structured RDBMS SQL Search Schema at Write Schema at Read Traditional Splunk Splunk Approach to Machine Data Copyright © 2014 Splunk Inc. 7 ETL Universal Indexing Volume Velocity Variety Unstructured
  • 7. Splunk: The Platform for Machine Data 8 Developer Platform Report and analyze Custom dashboards Monitor and alert Ad hoc search Online Services Web Proxy Data Loss Prevention Storage Desktops Packaged Applications Custom Applications Databases Call Detail Records Smartphones and Devices Firewall Authentication File servers Endpoint Threat Intelligence Asset & CMDB Employee / HR Info Data Stores Applications External Lookups Badging records Email servers VPN Any amount, any location, any source Schema- on-the-fly Universal indexing No back-end RDBMS No need to filter data
  • 8. Platform for Operational Intelligence The Splunk Portfolio Rich Ecosystem of Apps & Add-Ons Splunk Premium Solutions Mainframe Data Relational Databases MobileForwarders Syslog/TCP IoT Devices Network Wire Data Hadoop
  • 10. Workshop Setup 11 Wi-fi Access: Splunk! (pwd: splunk2017) 1. Download free Splunk Enterprise http://www.splunk.com/download 2. Download tutorial data (tutorialdata.zip): http://splunk.box.com/mdw101 3. Download lookup table (http_status.csv): http://splunk.box.com/mdw101 4. Add tutorial data to Splunk
  • 12. Non-Traditional Data Sources  Network Inputs  HTTP Event Collector  Log Event Alert Action  Splunk App for Stream  Scripted Inputs  Database Inputs  Splunk ODBC Driver  Modular Inputs  zLinux Forwarder  MINT  Non-Splunk Datastores 13
  • 13. Traditional Data Sources  Captures events from log files in real time  Runs scripts to gather system metrics, connect to APIs and databases  Listens to syslog and gathers Windows events  Universally indexes any data format so it doesn’t need adapters 14 Windows • Registry • Event logs • File system • sysinternals Linux/Unix • Configurations • Syslog • File system • Ps, iostat, top Virtualization • Hypervisor • Guest OS • Guest Apps Applications • Web logs • Log4J, JMS, JMX • .NET events • Code and scripts Databases • Configurations • Audit/query logs • Tables • Schemas Network • Configurations • syslog • SNMP • netflow
  • 14. Network Inputs  Collect data over any UDP or TCP port  Some devices only send data over a network port  Best Practice: use syslog-ng or rsyslog  Offers persistence  Categorizes data by host 15
  • 15. HTTP Event Collector (HEC)  Collect data over HTTP or HTTPS directly to Splunk  Application Developer focus – few lines of code in app to send data  HEC Features Include:  Token-based, not credential based  Indexer Acknowledgements – guarantees data indexing  Raw and JSON formatted event payloads  SSL, CORS (Cross Origin Resource Sharing), and Network Restrictions 16
  • 16. Log Event Alert Action  Use Splunk alerting to index a custom log event  Splunk searchable index of custom alert events  Configurable Features Include:  Host  Source  Sourcetype  Index  Event text – construct the exact syntax of the log event, including any text, tokens, or other information 17
  • 17. The Splunk App for Stream Wire Data Enhances the Platform for Operational Intelligence Efficient, Cloud-ready Wire Data Collection Simple Deployment Supports Fast Time to Value 18
  • 18. Stream = Better Insights for * Solution Area Contextual Data Wire Data Enriched View Application Management application logs, monitoring data, metrics, events protocol conversations on database performance, DNS lookups, client data, business transaction paths… Measure application response times, deeper insights for root- cause diagnostics, trace tx paths, establish baselines… IT Operations application logs, monitoring data, metrics, events payload data including process times, errors, transaction traces, ICA latency, SQL statements, DNS records… Analyze traffic volume, speed and packets to identify infrastructure performance issues, capacity constraints, changes; establish baselines… 19
  • 19. Stream = Better Insights for * Solution Area Contextual Data Wire Data Enriched View Security app + infra logs, monitoring data, events protocol identification, protocol headers, content and payload information, flow records Build analytics and context for incident response, threat detection, monitoring and compliance Digital Intelligence website activity, clickstream data, metrics browser-level customer interactions Customer Experience – analyze website and application bottlenecks to improve customer experience and online revenues Customer Support (online, call center) – faster root cause analysis and resolution of customer issues with website or apps 20
  • 20. Scripted Inputs 21  Send data to Splunk via a custom script  Splunk indexes anything written to stdout  Splunk handles scheduling  Supports shell, Python scripts, WIN batch, PowerShell  Any other utility that can format and stream data Streaming Mode  Splunk executes script and indexes stdout  Checks for any running instances Write to File Mode  Splunk launches script which produces output file, no need for external scheduler  Splunk monitors output file
  • 21. Use Cases for Scripted Inputs 22  Alternative to file-base or network-based inputs  Stream data from command-line tools, such as vmstat and iostat  Poll a web service, API or database and process the results  Reformat complex or binary data for easier parsing into events and fields  Maintain data sources with slow or resource-intensive startup procedures  Provide special or complex handling for transient or unstable inputs  Scripts that manage passwords and credentials  Wrapper scripts for command line inputs that contain special characters
  • 22. Database Inputs  Create value with structured data  Enrich search results with additional business context  Easily import data for deeper analysis  Integrate multiple DBs concurrently  Simple set-up, non-invasive and secure DB Connect provides reliable, scalable, real-time integration between Splunk and traditional relational databases 23
  • 23. Configure Database Inputs 24  DB Connect App  Real-time, scalable integration with relational DBs  Browse and navigate schemas and tables before data import  Reliable scheduled import  Seamless installation and UI configuration  Supports connection pooling and caching  “Tail” tables or import entire tables  Detect and import new/updated rows using timestamps or unique IDs  Supports many RDBMS flavors  AWS RDS Aurora, AWS RedShift, IBM DB2 for Linux, Informix, MemSQL, MS SQL, MySQL, Oracle, PostgreSQL, SAP SQL Anywhere (aka Sybase SA), Sybase ASE and IQ, Teradata
  • 24. Splunk ODBC Driver 25  Interact with, manipulate and visualize machine data in Splunk Enterprise using business software tools  Leverage analytics from Splunk alongside third party solutions such as Microsoft Excel and Tableau Desktop  Industry-standard connectivity to Splunk Enterprise  Empowers business users with direct and secure access to machine data  Combine machine data with structured data for better operational context
  • 25. Modular Inputs 26  Create your own custom inputs  Scripted input with structure and intelligence  First class citizen in the Splunk management interface  Appears under Settings > Data Inputs  Benefits over simple scripted input  Instance control: launch a single or multiple instances  Input validation  Support multiple platforms  Stream data as text or XML  Secure access to mod input scripts via REST endpoints
  • 26. Example Modular Inputs 27 Twitter  Stream JSON data from a Twitter source to Splunk using Tweepy Amazon S3 Online Storage  Index data from the Amazon S3 online storage web service Java Messaging Service (JMS)  Poll message queues and topics through JMS Messaging API  Talks to multiple providers: MQSeries (Websphere MQ), ActiveMQ, TibcoEMS, HornetQ, RabbitMQ, Native JMS, WebLogic JMS, Sonic MQ Splunk Windows Inputs  Retrieve WIN event logs, registry keys, perfmon counters
  • 28. zLinux Forwarder 29  Easily collect and index data on IBM mainframes  Collect application and platform data  Download as new Forwarder distribution for s390x Linux
  • 29. Extend Operational Intelligence to Mobile Apps 30 Deliver Better Performing, More Reliable Apps Deliver Real-Time Omni-Channel Analytics End-to-End Performance and Capacity Insights
  • 30. Monitor App Usage and Performance • Improve user retention by quickly identifying crashes and performance issues • Establish whether issues are caused by an app or the network(s) • Correlate app, OS and device type to diagnose crash and network performance issues 31
  • 31. Integrated Analytics Platform for Diverse Data Stores Full-featured, Integrated Product Fast Insights for Everyone Works with What You Have Today Explore Visualize Dashboard s ShareAnalyze Hadoop Clusters NoSQL and Other Data Stores Hadoop Client Libraries Streaming Resource Libraries Bi-directional Integration with Hadoop
  • 32. Connect to NoSQL and Other Data Stores • Build custom streaming resource libraries • Search and analyze data from other data stores in Hunk • In partnership with leading NoSQL vendors • Use in conjunction with DB Connect for relational database lookups
  • 33. Virtual Indexes  Enables seamless use of almost the entire Splunk stack on data  Automatically handles MapReduce  Technology is patent pending
  • 35. Agenda  Tags – categorize and add meaning to data  Field Aliases – simplify search and correlation  Calculated Fields – shortcut complex/repetitive computations  Event Types – group common events and share knowledge  Lookups – augment data with additional external fields 37
  • 36.  Adds inline meaning/context/specificity to raw data  Used to normalize metadata or raw data  Simplifies correlation of multiple data sources  Created in Splunk  Transferred from external sources What is Data Enrichment? 38
  • 37.  Add meaning/context/specificity to raw data  Labels describing team, category, platform, geography  Applied to field-value combination  Multiple tags can be applied for each field-value  Case sensitive Tags 39
  • 39.  Search events with tag in any field  Search events with tag in a specific field  Search events with tag using wildcards Find the Web Servers Tags in Action 41 tag=webserver tag::host=webserver tag=web*  Tag the host as webserver  Tag the sourcetype as web 1 2 3 4 5 Back to Slides
  • 40.  Normalize field labels to simplify search and correlation  Apply multiple aliases to a single field  Example: Username | cs_username | User  user  Example: c_ip | client | client_ip  clientip  Processed after field extractions + before lookups  Can apply to lookups  Aliases appear alongside original fields Field Aliases 42
  • 41. Re-Label Field to Intuitive Name Create Field Alias 43 1 2 3
  • 42.  Create field alias of clientip = customer  Search events in last 15 minutes, find customer field  Field alias (customer) and original field (clientip) are both displayed Search using an Intuitive Field Name Field Alias in Action 44 1 3 2 sourcetype=access_combined
  • 43.  Shortcut for performing repetitive/long/complex transformations using eval command  Based on extracted or discovered fields only  Do not apply to lookup or generated fields Calculated Fields 45
  • 44. Compute Kilobytes from Bytes Create Calculated Field 46 1 2 1 2 3
  • 45.  Create kilobytes = bytes/1024  Search events in last 15 minutes for kilobytes and bytes Search Using Kilobytes instead of Bytes Calculated Fields in Action 47 1 2 sourcetype=access_combined Back to Slides
  • 46.  Classify and group common events  Capture and share knowledge  Based on search  Use in combination with fields and tags to define event topography Event Types 48
  • 47.  Best Practice: Use punct field  Default metadata field describing event structure  Built on interesting characters: ",;-#$%&+./:=?@'|*nr"(){}<>[]^! »  Can use wildcards Create Event Types 49 event punct ####<Jun 3, 2014 5:38:22 PM MDT> <Notice> <WebLogicServer> <bea03> <asiAdminServer> <WrapperStartStopAppMain> <>WLS Kernel<> <> <BEA-000360> <Server started in RUNNING mode> ####<_,__::__>_<>_<>_<>_<>_<>_ 172.26.34.223 - - [01/Jul/2005:12:05:27 -0700] "GET /trade/app?action=logout HTTP/1.1" 200 2953 ..._-_-_[:::_-]_"_?=_/."__
  • 48.  Show punct for sourcetype=access_combined  Pick a punct, then wildcard it after the timestamp  Add NOT status=200  Save as “bad” event type + Color:red + Priority:1 (shift reload in browser to show coloring) Classify Events as Known Bad Create Event Type 50 eventtype=bad sourcetype="access_combined" punct="..._-_-_[//_:::]*" NOT status=200 1 2 3 4 Back to Slides
  • 49. Lookups to Enrich Raw Data LDAP AD Watch Lists CRM/ ERP CMDB External Data Sources Insight comes out Data goes inCreate additional fields from the raw data with a lookup to an external data source
  • 50.  Augment raw events with additional fields  Provide context or supporting details  Translate field values to more descriptive data  Example: add text descriptions for error codes, IDs  Example: add contact details to user names or IDs  Example: add descriptions to HTTP status codes  File-based or scripted lookups Lookups 52
  • 51. 53 1. Upload/create table 2. Assign table to lookup object 3. Map lookup to data set Convert a Code into a Description Configure a Static Lookup
  • 52. 1. Get the lookup http_status.csv file from link: http://splunk.box.com/mdw101  Lookup table files > Add new  Name: http_status.csv (must have .csv file extension)  Upload: <path to .csv>  Verify lookup was created successfully 1. Create HTTP Status Table 54 | inputlookup http_status.csv 1 2 3
  • 53.  Lookup definitions > Add new  Name: http_status  Type: File-based  Lookup file: http_status.csv  Invoke the lookup manually 2. Add Lookup Definition 55 1 2 sourcetype=access_combined | lookup http_status status OUTPUT status_description
  • 54.  Automatic lookups > Add new  Name: http_status (cannot have spaces)  Lookup table: http_status  Apply to: sourcetype = access_combined  Lookup input field: status  Lookup output field: status_description  Verify lookup is invoked automatically 3. Configure Automatic Lookup 56 1 2 sourcetype=access_combined Back to Slides
  • 55.  Temporal lookups for time-based lookups  Example: Identify users on your network based on their IP address and the timestamp in DHCP logs  Use search results to populate a lookup table  … | outputlookup <tablename|filename>  Call an external command or script  Python scripts only  Example: DNS lookup for IP  Host  Create a lookup table using a relational database  Review matches against a database column or SQL query Fancy Lookups 57
  • 56.  Creating and Managing Alerts (Job Inspector)  Macros  Workflow Actions More Data Enrichment 58
  • 57. Level Up on Search & Reporting Commands
  • 58. Agenda  Doing more with basic search commands  Advanced search commands  Doing more with basic reporting commands 60
  • 60. Anatomy of a Search 62 Disk
  • 61.  top – limit  rare – same options as top  timechart – parameters  stats – functions (sum, avg, list, values, sparkline)  sort – inline ascending or descending  addcoltotals  addtotals Doing More with Basic Search Commands 63
  • 62.  Commands have parameters or qualifiers  top and rare have similar syntax  Each search command has its own syntax – show inline help Find Most and Least Active Customers Using the top + rare Commands ... | top limit=20 clientip ... | rare limit=20 clientip IPs with the most visits IPs with the least visits
  • 63.  Sort inline descending or ascending 65 ... | stats count by clientip | sort - count ... | stats count by clientip | sort + count Number of requests by customer - descending Number of requests by customer - ascending Sort the Number of Customer Requests Using the sort Command
  • 64.  Show Search Command Reference Docs  Functions for eval + where  Functions for stats + chart and timechart  Invoke a function  Rename inline 66 ... | stats sum(bytes) by clientip | sort - sum(bytes) ... | stats sum(bytes) as totalbytes by clientip | sort - totalbytes Total payload by customer - descending Total payload by customer - descending Determine Total Customer Payload Using functions + rename command
  • 65.  List all values of a field  List only distinct values of a field 67 ... | stats values(action) by clientip ... | stats list(action) by clientip Activity by customer Distinct actions by customer Observe Customer Activity Using the list + values Functions
  • 66.  Show distinct actions and cardinality of each action 68 sourcetype=access_combined | stats count(action) as value by clientip, action | eval pair=action + " (" + value + ")" | stats list(pair) as values by clientip Analyze Customer Activity Combine list + values Functions
  • 67.  Add columns  Sum specific columns 69 ... | stats count by clientip, action 2 cols: clientip + action ... | stats sum(bytes) as totalbytes, avg(bytes) as avgbytes, count as totalevents by clientip | addcoltotals totalbytes, totalevents Sum totalbytes and totalevents colums Building a Table of Customer Activity Add Columns and Sum Columns
  • 68. 70 ... | stats sum(bytes) as totalbytes, sum(other) as totalother by clientip | addtotals fieldname=totalstuff For each row, add totalbytes + totalother A better example: physical memory + virtual memory = total memory Building a Table of Customer Activity Sum Across Rows
  • 69. 71 ... | stats sparkline(count) as trendline by clientip In context of larger event set ... | stats sparkline(count) as trendline sum(bytes) by clientip Inline in tables Trend Individual Customer Activity Sparklines in Action Back to Slides
  • 70. Advanced Search Commands Command Short Description Hints transaction Group events by a common field value. Convenient, but resource intensive. cluster Cluster similar events together. Can be used on _raw or field. associate Identifies correlations between fields. Calculates entropy btn field values. correlate Calculates the correlation between different fields. Evaluates relationship of all fields in a result set. contingency Builds a contingency table for two fields. Computes co-occurrence, or % two fields exist in same events. anomalies Computes an unexpectedness score for an event. Computes similarity of event (X) to a set of previous events (P). anomalousvalue Finds and summarizes irregular, or uncommon, search results. Considers frequency of occurrence or number of stdev from the mean
  • 71.  Sew events together + creates duration + eventcount  Sparklines inline in tables 73 ... | transaction JSESSIONID | table JSESSIONID, action, product_id Group by JSESSIONID View Customer Activity by Session Using the transaction Command
  • 72.  Intelligent group (creates cluster_count and cluster_label)  Sparklines inline in tables Cluster 74 ... | cluster showcount=1 | table _raw, cluster_count, cluster_label Back to Slides
  • 73.  Predict over time  Chart Overlay with and without streamstats  Maps with iplocation + geostats  Single value  Metered visuals with gauge Doing More with Basic Reporting Commands 75
  • 74.  Predict future values using lower/upper bounds – single and multiple series 76 ... | timechart count as traffic | predict traffic Predict Website Traffic Using the predict Command
  • 75. 77 sourcetype=access_combined (action=view OR action=purchase) | timechart span=10m count(eval(action="view")) as Viewed, count(eval(action="purchase")) as Purchased Compare Browsing vs. Buying Activity Simple Chart Overlay
  • 76. 78 ... | iplocation clientip | geostats count by clientip Combine IP lookup with geo mapping Map Customer Activity Geographically Geolocation in Action
  • 77. 79 ... | stats count Display a Simple Count of Events Single Value in Action
  • 78. Display Counts Using Gauges Single Value, Radial and Filler Gauges in Action 80 ... | stats count | gauge count 10000 20000 30000 40000 50000 Back to Slides
  • 79. Data Model and Pivot
  • 80. Agenda  What is a data model?  Build a data model  Pivot Interface  Accelerate a data model 82
  • 81. Powerful Analytics Anyone Can Use Enables non-technical users to build complex reports without the search language Provides more meaningful representation of underlying raw machine data Acceleration technology delivers results faster as volume increases 83 Pivot Data Model Analytics Store
  • 82. Define Relationships in Machine Data Data Model • Describes how underlying machine data is represented and accessed • Defines meaningful relationships in the data • Enables single authoritative view of underlying raw data Hierarchical object view of underlying data Add constraints to filter out events
  • 83. Transparent Acceleration • Automatically collected – Handles timing issues, backfill… • Automatically maintained – Uses acceleration window • Stored on the indexers – Peer to the buckets • Fault tolerant collection Time window of data that is accelerated Check to enable acceleration of data model High Performance Analytics Store
  • 84. Easy-to-Use Analytics • Drag-and-drop interface enables any user to analyze data • Create complex queries and reports without learning search language • Click to visualize any chart type; reports dynamically update when fields change Select fields from data model Time window All chart types available in the chart toolbox Save report to share Pivot
  • 85.  Defines least common denominator for a data domain  Standard method to parse, categorize, normalize data  Set of field names and tags by domain  Packaged as a Data Models in a Splunk App  Domains: security, web, inventory, JVM, performance, network sessions, and more  Minimal setup to use Pivot interface Common Information Model (CIM) App 87
  • 86. Custom Visualizations and the Web Framework Toolkit
  • 87. Agenda  Developer Platform  Web Framework Toolkit (WFT)  REST API and SDKs  Get a Flying Start 89
  • 88. Optimizing the Analytics Process 90 Focus on the data – intuitive tools to enable the analyst No single visualization exists to handle all data sets. Never lose sight of the raw data Splunk Analytics Explore Context Visualize Algorithms
  • 89. Simple, Interactive, and Extensible 91 VISUALIZATION EXPLORATION CUSTOMIZABLE FRAMEWORK POWERFUL ANALYTICS Pivot Data Models Interactive Forms Contextual Drilldown Dashboard Editor Web Framework
  • 90. The Splunk Enterprise Platform Collection Indexing Search Processing Language Core Functions Inputs, Apps, Other Content SDKContent Core Engine User and Developer Interfaces Web Framework REST API
  • 91. Powerful Platform for Enterprise Developers Developers Can Customize and Extend REST API Build Splunk Apps Extend and Integrate Splunk Simple XML JavaScript HTML5 Web Framework Java JavaScript Python Ruby C# PHP Data Models Search Extensibility Modular Inputs SDKs
  • 92. A Wealth of Splunk Apps Over 1,300 apps available on the Splunk apps site API SDKs UI Server, Storage, Network Server Virtualization Operating Systems Custom Applications Business Applications Cloud Services App Performance Monitoring Ticketing/ and Other Web Intelligence Mobile Applications Stream
  • 93.  Interactive, cut/paste examples from popular source repositories: D3, GitHub, jQuery  Splunk 6.x Dashboard Examples App https://apps.splunk.com/app/1603  Custom SimpleXML Extensions App https://apps.splunk.com/app/1772  Splunk Web Framework Toolkit App https://apps.splunk.com/app/1613 Example Advanced Visualizations 95
  • 95. Splunk Documentation 97 • http://docs.splunk.com • Official Product Docs • Wiki and community topics • Updated daily • Can be printed to .PDF
  • 96. Splunk Answers 98 • http://answers.splunk.com • Community driven • Splunk supported • Knowledge exchange • Q & A
  • 97. Splunk Education 99 • Recommended for Users – Using Splunk – Searching & Reporting • Recommended for UI/Dashboard Developers – Developing Apps • Instructor-Led Courses – Web – Onsite

Editor's Notes

  • #5: Unlike traditional structured data or multi-dimensional data– for example data stored in a traditional relational database for batch reporting – machine data is non-standard, highly diverse, dynamic and high volume. You will notice that machine data events are also typically time-stamped – it is time-series data.   Take the example of purchasing a product on your tablet or smartphone: the purchase transaction fails, you call the call center and then tweet about your experience. All these events are captured - as they occur - in the machine data generated by the different systems supporting these different interactions.   Each of the underlying systems can generate millions of machine data events daily. Here we see small excerpts from just some of them.
  • #6: When we look more closely at the data we see that it contains valuable information – customer id, order id, time waiting on hold, twitter id … what was tweeted.   What’s important is first of all the ability to actually see across all these disparate data sources, but then to correlate related events across disparate sources, to deliver meaningful insight.
  • #7: If you can correlate and visualize related events across these disparate sources, you can build a picture of activity, behavior and experience. And what if you can do all of this in real-time? You can respond more quickly to events that matter. For example, if an organizations captured the customers twitter ID in their customer profile this correlation would be possible. Where that didn’t exist, they could at least group by demographic with the tweets. You can extrapolate this example to a wide range of use cases – security and fraud, transaction monitoring and analysis, web analytics, IT operations and so on.
  • #8: The rise of big data has forced IT organizations to transition from a focus on structured, relational data, to accommodate unstructured data, driven by the volume, velocity and variety of today’s applications and systems. As the data has changed from structured data to unstructured data, the technology approach needs to change as well. When you don’t know what data types you’ll need to analyze tomorrow or what questions you need to ask in a week, flexibility becomes a key component of your technology decisions. The ability to index any data type, search across silos and avoid being locked into a rigid schema opens a new world of analytics and business insights to your organization. Schema at Read – Enables you ask any question of the deal Search – Enables rapid, iterative exploration of the data along with advanced analytics Universal Indexing – Enables you to ingest any type of machine data Horizontal scaling over commodity hardware enables big data analytics
  • #9: On the right is how Splunk can ingest all machine data, including data from non-traditional data sources like physical badge data. Talk about how we get data in, no schema, distributed architecture/search. And Splunk can understand machine data and the fields in it – time stamps, IPs, usernames, event codes, etc. Then talk to the bottom around external lookups on asset and identity information to add context/accuracy at search time. Splunk can be made aware of people’s roles and also the criticality of the assets & systems people are accessing. This context is needed to help distinguish between legitimate insider activity and inappropriate/malicious insider activity. For example, the CFO logging into a critical financial application is okay. But if the receptionist does this, that is a red flag. Then talk to the general use cases/capabilities at the top. We will talk about insider threat use cases in a few slides from now. Tie this slide back to the prior requirements in the step1-4 slide.
  • #10: The Splunk platform consists of multiple products and deployment models to fit your needs. Splunk Enterprise – for on-premise deployment Splunk Cloud – Fully managed service with 100% SLA and all the capabilities of Splunk Enterprise…in the Cloud Hunk – for analytics on data in Hadoop Splunk Mint – to get insights into data from Mobile devices The products can pull in data from virtually any source to support multiple use cases. Splunk Apps extend and simplify deployments by providing pre-packaged content designed for specific use cases and data types.
  • #15: Splunk’s mission statement is to make machine data accessible, useful and valuable to everyone. Splunk can take any machine data and automatically index it for fast searching. Because Splunk doesn’t use a database, there are no additional licenses, and most importantly, no pre-defined schema to limit how you use your information. Examples include the configuration files, syslog, Windows events and registry settings, as well as WMI. But the most important thing to note is how easy it is to get data into Splunk and make it useful.
  • #19: The Splunk App for Stream software captures real-time wire data from distributed infrastructures, including private, public and hybrid clouds with on-the-fly deployment and fine-grained filtering capabilities.
  • #24: Splunk DB Connect delivers reliable, scalable, real-time integration between Splunk Enterprise and traditional relational databases. With Splunk DB Connect, structured data from relational databases can be easily integrated into Splunk Enterprise, driving deeper levels of operational intelligence and richer business analytics across the organization. Organizations can drive more meaningful insights for IT operations, security and business users. For example, IT operations teams can track performance, outage and usage by department, location and business entities. Security professionals can correlate machine data with critical assets and watch-lists for: incident investigations, real-time correlations and advanced threat detection using the award-winning Splunk Enterprise. Business users can analyze service levels and user experience by customer in real-time to make more informed decisions.
  • #31: To address the needs of developers, operations and product management, you need operational intelligence for your mobile apps. This is what we call mobile intelligence. Mobile intelligence provides real-time insight on how your mobile apps are performing, and can correlate with and enhance operational intelligence.    Splunk software enables organizations to search, monitor, analyze and visualize machine-generated data from websites, applications, servers, networks, sensors and mobile devices. The Splunk MINT product line helps organizations monitor mobile app usage and performance, gain deep visibility into mobile app transactions and accelerate development Deliver better performing, more reliable apps When a user has a problem with a mobile app, the issue could be isolated or spread across all app versions, handsets and OS types. With Splunk MINT, you can see issues with app performance or availability in real time. Bugs can be addressed quickly, and app developers can gain a headstart in creating and delivering valuable app updates. End-to-End Application Transaction Performance When mobile apps fail, there are many potential sources of failure. With Splunk MINT Express, you can analyze overall transaction performance. And using Splunk MINT Enterprise, you can correlate this data with information from back-end apps to gain detailed insight on transaction problems. As a result, operations can reduce MTTR and better anticipate future mobile app back-end requirements. Deliver real-time omnichannel analytics Mobile apps give enterprises new ways of conducting digital business. With mobile app information in Splunk Enterprise, you can correlate usage and performance information—a form of omni-channel analytics—to better understand how users are engaging all aspects of your organization.
  • #32: Splunk MINT Express provides a dashboard that offers and at a glance view of Mobile app health and usage. This includes an overall index called “MobDex”, which provides a blended view of Application usage, crashes, engagement in and abandonment. The insight boxes provide top-level aggregated information, which you can click on to get more specific information, and context.
  • #33: Hunk offers Full-featured Analytics in an Integrated Platform Explore, analyze and visualize data, create dashboards and share reports from one integrated platform. Hunk enables everyone in your organization to unlock the business value of data locked in Hadoop Hunk integrates the processes of data exploration, analysis and visualization into a single, fluid user experience designed to drive rapid insights from your big data in Hadoop. Enable powerful analytics for everyone with Splunk’s Data Models and the Pivot interface, first released in Splunk Enterprise 6. And Hunk works with what you have today Hunk works on Apache Hadoop and most major distributions, including those from Cloudera, Hortonworks, IBM, MapR and Pivotal, with support for both first-generation MapReduce and YARN (Yet Another Resource Negotiator, the technical acronym for 2nd generation MapReduce). Preview results and interactively search across one or more Hadoop clusters, including from different distribution vendors. Use the ODBC driver for saved searches with report acceleration to feed data from Hunk to third-party data visualization tools or business intelligence software. Streaming Resource Libraries enables developers to stream data from NoSQL and other data stores, such as Apache Accumulo, Apache Cassandra, Couchbase, MongoDB and Neo4j, for exploration, analysis and visualization in Hunk.
  • #35: One of the key innovations in this product is Splunk Virtual Index technology. This patent-pending capability enables the seamless use of almost the entire Splunk technology stack, including the Splunk Search Processing Language for interactive exploration, analysis and visualization of data stored anywhere, as if it was stored in a Splunk Index. Splunk Analytics for Hadoop uses this foundational technology and is the first product to come from this innovation. To configure the virtual index, specify the external resource provider the virtual index is serviced by and specify the data paths that belong to this virtual index.
  • #52: The data for example may have a userid but you want to search on a name. Splunk’s lookup capability can enrich the raw data by adding additional fields at search time by. Some common use cases including event and error code description fields. Think “Page not Found” instead of “404”. Enriching your data can lead to entirely new insight. In the example shown, Splunk took the userid and looked up the name and role of the user from an HR database. Similarly, it determined the location of the failed log in attempt by correlating the IP address. Even though these fields don’t exist in the raw data, Splunk allows you to search or pivot on them at any time. You can also mask data. For example, you may want social security numbers to be replaced with all X’s for regular users but not masked for others. Removing data can also be useful, such as filtering PII, before writing it to an index in Splunk.
  • #84: Splunk 6 takes large-scale machine data analytics to the next level by introducing three breakthrough innovations: Pivot – opens up the power of Splunk search to non-technical users with an easy-to-use drag and drop interface to explore, manipulate and visualize data Data Model – defines meaningful relationships in underlying machine data and making the data more useful to broader base of non-technical users Analytics Store – patent pending technology that accelerates data models by delivering extremely high performance data retrieval for analytical operations, up to 1000x faster than Splunk 5 Let’s dig into each of these new features in more detail.
  • #85: Data Models are created using the Data Model Builder and are usually designed and implemented by users who understand the format and semantics of their indexed data, and who are familiar with the Splunk Search Processing Language (SPL). They define meaningful relationships in the data. Unlike data models in the traditional structured world, Splunk Data Models focus on machine data and data mashups between machine data and structured data. Splunk software is founded on the ability to flexibly search and analyze highly diverse machine data employing late-binding or search-time techniques for schematization (“schema-on-the-fly”). And Data Models are no exception. They define relationships in the underlying data, while leaving the raw machine data intact, and map these relationships at search time. They are therefore highly flexible and designed to enable users to rapidly iterate. Security is also a key consideration and data models are fully permissionable in Splunk 6.
  • #86: Data Models are accelerated using the High Performance Analytics Store, new in Splunk 6. The High Performance Analytics Store represents a breakthrough innovation from Splunk that dramatically accelerates analytical operations across massive data sets by up to 1000x over Splunk 5. The Analytics Store contain a separate store of pre-extracted values derived from the underlying Splunk index. This data is organized in columns for rapid retrieval and powers dramatic improvements in the performance of analytical operations. Once created the Analytics Store is used seamlessly by Data Models and in turn the Pivot interface. For users more comfortable with the Splunk Search Processing Language (SPL), The Analytics Store can also be used directly in the search language. The Splunk Analytics Store is different from traditional Columnar databases – it is based on the Splunk lexicon and optimized for data retrieval (versus updates) by the Splunk Data Model or directly from the Splunk Search Processing Language. With the Analytics Store, Splunk Enterprise now uniquely optimizes data retrieval for both rare term searches and now analytical operations all in the same software platform. The new Pivot interface, combined with Data Models and the Analytics Store makes it dramatically easier for non-technical users and technical users alike to analyze and visualize data in Splunk and represent an important step towards Splunk’s mission of making machine data accessible, usable and valuable to everyone.
  • #87: The Pivot interface enables non-technical and technical users alike to quickly generate sophisticated charts, visualizations and dashboards using simple drag and drop and without learning the Search Processing Language (SPL). Users can access different chart types from the Splunk toolbox to easily visualize their data different ways. Queries using the Pivot interface are powered by underlying “data models” which define the relationships in Machine Data.
  • #93: What does this platform look like? The platform consists of 2 layer: A core engine and an interface layer On top of the platform you can’t run a broad spectrum of content that supports use cases Use cases range from application mgmt. and IT operations, to ES and PCI compliance, to web analytics The core engine provides the basic services for real time data input, indexing and search as well alerting, large scale distributed processing and role based access The Interface layer consist of the basic UI for search, reporting and visualization – it contains developer interfaces, the REST API, SDKs and Web Framework The SDKs provide a convenient access to core engine services in a variety of programing language environments. The Web Framework enables developers to quickly create Splunk Apps by using the modern web programming paradigm including pre-built components, styles, templates, and reusable samples as well as supporting the development of custom logic, interactions, components, and UI. Developers can choose to program their Splunk App using Simple XML, JavaScript or Django (or any combination thereof). These programmatic interfaces allow you to eithe:r: extend Splunk integrate Splunk with other applications build completely new applications from scratch that require OI or analytical services that Splunk provides
  • #94: BUILD SPLUNK APPS The Splunk Web Framework makes building a Splunk app looks and feels like building any modern web application.   The Simple Dashboard Editor makes it easy to BUILD interactive dashboards and user workflows as well as add custom styling, behavior and visualizations. Simple XML is ideal for fast, lightweight app customization and building. Simple XML development requires minimal coding knowledge and is well-suited for Splunk power users in IT to get fast visualization and analytics from their machine data. Simple XML also lets the developer “escape” to HTML with one click to do more powerful customization and integration with JavaScript.   Developers looking for more advanced functionality and capabilities can build Splunk apps from the ground up using popular, standards-based web technologies: JavaScript and Django. The Splunk Web Framework lets developers quickly create Splunk apps by using prebuilt components, styles, templates, and reusable samples as well as supporting the development of custom logic, interactions, components, and UI. Developers can choose to program their Splunk app using Simple XML, JavaScript or Django (or any combination thereof). EXTEND AND INTEGRATE SPLUNK Splunk Enterprise is a robust, fully-integrated platform that enables developers to INTEGRATE data and functionality from Splunk software into applications across the organization using Software Development Kits (SDKs) for Java, JavaScript, C#, Python, PHP and Ruby. These SDKs make it easier to code to the open REST API that sits on top of the Splunk Engine. With almost 200 endpoints, the REST API lets developers do programmatically what any end user can do in the UI and more. The Splunk SDKs include documentation, code samples, resources and tools to make it faster and more efficient to program against the Splunk REST API using constructs and syntax familiar to developers experienced with Java, Python, JavaScript, PHP, Ruby and C#. Developers can easily manage HTTP access, authentication and namespaces in just a few lines of code.   Developers can use the Splunk SDKs to: - Run real-time searches and retrieve Splunk data from line-of-business systems like Customer Service applications - Integrate data and visualizations (charts, tables) from Splunk into BI tools and reporting dashboards - Build mobile applications with real-time KPI dashboards and alerts powered by Splunk - Log directly to Splunk from remote devices and applications via TCP, UDP and HTTP - Build customer-facing dashboards in your applications powered by user-specific data in Splunk - Manage a Splunk instance, including adding and removing users as well as creating data inputs from an application outside of Splunk - Programmatically extract data from Splunk for long-term data warehousing Developers can EXTEND the power of Splunk software with programmatic control over search commands, data sources and data enrichment. Splunk Enterprise offers search extensibility through: - Custom Search Commands - developers can add a custom search script (in Python) to Splunk to create own search commands. To build a search that runs recursively, developers need to make calls directly to the REST API - Scripted Lookups: developers can programmatically script lookups via Python. - Scripted Alerts: can trigger a shell script or batch file (we provide guidance for Python and PERL). - Search Macros: make chunks of a search reuseable in multiple places, including saved and ad hoc searches.    Splunk also provides developers with other mechanisms to extend the power of the platform. - Data Models: allow developers to abstract away the search language syntax, making Splunk queries (and thus, functionality) more manageable and portable/shareable. - Modular Inputs: allow developers to extend Splunk to programmatically manage custom data input functionality via REST.
  • #95: Here are just some of the new Splunk Apps that have been delivered over the past year. Their goal is to make it easier to use Splunk for specific technologies and use cases – prepackaging inputs, field extractions, searches and visualizations. Highlight a few apps. These apps along with 100’s of others have been developed not only by Splunk but by partners, customers and members of the Splunk community.
  • #98: “After this workshop, if you want more information, all the product documentation is available online. The documentation is divided into several manuals. For reporting and dashboards you will likely be most interested in the User and Developer Manuals.”
  • #99: “For a more interactive approach to getting your questions addressed there is Splunk Answers. It is a web based Splunk community of Splunkers like you. Splunk employees are also regular experts on the site.”
  • #100: “It is not possible to cover everything you need to know about building reports and dashboards in 30-45 minutes. For more structured training with labs, consider Splunk education courses. These are available as instructor-led web-based courses or onsite if there is enough participants per class.”