Snowplow drives
everything we do
What and why?
Digital and print publisher
Family-owned German company
116 sites across Australia and New Zealand
Tag management across all sites
Bauer Media
Just start collecting
Snowplow data collection in 2014
We didn’t really have a use case
Snowplow is at the core of everything we do
Stuff we record
Page views
Metadata around content
User logins
Email click-throughs
Ad impressions
Use cases started showing up
Cross-site integrated reporting
Ad hoc tricky analysis
Sanity checking industry audience reporting
Stalking individual users
Audience overlaps
Snowplow is at the core of everything we do
Snowplow is at the core of everything we do
Snowplow is at the core of everything we do
User behaviour
Ad impressions
Content metadata
Trending service
Recommendations
Dashboards
Ad hoc analysis
Some things you can’t do in GA
Tag-based reporting
Accurate reporting of in-app Facebook using user-agent contains FBAN
Snowplow is at the core of everything we do
Snowplow is at the core of everything we do
Snowplow is at the core of everything we do
We’re using Snowplow 0.9.2 from 2014-04-29!
It just works
We’ve been busy building other stuff
But...
Page pings is b0rken: no time spent or scroll depth
(Out-of-the-box) browser categorisation is terrible
Hourly batches are a bit higher latency than we’d like
No context shredding, but JSON queries are performant enough
runSnowPlow.sh
Web page
(JavaScript in
page creates
image beacon)
S3
Cloudfront
SnowCannon
(Node app in
Elastic
Beanstalk)
Redirects to
Writes logs
to
ETL
(Elastic Map
Reduce)
S3
events
(Redshift)
events_temp
(Redshift)
x_events
(Redshift)
Tips
Redshift can get very expensive very quickly
Decent dashboarding platforms are rare
And plenty of crap ones are overpriced
Just tip everything in and worry about what you’ll do later
What’s next?
Future plans
Upgrade ETL to real-time: probably our own solution
Time spent and scroll depth
Shredding?

More Related Content

PDF
2016 09 measurecamp - event data modeling
PPTX
The analytics journey at Viewbix - how they came to use Snowplow and the setu...
PPTX
Snowplow the evolving data pipeline
PPTX
How we use Hive at SnowPlow, and how the role of HIve is changing
PDF
Analytics at Carbonite: presentation to Snowplow Meetup Boston April 2016
PPTX
Snowplow: where we came from and where we are going - March 2016
PPTX
Understanding event data
PPTX
Yali presentation for snowplow amsterdam meetup number 2
2016 09 measurecamp - event data modeling
The analytics journey at Viewbix - how they came to use Snowplow and the setu...
Snowplow the evolving data pipeline
How we use Hive at SnowPlow, and how the role of HIve is changing
Analytics at Carbonite: presentation to Snowplow Meetup Boston April 2016
Snowplow: where we came from and where we are going - March 2016
Understanding event data
Yali presentation for snowplow amsterdam meetup number 2

What's hot (20)

PDF
Snowplow at DA Hub emerging technology showcase
PDF
Viewbix tracking journey
PPTX
Implementing improved and consistent arbitrary event tracking company-wide us...
PPTX
Big Data Beers - Introducing Snowplow
PPTX
A taste of Snowplow Analytics data
PPTX
Simply Business and Snowplow - Multichannel Attribution Analysis
PDF
How Gousto is moving to just-in-time personalization with Snowplow
PDF
Using Snowplow for A/B testing and user journey analysis at CustomMade
PDF
Snowplow: open source game analytics powered by AWS
PDF
Snowplow - Evolve your analytics stack with your business
PPTX
Modelling event data in look ml
PDF
Snowplow: putting digital analysts at the heart of digital analytics - the fo...
PDF
Big data meetup budapest adding data schemas to snowplow
PDF
Snowplow presentation for Amsterdam Meetup #3
PDF
Data driven video advertising campaigns - JustWatch & Snowplow
PDF
How to evolve your analytics stack with your business using Snowplow
PDF
Data science as a service
PDF
Simply Business - Near Real Time Event Processing
PDF
Streetlife's real time analytics stack
PPTX
Snowplow, Metail and Cascalog
Snowplow at DA Hub emerging technology showcase
Viewbix tracking journey
Implementing improved and consistent arbitrary event tracking company-wide us...
Big Data Beers - Introducing Snowplow
A taste of Snowplow Analytics data
Simply Business and Snowplow - Multichannel Attribution Analysis
How Gousto is moving to just-in-time personalization with Snowplow
Using Snowplow for A/B testing and user journey analysis at CustomMade
Snowplow: open source game analytics powered by AWS
Snowplow - Evolve your analytics stack with your business
Modelling event data in look ml
Snowplow: putting digital analysts at the heart of digital analytics - the fo...
Big data meetup budapest adding data schemas to snowplow
Snowplow presentation for Amsterdam Meetup #3
Data driven video advertising campaigns - JustWatch & Snowplow
How to evolve your analytics stack with your business using Snowplow
Data science as a service
Simply Business - Near Real Time Event Processing
Streetlife's real time analytics stack
Snowplow, Metail and Cascalog
Ad

Viewers also liked (7)

PPTX
Chefsfeed presentation to Snowplow Meetup San Francisco, Oct 2015
PPTX
Snowplow Analytics and Looker at Oyster.com
PPTX
Snowplow Analytics: from NoSQL to SQL and back again
PDF
Snowplow at the heart of Busuu's data & analytics infrastructure
PDF
Snowplow: evolve your analytics stack with your business
PPTX
Why use big data tools to do web analytics? And how to do it using Snowplow a...
PDF
Snowplow at Sigfig
Chefsfeed presentation to Snowplow Meetup San Francisco, Oct 2015
Snowplow Analytics and Looker at Oyster.com
Snowplow Analytics: from NoSQL to SQL and back again
Snowplow at the heart of Busuu's data & analytics infrastructure
Snowplow: evolve your analytics stack with your business
Why use big data tools to do web analytics? And how to do it using Snowplow a...
Snowplow at Sigfig
Ad

Similar to Snowplow is at the core of everything we do (8)

PDF
Capturing online customer data to create better insights and targeted actions...
PDF
How GetNinjas uses data to make smarter product decisions
PDF
Stop Being A Third-Party Victim - Treat Your Customer Data Like A Pro - Mo Mi...
PPTX
Exploring Splunk
PDF
OSA Con 2022 - Building Event Collection SDKs and Data Models - Paul Boocock ...
PDF
[Public] 7 arquetipos de la tecnología moderna [españa]
PDF
The culture trip snowplow implementation
PPTX
Splunk 5 Overview Analyst v1.0
Capturing online customer data to create better insights and targeted actions...
How GetNinjas uses data to make smarter product decisions
Stop Being A Third-Party Victim - Treat Your Customer Data Like A Pro - Mo Mi...
Exploring Splunk
OSA Con 2022 - Building Event Collection SDKs and Data Models - Paul Boocock ...
[Public] 7 arquetipos de la tecnología moderna [españa]
The culture trip snowplow implementation
Splunk 5 Overview Analyst v1.0

Recently uploaded (20)

PPTX
Capstone Presentation a.pptx on data sci
PPTX
cp-and-safeguarding-training-2018-2019-mmfv2-230818062456-767bc1a7.pptx
PPTX
GPS sensor used agriculture land for automation
PDF
©️ 02_SKU Automatic SW Robotics for Microsoft PC.pdf
PDF
General category merit rank list for neet pg
PPTX
langchainpptforbeginners_easy_explanation.pptx
PPTX
chuitkarjhanbijunsdivndsijvndiucbhsaxnmzsicvjsd
PPTX
PPT for Diseases (1)-2, types of diseases.pptx
PDF
Session 11 - Data Visualization Storytelling (2).pdf
PPTX
Machine Learning and working of machine Learning
PPTX
PPT for Diseases.pptx, there are 3 types of diseases
PDF
Concepts of Database Management, 10th Edition by Lisa Friedrichsen Test Bank.pdf
PPTX
lung disease detection using transfer learning approach.pptx
PPT
What is life? We never know the answer exactly
PPT
Classification methods in data analytics.ppt
PDF
Grey Minimalist Professional Project Presentation (1).pdf
PDF
The Role of Pathology AI in Translational Cancer Research and Education
PPTX
ifsm.pptx, institutional food service management
PPTX
inbound2857676998455010149.pptxmmmmmmmmm
PPTX
Chapter security of computer_8_v8.1.pptx
Capstone Presentation a.pptx on data sci
cp-and-safeguarding-training-2018-2019-mmfv2-230818062456-767bc1a7.pptx
GPS sensor used agriculture land for automation
©️ 02_SKU Automatic SW Robotics for Microsoft PC.pdf
General category merit rank list for neet pg
langchainpptforbeginners_easy_explanation.pptx
chuitkarjhanbijunsdivndsijvndiucbhsaxnmzsicvjsd
PPT for Diseases (1)-2, types of diseases.pptx
Session 11 - Data Visualization Storytelling (2).pdf
Machine Learning and working of machine Learning
PPT for Diseases.pptx, there are 3 types of diseases
Concepts of Database Management, 10th Edition by Lisa Friedrichsen Test Bank.pdf
lung disease detection using transfer learning approach.pptx
What is life? We never know the answer exactly
Classification methods in data analytics.ppt
Grey Minimalist Professional Project Presentation (1).pdf
The Role of Pathology AI in Translational Cancer Research and Education
ifsm.pptx, institutional food service management
inbound2857676998455010149.pptxmmmmmmmmm
Chapter security of computer_8_v8.1.pptx

Snowplow is at the core of everything we do

  • 1. Snowplow drives everything we do What and why?
  • 2. Digital and print publisher Family-owned German company 116 sites across Australia and New Zealand Tag management across all sites Bauer Media
  • 3. Just start collecting Snowplow data collection in 2014 We didn’t really have a use case
  • 5. Stuff we record Page views Metadata around content User logins Email click-throughs Ad impressions
  • 6. Use cases started showing up Cross-site integrated reporting Ad hoc tricky analysis Sanity checking industry audience reporting Stalking individual users Audience overlaps
  • 10. User behaviour Ad impressions Content metadata Trending service Recommendations Dashboards Ad hoc analysis
  • 11. Some things you can’t do in GA Tag-based reporting Accurate reporting of in-app Facebook using user-agent contains FBAN
  • 15. We’re using Snowplow 0.9.2 from 2014-04-29! It just works We’ve been busy building other stuff
  • 16. But... Page pings is b0rken: no time spent or scroll depth (Out-of-the-box) browser categorisation is terrible Hourly batches are a bit higher latency than we’d like No context shredding, but JSON queries are performant enough
  • 17. runSnowPlow.sh Web page (JavaScript in page creates image beacon) S3 Cloudfront SnowCannon (Node app in Elastic Beanstalk) Redirects to Writes logs to ETL (Elastic Map Reduce) S3 events (Redshift) events_temp (Redshift) x_events (Redshift)
  • 18. Tips Redshift can get very expensive very quickly Decent dashboarding platforms are rare And plenty of crap ones are overpriced Just tip everything in and worry about what you’ll do later
  • 20. Future plans Upgrade ETL to real-time: probably our own solution Time spent and scroll depth Shredding?

Editor's Notes

  • #10: Dolly usage by hour