Interesting Applications of Analytics
In shortest terms, Analytics is “answering questions with data”. With the progress in technology of data
capturing and efficient access and analysis, analytics has become mainstream. Gone are the days when
it was an esoteric science which required scientists to find anything from millions of rows and columsn
of number. Today everyone can do “analytics” from simple aggregated analytics to advanced predictive
modeling with some training and orientation. Like the quip of a wise analyst – bring me the right
question and I will give you the right answer through data. Analytics today is leveraged across functions,
domains, industries and day-to-day activities from a smart-energy meter at home to predictive
presidential election results.
Here we bring you some stories where analytics was leveraged in a non-traditional way in non-
traditional industries to achieve success.
Obama’s re-election campaign
The problem:
Given the losses during the first term of the presidency, re-election was no longer firmly under the belt.
Obama team had a simple fundamental principle – get everyone who voted the first time to vote again
and also attempt to bring in new voters – from new growing demographics or by swaying the
undecideds into their fold through targeted messaging. To them it became a task of reorganizing the
coalition of supporters one-by-one, through personal touchpoints.
The solution:
Obama For America (OFA) 2012 will be quoted in history for leveraging best practices in Analytics,
Survey, Testing, Visualization & Reporting; and how it molded them into a coherent “Data-Driven-
Strategic Framework”.
“The Cave” in OFA’s Chicago headquarters housed the campaign’s Analytics team. Behind closed doors,
more than 50 data analysts used Big Data to predict the individual behavior of tens of millions of
American voters.
 Data Collection: (Dan Wagner)
o Survey Manager: Collection of series of surveys on voters’ attitudes and preferences,
which fed into software called ‘Survey Manager’ as tables. Surveys could be short-term
and long term interviews with voters. At one time, the campaign completed 8,000 to
9,000 such calls per night.
o Constituent Relationship Management system: Captured everything about the
interaction with the voter, volunteer, donor and website user. Vertica software from
Hewlett-Packard allowed combined access the party’s 180-million-person voter file and
all the other data systems.
o The analytics staff also routinely aggregated all varied data sources-- Benenson's
aggregate battleground survey, the state tracking polls, the analytical calls and even
public polling data and feeding to predictive models
 Micro-targeting models: (Predictive Analysis)
o The electorate could be seen as a collection of individual citizens who could each be
measured and assessed on their own terms - casting a ballot and supporting Obama.
These models were adjusted weekly based on new data. Applying these models
identified which non-registrants were most likely to be Democrats and which ones
Republicans. It informed subsequent “Get-out-the-vote” and “persuasion” campaigns.
o Models estimated support for Obama and Romney in each state and media market.
They controlled for the "house effects" of each pollster or data collection method, and
each nightly run of the model involved approximately 66,000 "Monte Carlo"
simulations, which allowed the campaign to calculate its chances of winning each state.
 Experiment-informed programs (EIPs): Designed by Analyst Institute, these are A/B Tests (or
Correlation Analysis) used for various purposes
o Testing ‘Resonating’ Messages: measure effectiveness of different types of messages at
moving public opinion. Experimenters would randomly assign voters to receive varied
sequences of direct mail—four pieces on the same policy theme, each making a slightly
different case for Obama—and then use ongoing survey calls to isolate the attributes of
those whose opinions changed as a result, e.g., age group 45-65 responded better to
Medicare messages compared to 65+ already in a program.
o Fundraising: By testing different variations of fundraising e-mails to find the ones with
best response rate. In one of the campaigns, they tested 18 variations on subject line,
email copy and number of mails sent. When they rolled out the winning variation “I will
be outspent” to broader email base message, it raised $2.6+ million on June 26th
, 2012.
o User Interface Optimization: The campaign conducted 240 A/B tests on their donation
page. This resulted in a 49% increase in their conversion rate. By making the platform
60% faster, they saw a 14% increase in donations. In June 2012, the campaign switched
to the 4 step donation process and saw a 5% increase in conversions (donations).
 Campaign Response Analytics: (D’Agostino) (Predictive Analysis)
o Customized Donation Requests: Instead of soliciting for a fixed amount like $25, the
campaign tested on different percentages of donors' highest previous donation
amounts and found that all versions of those requests did better than set amount.
o Targeted Communications: Models based on recipients' past responses to past e-mail
campaigns helped organizers better target their specific communications, e.g., certain
recipients more open to volunteering than to donate online. This was facilitated by a
new system called, Narwhal, with above ‘analytics’ algorithms.
 Optimizer (Davidsen): (Aggregate Analysis) Product that helped with behaviorally targeted TV
buys. It co-ordinates model predictions from above and user TV viewing behavior and comes up
with quarter-hour segment of day with the greatest number of persuadable targets per dollar
across 60 channels. It was developed in coordination with a company called Rentrak. Campaign
estimated that this made the TV buy as a whole 10-20% more efficient. That’s the equivalent of
$40 million and $80 million in added media.
 Social Analytics: (Aggregate Analysis) OFA scored 50,000 Twitter accounts by political
affiliation. They used Twitter influence (looking at number of tweets & followers) to target direct
messages asking people to get involved.
 Communication Analytics (Matthew Rattigan): (Aggregate Analysis) OFA had a tool to look at
the coverage of speeches in local newspapers and understand people’s reaction across
geographic regions and which parts were quoted most. Speechwriters were therefore able to
see how the messages they wanted to convey were actually the ones that were covered.
 Dashboard was the campaign's grassroots organizing platform that mapped directly to how the
campaign was structured in the field. It provided unified view of the team, activity (calls,
messages), voting info, fund raising, etc. A mobile app allowed a canvasser to download and
return walk sheets without ever entering a campaign office
Tool Used: R was used for Analytics projects throughout the campaign. D3 for data visualization.
The Results:
 Some of the congressional predictions from the Models were in +/-2.5% range, e.g., final
predictive margin, for a 2009 special election for an open congressional seat in upstate New
York, was 150 votes well before Election Day.
 The final simulations were accurate to within 0.2% in Ohio and 0.4% in Florida, but were 1% too
cautious in Colorado.
 OFA’s final projection was for a 51-48 battleground-state margin for the president, which is
approximately where the race ended up
“Like OFA summarized, Data Analytics made a national presidential campaign run the way of a local
ward campaign”
References:
http://www.technologyreview.com/featuredstory/509026/how-obamas-team-used-big-data-to-rally-
voters/
http://engagedc.com/download/Inside%20the%20Cave.pdf
http://techpresident.com/news/23214/how-analytics-made-obamas-campaign-communications-more-
efficient
http://www.businessweek.com/articles/2012-11-29/the-science-behind-those-obama-campaign-e-mails
Operation Blue Crush (Crime Reduction Using Statistical History) 2005
The problem:
Blue Crush began as a brainchild of Memphis Professor Richard Janikowski who met with then Memphis
Police Department (MPD), Police director, Larry Godwin to talk about new ways to reduce crime. With
MPD opening to both new strategy and Sharing data, was born “Operation Blue Crush” as a predictive
analytics based Crime fighting effort in one of the most crime-ridden cities in America. University of
Memphis actively co-ordinates with MPD on this program.
The solution:
The underlying philosophy is to pre-emptively identify places to dedicate police resources to prevent
and/or reduce crimes.
 Data collection/monitoring: The Memphis Police Department gathers data on every crime
reported in the city and then tracks and maps all crimes over time. When patterns of criminal
activity emerge from the data, officers are assigned to "details" and sent to those areas that
data show are being hardest hit. Hand held devices help the Memphis Police Department (MPD)
file reports on the spot -- making them available to detectives within minutes -- and check for
local and national outstanding warrants instantly.
 Aggregate Analysis: Past criminal event statistics are used to create maps with crosstabs to
create "focus areas".
 Trend & Correlation Analysis: Long/short term Trend in Crimes, by various drivers like time of
day, day of week, etc.
Above analysis help police proactively deployed resources -- from organized crime and special ops units
to the mounted patrol, K-9, traffic and DUI enforcement.
Tool Used: IBM SPSS is a partner.
The Results:
 In its first 7 years, violent crime was down 23%. Burglaries went down five times the national
average. Area most impacted by Blue Crush was apartments around the city where violent crime
was cut by more than 35%.
 IBM is also circulating a June case study that says Memphis made an 863 percent return on its
investment, calculated using the percentage decline in crime and the number and cost of
additional cops that would be needed to match the declining rate.
 The study by Nucleus Research said Memphis has paid on average $395,249 a year on the
initiative, including personnel costs, for a $7.2 million return. (contradicting with IBM number?)
 Blue Crush has become a department-wide philosophy – facilitating effective deployment of
resources and higher level of accountability and responsibility from all officers of MPD.
 Blue Crush has now evolved into a more intense and targeted community policing strategy at
chronic crime hotspots.
References:
http://www.commercialappeal.com/news/2010/sep/19/blue-crush-gives-ibm-a-boost/
http://www.memphispolice.org/BLUE%20Crush.htm
http://www.commercialappeal.com/news/2013/jan/27/blue-crush-controversy/
http://wreg.com/2013/05/01/the-brain-behind-operation-blue-crush-retires/
Global Warming Prediction Report
The problem:
Global Warming is a global problem with global ramifications. IPCC is the leader in this domain and had
come up with a model based on CO2 concentrations way back in 2007. With an aim, of modeling and
predicting the global temperature anomalies, through “Self-Organizing Knowledge Extraction” using
public data, Insights (formerly KnowledgeMiner) a research, consulting and software development
company in the field of high-end predictive modeling initiated this project. Insights presented a “6-year
monthly global mean temperature predictions” in Sep, 2011 which was then discussed in Climate Etc. in
Oct, 2011.
The solution:
The philosophy of this model is based on letting the data tell the story – don’t start with hypotheses to
tests, since there is a lot we human don’t know and can’t use to predict. One technique based on this
philosophy, “Self Organizing Modeling” technique works on Adaptive Networks, where Self-organization
of predictive variables happens to give a mathematical equation of optimal complexity and reliable
predictive accuracy.
 Data collection: Data comes from public sources. Data inputs include sun, ozone, cloud, aerosols
and CO2 concentrations.
 Predictive Analysis: The self-organized model builds a dynamic system model - a system of
nonlinear difference equations. This system model was obtained from monthly observation data
of the past 33 years. The model when built proved interdependencies in the system, e.g. ozone
affects other variables, and then these interdependencies then merge together in a fashion that
predicts global temperatures.
Tool Used: KnowledgeMiner
The Results:
 The model shows that Sun, Ozone, Aersosol and cloud are primary drivers of global warming. It
also acknowledges that there could be outside forces that haven’t been accounted here.
 The model shows an accuracy of 75% given the noise and uncertainty in the observation data. It
has also been tested between Apr and Dec, 2012. KnowledgeMiner regularly updates the
performance and predictions.
 Model also predicts that level of global mean temperatures is going to stabilize near current
levels, although there may be regional variations.
 The above model could have ramifications on the debates/discussions/controversies on the
current strategies to fight global warming, esp role of green-house gases and the combat.
References:
http://www.climateprediction.eu/cc/Main/Entries/2011/9/13_What_Drives_Global_Warming.html
http://www.knowledgeminer.eu/about.html
http://www.climateprediction.eu/cc/About.html
Predictive Analytics at Delegat’s Wine Estates, a listed New Zealander Wine Company
The problem:
Delegat’s is the New Zealand’s largest listed Wine company - in 2012 Delegat’s alone sold nearly two
million cases of wine worldwide. The entire winemaking process is managed in-house by 350 staff
globally, from growing the grapes to producing, distributing and selling the wine with direct sales teams
in each country. Winemaking business is demand/supply sensitive- a change in one area can impact the
ability to serve customers in another. It’s also time sensitive - highly specific growing and harvest
seasons give winemakers only a brief window to find and fix supply problems at the vineyards.
Predictive Analytics imparts a unique advantage of being prepared for such fluctuations.
The solution:
Together with IBM Business Partner Cortell NZ, Delegat’s deployed an integrated planning and
reporting tool
 Data collection: Internal (Production, Supply, Demand, Product, Sales, Viticulture inputs) and
Market data.
 Reporting: Integrated standard planning & monitoring suite to keep track of all aspects of
business using KPIs.
 Aggregate Analysis: Supply/Demand profiling (what product for which region) & elasticity
studies (changes and response strategies) of markets and consumers.
 Predictive Analysis: Net profitability modeling based on yield, production, supply and demand.
System modeling on how one component affects others in the chain.
 Trend & Correlation Analysis: Short and Long term changes in the company and the industry.
 Sizing/Estimation: What-if scenarios on profitability, brand and other KPIs.
Tool Used: IBM Cognos TM1
The Results:
 Time to produce reports reduced by 90 percent and shortened its planning cycles to six weeks.
 Efficient day-to-day and strategic business decisions - e.g.,
o Decision on acquisition of Oyster Bay Marlborough’s remaining shares based on insights
from “what-if” scenario models apart from business analysis.
o Expansion from owning ten vineyards to 15 based on insights from scenario modeling.
References:
http://asmarterplanet.com/blog/2012/10/new-zealand-vintner-taps-predictive-analytics-for-global-
wine-markets.html
http://www.huffingtonpost.com/paul-chang/ibm-analyze-this-how-tech_b_1967131.html
Predictive Energy Analytics to reduce Operational Costs
The problem:
Energy cost mitigation at a California wastewater facility was extremely difficult due to many factors
outside the control of plant personnel, including dynamics of wastewater flow, energy sources and the
requirements of integrating effluent from multiple municipal agency treatment facilities provided at
various levels of water treatment. Dissimilar data collection platforms and “raw data only” reporting
compounded the issue.
The solution:
Mike Murray and TR Bietsch from Heliopower group companies worked on implementing a data
analytics framework to overcome the above challenges. The primary goal of the project was to provide
operators with real-time, on-demand energy analytics.
 Understanding of requirements:
o Requirement Sessions: HelioPower (creator of PredictEnergy) conducted Energy Audit
to understand cost, energy utilization patters, etc. and other business requirements.
o KPI: The primary philosophy was to increase Output-Per-Energy cost by combining with
financial information. Established consumption, demand and cost baselines and
quantified KPIs and set targets.
o Energy sources (utility, co-gen and solar) were paired against uses(facility process)
 Data collection: Data from Energy sources, production information and utility tariff cost
structure. PredictEnergy the tool required for the analysis, combined current SCADA
(Supervisory Control And Data Acquisition) and metering systems with historical, current and
predictive energy data from the utility and distributed (inhouse cogen and Solar) energy sources
by installing at key points like main power meter and load centers like Co-gen.
 Reporting: Dashboard to monitor & analyze KPIs across various slices & dices.
 Aggregate Analysis: Profiled real-time energy costs for pumping and processing, optimized co-
gen energy cost off-set and quantified cost avoidance provided by Solar.
 Trend Analysis: Short and Long term changes in KPIs.
 Predictive Analysis: Patent pending algorithms utilizing data on actual energy consumption and
demand to utility billings, baselines, iterative analysis best outcome predictions and constant
feedback error corrections.
 Sizing/Estimation: What-if scenarios on KPIs and iterations for best performance.
Tool Used: PredictEnergy from HelioEnergySolutions
The Results:
 Analyses identified operators to shift process loads and energy source usage to minimize
operational expenses (15+% via man hour reduction, process load reduction and cross-billing
error reduction) and energy costs (3-5%). The findings were shared with four other facilities as
best practices to reduce their own costs.
References:
http://www.heliopower.com/wp-content/uploads/2013/01/Predictive-Energy-Analytics-to-Reduce-
Operational-Costs.pdf
http://www.heliopower.com/wp-content/uploads/2013/01/PredictEnergy-Product-Overview.pdf
http://heliopower.com/wp-content/uploads/2013/03/PredictEnergy-Implementation-Phases.pdf
Predicting High School Graduation and Dropout
The problem:
Every education board needs to understand the drivers of Student graduations and dropout, so that
they can tailor their programs to especially address needs of students at risk, increase gain scores and
reform schools. Raj Subedi a researcher with Department of Educational Psychology and Learning
System, Florida Department of Education submitted a dissertation on Educational Research, which could
inform the Department’s efforts to address this problem.
The solution:
Predictive Models were built to understand the effect of Student, Teacher and School level variables on
the Graduation outcome (Yes or No) of the student.
 Data collection: This study used 6,184 students and 253 mathematics teachers from all middle
schools in the Orange County Public Schools (OCPS), which is the tenth largest school district out
of 14,000 in the USA.
o Outcome variable: Grades 6–8 mathematics YoY (’05 vs. ’04) gain scores in NRT-NCE
(Norm Referenced Test-Normal Curve Equivalent) portion of the Florida Comprehensive
Assessment Test, a state mandated standardized test of student achievement of the
benchmarks in reading, mathematics, science, and social studies in Florida schools.
o Student level predictors: Pretest scores, Socio-economic status (participation in the free
and reduced lunch program.
o Teacher level predictors: Mathematics content-area certification, advanced
mathematics or mathematics education degree, and experience.
o School level predictors: School poverty is defined as the percent of free and reduced
lunch students in each school, and teachers’ school mean experience is defined as the
average number of years taught by middle school teachers in a given school.
 Predictive Analysis: A three level Hierarchical Linear Model (HLM) through Value Added Model
(VAM) approach to investigate Student, Teacher and School level predictors. The HLM method
checks for the interaction between the various level predictors (Student, Teacher & School).
Value Added Modeling is the method of Teacher contribution in a given year by comparing the
current test scores of the students in current year with the scores in previous year and also by
comparing to the scores of other students in the same year.
Tool Used: SAS
The Results:
 Findings suggested that already high performing students can be expected to score better.
 Students with poorer economic background by themselves performed poorly however guidance
by Content Certified teachers.
 However such students did not show performance improvement when paired with teachers
possessing longer work experience.
 The findings also indicated that Content Certification & teaching experience had a positive
impact, some of the accountability requirements under “No Child Left Behind” Act. In other
words, teacher competency had a substantial impact on student performance.
References:
http://www.hindawi.com/journals/edu/2011/532737/
http://diginole.lib.fsu.edu/cgi/viewcontent.cgi?article=4896&context=etd
http://www.palmbeachschools.org/dre/documents/Predicting_Graduation_and_Dropout.pdf
Oklahoma Town Uses Workforce Analytics to Lure Manufacturer to Region
The problem:
As U.S. skilled labor supply continues to tightens, economic development groups are struggling with the
problem of showing availability of skilled workers to Investors (Site selectors and businesses). The
Belgian high-tech materials company, Umicore, decided to expand its germanium wafer1 production to
the U.S. The initial search parameters narrowed the location to three cities - Phoenix, Albuquerque and
Quapaw in Oklahoma. Quapaw is a small town of 966 residents. Even though Umicore’s Opticals division
had an existing plant in Quapaw and expanding here would be cost efficient, it couldn’t be sure of
availability of requisite workforce.
When the site selector approached Judee Snodderly, executive director of the Miami Area Economic
Development Service, Okhlohoma’s public data didn’t have sufficient details or flexibility to answer the
questions. To complicate the problem, Miami’s regional labor force is shared by four states (Oklahoma,
Kansas, Missouri and Arkansas) and integrating them to get an accurate and reliable picture was
difficult.
The solution:
 Data collection:
o EMSI (Economic Modeling Specialists International) captures and stores labor market
data and stores it within web based market research, visualization & analytics tool
“Analyst”. EMSI broad array of products from Analyst (Labor Market) to Career Coach
(Career Vision) and Economic Impact Studies brings together tremendous value addition
for Analysts and Decision makers.
o Tap into a composite of more than 90 federal, state, and private data sources, refreshed
quarterly and available at many granularities -county, ZIP, MSA, or multi-state region.
o It also has data for UK and Canada internationally.
 Aggregate Analysis: Analyst tool has dashboard capabilities with multiple visualization options
like maps, charts, tables, etc.
o Gary Box, the business retention coordinator at the Workforce Investment Board of
Southwest Missouri had access to EMSI’s Analyst tool which provided him Industry and
Occupation reports highlighting high-tech manufacturing skill availability. It also enabled
him to emphasize availability of “compatible” talent for Umicore in the region.
Tool Used: EMSI’s web-based labor market tool, Analyst
The Results:
 In late June 2008, Umicore chose Quapaw as the new location for its germanium wafer
production site, resulting in an investment of $51 million into the region and 165 new jobs with
an average salary of $51,000 a year not including benefits.
 Construction began in 2008 and continued through late 2009. EMSI estimates that the total
impact on the economy of Ottawa County during the construction phase alone was more than
160 jobs and nearly $9 million in earnings annually.
 Once the plant began operating, that impact rose to more than 250 jobs and more than $12
million in earnings annually.
References:
http://thehiringsite.careerbuilder.com/2013/03/04/workforce-data-case-study/
http://www.economicmodeling.com/wp-content/uploads/Analyst_onepage2013_v1b.pdf
http://www.economicmodeling.com/analyst/
Optimization of “Procure-to-Pay” process - Strategic Customer benefit initiative by VISA
The problem:
A $2 billion U.S. Construction company was exploring options to maximize its card program by analyzing
its processing costs and efficiency across its entire Procure-to-Pay process. Its Visa Issuer (Issuing bank)
introduced it to Visa's Procure-to-Pay and Visa PerformSource, a consultative service aimed at
maximizing value from the commercial card program and Procure-to-Pay process.
The solution:
Visa and its Issuing Banks (Issuers) through a new program helped the US Construction Company identify
opportunities to improve Procure-to-Pay operations and increase savings through their card programs.
The Optimization Review utilized analytical tools (Procure-to-Pay performance Gauge & Accounts
Payable tool) designed to identify, benchmark, and improve the Procure-to-Pay practices. These tools
helped define a plan and financial impact estimate for the expansion of Visa Commercial card programs.
 Data collection: Procurement and card operations data
 Aggregate Analysis:
o Procurement-to-Pay Performance Gauge: This tool is designed to assist a company in
understanding how to improve its current Procure-to-Pay processes and technology. A
customized diagnostic report comparing against best practice companies of same
revenue size was developed.
o The Accounts Payable Analysis Tool: Helped the company analyze spend patterns and
develop both strategic and tactical commercial card program implementation or
expansion plans organized by commodity, business unit and supplier. Additionally, the
built-in ROI calculator estimated the financial benefits they could realize through the
card program. The tool also allows companies to set program goals over a three-year
time frame for the Visa Commercial card program expansion.
Tool Used: VISA Performsource service – toolkit (Procure-to-Pay Performance Gauge & Accounts
Payable tool)
The Results:
 This company scored Overall Advanced Rating (51-75), It had good foundation – clear Procure-
to-Pay strategy, RoI analysis of all initiatives, automated process & inclusion of payment method
into preferred vendor contracts.
 It also implemented a best practice commercial card program (distribution of approved vendor
list, accounts payable interception of invoices & distribution of cards to super users)
 However there was still scope of further cost reductions, greater control and process efficiencies
– Ongoing Vendor List Management, Communication/Audit/Reporting of non-compliance and
regular reports to Senior Management. VISA projected a net process efficiency savings of $0.6
MM over 3 years from the program.
References:
http://usa.visa.com/corporate/corporate_solutions/perform_source/case_study_construction.html
http://usa.visa.com/corporate/corporate_solutions/perform_source/index.html
The US Women Cycling Team’s Big Data Story of London 2012 Olympics
The problem:
The last time US Women team had won a track medal in Olympics was two decades ago. It was
outfunded by British, with a spend of $46 MM and outstaffed in staff by 10 to 1. They were entering the
tournament 5 seconds away from even being considered for the medals. Closing this gap for a medal
was almost considered an impossible task by many.
The solution:
Sky Christopherson himself a recent world record winner and a big user of Quantified Self data, was an
instrumental force in helping the team do the impossible.
 Data collection: Quantified Self data from sensors, cameras, ipads on environment, sleep
pattern, genetic, blood glucose tracking & just about everything that is important to the cyclists.
The data was being collected every sec, 24 hours a day, 7 days a week.
 Reporting: Visualization of key metrics on charts, tables, etc.
 Aggregate Analysis: Profiling and drill-down of various key metrics by other levers.
 Correlation Analysis: Among various drivers – lifecycles, routines with performances.
 Trend Analysis: Impact of changes on the performance over time.
Tool Used: Datameer
The Results:
 Race strategies, health & recovery routines, changes to day-to-day lifecycle pattern and habits;
in short all data driven actions to improve performance of the cyclists.
 US team beat Australia, the overwhelming favorites in the Semi-Final round by 0.08 seconds and
finally went on to the win the silver medal.
References:
http://www.datameer.com/learn/videos/us-womens-olympic-cycling-team-big-data-story.html
http://fora.tv/2013/06/27/The_US_Womens_Cycling_Teams_Big_Data_Story
http://en.wikipedia.org/wiki/Quantified_Self
Helping keep Manitoba Food Chain safe by tracking disease outbreaks
The problem:
Manitoba Agriculture, Food & Rural Initiatives (MAFRI) ministry under Chief Veterinary Officer, Dr.
Wayne Lees, is responsible for safeguarding the agri-food chain at Manitoba, Canada and beyond. They
either actively manage animal disease outbreak and/or strategize on how to best prevent it or
effectively control it the next time. The electronic tracking system containing the livestock premises
identification system tracks the movement of lifestock across the food chain ensuring ability to pinpoint
risks from animal-to-animal exposure. It also ensures effective inter-agency collaboration for rapid and
effective response in event of outbreaks.
Manitoba is the largest pork-exporting province in Canada and almost two thirds of its hog productions
is exported to US for finishing. Any unmanaged outbreak could be financially catastrophic. Info on
outbreak can come from multiple sources – people, rumour, diagnostic labs and veterinary practitioners.
Within months of the deployment of launch of tracking system, there was a transmittable
gastroenteritis (TGE) in pigs within a cluster of three farms.
The solution:
Sky Christopherson himself a recent world record winner and a big user of Quantified Self data, was an
instrumental force in helping the team do the impossible.
 Data collection: Premises ID database collects all information on the livestock.
 Aggregate Analysis: Visualization of disease trackers on maps (origin location, proximity & size
of outbreak and at risk, livestock movements)
 Sizing & Estimation: Calculation of optimal buffer zone, what-if scenario planning of spread
patterns and response strategies.
 Correlation Analysis: Identification of potentially exposed herds, animals that had come into
and out of the affected farms.
 Predictive Analysis: Modeling of outbreak spreads based on farm locations, animal densities
and other factors.
Tool Used: IBM Maximo Asset Management & IBM Global Business Services –Industry Consulting
The Results:
 By identifying, quantifying and analyzing risk factors at the time of detection, MAFRI reduced
the average disease control cycle by 80% - in this case weeks, with no additional farms affected.
 Almost halved the “downstream” containment costs, such as manpower and transport.
 Able to do a more targeted and accurate application of epidemic responses such as quarantines.
 Reduced the risk of export restrictions and cull-related losses resulting from animal disease
epidemics, which represents millions of dollars in direct losses to the Manitoba livestock
industry and local economies.
 Able to execute a more efficient deployment of animal disease control specialists in the field
during outbreaks.
References:
http://www.ibm.com/smarterplanet/us/en/leadership/mafri/assets/pdf/MAFRI_Paper.pdf
http://www.ibm.com/smarterplanet/us/en/leadership/mafri/
Chapter Summary
The intent of this chapter was to illustrate the applications of data analysis principles from visual
analytics to advanced analytics in various walks of our lives. Our hope is that the above stories has a
struck a chord with your imagination and the next time you see life around you, you are able to relate to
it at an “analytical” level. Given the rapid pace of our lives digitization, data generation is only going
north and the analysis of such data is only going to make our lives better. Please do not be surprised
tomorrow, if your self-driving, self-thinking car recommends a great exotic restaurant after analyzing
your food channel program viewing pattern.

More Related Content

PPTX
Predictive Policing
PPTX
PredPol: How Predictive Policing Works
DOCX
Individual project 2.20
PPTX
Predictive Policing on Gun Violence Using Open Data
PPTX
Predictive policing computational thinking show and tell
PPSX
Metodologia para el analisis de redes sociales
PDF
Crime Analysis based on Historical and Transportation Data
PDF
UNGP_ProjectSeries_Mobile_Data_Privacy_2015 (1)
Predictive Policing
PredPol: How Predictive Policing Works
Individual project 2.20
Predictive Policing on Gun Violence Using Open Data
Predictive policing computational thinking show and tell
Metodologia para el analisis de redes sociales
Crime Analysis based on Historical and Transportation Data
UNGP_ProjectSeries_Mobile_Data_Privacy_2015 (1)

What's hot (13)

PPTX
Team Spyglass Final Presentation
PPTX
Chicago Crime Dataset Project Proposal
PDF
Forrester Research: Challenge Thinking. Lead Change. #LSS2016
PDF
87692_GV499
PPT
Crime mapping conference 2009
PDF
Merseyside Crime Analysis
PPTX
7 misconceptions about predictive policing webinar
PDF
Matching Mobile Applications for Cross Promotion
PPTX
Foresight Analytics
PPT
Evolving social data mining and affective analysis
PDF
Google trends correlate
PDF
Can tweets help predict a stock's price movements?
Team Spyglass Final Presentation
Chicago Crime Dataset Project Proposal
Forrester Research: Challenge Thinking. Lead Change. #LSS2016
87692_GV499
Crime mapping conference 2009
Merseyside Crime Analysis
7 misconceptions about predictive policing webinar
Matching Mobile Applications for Cross Promotion
Foresight Analytics
Evolving social data mining and affective analysis
Google trends correlate
Can tweets help predict a stock's price movements?
Ad

Similar to Analytics anecdotes (20)

PDF
Inside the cave
PDF
Political Campaigns & Predictive Analytics- Changing how to campaign
PDF
TARGET CRM
PPTX
The New Polling Mix: Increasing Accuracy With Online Surveys
PDF
Voter id targeting and election integrity
PDF
Inside the-cave
PDF
Campaign Sciences Analytics White Paper
PPTX
[DSC Europe 23] Alen Kisic - How can do Facebook data and machine learning al...
PDF
Mediawave, social media monitoring & data analytics
PDF
Who are you trying to reach and how? Building and using the modern public sec...
DOCX
1CONTEXTUAL THINKING ABOUT DIFFERENT SCENARIOS Scenario A L.docx
DOCX
Glossary
PPTX
Mobilising your Social Media Journey
 
PPTX
Future of market research
PDF
Social Media Tracing D2P2 Analysis-Austere Systems Ltd
PDF
Making data actionable: A look at the power of people-driven data in activati...
PDF
data, big data, open data
PPTX
Orchestrating Collective Intelligence
DOCX
1) Values in Computational Models RevaluedComputational mode.docx
PDF
A Review of machine learning approaches to mine Social Choice of voters.
Inside the cave
Political Campaigns & Predictive Analytics- Changing how to campaign
TARGET CRM
The New Polling Mix: Increasing Accuracy With Online Surveys
Voter id targeting and election integrity
Inside the-cave
Campaign Sciences Analytics White Paper
[DSC Europe 23] Alen Kisic - How can do Facebook data and machine learning al...
Mediawave, social media monitoring & data analytics
Who are you trying to reach and how? Building and using the modern public sec...
1CONTEXTUAL THINKING ABOUT DIFFERENT SCENARIOS Scenario A L.docx
Glossary
Mobilising your Social Media Journey
 
Future of market research
Social Media Tracing D2P2 Analysis-Austere Systems Ltd
Making data actionable: A look at the power of people-driven data in activati...
data, big data, open data
Orchestrating Collective Intelligence
1) Values in Computational Models RevaluedComputational mode.docx
A Review of machine learning approaches to mine Social Choice of voters.
Ad

More from Ramkumar Ravichandran (20)

PPTX
Risk Product Management - Creating Safe Digital Experiences, Product School 2019
PPTX
Improving AI products with Analytics
PDF
Advancing the analytics maturity curve at your organization
PDF
Advancing Testing Program Maturity in your organization
PDF
Leadership, analytics & you
PPTX
Augment the actionability of Analytics with the “Voice of Customer”
PPTX
Predictive Analytics as a Product
PPTX
Prepping the Analytics organization for Artificial Intelligence evolution
PPTX
Power of Small Data
PPTX
Optimizing Marketing Decisions
PPTX
Building & nurturing an Analytics Team
PPTX
Analytics as an enabler of Company Culture
PPTX
Digital summit Dallas 2015 - Research brings back the 'human' aspect to insights
PPTX
Social media analytics - a delicious treat, but only when handled like a mast...
PPTX
Optimizing product decisions
PPTX
Moving beyond numbers
PPTX
Taming the Data Lake with Scalable Metrics Model Framework
PPTX
Actionability of insights
PPTX
A/B Testing Best Practices - Do's and Don'ts
PPTX
Transform your Analytics Practice into Insights Practice
Risk Product Management - Creating Safe Digital Experiences, Product School 2019
Improving AI products with Analytics
Advancing the analytics maturity curve at your organization
Advancing Testing Program Maturity in your organization
Leadership, analytics & you
Augment the actionability of Analytics with the “Voice of Customer”
Predictive Analytics as a Product
Prepping the Analytics organization for Artificial Intelligence evolution
Power of Small Data
Optimizing Marketing Decisions
Building & nurturing an Analytics Team
Analytics as an enabler of Company Culture
Digital summit Dallas 2015 - Research brings back the 'human' aspect to insights
Social media analytics - a delicious treat, but only when handled like a mast...
Optimizing product decisions
Moving beyond numbers
Taming the Data Lake with Scalable Metrics Model Framework
Actionability of insights
A/B Testing Best Practices - Do's and Don'ts
Transform your Analytics Practice into Insights Practice

Recently uploaded (20)

PPTX
Introduction to Fundamentals of Data Security
PPT
Chinku Sharma Internship in the summer internship project
PPTX
SET 1 Compulsory MNH machine learning intro
PPT
statistics analysis - topic 3 - describing data visually
PPTX
OJT-Narrative-Presentation-Entrep-group.pptx_20250808_102837_0000.pptx
PDF
Tetra Pak Index 2023 - The future of health and nutrition - Full report.pdf
PPT
DU, AIS, Big Data and Data Analytics.ppt
PPTX
Hushh.ai: Your Personal Data, Your Business
PDF
Navigating the Thai Supplements Landscape.pdf
PPT
dsa Lec-1 Introduction FOR THE STUDENTS OF bscs
PPTX
cp-and-safeguarding-training-2018-2019-mmfv2-230818062456-767bc1a7.pptx
PPTX
machinelearningoverview-250809184828-927201d2.pptx
PPT
PROJECT CYCLE MANAGEMENT FRAMEWORK (PCM).ppt
PPTX
The Data Security Envisioning Workshop provides a summary of an organization...
PPT
statistic analysis for study - data collection
PPTX
FMIS 108 and AISlaudon_mis17_ppt_ch11.pptx
PPTX
Lesson-01intheselfoflifeofthekennyrogersoftheunderstandoftheunderstanded
PDF
Session 11 - Data Visualization Storytelling (2).pdf
PPTX
ch20 Database System Architecture by Rizvee
PDF
technical specifications solar ear 2025.
Introduction to Fundamentals of Data Security
Chinku Sharma Internship in the summer internship project
SET 1 Compulsory MNH machine learning intro
statistics analysis - topic 3 - describing data visually
OJT-Narrative-Presentation-Entrep-group.pptx_20250808_102837_0000.pptx
Tetra Pak Index 2023 - The future of health and nutrition - Full report.pdf
DU, AIS, Big Data and Data Analytics.ppt
Hushh.ai: Your Personal Data, Your Business
Navigating the Thai Supplements Landscape.pdf
dsa Lec-1 Introduction FOR THE STUDENTS OF bscs
cp-and-safeguarding-training-2018-2019-mmfv2-230818062456-767bc1a7.pptx
machinelearningoverview-250809184828-927201d2.pptx
PROJECT CYCLE MANAGEMENT FRAMEWORK (PCM).ppt
The Data Security Envisioning Workshop provides a summary of an organization...
statistic analysis for study - data collection
FMIS 108 and AISlaudon_mis17_ppt_ch11.pptx
Lesson-01intheselfoflifeofthekennyrogersoftheunderstandoftheunderstanded
Session 11 - Data Visualization Storytelling (2).pdf
ch20 Database System Architecture by Rizvee
technical specifications solar ear 2025.

Analytics anecdotes

  • 1. Interesting Applications of Analytics In shortest terms, Analytics is “answering questions with data”. With the progress in technology of data capturing and efficient access and analysis, analytics has become mainstream. Gone are the days when it was an esoteric science which required scientists to find anything from millions of rows and columsn of number. Today everyone can do “analytics” from simple aggregated analytics to advanced predictive modeling with some training and orientation. Like the quip of a wise analyst – bring me the right question and I will give you the right answer through data. Analytics today is leveraged across functions, domains, industries and day-to-day activities from a smart-energy meter at home to predictive presidential election results. Here we bring you some stories where analytics was leveraged in a non-traditional way in non- traditional industries to achieve success. Obama’s re-election campaign The problem: Given the losses during the first term of the presidency, re-election was no longer firmly under the belt. Obama team had a simple fundamental principle – get everyone who voted the first time to vote again and also attempt to bring in new voters – from new growing demographics or by swaying the undecideds into their fold through targeted messaging. To them it became a task of reorganizing the coalition of supporters one-by-one, through personal touchpoints. The solution: Obama For America (OFA) 2012 will be quoted in history for leveraging best practices in Analytics, Survey, Testing, Visualization & Reporting; and how it molded them into a coherent “Data-Driven- Strategic Framework”. “The Cave” in OFA’s Chicago headquarters housed the campaign’s Analytics team. Behind closed doors, more than 50 data analysts used Big Data to predict the individual behavior of tens of millions of American voters.  Data Collection: (Dan Wagner) o Survey Manager: Collection of series of surveys on voters’ attitudes and preferences, which fed into software called ‘Survey Manager’ as tables. Surveys could be short-term and long term interviews with voters. At one time, the campaign completed 8,000 to 9,000 such calls per night. o Constituent Relationship Management system: Captured everything about the interaction with the voter, volunteer, donor and website user. Vertica software from Hewlett-Packard allowed combined access the party’s 180-million-person voter file and all the other data systems.
  • 2. o The analytics staff also routinely aggregated all varied data sources-- Benenson's aggregate battleground survey, the state tracking polls, the analytical calls and even public polling data and feeding to predictive models  Micro-targeting models: (Predictive Analysis) o The electorate could be seen as a collection of individual citizens who could each be measured and assessed on their own terms - casting a ballot and supporting Obama. These models were adjusted weekly based on new data. Applying these models identified which non-registrants were most likely to be Democrats and which ones Republicans. It informed subsequent “Get-out-the-vote” and “persuasion” campaigns. o Models estimated support for Obama and Romney in each state and media market. They controlled for the "house effects" of each pollster or data collection method, and each nightly run of the model involved approximately 66,000 "Monte Carlo" simulations, which allowed the campaign to calculate its chances of winning each state.  Experiment-informed programs (EIPs): Designed by Analyst Institute, these are A/B Tests (or Correlation Analysis) used for various purposes o Testing ‘Resonating’ Messages: measure effectiveness of different types of messages at moving public opinion. Experimenters would randomly assign voters to receive varied sequences of direct mail—four pieces on the same policy theme, each making a slightly different case for Obama—and then use ongoing survey calls to isolate the attributes of those whose opinions changed as a result, e.g., age group 45-65 responded better to Medicare messages compared to 65+ already in a program. o Fundraising: By testing different variations of fundraising e-mails to find the ones with best response rate. In one of the campaigns, they tested 18 variations on subject line, email copy and number of mails sent. When they rolled out the winning variation “I will be outspent” to broader email base message, it raised $2.6+ million on June 26th , 2012. o User Interface Optimization: The campaign conducted 240 A/B tests on their donation page. This resulted in a 49% increase in their conversion rate. By making the platform 60% faster, they saw a 14% increase in donations. In June 2012, the campaign switched to the 4 step donation process and saw a 5% increase in conversions (donations).  Campaign Response Analytics: (D’Agostino) (Predictive Analysis) o Customized Donation Requests: Instead of soliciting for a fixed amount like $25, the campaign tested on different percentages of donors' highest previous donation amounts and found that all versions of those requests did better than set amount. o Targeted Communications: Models based on recipients' past responses to past e-mail campaigns helped organizers better target their specific communications, e.g., certain recipients more open to volunteering than to donate online. This was facilitated by a new system called, Narwhal, with above ‘analytics’ algorithms.  Optimizer (Davidsen): (Aggregate Analysis) Product that helped with behaviorally targeted TV buys. It co-ordinates model predictions from above and user TV viewing behavior and comes up with quarter-hour segment of day with the greatest number of persuadable targets per dollar across 60 channels. It was developed in coordination with a company called Rentrak. Campaign
  • 3. estimated that this made the TV buy as a whole 10-20% more efficient. That’s the equivalent of $40 million and $80 million in added media.  Social Analytics: (Aggregate Analysis) OFA scored 50,000 Twitter accounts by political affiliation. They used Twitter influence (looking at number of tweets & followers) to target direct messages asking people to get involved.  Communication Analytics (Matthew Rattigan): (Aggregate Analysis) OFA had a tool to look at the coverage of speeches in local newspapers and understand people’s reaction across geographic regions and which parts were quoted most. Speechwriters were therefore able to see how the messages they wanted to convey were actually the ones that were covered.  Dashboard was the campaign's grassroots organizing platform that mapped directly to how the campaign was structured in the field. It provided unified view of the team, activity (calls, messages), voting info, fund raising, etc. A mobile app allowed a canvasser to download and return walk sheets without ever entering a campaign office Tool Used: R was used for Analytics projects throughout the campaign. D3 for data visualization. The Results:  Some of the congressional predictions from the Models were in +/-2.5% range, e.g., final predictive margin, for a 2009 special election for an open congressional seat in upstate New York, was 150 votes well before Election Day.  The final simulations were accurate to within 0.2% in Ohio and 0.4% in Florida, but were 1% too cautious in Colorado.  OFA’s final projection was for a 51-48 battleground-state margin for the president, which is approximately where the race ended up “Like OFA summarized, Data Analytics made a national presidential campaign run the way of a local ward campaign” References: http://www.technologyreview.com/featuredstory/509026/how-obamas-team-used-big-data-to-rally- voters/ http://engagedc.com/download/Inside%20the%20Cave.pdf http://techpresident.com/news/23214/how-analytics-made-obamas-campaign-communications-more- efficient http://www.businessweek.com/articles/2012-11-29/the-science-behind-those-obama-campaign-e-mails Operation Blue Crush (Crime Reduction Using Statistical History) 2005 The problem: Blue Crush began as a brainchild of Memphis Professor Richard Janikowski who met with then Memphis Police Department (MPD), Police director, Larry Godwin to talk about new ways to reduce crime. With
  • 4. MPD opening to both new strategy and Sharing data, was born “Operation Blue Crush” as a predictive analytics based Crime fighting effort in one of the most crime-ridden cities in America. University of Memphis actively co-ordinates with MPD on this program. The solution: The underlying philosophy is to pre-emptively identify places to dedicate police resources to prevent and/or reduce crimes.  Data collection/monitoring: The Memphis Police Department gathers data on every crime reported in the city and then tracks and maps all crimes over time. When patterns of criminal activity emerge from the data, officers are assigned to "details" and sent to those areas that data show are being hardest hit. Hand held devices help the Memphis Police Department (MPD) file reports on the spot -- making them available to detectives within minutes -- and check for local and national outstanding warrants instantly.  Aggregate Analysis: Past criminal event statistics are used to create maps with crosstabs to create "focus areas".  Trend & Correlation Analysis: Long/short term Trend in Crimes, by various drivers like time of day, day of week, etc. Above analysis help police proactively deployed resources -- from organized crime and special ops units to the mounted patrol, K-9, traffic and DUI enforcement. Tool Used: IBM SPSS is a partner. The Results:  In its first 7 years, violent crime was down 23%. Burglaries went down five times the national average. Area most impacted by Blue Crush was apartments around the city where violent crime was cut by more than 35%.  IBM is also circulating a June case study that says Memphis made an 863 percent return on its investment, calculated using the percentage decline in crime and the number and cost of additional cops that would be needed to match the declining rate.  The study by Nucleus Research said Memphis has paid on average $395,249 a year on the initiative, including personnel costs, for a $7.2 million return. (contradicting with IBM number?)  Blue Crush has become a department-wide philosophy – facilitating effective deployment of resources and higher level of accountability and responsibility from all officers of MPD.  Blue Crush has now evolved into a more intense and targeted community policing strategy at chronic crime hotspots. References: http://www.commercialappeal.com/news/2010/sep/19/blue-crush-gives-ibm-a-boost/ http://www.memphispolice.org/BLUE%20Crush.htm http://www.commercialappeal.com/news/2013/jan/27/blue-crush-controversy/ http://wreg.com/2013/05/01/the-brain-behind-operation-blue-crush-retires/
  • 5. Global Warming Prediction Report The problem: Global Warming is a global problem with global ramifications. IPCC is the leader in this domain and had come up with a model based on CO2 concentrations way back in 2007. With an aim, of modeling and predicting the global temperature anomalies, through “Self-Organizing Knowledge Extraction” using public data, Insights (formerly KnowledgeMiner) a research, consulting and software development company in the field of high-end predictive modeling initiated this project. Insights presented a “6-year monthly global mean temperature predictions” in Sep, 2011 which was then discussed in Climate Etc. in Oct, 2011. The solution: The philosophy of this model is based on letting the data tell the story – don’t start with hypotheses to tests, since there is a lot we human don’t know and can’t use to predict. One technique based on this philosophy, “Self Organizing Modeling” technique works on Adaptive Networks, where Self-organization of predictive variables happens to give a mathematical equation of optimal complexity and reliable predictive accuracy.  Data collection: Data comes from public sources. Data inputs include sun, ozone, cloud, aerosols and CO2 concentrations.  Predictive Analysis: The self-organized model builds a dynamic system model - a system of nonlinear difference equations. This system model was obtained from monthly observation data of the past 33 years. The model when built proved interdependencies in the system, e.g. ozone affects other variables, and then these interdependencies then merge together in a fashion that predicts global temperatures. Tool Used: KnowledgeMiner The Results:  The model shows that Sun, Ozone, Aersosol and cloud are primary drivers of global warming. It also acknowledges that there could be outside forces that haven’t been accounted here.  The model shows an accuracy of 75% given the noise and uncertainty in the observation data. It has also been tested between Apr and Dec, 2012. KnowledgeMiner regularly updates the performance and predictions.  Model also predicts that level of global mean temperatures is going to stabilize near current levels, although there may be regional variations.  The above model could have ramifications on the debates/discussions/controversies on the current strategies to fight global warming, esp role of green-house gases and the combat. References: http://www.climateprediction.eu/cc/Main/Entries/2011/9/13_What_Drives_Global_Warming.html
  • 6. http://www.knowledgeminer.eu/about.html http://www.climateprediction.eu/cc/About.html Predictive Analytics at Delegat’s Wine Estates, a listed New Zealander Wine Company The problem: Delegat’s is the New Zealand’s largest listed Wine company - in 2012 Delegat’s alone sold nearly two million cases of wine worldwide. The entire winemaking process is managed in-house by 350 staff globally, from growing the grapes to producing, distributing and selling the wine with direct sales teams in each country. Winemaking business is demand/supply sensitive- a change in one area can impact the ability to serve customers in another. It’s also time sensitive - highly specific growing and harvest seasons give winemakers only a brief window to find and fix supply problems at the vineyards. Predictive Analytics imparts a unique advantage of being prepared for such fluctuations. The solution: Together with IBM Business Partner Cortell NZ, Delegat’s deployed an integrated planning and reporting tool  Data collection: Internal (Production, Supply, Demand, Product, Sales, Viticulture inputs) and Market data.  Reporting: Integrated standard planning & monitoring suite to keep track of all aspects of business using KPIs.  Aggregate Analysis: Supply/Demand profiling (what product for which region) & elasticity studies (changes and response strategies) of markets and consumers.  Predictive Analysis: Net profitability modeling based on yield, production, supply and demand. System modeling on how one component affects others in the chain.  Trend & Correlation Analysis: Short and Long term changes in the company and the industry.  Sizing/Estimation: What-if scenarios on profitability, brand and other KPIs. Tool Used: IBM Cognos TM1 The Results:  Time to produce reports reduced by 90 percent and shortened its planning cycles to six weeks.  Efficient day-to-day and strategic business decisions - e.g., o Decision on acquisition of Oyster Bay Marlborough’s remaining shares based on insights from “what-if” scenario models apart from business analysis. o Expansion from owning ten vineyards to 15 based on insights from scenario modeling. References:
  • 7. http://asmarterplanet.com/blog/2012/10/new-zealand-vintner-taps-predictive-analytics-for-global- wine-markets.html http://www.huffingtonpost.com/paul-chang/ibm-analyze-this-how-tech_b_1967131.html Predictive Energy Analytics to reduce Operational Costs The problem: Energy cost mitigation at a California wastewater facility was extremely difficult due to many factors outside the control of plant personnel, including dynamics of wastewater flow, energy sources and the requirements of integrating effluent from multiple municipal agency treatment facilities provided at various levels of water treatment. Dissimilar data collection platforms and “raw data only” reporting compounded the issue. The solution: Mike Murray and TR Bietsch from Heliopower group companies worked on implementing a data analytics framework to overcome the above challenges. The primary goal of the project was to provide operators with real-time, on-demand energy analytics.  Understanding of requirements: o Requirement Sessions: HelioPower (creator of PredictEnergy) conducted Energy Audit to understand cost, energy utilization patters, etc. and other business requirements. o KPI: The primary philosophy was to increase Output-Per-Energy cost by combining with financial information. Established consumption, demand and cost baselines and quantified KPIs and set targets. o Energy sources (utility, co-gen and solar) were paired against uses(facility process)  Data collection: Data from Energy sources, production information and utility tariff cost structure. PredictEnergy the tool required for the analysis, combined current SCADA (Supervisory Control And Data Acquisition) and metering systems with historical, current and predictive energy data from the utility and distributed (inhouse cogen and Solar) energy sources by installing at key points like main power meter and load centers like Co-gen.  Reporting: Dashboard to monitor & analyze KPIs across various slices & dices.  Aggregate Analysis: Profiled real-time energy costs for pumping and processing, optimized co- gen energy cost off-set and quantified cost avoidance provided by Solar.  Trend Analysis: Short and Long term changes in KPIs.  Predictive Analysis: Patent pending algorithms utilizing data on actual energy consumption and demand to utility billings, baselines, iterative analysis best outcome predictions and constant feedback error corrections.  Sizing/Estimation: What-if scenarios on KPIs and iterations for best performance. Tool Used: PredictEnergy from HelioEnergySolutions
  • 8. The Results:  Analyses identified operators to shift process loads and energy source usage to minimize operational expenses (15+% via man hour reduction, process load reduction and cross-billing error reduction) and energy costs (3-5%). The findings were shared with four other facilities as best practices to reduce their own costs. References: http://www.heliopower.com/wp-content/uploads/2013/01/Predictive-Energy-Analytics-to-Reduce- Operational-Costs.pdf http://www.heliopower.com/wp-content/uploads/2013/01/PredictEnergy-Product-Overview.pdf http://heliopower.com/wp-content/uploads/2013/03/PredictEnergy-Implementation-Phases.pdf Predicting High School Graduation and Dropout The problem: Every education board needs to understand the drivers of Student graduations and dropout, so that they can tailor their programs to especially address needs of students at risk, increase gain scores and reform schools. Raj Subedi a researcher with Department of Educational Psychology and Learning System, Florida Department of Education submitted a dissertation on Educational Research, which could inform the Department’s efforts to address this problem. The solution: Predictive Models were built to understand the effect of Student, Teacher and School level variables on the Graduation outcome (Yes or No) of the student.  Data collection: This study used 6,184 students and 253 mathematics teachers from all middle schools in the Orange County Public Schools (OCPS), which is the tenth largest school district out of 14,000 in the USA. o Outcome variable: Grades 6–8 mathematics YoY (’05 vs. ’04) gain scores in NRT-NCE (Norm Referenced Test-Normal Curve Equivalent) portion of the Florida Comprehensive Assessment Test, a state mandated standardized test of student achievement of the benchmarks in reading, mathematics, science, and social studies in Florida schools. o Student level predictors: Pretest scores, Socio-economic status (participation in the free and reduced lunch program. o Teacher level predictors: Mathematics content-area certification, advanced mathematics or mathematics education degree, and experience. o School level predictors: School poverty is defined as the percent of free and reduced lunch students in each school, and teachers’ school mean experience is defined as the average number of years taught by middle school teachers in a given school.
  • 9.  Predictive Analysis: A three level Hierarchical Linear Model (HLM) through Value Added Model (VAM) approach to investigate Student, Teacher and School level predictors. The HLM method checks for the interaction between the various level predictors (Student, Teacher & School). Value Added Modeling is the method of Teacher contribution in a given year by comparing the current test scores of the students in current year with the scores in previous year and also by comparing to the scores of other students in the same year. Tool Used: SAS The Results:  Findings suggested that already high performing students can be expected to score better.  Students with poorer economic background by themselves performed poorly however guidance by Content Certified teachers.  However such students did not show performance improvement when paired with teachers possessing longer work experience.  The findings also indicated that Content Certification & teaching experience had a positive impact, some of the accountability requirements under “No Child Left Behind” Act. In other words, teacher competency had a substantial impact on student performance. References: http://www.hindawi.com/journals/edu/2011/532737/ http://diginole.lib.fsu.edu/cgi/viewcontent.cgi?article=4896&context=etd http://www.palmbeachschools.org/dre/documents/Predicting_Graduation_and_Dropout.pdf Oklahoma Town Uses Workforce Analytics to Lure Manufacturer to Region The problem: As U.S. skilled labor supply continues to tightens, economic development groups are struggling with the problem of showing availability of skilled workers to Investors (Site selectors and businesses). The Belgian high-tech materials company, Umicore, decided to expand its germanium wafer1 production to the U.S. The initial search parameters narrowed the location to three cities - Phoenix, Albuquerque and Quapaw in Oklahoma. Quapaw is a small town of 966 residents. Even though Umicore’s Opticals division had an existing plant in Quapaw and expanding here would be cost efficient, it couldn’t be sure of availability of requisite workforce. When the site selector approached Judee Snodderly, executive director of the Miami Area Economic Development Service, Okhlohoma’s public data didn’t have sufficient details or flexibility to answer the questions. To complicate the problem, Miami’s regional labor force is shared by four states (Oklahoma, Kansas, Missouri and Arkansas) and integrating them to get an accurate and reliable picture was difficult. The solution:
  • 10.  Data collection: o EMSI (Economic Modeling Specialists International) captures and stores labor market data and stores it within web based market research, visualization & analytics tool “Analyst”. EMSI broad array of products from Analyst (Labor Market) to Career Coach (Career Vision) and Economic Impact Studies brings together tremendous value addition for Analysts and Decision makers. o Tap into a composite of more than 90 federal, state, and private data sources, refreshed quarterly and available at many granularities -county, ZIP, MSA, or multi-state region. o It also has data for UK and Canada internationally.  Aggregate Analysis: Analyst tool has dashboard capabilities with multiple visualization options like maps, charts, tables, etc. o Gary Box, the business retention coordinator at the Workforce Investment Board of Southwest Missouri had access to EMSI’s Analyst tool which provided him Industry and Occupation reports highlighting high-tech manufacturing skill availability. It also enabled him to emphasize availability of “compatible” talent for Umicore in the region. Tool Used: EMSI’s web-based labor market tool, Analyst The Results:  In late June 2008, Umicore chose Quapaw as the new location for its germanium wafer production site, resulting in an investment of $51 million into the region and 165 new jobs with an average salary of $51,000 a year not including benefits.  Construction began in 2008 and continued through late 2009. EMSI estimates that the total impact on the economy of Ottawa County during the construction phase alone was more than 160 jobs and nearly $9 million in earnings annually.  Once the plant began operating, that impact rose to more than 250 jobs and more than $12 million in earnings annually. References: http://thehiringsite.careerbuilder.com/2013/03/04/workforce-data-case-study/ http://www.economicmodeling.com/wp-content/uploads/Analyst_onepage2013_v1b.pdf http://www.economicmodeling.com/analyst/ Optimization of “Procure-to-Pay” process - Strategic Customer benefit initiative by VISA The problem: A $2 billion U.S. Construction company was exploring options to maximize its card program by analyzing its processing costs and efficiency across its entire Procure-to-Pay process. Its Visa Issuer (Issuing bank)
  • 11. introduced it to Visa's Procure-to-Pay and Visa PerformSource, a consultative service aimed at maximizing value from the commercial card program and Procure-to-Pay process. The solution: Visa and its Issuing Banks (Issuers) through a new program helped the US Construction Company identify opportunities to improve Procure-to-Pay operations and increase savings through their card programs. The Optimization Review utilized analytical tools (Procure-to-Pay performance Gauge & Accounts Payable tool) designed to identify, benchmark, and improve the Procure-to-Pay practices. These tools helped define a plan and financial impact estimate for the expansion of Visa Commercial card programs.  Data collection: Procurement and card operations data  Aggregate Analysis: o Procurement-to-Pay Performance Gauge: This tool is designed to assist a company in understanding how to improve its current Procure-to-Pay processes and technology. A customized diagnostic report comparing against best practice companies of same revenue size was developed. o The Accounts Payable Analysis Tool: Helped the company analyze spend patterns and develop both strategic and tactical commercial card program implementation or expansion plans organized by commodity, business unit and supplier. Additionally, the built-in ROI calculator estimated the financial benefits they could realize through the card program. The tool also allows companies to set program goals over a three-year time frame for the Visa Commercial card program expansion. Tool Used: VISA Performsource service – toolkit (Procure-to-Pay Performance Gauge & Accounts Payable tool) The Results:  This company scored Overall Advanced Rating (51-75), It had good foundation – clear Procure- to-Pay strategy, RoI analysis of all initiatives, automated process & inclusion of payment method into preferred vendor contracts.  It also implemented a best practice commercial card program (distribution of approved vendor list, accounts payable interception of invoices & distribution of cards to super users)  However there was still scope of further cost reductions, greater control and process efficiencies – Ongoing Vendor List Management, Communication/Audit/Reporting of non-compliance and regular reports to Senior Management. VISA projected a net process efficiency savings of $0.6 MM over 3 years from the program. References: http://usa.visa.com/corporate/corporate_solutions/perform_source/case_study_construction.html http://usa.visa.com/corporate/corporate_solutions/perform_source/index.html
  • 12. The US Women Cycling Team’s Big Data Story of London 2012 Olympics The problem: The last time US Women team had won a track medal in Olympics was two decades ago. It was outfunded by British, with a spend of $46 MM and outstaffed in staff by 10 to 1. They were entering the tournament 5 seconds away from even being considered for the medals. Closing this gap for a medal was almost considered an impossible task by many. The solution: Sky Christopherson himself a recent world record winner and a big user of Quantified Self data, was an instrumental force in helping the team do the impossible.  Data collection: Quantified Self data from sensors, cameras, ipads on environment, sleep pattern, genetic, blood glucose tracking & just about everything that is important to the cyclists. The data was being collected every sec, 24 hours a day, 7 days a week.  Reporting: Visualization of key metrics on charts, tables, etc.  Aggregate Analysis: Profiling and drill-down of various key metrics by other levers.  Correlation Analysis: Among various drivers – lifecycles, routines with performances.  Trend Analysis: Impact of changes on the performance over time. Tool Used: Datameer The Results:  Race strategies, health & recovery routines, changes to day-to-day lifecycle pattern and habits; in short all data driven actions to improve performance of the cyclists.  US team beat Australia, the overwhelming favorites in the Semi-Final round by 0.08 seconds and finally went on to the win the silver medal. References: http://www.datameer.com/learn/videos/us-womens-olympic-cycling-team-big-data-story.html http://fora.tv/2013/06/27/The_US_Womens_Cycling_Teams_Big_Data_Story http://en.wikipedia.org/wiki/Quantified_Self Helping keep Manitoba Food Chain safe by tracking disease outbreaks The problem: Manitoba Agriculture, Food & Rural Initiatives (MAFRI) ministry under Chief Veterinary Officer, Dr. Wayne Lees, is responsible for safeguarding the agri-food chain at Manitoba, Canada and beyond. They either actively manage animal disease outbreak and/or strategize on how to best prevent it or
  • 13. effectively control it the next time. The electronic tracking system containing the livestock premises identification system tracks the movement of lifestock across the food chain ensuring ability to pinpoint risks from animal-to-animal exposure. It also ensures effective inter-agency collaboration for rapid and effective response in event of outbreaks. Manitoba is the largest pork-exporting province in Canada and almost two thirds of its hog productions is exported to US for finishing. Any unmanaged outbreak could be financially catastrophic. Info on outbreak can come from multiple sources – people, rumour, diagnostic labs and veterinary practitioners. Within months of the deployment of launch of tracking system, there was a transmittable gastroenteritis (TGE) in pigs within a cluster of three farms. The solution: Sky Christopherson himself a recent world record winner and a big user of Quantified Self data, was an instrumental force in helping the team do the impossible.  Data collection: Premises ID database collects all information on the livestock.  Aggregate Analysis: Visualization of disease trackers on maps (origin location, proximity & size of outbreak and at risk, livestock movements)  Sizing & Estimation: Calculation of optimal buffer zone, what-if scenario planning of spread patterns and response strategies.  Correlation Analysis: Identification of potentially exposed herds, animals that had come into and out of the affected farms.  Predictive Analysis: Modeling of outbreak spreads based on farm locations, animal densities and other factors. Tool Used: IBM Maximo Asset Management & IBM Global Business Services –Industry Consulting The Results:  By identifying, quantifying and analyzing risk factors at the time of detection, MAFRI reduced the average disease control cycle by 80% - in this case weeks, with no additional farms affected.  Almost halved the “downstream” containment costs, such as manpower and transport.  Able to do a more targeted and accurate application of epidemic responses such as quarantines.  Reduced the risk of export restrictions and cull-related losses resulting from animal disease epidemics, which represents millions of dollars in direct losses to the Manitoba livestock industry and local economies.  Able to execute a more efficient deployment of animal disease control specialists in the field during outbreaks. References: http://www.ibm.com/smarterplanet/us/en/leadership/mafri/assets/pdf/MAFRI_Paper.pdf http://www.ibm.com/smarterplanet/us/en/leadership/mafri/
  • 14. Chapter Summary The intent of this chapter was to illustrate the applications of data analysis principles from visual analytics to advanced analytics in various walks of our lives. Our hope is that the above stories has a struck a chord with your imagination and the next time you see life around you, you are able to relate to it at an “analytical” level. Given the rapid pace of our lives digitization, data generation is only going north and the analysis of such data is only going to make our lives better. Please do not be surprised tomorrow, if your self-driving, self-thinking car recommends a great exotic restaurant after analyzing your food channel program viewing pattern.