Designing Quality In
Applying Information Quality Principles
to
Data Protection
You cannot inspect quality into the product;
it is already there.

W. Edwards Deming
Some Theory….
A Case Study…
Lessons learned…
In less than 60 mins…
(including Q&A)
Some Theory….
Peter Drucker
“So far, for 50 years, the information
revolution has centered on data - their
collection, storage, transmission, analysis,
and presentation.
It has centered on the "T" in IT.
The next information revolution asks, what
is the MEANING of information, and what is
its PURPOSE?”
Technology is just a tool.
Data Protection Rules
1. Personal data which is being processed must be fairly obtained
and processed

2. Personal Data shall be kept accurate and complete and, where
necessary, kept up to date
3. Personal Data shall be obtained for a Specified and Lawful
Purpose
4. Personal Data shall not be processed in a manner incompatible
with the specified purpose.
Data Protection Rules
5.

Data processed must be adequate, relevant and not
excessive

6.

Personal data should not be kept for longer than necessary

for the specified purpose or purposes
7.

Personal Data should be kept Safe & Secure

8.

Data Subjects have a right of Access.
Nominated REsponsibility

Legislation requires a
nominated “Data
Protection Officer” to
exist in organisations
Linking to Data Quality
EU Directive 95/46/EC defines “Data Protection”
principles as
“Data Quality Principles”.
What is “Information Quality”?

The degree to which information and data can
be a trusted source for any or all required uses.
The degree to which data and information
meets the specific needs of specific customers.

Consistently meeting or exceeding knowledge
worker/end customer expectations.
Data Quality Characteristics
Information Chains
Information Flow
Some Output
Some
Input

Some Output

Some Action

Some Action

Some Action

By someone

By someone

By someone

That becomes
an Input

That becomes
an Input

Some
Output
Who

are they doing it for?
When

are they doing it?
are they stopped from doing it?
are they finished doing it?
do they start doing it again?
What

do they need to do it?
do IT systems need to interact?
How

does this get done?
is technology deployed to support?
Why

are we doing this?
If you can't describe what
you are doing as a
process...

... You don’t know what you
are doing.

W. Edwards Deming
THIS IS NOT A PROCESS Map or
Info Chain Description
•
•
•
•
•
•

We do this.
Then Martin in Accounts does that.
Then Betty in Receivables does this other thing
Then it comes back to us
Then something else happens.
4th Thursday of month the Jaberwock audits.
If I had wanted to
know what you did
on your holidays,

I’d have asked
.

Process Improvement Lead, Telco industry
The Info Asset Life Cycle
Plan

What resource do we need? What do we need it for? What
attributes/characteristics should this resource have? Are we
prepared for this resource?

Obtain

How will we get this resource? Where will we get it?

Store/
Share

How will we accommodate/store this resource? How (if necessary)
will it be shared amongst functions in the business?

Maintain

Apply
Dispose

How will we maintain and develop this resource to ensure maximum
utility and value?
How will we use this resource? How will the resource be used to
generate net cash inflows or support the delivery of services?
How will we reduce our volumes of this asset when it no longer
serves a valuable purpose? What conditions will indicate that an
asset is no longer serving a valuable purpose?
Data Quality Characteristics
Metrics &Measurement
Statistical Process Control

Scorecards

From Vishagashe.wordpress.com
Only

1

in 10 companies performed some form of
data profiling on their datasets,
affecting risk assessment on data
migrations and other initiatives.

Source: Bloor Research 2007
Summary (of Theory)
Data Protection & Information
Quality are closely linked disciplines
Understanding your Processes is key

Quality has to be built in
Inspecting defects out is not Quality
POSMAD

Information Life Cycle gives context
You can measure Quality of Information
(across many characteristics)
A Case Study…
Disclaimer:
The case study described here is a composite of a
number of projects that I have either worked on directly or
studied in direct contact with the project sponsors and
project managers involved.
This has been done to preserve anonymity of the
organisation(s) involved
Not all the projects were “Data Protection” focussed.
However the application of Information Quality and Data
Governance best practices resulted in opportunities for
Data Protection supports being identified and seized.
.
The Project

I’ve got a cunning plan.
The Project
•
•
•
•
•

•

Sales Force Automation.
Single View of Customer
Master Data Management
“e-Nable” traditional paperbased processes
Outsource some Call
Centre/Field Sales operations
E-billing for customers
Wanted
Got
The First Mistake
Focus on IT Architecture and systems infrastructure
Process Definition & Data Quality issues “descoped”
Plans became based on “Systems”
Project teams became siloed
Focus on “Data Subject” lost
Second Mistake

Key information wasn’t
available in the right
format for certain
processes
Couldn’t handle
Consumer customer
with >1 billing account.

Interim Solution:

Needed “Customer Name” as
<title><firstname><lastname>

Interim Solution:
All customer names in Single View of
Customer database parsed using an
MS Access Database on a laptop each
week
Third Mistake

Personal Data was being
captured that wasn’t actually
being used anywhere or whose
use was unclear
First name
Last name
Address 1
Address 2
Address 3
Address 4
PPSN
Age
Gender
Junk Mail
Bank A/C details/Credit Card

Yes/No
Acct Number
Branch Code
Credit Card Number

Contact Telephone
Mother’s Maiden Name
Details of any Medical Conditions

Household Demographics

Services you are interested in

Marital Status

Married
Single
Separated
Divorced
Cohabiting

Number of Children
Partner’s Name
Service 1
Service 2
Service 3
Service 4

Mock up of the paper form that had historically been used
Why Spock?
Why?
Basic design also used in Call Centre and for on-line self service sign up
(write once, use often)
Fourth mistake

Junk Mail: Yes/No
Assumption: No means No?
Fifth Mistake

Incentivisation:
Performance related pay
tied to % of fields
POPULATED
Sixth Mistake

Outsource Contract
No contract terms re:
• Data Protection
• Data Quality
Plan

Processes were mapped
Timing of data needs identified
Relevance of Data was documented
Critical Quality Measures picked
Meaning of Data clearly defined
Purpose of Data clearly understood
Obtain
Customer
inputs data

Customer
data
Output

Log
Expression of
Interest

Call Centre

Customer
Acquisition Decision

Review
Customer
Request

Complete
Additional
Data

Sales Team

Customer on-line

On-line Self Service

Call Centre & Web forms redesigned
Unused data removed from forms
Timing of questions was changed
The down stream processes were age-sensitive. Just
capturing AGE at a point in time meant customer
services could not be delivered reliably
Maintain

Key metrics defined for Information
Processes defined for Maintaining data
… by the customer (self service)
… by the organisation
… in response to leading indicators
Data Quality Characteristics
Maintain

Manual Data?

Define characteristics to be measured
Random sample of files (Poisson Distribution)
Simples
Maintain

Measured performance of Process

Completeness

Currency (age of data)
Consistency
Store/Share

Outsourcing contract was renegotiated
Clear expectations set
Duties & Obligations
Security/Privacy
Process standards
Data Quality
Store/Share

Controls implemented over
“End User Computing”

Use and purpose informed future
development roadmap
Apply

By understanding “HOW”
data was used, and
where...
Apply

• Right Data to Right place at Right
time for Right Reason
• Right Risk Assessment, Right Security,
Right controls.

• Less cost of rework and worry
Disposal/Destruction

Metrics

Average Age (of data)
Average Time since
last update/access

Actions

Business Case
Support Retention Policy
Lessons learned…
Technology is just a tool.
Information Value
Chains are the
Missing Link

A.K.A. Processes
A.K.A. “CyCles”
A.K.A. SIPOC
A.K.A. Workflow

If you can't describe
what you are doing as a
process...
... you don’t know what you
are doing.
Information has a life cycle.
That life cycle gives important
CONTEXT
From Toothpastefordinner.com

Process & Context
=
Meaning & Purpose
Information has attributes you
can measure...

Measurement can support Controls and Policies
Metrics can support Change Management goals
What is measured gets
done.
Perspective is important...
The Project did not set out to address
Data Protection per se.
But deliverables supported proactive
Data Protection Management
Sustaining the change is the
Big Challenge
Data Protection often sold on FEAR
How can you feed the GREED motive?
What is your
Data Protection Scorecard?
How does it translate to your
bottom line?

From Asset to Impact - Presentation to ICS Data Protection Conference 2011

  • 1.
    Designing Quality In ApplyingInformation Quality Principles to Data Protection
  • 2.
    You cannot inspectquality into the product; it is already there. W. Edwards Deming
  • 3.
  • 4.
  • 5.
  • 6.
    In less than60 mins… (including Q&A)
  • 7.
  • 8.
    Peter Drucker “So far,for 50 years, the information revolution has centered on data - their collection, storage, transmission, analysis, and presentation. It has centered on the "T" in IT. The next information revolution asks, what is the MEANING of information, and what is its PURPOSE?”
  • 9.
  • 10.
    Data Protection Rules 1.Personal data which is being processed must be fairly obtained and processed 2. Personal Data shall be kept accurate and complete and, where necessary, kept up to date 3. Personal Data shall be obtained for a Specified and Lawful Purpose 4. Personal Data shall not be processed in a manner incompatible with the specified purpose.
  • 11.
    Data Protection Rules 5. Dataprocessed must be adequate, relevant and not excessive 6. Personal data should not be kept for longer than necessary for the specified purpose or purposes 7. Personal Data should be kept Safe & Secure 8. Data Subjects have a right of Access.
  • 12.
    Nominated REsponsibility Legislation requiresa nominated “Data Protection Officer” to exist in organisations
  • 13.
    Linking to DataQuality EU Directive 95/46/EC defines “Data Protection” principles as “Data Quality Principles”.
  • 14.
    What is “InformationQuality”? The degree to which information and data can be a trusted source for any or all required uses. The degree to which data and information meets the specific needs of specific customers. Consistently meeting or exceeding knowledge worker/end customer expectations.
  • 15.
  • 16.
    Information Chains Information Flow SomeOutput Some Input Some Output Some Action Some Action Some Action By someone By someone By someone That becomes an Input That becomes an Input Some Output
  • 17.
  • 18.
    When are they doingit? are they stopped from doing it? are they finished doing it? do they start doing it again?
  • 19.
  • 20.
    do IT systemsneed to interact? How does this get done? is technology deployed to support?
  • 21.
  • 22.
    If you can'tdescribe what you are doing as a process... ... You don’t know what you are doing. W. Edwards Deming
  • 23.
    THIS IS NOTA PROCESS Map or Info Chain Description • • • • • • We do this. Then Martin in Accounts does that. Then Betty in Receivables does this other thing Then it comes back to us Then something else happens. 4th Thursday of month the Jaberwock audits.
  • 24.
    If I hadwanted to know what you did on your holidays, I’d have asked . Process Improvement Lead, Telco industry
  • 25.
    The Info AssetLife Cycle Plan What resource do we need? What do we need it for? What attributes/characteristics should this resource have? Are we prepared for this resource? Obtain How will we get this resource? Where will we get it? Store/ Share How will we accommodate/store this resource? How (if necessary) will it be shared amongst functions in the business? Maintain Apply Dispose How will we maintain and develop this resource to ensure maximum utility and value? How will we use this resource? How will the resource be used to generate net cash inflows or support the delivery of services? How will we reduce our volumes of this asset when it no longer serves a valuable purpose? What conditions will indicate that an asset is no longer serving a valuable purpose?
  • 27.
  • 28.
    Metrics &Measurement Statistical ProcessControl Scorecards From Vishagashe.wordpress.com
  • 29.
    Only 1 in 10 companiesperformed some form of data profiling on their datasets, affecting risk assessment on data migrations and other initiatives. Source: Bloor Research 2007
  • 30.
    Summary (of Theory) DataProtection & Information Quality are closely linked disciplines Understanding your Processes is key Quality has to be built in Inspecting defects out is not Quality POSMAD Information Life Cycle gives context You can measure Quality of Information (across many characteristics)
  • 31.
  • 32.
    Disclaimer: The case studydescribed here is a composite of a number of projects that I have either worked on directly or studied in direct contact with the project sponsors and project managers involved. This has been done to preserve anonymity of the organisation(s) involved Not all the projects were “Data Protection” focussed. However the application of Information Quality and Data Governance best practices resulted in opportunities for Data Protection supports being identified and seized. .
  • 33.
    The Project I’ve gota cunning plan.
  • 34.
    The Project • • • • • • Sales ForceAutomation. Single View of Customer Master Data Management “e-Nable” traditional paperbased processes Outsource some Call Centre/Field Sales operations E-billing for customers
  • 35.
  • 36.
  • 37.
    The First Mistake Focuson IT Architecture and systems infrastructure Process Definition & Data Quality issues “descoped” Plans became based on “Systems” Project teams became siloed Focus on “Data Subject” lost
  • 38.
    Second Mistake Key informationwasn’t available in the right format for certain processes
  • 39.
    Couldn’t handle Consumer customer with>1 billing account. Interim Solution: Needed “Customer Name” as <title><firstname><lastname> Interim Solution: All customer names in Single View of Customer database parsed using an MS Access Database on a laptop each week
  • 40.
    Third Mistake Personal Datawas being captured that wasn’t actually being used anywhere or whose use was unclear
  • 41.
    First name Last name Address1 Address 2 Address 3 Address 4 PPSN Age Gender Junk Mail Bank A/C details/Credit Card Yes/No Acct Number Branch Code Credit Card Number Contact Telephone Mother’s Maiden Name Details of any Medical Conditions Household Demographics Services you are interested in Marital Status Married Single Separated Divorced Cohabiting Number of Children Partner’s Name Service 1 Service 2 Service 3 Service 4 Mock up of the paper form that had historically been used
  • 42.
  • 43.
    Basic design alsoused in Call Centre and for on-line self service sign up (write once, use often)
  • 44.
    Fourth mistake Junk Mail:Yes/No Assumption: No means No?
  • 45.
    Fifth Mistake Incentivisation: Performance relatedpay tied to % of fields POPULATED
  • 46.
    Sixth Mistake Outsource Contract Nocontract terms re: • Data Protection • Data Quality
  • 48.
    Plan Processes were mapped Timingof data needs identified Relevance of Data was documented Critical Quality Measures picked Meaning of Data clearly defined Purpose of Data clearly understood
  • 49.
    Obtain Customer inputs data Customer data Output Log Expression of Interest CallCentre Customer Acquisition Decision Review Customer Request Complete Additional Data Sales Team Customer on-line On-line Self Service Call Centre & Web forms redesigned Unused data removed from forms Timing of questions was changed
  • 50.
    The down streamprocesses were age-sensitive. Just capturing AGE at a point in time meant customer services could not be delivered reliably
  • 51.
    Maintain Key metrics definedfor Information Processes defined for Maintaining data … by the customer (self service) … by the organisation … in response to leading indicators
  • 52.
  • 53.
    Maintain Manual Data? Define characteristicsto be measured Random sample of files (Poisson Distribution)
  • 54.
  • 55.
    Maintain Measured performance ofProcess Completeness Currency (age of data) Consistency
  • 56.
    Store/Share Outsourcing contract wasrenegotiated Clear expectations set Duties & Obligations Security/Privacy Process standards Data Quality
  • 57.
    Store/Share Controls implemented over “EndUser Computing” Use and purpose informed future development roadmap
  • 58.
  • 59.
    Apply • Right Datato Right place at Right time for Right Reason • Right Risk Assessment, Right Security, Right controls. • Less cost of rework and worry
  • 60.
    Disposal/Destruction Metrics Average Age (ofdata) Average Time since last update/access Actions Business Case Support Retention Policy
  • 61.
  • 62.
  • 63.
    Information Value Chains arethe Missing Link A.K.A. Processes A.K.A. “CyCles” A.K.A. SIPOC A.K.A. Workflow If you can't describe what you are doing as a process... ... you don’t know what you are doing.
  • 64.
    Information has alife cycle. That life cycle gives important CONTEXT
  • 65.
    From Toothpastefordinner.com Process &Context = Meaning & Purpose
  • 66.
    Information has attributesyou can measure... Measurement can support Controls and Policies Metrics can support Change Management goals
  • 67.
    What is measuredgets done.
  • 68.
    Perspective is important... TheProject did not set out to address Data Protection per se. But deliverables supported proactive Data Protection Management
  • 69.
    Sustaining the changeis the Big Challenge
  • 70.
  • 72.
    How can youfeed the GREED motive?
  • 73.
    What is your DataProtection Scorecard? How does it translate to your bottom line?

Editor's Notes

  • #30 While the Bloor Research study was looking at planning for data migrations, the fact that only 1 in 10 companies who responded to the survey had done any data profiling as part of planning their data migrations is a statistic that backs up the anecdotal stories from Risk Management consultants that the biggest problem in Risk Audits or Risk Reviews is that people don’t have information to make informed decisions, and for the information they do have, they are not always in a position to stand over the accuracy of it.