Charleston Conference
November 2013
Christine Orr
1-2-3
One
Get Ready to Clean Your Data
Two
Three Things You Can Do Right Now to
Improve Your Data
Three
Next Steps
One: Get Ready
1. What problem are you

trying to solve?
2. Prioritize your records
3. Gather tools & resources

…and a tip
Two: 3 Easy Things
You Can Do Right Now
Really.
Step 1: Create a Style Sheet
 Define each data

element

 Make a list including all fields of

a record, and define what each
means (and what it doesn’t)

 Decide which

elements are
required

 Which must be filled in now as

part of the cleanup effort, or
for the creation of any new
records
Set naming conventions
To abbreviate or not to
abbreviate?
UCL:
 University College London (UK)
 Université Catholique de Louvain






(Belgium)
Universidad Cristiana
Latinoamericana (Ecuador)
University College Lillebælt
(Denmark)
Centro Universitario Celso Lisboa
(Brazil)
Union County Library (USA)

English or Native Language?
Style Sheet: Don’t reinvent the wheel
Any authority files available that you can use? Yes!
 ISO Country Code List (ISO 3166)
 UN/LOCODE for Cities

 Carnegie Classifications Data File
 Standard Identifiers (ISNI, Ringgold ID)
 Europa World of Learning
First draft only, for
now.
Step 2: Fill in what is missing
Step 3: Put everything in its proper place
Data: Before
ID
55897

Title
Lt.

FirstName

LastName Company
Worf
High Altitude Warfare
School

Address1
778 North Ring
Road, Gulmarg

Address 2

69583

Ms.

587

City

State

Deanna

Troi

Roots of Empathy

Continuing
Education Dept.

P.O. Box 3232

Toronto

Capt.

Jean

Luc

Picard

University of
Birmingham
Shakespeare
Institute

Edgbaston

Stratford

55632

Dr.

Beverly

Crusher

Katholieke Univ Leuven
5963 Central
Departement Molecularie University Road
Celbiologie

Leuven

55633

Dr.

Bev

Crusher

KUL Dept of Molecular Cell 5963 Central
Bio
University Road

Leuven

55634

Mr.

Geordie

LaForge

UW MSN

School of
Engineering

55638

Mr.

W.

Crusher

Husson Uvinersity

50 Main Street

59648

Dr.

Wesley

Crusher

Sawyer Library, Husson
University

Serials Dept.

50 Main Street Bangor

58963
57741

Lieutenant
Com.

Commander
William

Data
T. Riker

Starfleet Headquarters
Star Fleet

Main Library
1 Ellipse Way

1 Ellipse Way
Main Library

SF
CA
San Franisco Calif

Z29.30

Librarian

Starfleet

1 Ellipse Way

Serials
Department

San Fran

58582

Librarian

Starfleet

1 Ellipse Way

Serials
Department

San Francisco Cal.

Postal

Country
India

Title
Weaponry Weekly

ONT

Canada

European Journal of
Empathy and Sympathy

on-Avon

UK

Leadership Quarterly

3000

Belgium

Journal of Interspecies
Bioengineering

3000

BE

Prognosis Negative: the
Journal of Poor Outcomes

90044

960 Vilas St.

Madison

WI

53706

Bangor

Maine

044012999

USA

Navigational Systems

ME

044012999

USA

Child Prodigy Management

CA

Advances in Warp Speed
Engine Efficiency

Alls Package
Leadership Quarterly

United
States
United
States of
America

Annals of Mind Meld
Research

USA

Transporter Diagnostics
Data: After
Account
ID *
55897

Prefix
Lt.

FirstName * MI
Steve

LastName *
Worf

69583

Ms.

Deanna

Troi

58700

Capt.

Jean-Luc

Picard

55632

Dr.

Beverly

Crusher

55632

Dr.

Beverly

55634

Mr.

55638

X

Title
Dean of
Students

Org Name

Account Name *
High Altitude Warfare
School

Department
Combat Instruction
Division

Street Address1 *
778 North Ring
Road

Roots of Empathy

Continuing Education
Department

Address 2

City *
Gulmarg

State

Postal *
90044

Country *
IN

P.O. Box 3231

Toronto

ONT

M4B 1B7 CAN

Edgbaston

Stratford-onAvon

CV37 9FX UK

University of
Birmingham

University of Birmingham Chancellors Office
Shakespeare Institute

Professor

Katholieke
Universiteit
Leuven

Katholieke Univ Leuven Pulaski Memorial Library 5963 Central
Departement Molecularie
University Road
Celbiologie

Leuven

3000

BE

Crusher

Professor

Katholieke
Universiteit
Leuven

Katholieke Univ Leuven Pulaski Memorial Library 5963 Central
Departement Molecularie
University Road
Celbiologie

Leuven

3000

BE

Geordie

LaForge

Dean of
Engineering

University of Wisconsin - School of Engineering
Madison

959 Vilas St.

Madison

WI

537062525

USA

Mr.

Wesley

Crusher

Serials
Librarian

Husson
University

Sawyer Library

Serials Department

50 Main Street

Bangor

ME

044012999

USA

55638

Dr.

Wesley

Crusher

Serials
Librarian

Husson
University

Sawyer Library

Serials Department

50 Main Street

Bangor

ME

044012999

USA

58963

Lt.

Phillip

N.

Data

Electronic
Resources
Librarian

Starfleet

Starfleet Academy

Uhura Information
Transfer Center

1 Ellipse Way

Serials
Department

San
Francisco

CA

941210001

USA

57741

Com.

William

T.

Riker

Starfleet

Starfleet Command

Fort Baker

Mailstop 27

San
Francisco

CA

941270601

USA

58963

Lt.

Phillip

N.

Data

Electronic
Resources
Librarian

Starfleet

Starfleet Academy

Uhura Information
Transfer Center

1 Ellipse Way

Serials
Department

San
Francisco

CA

941210001

USA

58963

Lt.

Phillip

N.

Data

Electronic
Resources
Librarian

Starfleet

Starfleet Academy

Uhura Information
Transfer Center

1 Ellipse Way

Serials
Department

San
Francisco

CA

941210001

USA
Three: Next Steps
1. Evangelize the style

sheet
2. Encourage & enforce

good data governance
practices where you can
3. Keep the ball rolling…..
….and maintain good data hygiene
 Review new records on a regular basis
 Determine if additional data sets need

attention
 Educate new staff
 Show tangible ROI
To sum up:
It’s absolutely worth it
You can do it
…But don’t do it alone
Christine Orr
Sales Director, North America
Ringgold Inc.
Christine.orr@ringgold.com
+1 540.359.6620
ORCID: http://orcid.org/0000-0003-1362-3330

Spring Cleaning: Easy Ways to Tidy Your Customer Data

  • 1.
  • 3.
    1-2-3 One Get Ready toClean Your Data Two Three Things You Can Do Right Now to Improve Your Data Three Next Steps
  • 4.
    One: Get Ready 1.What problem are you trying to solve? 2. Prioritize your records 3. Gather tools & resources …and a tip
  • 5.
    Two: 3 EasyThings You Can Do Right Now Really.
  • 6.
    Step 1: Createa Style Sheet  Define each data element  Make a list including all fields of a record, and define what each means (and what it doesn’t)  Decide which elements are required  Which must be filled in now as part of the cleanup effort, or for the creation of any new records
  • 7.
    Set naming conventions Toabbreviate or not to abbreviate? UCL:  University College London (UK)  Université Catholique de Louvain     (Belgium) Universidad Cristiana Latinoamericana (Ecuador) University College Lillebælt (Denmark) Centro Universitario Celso Lisboa (Brazil) Union County Library (USA) English or Native Language?
  • 8.
    Style Sheet: Don’treinvent the wheel Any authority files available that you can use? Yes!  ISO Country Code List (ISO 3166)  UN/LOCODE for Cities  Carnegie Classifications Data File  Standard Identifiers (ISNI, Ringgold ID)  Europa World of Learning
  • 9.
  • 10.
    Step 2: Fillin what is missing
  • 11.
    Step 3: Puteverything in its proper place
  • 12.
    Data: Before ID 55897 Title Lt. FirstName LastName Company Worf HighAltitude Warfare School Address1 778 North Ring Road, Gulmarg Address 2 69583 Ms. 587 City State Deanna Troi Roots of Empathy Continuing Education Dept. P.O. Box 3232 Toronto Capt. Jean Luc Picard University of Birmingham Shakespeare Institute Edgbaston Stratford 55632 Dr. Beverly Crusher Katholieke Univ Leuven 5963 Central Departement Molecularie University Road Celbiologie Leuven 55633 Dr. Bev Crusher KUL Dept of Molecular Cell 5963 Central Bio University Road Leuven 55634 Mr. Geordie LaForge UW MSN School of Engineering 55638 Mr. W. Crusher Husson Uvinersity 50 Main Street 59648 Dr. Wesley Crusher Sawyer Library, Husson University Serials Dept. 50 Main Street Bangor 58963 57741 Lieutenant Com. Commander William Data T. Riker Starfleet Headquarters Star Fleet Main Library 1 Ellipse Way 1 Ellipse Way Main Library SF CA San Franisco Calif Z29.30 Librarian Starfleet 1 Ellipse Way Serials Department San Fran 58582 Librarian Starfleet 1 Ellipse Way Serials Department San Francisco Cal. Postal Country India Title Weaponry Weekly ONT Canada European Journal of Empathy and Sympathy on-Avon UK Leadership Quarterly 3000 Belgium Journal of Interspecies Bioengineering 3000 BE Prognosis Negative: the Journal of Poor Outcomes 90044 960 Vilas St. Madison WI 53706 Bangor Maine 044012999 USA Navigational Systems ME 044012999 USA Child Prodigy Management CA Advances in Warp Speed Engine Efficiency Alls Package Leadership Quarterly United States United States of America Annals of Mind Meld Research USA Transporter Diagnostics
  • 13.
    Data: After Account ID * 55897 Prefix Lt. FirstName* MI Steve LastName * Worf 69583 Ms. Deanna Troi 58700 Capt. Jean-Luc Picard 55632 Dr. Beverly Crusher 55632 Dr. Beverly 55634 Mr. 55638 X Title Dean of Students Org Name Account Name * High Altitude Warfare School Department Combat Instruction Division Street Address1 * 778 North Ring Road Roots of Empathy Continuing Education Department Address 2 City * Gulmarg State Postal * 90044 Country * IN P.O. Box 3231 Toronto ONT M4B 1B7 CAN Edgbaston Stratford-onAvon CV37 9FX UK University of Birmingham University of Birmingham Chancellors Office Shakespeare Institute Professor Katholieke Universiteit Leuven Katholieke Univ Leuven Pulaski Memorial Library 5963 Central Departement Molecularie University Road Celbiologie Leuven 3000 BE Crusher Professor Katholieke Universiteit Leuven Katholieke Univ Leuven Pulaski Memorial Library 5963 Central Departement Molecularie University Road Celbiologie Leuven 3000 BE Geordie LaForge Dean of Engineering University of Wisconsin - School of Engineering Madison 959 Vilas St. Madison WI 537062525 USA Mr. Wesley Crusher Serials Librarian Husson University Sawyer Library Serials Department 50 Main Street Bangor ME 044012999 USA 55638 Dr. Wesley Crusher Serials Librarian Husson University Sawyer Library Serials Department 50 Main Street Bangor ME 044012999 USA 58963 Lt. Phillip N. Data Electronic Resources Librarian Starfleet Starfleet Academy Uhura Information Transfer Center 1 Ellipse Way Serials Department San Francisco CA 941210001 USA 57741 Com. William T. Riker Starfleet Starfleet Command Fort Baker Mailstop 27 San Francisco CA 941270601 USA 58963 Lt. Phillip N. Data Electronic Resources Librarian Starfleet Starfleet Academy Uhura Information Transfer Center 1 Ellipse Way Serials Department San Francisco CA 941210001 USA 58963 Lt. Phillip N. Data Electronic Resources Librarian Starfleet Starfleet Academy Uhura Information Transfer Center 1 Ellipse Way Serials Department San Francisco CA 941210001 USA
  • 14.
    Three: Next Steps 1.Evangelize the style sheet 2. Encourage & enforce good data governance practices where you can
  • 15.
    3. Keep theball rolling…..
  • 16.
    ….and maintain gooddata hygiene  Review new records on a regular basis  Determine if additional data sets need attention  Educate new staff  Show tangible ROI
  • 17.
    To sum up: It’sabsolutely worth it You can do it …But don’t do it alone
  • 18.
    Christine Orr Sales Director,North America Ringgold Inc. [email protected] +1 540.359.6620 ORCID: http://orcid.org/0000-0003-1362-3330

Editor's Notes

  • #3 October has explained why it’s so important to have and maintain clean customer data. But you are probably thinking that your data is voluminous, held in multiple systems, there are loads of issues that you’d like to fix, but cleaning it all - implementing a vast data governance program - is feeling like an insurmountable task. This is not that presentation – we are not going to cover everything, enterprise-wide. We are going to talk about simple things that you can do right now to improve the health of the data that you decide makes up an important record set. A short term project to get your data fit for a single purpose, to make decisions on a single issue.
  • #4 I want to focus now on things you can do right now – with the resources you have available to you – to improve the health of your customer data. These things also can apply to prospect & contact data in your CRM, live customer records in your fulfillment system, your authentication system – any where your customer data lives. The take-aways here could even be used on product data (think about all the ways your title data might possibly be rendered). We may not be able to get it to the size of a molehill, but we can absolutely cut it down to size.
  • #5 TAKE TIME TO PLAN THIS SHORT-TERM DATA CLEANUP PROPERLYWhy are you doing this:Experiencing service problems? Adding a new journal, system migration, support pricing decisions. Think of a problem and let’s focus on getting your data fit for that purpose. Prioritizing will help you avoid scope creep. What timeframe of records is relevant: just current customer records, or last years, too? (may not be worth cleaning anything older than 3 years, unless historical trends are needed). Are any geographic regions more important/troublesome than others? What type of customer records need cleaning: Institutional subscribers, members, individual subscribers, CRM contacts or just live customer records from your fulfillment or AMS?Tools/resources: People (can you enlist anyone else in your dept or customer service? A consultant? IT staff?)Info sources: data sources, authority files, taxonomies. Time: How much time have you got? Any pressing deadlines – system going live, need to re-do your pricing to align w the renewals schedule?And a tip: Back up your original data and work from a copy of your data. Pull a spreadsheet/file, make your changes there, then re-import to your system of choice. You should also keep a copy of the pre-cleaned data in case you experience problems w any overwritten records.
  • #7 Get granular when you define fields for customer and product records: do not permit a field to hold two possible value types (e.g., job title vs prefix; product name and format)Do you need to add any new fields? Address, middle name, official institution name…
  • #8 Set naming conventions for organization names, addresses, etc. As many fields as you can.Abbreviation: Once you abbreviate anything, it’s easy to start abbreviating everything. UCL example. If you must abbreviate due to fixed length fields, be very clear about what words may be abbreviated & how. (of course a standard unique identifier like ISNI or RIN, or a separate “Official Name” field, means you may not need to try to standardize all org names anyway)English vs native: Applies to cities, countries (Germany/Deutschland; Venice/Venezia, Beijing/Peking), as well as institution names. If you are an international org with staff in different countries, native may be more helpful as you are likely to be creating records in multiple languages. Just be consistent.
  • #10 You can always go back and refine this, or define additional “nice to have” fields. Or cover your products as well as customer addressing/service fields. It is easy to try and make the style sheet a huge part of the project, and strive for perfectionism and to cover every possible data element. But it is a key step, and will become your rulebook.
  • #11 Step 2: Ensure data is complete by filling in any missing required fields first: cities, states, postal codes, Org names, etc. Some of this you will know easily: Munich w no “Germany”, NYC with no NY. Some will need to be filled in with progressive profiling (next time the customer renews, get the data); or collapsing records you know to be duplicates – each may have part of the info you need. Or use other data sources – the institution’s website can be a reliable source of information. Start w fields you need to fulfill services or orders: these are your most high-value clients and thereby, records. If it is easy to complete additional fields: URL, FTEs, consortia memberships, - that you don’t have use for now but are “nice to have”, do it: You cannot anticipate all future use cases for your data. Are there records that are completely missing? A “parent” record that might link together two related records, records for a consortia that could link all member records?
  • #12 Step 3: Put data in the right fields. You will have defined what counts as the right field in your style sheet – e.g, where to put the library name vs the organization nameYou may find you already have bits of correct data, but in the wrong place. Addresses frequently get concatenated or joined together, as do middle names. We will often see a full address – street, city, postcode, country, with carriage returns, all in one address field!Lots of the cleaning in your working file may be able to be done programmatically or with a search/replace: change all 5 spellings of Beijing with one swoop! Before you begin: You may want to do just a sample of the required set yourself to see how long it takes. If too time consuming, see who else can help – consultant.
  • #13 Notice: See the red text for suspect data:Customer IDs that don’t seem to fit the patternData obviously in the wrong fieldsFour different versions of “California”Variant & abbreviated company namesPost codes that are not standardCountry names not standard
  • #14 Data was cleaned here by using only the 3 steps outlined above.
  • #15 Evangelize: Stress the importance of good data: put it in monetary terms, or customer service terms, or membership terms: whatever language is understood in your corporate culture. Whatever it takes to get anyone responsible for data capture to adhere to the style sheet. Value the data entry staffers (and win over their bosses, if they don’t report to you) – they may be “low level” but their cooperation as part of the Data Capture process is vital to ongoing data health, and their contribution is important to ensuring your high-value clients are well served, and that you can even know who your high-value clients are!Enforce:Good data capture practices: encourage use of drop-down menus in any internal or external systems. Free text is the death of good data. If you capture it correctly up front, you will not have to clean it later. Become a data champion.
  • #16 Continue to improve your data:
  • #17 Create a periodic regular review process & schedule for new records Renewal season is a good time for this; lots of new records being added, but they are all high value records. What can you do if you get orders w no identifiable inst name? ---Push back on whoever submitted the order, ask for better information --- match on ID number (agent or your own customer number); import only new title/product order info rather than overwrite org name & address fieldsDetermine if additional data sets or records need attention – these simple steps can be applied to any data setEducate new staff on the importance of clean data. Continue to evangelize and get buy-in from those affected by poor data, or those who rely on good data to make key decisions. Show the results of your data cleansing efforts (examples: time saved, revenue preserved or increased, ROI, )
  • #18 1. you and your customers will reap great benefits from clean data2. It need not be overwhelming if you keep to a tight scope of work that will fix the problem at hand. Get help if you need it: Build alliances with your colleagues & agents, Use existing style sheets/naming conventions where available.Lastly call upon external resources if your needs for data governance go beyond what we’ve covered here.
  • #19 And now for any questions……