IMPACT Conference 2011 Richard Boulderstone Director, eStrategy & Programmes October 2011
Fantastic Project! Highly collaborative Addressing common set of issues across Europe Will have multi-year benefits for organisations that do digitisation Will result in much richer and more value-added applications Will benefits the citizens of Europe for many years to come Could finish here,…However, would like to talk about: My views on print, digitisation, OCR, apps and the future…..
The British Library Exists for everyone  who wants to do research – for academic, personal, and commercial purposes.  Covers all subject areas  – sciences, technology, medicine, arts, humanities, social sciences… Receives a copy of every item  published in the UK. Holds over 150 million items , with 3 million items added each year. Used by over 16,000 people each day  (on site and online).
2020 Mission & Vision Digitisation provides long-lasting digital copy Digital content can  Support advanced analysis Digital is easier to access Digital content has much greater reach We can only accomplish these objectives with partners
Physical Collections British Library has 150M Items in Collection Estimated Number Of Pages 5,000M Therefore Average Number of Pages per Item = 33 CENL (Conference Of European National Libraries) Survey 2006 400M Items in National Libraries Estimate 13,200M pages (33 * 400M) Lots to Digitise! Physical Item
Digital not Digitalis Born-Digital Normally contemporary material that we acquire in digital-form (eJournals, eBooks, Web Sites, &etc). Digitised Digital image of physical collection item (Newspapers, Books, Manuscript, Journals, Audio, &etc.) Not….Digitalization The administration of digitalis (fox glove) or one of its active constituents to a patient or an animal so that the required physiological changes occur in the body; also, the state of the body resulting from this. ( Oxford English Dictionary )
Digitisation – Create Images BL has digitised 57M Objects, around 1% of physical collection However, partnership with Brightsolid - digitising newspaper collection (fee service) – Up to an additional 40M pages Google to digitise 250,000 books (80M pages) Cost to digitise, initially much more than £1 per page, more recently less than £1 per page For entire BL collection – estimated storage required @10Mbytes / page is 50 Petabytes (5 * 10^16) CENL Survey 2006: 4.8M Items; 2012 Projection: 17M Items (~4%) Physical Item Digitised Item Digitisation
OCR – Gateway to Advanced Digital Functionality OCR Works very well for modern collections with high accuracy rates However, some way to go for older material ( Going Grey? Comparing the OCR Accuracy Levels of Bitonal and Greyscale Images, Tracy Powell & Gordon Paynter NLNZ ) Vital for Advanced Digital Functionality Impact has made significant progress in this area Physical Item How good can it get? Rose Holley NLA Require high accuracy for researchers to trust. Good 98-99% Poor below 90% Digitised Item Digitisation <?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; ?>  - <mets:mets xmlns:xsi=&quot; http://www.w3.org/2001/XMLSchema-instance &quot; xmlns:mets=&quot; http://www.loc.gov/METS/ &quot; xsi:schemaLocation=&quot; http://www.loc.gov/METS/ http://www.loc.gov/standards/mets/version18/mets.xsd info:lc/xmlns/premis-v2  Digital Item Optical  Character  Recognition
Adding Value To Collection Items Do we need all these applications? Are they value for money? Physical Item Digitised Item Digitisation <?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; ?>  - <mets:mets xmlns:xsi=&quot; http://www.w3.org/2001/XMLSchema-instance &quot; xmlns:mets=&quot; http://www.loc.gov/METS/ &quot; xsi:schemaLocation=&quot; http://www.loc.gov/METS/ http://www.loc.gov/standards/mets/version18/mets.xsd info:lc/xmlns/premis-v2  Digital Item Optical  Character  Recognition Indexing Basic Search & Discovery Text Analysis Text Mining Image Comparison Specialist Applications Application Programming Interface (API) Social Networking Collect & Store: Comments,  Annotations,  Additions
Commercial Break…..
Value of Digitisation Splashes and Ripples: Synthesizing the Evidence on the Impact of Digital Resources, 2011  -  Eric T. Meyer, Oxford Internet Institute JISC Funded Review of the Value of Digitisation Projects Examined 12 JISC-Funded Digitisation Projects Various Types of Benefits Analysed:  Quantitative Analytics Income Log Files Scientometrics Surveys Webometrics Qualitative Content Analysis Feedback Focus Groups Interviews Referrer
Webometrics for 12 JISC-funded Digitisation Projects Monthly statistics for 12 JISC-funded Digitisation Projects Does this tell us whether we should do these projects?....
Print vs Digital Digital Wins!!! Factor Print Digital Winner Durability Good (some not so good – newspapers) but eventually destroyed through use Requires specialist system to retain for ever – but possible Tie Look & Feel Original Item Good simulations possible – also multi-layer digitisation; electronic comparisons provide additional utility Tie Search Only catalogue With good ocr - Full Text Digital Distribution Slow, expensive & cumbersome Fast, cheap, entire internet Digital Linking, Text mining, social networking Not Possible Potentially Digital Revenue Very limited opportunities Already have a number of revenue generating apps Digital
CENL Survey Digitised Items: Potential Enormous Potential for digitisation “ If we match the total physical holdings national libraries against digital holdings (objects) of a library it becomes clear that content digitisation still in its  infancy  and how enormous the potential for digitisation of content in National Libraries is.”
My Vision Cost reductions in storage technologies, mass digitisation processes and application development make it possible for the first time to imagine digitising the entire holdings of major Libraries.  This creates the opportunity to allow all citizens to experience, enjoy, learn from and build on the World’s Knowledge.
Concluding Comments Digitisation projects have created a fantastic resource for scholars, researchers and the public European National Libraries, including the British Library, will have digitised around 4% of their collections by 2012 Funding, standards, copyright, technology and interoperability will remain major issues However these programmes have the potential to radically improve the access to collections across Europe and beyond We will need to work together to unleash the potential of these resources….. is a great example of this collaboration

More Related Content

PPTX
IMPACT Final Event 26-06-2012 - Franciska de Jong - Indexing and searching of...
PPTX
Neural Network Language Models for Candidate Scoring in Multi-System Machine...
PPTX
Statistical Machine Translation for Language Localisation
PPT
Dr H K Kaul
PDF
Representation and Absence in Digital Resources: The Case of Europeana Newspa...
PPT
The Archives Forum - The National Archives - 02 March 2011
PDF
ISWC 2013 Tutorial on the Web of Things
PPT
An Introduction to digital preservation at the Library of Congress
IMPACT Final Event 26-06-2012 - Franciska de Jong - Indexing and searching of...
Neural Network Language Models for Candidate Scoring in Multi-System Machine...
Statistical Machine Translation for Language Localisation
Dr H K Kaul
Representation and Absence in Digital Resources: The Case of Europeana Newspa...
The Archives Forum - The National Archives - 02 March 2011
ISWC 2013 Tutorial on the Web of Things
An Introduction to digital preservation at the Library of Congress

Similar to IMPACT Final Conference - Richard Boulderstone (20)

PPT
Permanent access to digital material
PDF
Addressing dm-cloud
PDF
Microsoft Dynamics Academic Alliance: How to win future of business
PDF
Resource Management for Internet of Things 1st Edition Flávia C. Delicato
PPT
Smarter campus workshop Part I - Amit Sinha and Heidi Riley - Smarter planet ...
PDF
What do we want computers to do for us?
PDF
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
PPTX
Seoul, Republic of Korea: Libraries as a Catalyst for Economic Growth and Com...
PDF
Digital library
PDF
Resource Management for Internet of Things 1st Edition Flávia C. Delicato
PPTX
Semantic Web in the Plateau of Productivity
PPTX
Technological trends in libraries lilian okello
PPT
The Agricultural Ontology Service and its Vision
PPT
The agricultural ontology service and its vision
PPTX
Is It Too Late to Ensure Continuity of Access to the Scholarly Record?
PPTX
Slideshare startup contest
PPT
Planning and Managing Digital Library & Archive Projects
PPTX
NHM Data Portal: first steps toward the Graph-of-Life
PPTX
NHM Data Portal: first steps toward the Graph-of-Life
PDF
TING.concept ELAG conference presentation 2010-06-09
Permanent access to digital material
Addressing dm-cloud
Microsoft Dynamics Academic Alliance: How to win future of business
Resource Management for Internet of Things 1st Edition Flávia C. Delicato
Smarter campus workshop Part I - Amit Sinha and Heidi Riley - Smarter planet ...
What do we want computers to do for us?
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Seoul, Republic of Korea: Libraries as a Catalyst for Economic Growth and Com...
Digital library
Resource Management for Internet of Things 1st Edition Flávia C. Delicato
Semantic Web in the Plateau of Productivity
Technological trends in libraries lilian okello
The Agricultural Ontology Service and its Vision
The agricultural ontology service and its vision
Is It Too Late to Ensure Continuity of Access to the Scholarly Record?
Slideshare startup contest
Planning and Managing Digital Library & Archive Projects
NHM Data Portal: first steps toward the Graph-of-Life
NHM Data Portal: first steps toward the Graph-of-Life
TING.concept ELAG conference presentation 2010-06-09
Ad

More from IMPACT Centre of Competence (20)

PDF
Session6 01.helmut schmid
PDF
Session1 03.hsian-an wang
PDF
Session7 03.katrien depuydt
PDF
Session7 02.peter kiraly
PDF
Session6 04.giuseppe celano
PDF
Session6 03.sandra young
PDF
Session6 02.jeremi ochab
PDF
Session5 04.evangelos varthis
PDF
Session5 03.george rehm
PDF
Session5 02.tom derrick
PDF
Session5 01.rutger vankoert
PDF
Session4 04.senka drobac
PDF
Session3 04.arnau baro
PDF
Session3 03.christian clausner
PDF
Session3 02.kimmo ketunnen
PDF
Session3 01.clemens neudecker
PDF
Session2 04.ashkan ashkpour
PDF
Session2 03.juri opitz
PDF
Session2 02.christian reul
PDF
Session2 01.emad mohamed
Session6 01.helmut schmid
Session1 03.hsian-an wang
Session7 03.katrien depuydt
Session7 02.peter kiraly
Session6 04.giuseppe celano
Session6 03.sandra young
Session6 02.jeremi ochab
Session5 04.evangelos varthis
Session5 03.george rehm
Session5 02.tom derrick
Session5 01.rutger vankoert
Session4 04.senka drobac
Session3 04.arnau baro
Session3 03.christian clausner
Session3 02.kimmo ketunnen
Session3 01.clemens neudecker
Session2 04.ashkan ashkpour
Session2 03.juri opitz
Session2 02.christian reul
Session2 01.emad mohamed
Ad

Recently uploaded (20)

PDF
Myanmar Dental Journal, The Journal of the Myanmar Dental Association (2015).pdf
PDF
plant tissues class 6-7 mcqs chatgpt.pdf
PDF
Journal of Dental Science - UDMY (2022).pdf
PDF
Nurlina - Urban Planner Portfolio (english ver)
PDF
Everyday Spelling and Grammar by Kathi Wyldeck
PDF
0520_Scheme_of_Work_(for_examination_from_2021).pdf
PDF
LIFE & LIVING TRILOGY - PART (3) REALITY & MYSTERY.pdf
PPTX
What’s under the hood: Parsing standardized learning content for AI
PPTX
Integrated Management of Neonatal and Childhood Illnesses (IMNCI) – Unit IV |...
PDF
Race Reva University – Shaping Future Leaders in Artificial Intelligence
PDF
English-bài kiểm tra tiếng anh cơ bản.pdf
PPTX
2025 High Blood Pressure Guideline Slide Set.pptx
PPTX
UNIT_2-__LIPIDS[1].pptx.................
PDF
MICROENCAPSULATION_NDDS_BPHARMACY__SEM VII_PCI Syllabus.pdf
PDF
Compact First Student's Book Cambridge Official
PDF
Climate and Adaptation MCQs class 7 from chatgpt
PDF
Myanmar Dental Journal, The Journal of the Myanmar Dental Association (2013).pdf
PPTX
Education and Perspectives of Education.pptx
PDF
MA in English at Shiv Nadar University – Advanced Literature, Language & Rese...
PDF
CRP102_SAGALASSOS_Final_Projects_2025.pdf
Myanmar Dental Journal, The Journal of the Myanmar Dental Association (2015).pdf
plant tissues class 6-7 mcqs chatgpt.pdf
Journal of Dental Science - UDMY (2022).pdf
Nurlina - Urban Planner Portfolio (english ver)
Everyday Spelling and Grammar by Kathi Wyldeck
0520_Scheme_of_Work_(for_examination_from_2021).pdf
LIFE & LIVING TRILOGY - PART (3) REALITY & MYSTERY.pdf
What’s under the hood: Parsing standardized learning content for AI
Integrated Management of Neonatal and Childhood Illnesses (IMNCI) – Unit IV |...
Race Reva University – Shaping Future Leaders in Artificial Intelligence
English-bài kiểm tra tiếng anh cơ bản.pdf
2025 High Blood Pressure Guideline Slide Set.pptx
UNIT_2-__LIPIDS[1].pptx.................
MICROENCAPSULATION_NDDS_BPHARMACY__SEM VII_PCI Syllabus.pdf
Compact First Student's Book Cambridge Official
Climate and Adaptation MCQs class 7 from chatgpt
Myanmar Dental Journal, The Journal of the Myanmar Dental Association (2013).pdf
Education and Perspectives of Education.pptx
MA in English at Shiv Nadar University – Advanced Literature, Language & Rese...
CRP102_SAGALASSOS_Final_Projects_2025.pdf

IMPACT Final Conference - Richard Boulderstone

  • 1. IMPACT Conference 2011 Richard Boulderstone Director, eStrategy & Programmes October 2011
  • 2. Fantastic Project! Highly collaborative Addressing common set of issues across Europe Will have multi-year benefits for organisations that do digitisation Will result in much richer and more value-added applications Will benefits the citizens of Europe for many years to come Could finish here,…However, would like to talk about: My views on print, digitisation, OCR, apps and the future…..
  • 3. The British Library Exists for everyone who wants to do research – for academic, personal, and commercial purposes. Covers all subject areas – sciences, technology, medicine, arts, humanities, social sciences… Receives a copy of every item published in the UK. Holds over 150 million items , with 3 million items added each year. Used by over 16,000 people each day (on site and online).
  • 4. 2020 Mission & Vision Digitisation provides long-lasting digital copy Digital content can Support advanced analysis Digital is easier to access Digital content has much greater reach We can only accomplish these objectives with partners
  • 5. Physical Collections British Library has 150M Items in Collection Estimated Number Of Pages 5,000M Therefore Average Number of Pages per Item = 33 CENL (Conference Of European National Libraries) Survey 2006 400M Items in National Libraries Estimate 13,200M pages (33 * 400M) Lots to Digitise! Physical Item
  • 6. Digital not Digitalis Born-Digital Normally contemporary material that we acquire in digital-form (eJournals, eBooks, Web Sites, &etc). Digitised Digital image of physical collection item (Newspapers, Books, Manuscript, Journals, Audio, &etc.) Not….Digitalization The administration of digitalis (fox glove) or one of its active constituents to a patient or an animal so that the required physiological changes occur in the body; also, the state of the body resulting from this. ( Oxford English Dictionary )
  • 7. Digitisation – Create Images BL has digitised 57M Objects, around 1% of physical collection However, partnership with Brightsolid - digitising newspaper collection (fee service) – Up to an additional 40M pages Google to digitise 250,000 books (80M pages) Cost to digitise, initially much more than £1 per page, more recently less than £1 per page For entire BL collection – estimated storage required @10Mbytes / page is 50 Petabytes (5 * 10^16) CENL Survey 2006: 4.8M Items; 2012 Projection: 17M Items (~4%) Physical Item Digitised Item Digitisation
  • 8. OCR – Gateway to Advanced Digital Functionality OCR Works very well for modern collections with high accuracy rates However, some way to go for older material ( Going Grey? Comparing the OCR Accuracy Levels of Bitonal and Greyscale Images, Tracy Powell & Gordon Paynter NLNZ ) Vital for Advanced Digital Functionality Impact has made significant progress in this area Physical Item How good can it get? Rose Holley NLA Require high accuracy for researchers to trust. Good 98-99% Poor below 90% Digitised Item Digitisation <?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; ?> - <mets:mets xmlns:xsi=&quot; http://www.w3.org/2001/XMLSchema-instance &quot; xmlns:mets=&quot; http://www.loc.gov/METS/ &quot; xsi:schemaLocation=&quot; http://www.loc.gov/METS/ http://www.loc.gov/standards/mets/version18/mets.xsd info:lc/xmlns/premis-v2 Digital Item Optical Character Recognition
  • 9. Adding Value To Collection Items Do we need all these applications? Are they value for money? Physical Item Digitised Item Digitisation <?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; ?> - <mets:mets xmlns:xsi=&quot; http://www.w3.org/2001/XMLSchema-instance &quot; xmlns:mets=&quot; http://www.loc.gov/METS/ &quot; xsi:schemaLocation=&quot; http://www.loc.gov/METS/ http://www.loc.gov/standards/mets/version18/mets.xsd info:lc/xmlns/premis-v2 Digital Item Optical Character Recognition Indexing Basic Search & Discovery Text Analysis Text Mining Image Comparison Specialist Applications Application Programming Interface (API) Social Networking Collect & Store: Comments, Annotations, Additions
  • 11. Value of Digitisation Splashes and Ripples: Synthesizing the Evidence on the Impact of Digital Resources, 2011 - Eric T. Meyer, Oxford Internet Institute JISC Funded Review of the Value of Digitisation Projects Examined 12 JISC-Funded Digitisation Projects Various Types of Benefits Analysed: Quantitative Analytics Income Log Files Scientometrics Surveys Webometrics Qualitative Content Analysis Feedback Focus Groups Interviews Referrer
  • 12. Webometrics for 12 JISC-funded Digitisation Projects Monthly statistics for 12 JISC-funded Digitisation Projects Does this tell us whether we should do these projects?....
  • 13. Print vs Digital Digital Wins!!! Factor Print Digital Winner Durability Good (some not so good – newspapers) but eventually destroyed through use Requires specialist system to retain for ever – but possible Tie Look & Feel Original Item Good simulations possible – also multi-layer digitisation; electronic comparisons provide additional utility Tie Search Only catalogue With good ocr - Full Text Digital Distribution Slow, expensive & cumbersome Fast, cheap, entire internet Digital Linking, Text mining, social networking Not Possible Potentially Digital Revenue Very limited opportunities Already have a number of revenue generating apps Digital
  • 14. CENL Survey Digitised Items: Potential Enormous Potential for digitisation “ If we match the total physical holdings national libraries against digital holdings (objects) of a library it becomes clear that content digitisation still in its infancy and how enormous the potential for digitisation of content in National Libraries is.”
  • 15. My Vision Cost reductions in storage technologies, mass digitisation processes and application development make it possible for the first time to imagine digitising the entire holdings of major Libraries. This creates the opportunity to allow all citizens to experience, enjoy, learn from and build on the World’s Knowledge.
  • 16. Concluding Comments Digitisation projects have created a fantastic resource for scholars, researchers and the public European National Libraries, including the British Library, will have digitised around 4% of their collections by 2012 Funding, standards, copyright, technology and interoperability will remain major issues However these programmes have the potential to radically improve the access to collections across Europe and beyond We will need to work together to unleash the potential of these resources….. is a great example of this collaboration

Editor's Notes

  • #4: Intro to British Library. Facts.