linked data – In the mailbox

“The end is nigh”: RiC(h) Description – part 2

The period for comment on the EGAD RiC – CM draft standard or model is coming to an end. Since I last posted, there has been a flurry of activity, with comments from at least two Society of American Archivists technical subcommittees (TS-DACS and TS-EAS being the ones I know of), Artefactual (the developers of Accesstomemory software), the Australian Society of Archivists, Chris Hurley and Ross Spencer.

Each has something of value to add; whether concerned with specifics or in thinking about the broader implications for archival description in an online and connected world.

RiC- CM – some comments

I have been looking through the proposed RiC-CM model or standard, as
proposed by the Expert Group on archival description.

The document, as set out, provides a good overview of the development
of archival description to date, particularly as regards the
principles of provenance and respect des fonds. The group notes the
need for broader involvement from a range of traditions and cultures.
It is disappointing therefore, that within the current group and
document there is not a broader exposition of the implementation of
the ICA standards across the countries involved, such as DACS and RAD,
not to mention Australian Series registration. This is particularly
evident in relation to the focus in the document on relationships and
the discussion of the limitation of fonds based and hierarchical
description.

Although the principal audience is identified as archivists, there
seems to be little thought as to how the standard would be
implemented. The one graphical model provided looks at the
relationships, but does not identify how the descriptions were created
that sustained that model. Similarly, though the ICA standards are
identified as being used as the basis for this new standard, there is
little evidence within the specific document that identifies how the
new descriptions relate to the old. It would be useful to have
references to the class numbers and standards within the descriptions.
The document notes that “RiC is complex and detailed”. Some advice or
suggestions as to minimum or mandated elements would be useful.

Some fairly broad statements are made about the implementation of the
current standards, but no evidence is provided in support of these
assertions. Some greater evidence of the research undertaken in
pursuit of the proposed standard would be useful. The same is also
true for discussions of other methods of making content web
accessible, such as the use of markup language, and the way in which
relational and other databases respond to or are aligned with the
current standards. There are now a number of software systems (e.g.
AccesstoMemory) and metadata standards (EAD, EAC, METS) that allow for
the development of relational, rather than purely hierarchical
description and linkages to other descriptive systems.

More specifically, I have the following comments:

Entity type – record. I think this relates to the idea of the smallest
descriptive entity (e.g. US document or UK item). Relating this back
to ISAD(G) and also to the Multilingual glossary –
http://www.ica.org/en/online-resource-centre/multilingual-archival-terminology
would confirm this.
I like the idea of a record component, but would note that the example
of two digital photographs might also be two record entities. The
concept of a ‘complex record’ (US file/folder, UK piece, Australian
item?) needs to be further explored.

Record set – is too generic. This could be a ‘complex record’, or
fonds or series level description?

Function and Function (abstract) – too confusing.

The range of relationship types is overwhelming. It would be better to
look at the OWL ontology, if that is the model used and work from the
relationships already described in the current ICA standards.
Similarly, look at schema.org definitions (The definition of Thing
will cause problems when trying to make linkages with that). It may be
better to provide examples showing how ICA standard descriptions may
be integrated into a range of models using clever software and
analysis. The PROVisualiser demonstrates how such a system might work
– http://conaltuohy.com/blog/visualizing-government-archives-through-linked-data/

While it is commendable that the ICA and EGAD are looking at ways of
sharing and connecting archival metadata, I think that they have
become too focused on one metadata model, possibly to the detriment of others.

RiC[h]-CM description, relationships and standards

A few months ago, at the ICA2016 conference in Seoul, the Expert Group on Archival Description (EGAD) released their first draft model (or standard) for a new relationally enhanced mode for archival description. There’s an email list for comments so I thought I would start there…

<ica-egad-ric.lists.village.Virginia.EDU>)

Jenny Bunn from the Department of Information Studies, University College London, kicked off by asking what format was preferred for responses, as she and some colleagues were getting together to work through the standard:

“Primary Entities
1. Do you agree with the membership of the list? Should anything else be included as a primary entity? Should anything be taken off this list?
2. Do you have any specific comments on any of the entities in particular, e.g. changes to wording, additional examples, confusion about usage?

Properties
1. Do you agree with the lists of properties for each entity? Should anything be added/taken away?
2. Do you have any specific comments on any of the properties in particular, e.g. changes to wording, additional examples, confusion about usage?

Relations
1. Do you agree with the lists of relations? Can you suggest further relations?
2. How should these relations be presented? What information do you need/would you like about each relation?

General comments
1. Anything else you want to say.”

Daniel Pitti, who appears to have been the driving force, agreed to that format, suggesting that general comments come first.

Australia’s Chris Hurley immediately picked up on the relationships, noting that he had identified “792 relationships and still counting.” He then suggested that there needed to be an understanding of the different relationship types, and also a glossary. Chris provided some examples of relationship categories, but I think it would be useful to go back to the original standards and work from there.

RiC is based on the four standards produced by the ICA – ISAD(G) for archival descriptions (fonds, series, items, etc), ISAAR-CPF for archival authorities (organisations, families and individuals), ISDF for the functions which are the reason for records to be created and ISDIAH, which describes the archival institutions and collecting organisations. Of these, only ISAAR-CPF has relationships included in it, which are hierarchical (which organisation controls or owns which), temporal (which organisation preceded which), associative and related, which is mostly used for families and individuals. The different relationships are described in a follow on field. The Australian series system recognises relationships within and between archival descriptions, authorities and functions, and identifies that they may be reciprocal. In amending Access to Memory software for use in with series registration, my colleagues and I at State Records Office of Western Australia worked with the relationships and created subsets within the temporal and hierarchical relationships – controlled and controlling, subordinate and superior; succeeding and preceding. Relationships among individuals were not well defined, but in a private archive or manuscript library scenario I can see how these too may be developed. There are also the relationships such as custodian, creating and transferring, which describes the relationships between authorities and descriptions.

George Charonitis (Georgia State Archives) concurred with Chris with respect to identifying relationship types and also advocated for some more definitions, particularly with respect to context/s, provenance, creation, accumulation and selection. Chris’s next post looked at and suggested some common properties that could be used across all description types – identifier, dates and relationships, as well as looking at and reminding us of the relationships used in series registration, between deed, doer and document.

John Machin, also from Australia, picked up on the next part of the RiC process – the creation of ontologies, asking whether any existing ontologies would be used and how closely they would be followed. Florence Clavaud, from the EGAD group, responded that RiC-O (for ontologies) would probably be unique, and that they would then work on linkages and alignments.

Professor George Bak, from University of Manitoba, also made comment on the new standard, pointing out that the introduction is very Eurocentric (a point also made by the InterPARES Trust, of which slightly more below) and asking whether much thought had been given to indigenous perspectives, and also from the perspective of social memory. He queried how much of the standard had been aligned with current data visualisation practice, and looked at the scholarship in this area. He then followed up with a summary of a discussion held by some Winnipeg based archivists, looking at digital systems and raising the question of definitions and understandings of the way in which information is created and understood, by pointing to the OAIS model for representation information, information objects and so on (he also writes beautifully, so it’s worth reading his posts just for the language).

Finally, the InterPARES Trust have released a bit of a broadside, however politely phrased, against EGAD online, pointing to the lack of communication over the past few years (the RiC project was instigated in 2012). One of their criticisms, which I agree with, is that although RiC is based on the four current standards produced by ICA, there is no evidence of a review of those standards, or how they have been implemented by different archival cultures. Like Bak and Machin, they are concerned that there is no higher ontological model or ‘anchor’ on which the new standard is based. They also suggest that looking at current relational database models, rather than focusing on data visualisation, may be of more use to both users of archives and those describing them for use. Indeed, the lack of user representation or awareness of the new model is also an issue.

Should you wish to review RiC-CM or add your voices to the mix, you have until the end of December to do so. For Australian archivists, the ASA is looking at presenting a combined response, so please do contact them.

STOP PRESS – deadline for comment now extended to 31 January, 2017

Archival description and discovery layers

Some years ago, Campbell Soups ran a campaign about their thick and rich soup range, one of which included Australia’s own Rose Porteous (I can’t find the link, perhaps someone cottoned on?). Anyway, I always think of that and, more academically, of Clifford Geertz’s ‘thick description’ when I think about the ways in which archives can describe their holdings. It’s not always the case, of course. Sometimes, time and pressure mean that holdings and archival authorities are described in minimalistic terms, but the potential for rich and thick description still exists, especially when contextual relationships between creators, functions and records are fully developed. It’s this that sets us aside from library description, and why archives generally don’t use the library MARC (MAchine Readable Catalogue) formats, even though there is a special set for archives (MARC-AMC). Libraries describe the individual elements of the soup – the pea, the bean, the meaty chunk, the liquid – on their own merit. The author statement can bring these elements together but doesn’t give a sense of how they interact. Archives describe the soup, and then the elements.

Given this difference, it’s been interesting to see how different archives have been included into broader, generally library based discovery layers. Our own TROVE is one such instance, and I’ve previously flagged how both the ANU Archives and PROV have added content to TROVE in my #GovHack posts. I’ve not seen much about what compromises had to be made, so I was very interested when the Digital Repository of Ireland brought out its guide to including archival description a few months ago. The Digital Public Library of America has recently released a white paper for similar content. Both the DRI guidelines and the DPLA white paper use EAD (encoded archival description) as the major tool for exploring and exporting information. Both work within a fonds based hierarchical descriptive framework, and focus on the archival object or levels of description. The links made to archival authority and to function (Chris Hurley’s doer and deed) are minimal at best.

The DRI guideline is, by its nature, prescriptive. If you are looking for a good description of the elements within EAD and how they can be matched to standard elements in descriptive practice, then this is a good place to start. The descriptions of each required and recommended element are clear, and provide some food for thought in Australian practice with regards to name, place and subject indexing of archival holdings. I think it would be relatively easy to implement the recommendations for a TROVE like discovery system (although, we have, as yet, to investigate why or whether we want one, and what we would expect to get out of it).

The DPLA white paper is, also by its nature, more complex, looking at comparative descriptive practices, meditating on the differences between library and archival description, and aggregated (fonds, collection, series, even Australian item level) description. It focuses, however, on individual digital objects, either a product of digitisation or a natively created in the digital environment, such as pages of books or individual photographs in an album. The working group looked at both description at a higher aggregated level (using the term ‘collection’) and for individual objects. Again, a number of examples are given for both, and some recommendations come from that. The working group is to be commended for the way in which they have approached the task at hand. Like the DRI guidelines, the white paper raises some important questions for Australian archivists looking at either a federated system, as proposed by Chris Hurley and others at the recent ASA 2016 conference, or in support of further work with TROVE.

Reflecting on #govHack

A fortnight ago, I gave up a little bit of time to see if I could engage hackers in using cultural heritage data, either to enhance a project or to be the basis for one.

This year’s#govHackWA was held in a new space, and included a link to a regional centre, Geraldton. After four years, it has become far more slick and professional, which was needed with the large number of entrants, but meant that some of the more social components of the weekend had gone by the wayside (the introduction and welcome from the central committee sounded more like phoning a government organisation with a long phone menu, than the somewhat quirky presentation by @pia_waugh of earlier years). We shared information via Slack, an internet relay chat system with pretensions of grandeur, and the data sets needed to be on the various data portals a week ahead of the competition (rather than on a thumbdrive or harddrive brought in at the last minute).

The Slack channels worked well, enabling information, advice and requests to be shared with a large or small group as required. I have some concerns about these sorts of channels for more formal communication, particularly from a government recordkeeping perspective, but it was an effective tool for a specific project. There was a specific channel for project ideas, so I was able to suggest a few things, one of which, I think, was incorporated into the ihero project, about facial recognition of WWI photographs.

The data portals are clearly identified on the various government websites, with a link to each state from the Commonwealth portal, which shows how data can be connected across jurisdictions. However, I found the quality of the datasets to be variable, and I do wonder how many of them have longevity or usefulness either because of the specificity of the data collected, or the format in which the data is presented (but this is a discussion for another day). Nevertheless, by searching keywords in the data portals I was able to identify a range of useful data sets, and also links to databases, which provide more complex data. I collated some DATASETS and sources and also printed off my previous post on some #govHack tools.

I was able to help two groups with identifying data and suggesting some ways of working with the data that they had – colourfulpast and ihero. I had more involvement with the colourfulpast team, because they had worked with cultural data in the past and they included a colleague from the State Library of WA, but it was great to see how both projects evolved over the course of the weekend. I was able to promote both projects via twitter and on relevant facebook groups after the event, so that the target audience could identify and work with the projects and, hopefully, provide feedback and vote!

That said, there are some things that I would do differently next time. The WA Fisheries Department were there all weekend, with just one dataset – their shark data. Their ability to work with multiple groups and to provide both data and technical expertise meant that three groups elected to work primarily with their data. Had I been more switched on, I could have had a look at the WA Museum and SRO trial discovery layer which Andrew brought to the weekend and identified additional shark data. Similarly, working with Trove to develop some complementary data might also have been useful for them. The teams are time poor, so helping by providing some easily used and pre-collated data is worth considering. And, I would work to have some specific datasets identified in the portal, which I was really familiar with.

Next year, I hope to return to GovHack with a fully working SROWA catalogue and some datasets derived from the collection. I’ll also have a look at the other data provided by cultural organisations, and work on identifying projects and problems with them. Having specific datasets and clearly identified projects is of benefit to both the organisations and the hackers.

Getting ready for #govHack 2 – tools, other data sources and examples

In this post, I’m going to point to some of the tools that I know from digital humanities and the like. They are mostly used in the cultural sphere, but that is not to say that they aren’t useful for exposing and manipulating other sorts of data. I’ll also try and provide some examples of the way data has been used for some simple and not so simple projects. GovHack is all about getting something up and running in 24 hours so, like a thesis, the parameters of time, space and subject need to be clearly defined. However, also like a thesis, the project should show some potential for further work, research and avenues for publication.

I’ve already provided a link to the TROVE API, and to some of the blogs that discuss using it. The API has been acknowledged as a source of inspiration for the Europeana and Digital Public Library of America (DPLA) APIs, too ( a good way of incorporating some international data) : http://help.nla.gov.au/trove/building-with-trove/api; http://labs.europeana.eu/api; https://dp.la/info/developers/codex/. Library cataloguing data, including Australian libraries, can be found on WorldCat – https://www.worldcat.org/affiliate/tools?atype=wcapi, while archival and manuscript collections can be found via ArchivesGrid – http://beta.worldcat.org/ArchiveGrid.

Libraries and some archives use a format called MARC (MAchine Readable Catalogue) to describe resources. It’s a standard developed by the Library of Congress, and about half way down their MARC documentation page, you’ll find a list of crosswalks and mappings to other formats including Dublin Core (developed by OCLC, the people who run WorldCat) and geospatial data – http://www.loc.gov/marc/marcdocz.html

Other archives use Encoded Archival Description (EAD) and Encoded Archival Context (EAC) to create and share descriptions. Developed independently, the Library of Congress also maintains documentation to support these standards, and again has some crosswalks – http://www.loc.gov/ead/ag/agappb.html. EAC is used by the SNAC Project and the eScholarship Research Centre at University of Melbourne (which is a data provider for ANDS) to create connections between organisations and individuals – http://socialarchive.iath.virginia.edu/; http://www.esrc.unimelb.edu.au/about-us/informatics-lab/

Beyond the world of library and archives description (and you just wanted some simple headers to capture data, right?), there is Zotero, an open source citation software developed by the Roy Rosenzweig Centre for History and New Media (CHNM) – https://www.zotero.org/ . Zotero comes with some nice tools, including a simple timeline, and is also something I’d like to play with to open up referencing from archival sources. The CHNM spends a lot of time creating neat tools for historians and cultural curation so they also have Omeka, an online exhibition tool, and Scripto for transcription purposes – http://chnm.gmu.edu/research-and-tools/.

You can also use the open source project Blacklight (including Spotlight) to play with library described data – http://projectblacklight.org/; http://www.rubydoc.info/gems/blacklight-spotlight/0.19.1. (Turns out Blacklight, Spotlight and other delights are the work of Stanford University Libraries – https://library.stanford.edu/blogs/topic/blacklight).

There’s some good tutorials on Zotero and other tools on the Programming Historian site – http://programminghistorian.org/lessons/

The ever fabulous and creative Tim Sherratt has a whole host of tools, and examples of how to use them, on his wraggelabs site. The focus is on TROVE and the National Archives of Australia – http://wraggelabs.com/emporium/: e.g. http://troveconsole.herokuapp.com/ and http://faceapi.herokuapp.com/

Finally, I’d like to point to some interesting uses of cultural data, both as part of govHack and more generally.

Not open source, but fun, there’s HistoryPin and NowandThen https://www.historypin.org/en/ and http://nowandthen.net.au/Main_Page. Pixstory, from the 2013 Govhack, explored some of these ideas – https://www.youtube.com/watch?v=pUDxyrOhVQs

As part of the WW1 centenary project, the RSL teamed up with a local TAFE to create a virtual ‘Digger’ app – http://rslwahq.org.au/News/Well-done-Tom.aspx

Last year, at least two projects used cultural data for govhack – http://2015.hackerspace.govhack.org/content/citizen-culture-heritage-lest-we-forget

http://2015.hackerspace.govhack.org/content/exploring-indigenous-culture

And, there are all those geospatial projects, e.g. https://www.gaiaresources.com.au/sro-archive-maps/

Getting ready for #govHack – cultural datasets

Next week, the largest hackathon in the world, GovHack, takes place in Australia and New Zealand. There are govhack sites at universities and regional centres, and in all the major centres. Each site has participants, who make the things, and mentors who provide advice and guidance on tools and datasets. There’s even a specific cultural hack node in Canberra, run by Tim Sherratt.

This year, I’ve signed up as a data mentor for WA, which has a central city site and a regional node at Geraldton. This will be my third year as a data mentor, and my first year as a general mentor, talking cultural data generally (mostly archival, of course) rather than representing the SROWA. It’s a lot more organised than I anticipated, and people are already asking for more information to help them prepare. To this end, I’m going to use this post to talk about some datasets. Participants need to use at least one official data set, but can then also look for other data that they can mash together or reuse. This way, I can print off the page as a guide, and provide a link to it for #govhackwa participants. I’ll do an additional post or two on some tools for analysing them and provide some examples of how data has been used.

Official cultural datasets

These datasets are taken from the various government data portals.

Searching the data.gov.au dataset reveals 144 datasets for the keyword ‘library’, 117 for ‘archive’, 52 for ‘museum’ and 129 for ‘cultural’. The latter includes some gis datasets, including the “coarse cultural topographical data”, showing where major population centres are and the CPI index. In addition to collection links and collection subsets, State and National Libraries have contributed statistical datasets relating to location of libraries, user statistics and so on.

My top picks, outside of TROVE (from the National Library) and ANDS (Australian National Data Service) are –

The National Portrait Gallery – https://data.gov.au/dataset/portraits-and-people

The Antarctic artefacts bibliography – https://data.gov.au/dataset/aad-aa-bibliography and Commonwealth Bay artefacts survey data – https://data.gov.au/dataset/aad-cden-artefacts-gis

Indigenous protected areas – https://data.gov.au/dataset/indigenous-protected-areas-ipa-declared

The National Archives of Australia – “Memory of a nation” – digitised content from online exhibitions – https://data.gov.au/dataset/memory-of-a-nation-data and the Commonwealth Agencies dataset, which provides a comprehensive set of federal government departments, ministeries, offices and so on. Because of the way archives link data, some state and local government agencies are also included. This dataset was last updated in April, 2016 – https://data.gov.au/dataset/commonwealth-agencies.

The State Records Office of New South Wales has a number of indexes available as csv files in the NSW data portal – including convicts, soldier settlement indexes and wills and probate, not to mention their Flickr dataset. SRNSW collection information can also be searched via their online catalogue. Queensland State Archives has 55 datasets in the data.qld.gov.au portal – https://data.qld.gov.au/dataset?q=archive&tags=Queensland+State+Archives&groups=historical. State Records South Australia has 5 datasets – http://data.sa.gov.au/data/organization/state-records.

The Powerhouse Museum API – http://data.nsw.gov.au/data/dataset/bf9df234-7890-4907-94f6-e7872c8f4258

Other museum datasets include the gorgeous Scott Sisters collection from the Australian Museum, itself the subject of a remix competition in 2013/2014 – http://data.nsw.gov.au/data/dataset/4e57d134-79e9-42ad-a0a9-83fc91e1091c

There’s a plethora of WW1 related datasets – searching for ‘World War’ returns 24 datasets, of which only two are not clearly related, and the majority of which are from State Libraries.

It’s worth remembering that data in TROVE is harvested from all public libraries, and includes data from museums and archives. The content can be filtered via the TROVE API. The Public Records Office of Victoria and the Australian National University and Noel Butlin Archives have both contributed data to TROVE. The State Library of Queensland not only has data in TROVE, but also contributed over 50,000 photographs to Wikipedia.

http://help.nla.gov.au/trove/building-with-trove/api

TROVE has some useful examples and help sheets – http://help.nla.gov.au/trove/building-with-trove/examples

The Australian National Data Service is similarly rich and complex. Again, the Public Records Office of Victoria (PROV) has contributed data to ANDS, along with the State Records Office of NSW. The PROV’s semantic wiki is available as an xml formatted download – https://www.data.vic.gov.au/data/dataset/public-record-office-victoria-semantic-wiki.

Postcript – I’ve just been advised that the Curtin Library has made weather observation data from Jon Sanders’ 1986 – 1988 circumnavigation available through ANDS – https://researchdata.ands.org.au/search/#!/rows=15/sort=list_title_sort%20asc/class=collection/q=jon%20sanders/p=1/group=Curtin%20University/. There’s also a nice blog – http://triplesolo.library.curtin.edu.au/ – and you can follow along on Twitter #triplesolo #noonsummary.

Weather afficionados may also be interested in the digitised daily observations from colonial Perth, now in the NAA collection – http://recordsearch.naa.gov.au/SearchNRetrieve/Interface/ListingReports/ItemsListing.aspx?series=PP430/1

Finally, in the WA datasets, you will find a range of historical maps and plans, taken from the State Records Office digitised collection – each map links to the series at the top, but there are some older links to the previous catalogue. For better searching and exporting of data, it’s best to go straight to the new catalogue – https://archive.sro.wa.gov.au/

WA theme parks – taken from the Landgate “locate the 80s’ site – http://catalogue.beta.data.wa.gov.au/dataset/wa-theme-parks

State Heritage Office datasets – http://catalogue.beta.data.wa.gov.au/organization/state-heritage-office

WA Maritime Archaeology datasets, provided by the WA Museum – http://catalogue.beta.data.wa.gov.au/organization/western-australian-museum

In which there are too many hashtags, again!

Barely had the American Library Association (#alaac2016) conference finished, when I became aware of a groundswell of European conferences and workshops.

The first to pop up was #DAMEU for, obviously, Digital Asset Management. This conference is of interest because of both its focus on how to manage current digital content (whether or not it is a copy of an analogue record) but also its focus on long term management and preservation. Formats, platforms, repositories – all the buzz words are there.

The second is #LIBER16,taking place in Helsinki, for European research libraries. There’s quite a bit of overlap with the #DAMEU conference, but unlike the ones in Hobart last week, I don’t think participants could run from one to the other. Once again, open access, data repositories, and the management and maintenance of data is being discussed.

And finally, for today at least, is #eu2016nl (#eunl2016), a conference held in the final week of the Dutch Presidency of the EU, and focusing on the ‘digitalisation’ of cultural heritage (their words and spelling, not mine). This looks at both digital platforms, but also digitisation programs and linked content through the mighty Europeana. The best quote so far is that it is time to focus on quality not quantity.

Serendipity or design? #dha2016

Over in Tasmania, at the Digital Humanities conference today, there was a panel discussion on GLAM and humanities research and access to collections. @mikejonesmelb and others tweeted some of the content, and I’d love to see some of the papers and presentations.

The focus was, of course, the relationship between GLAM bodies and academia, with some suggestions for collaboration, such as the McCoy Project between University of Melbourne and Victoria Collections, and having LIS students help with digital humanities projects. It was identified that libraries and archives are not generally identified as research institutions (although with the changes to ARC funding a few years ago, I think the larger institutions can now partner with academics?), and that generally, funding is not that available for research within collections as part of the institutions’ roles.

Digitisation was also discussed with mixed feelings. It’s one way of providing data, but as Janet Carding, one of the panellists said, “the role for GLAM institutions isn’t to shut themselves in a room with a flatbed scanner for the next 20 years …”. It was also suggested that APIs for collections need to be made more open and accessible for users. I think there may be some more general discussion that needs to occur vis a vis collections data and the ‘ordinary punter’ as one of the panellists put it. The discussion appeared to range over the ways in which libraries and archives make information available about their collections (which is their raison d’etre) while galleries and museums have been much slower to enable access to collection databases. There are also the dichotomies between science and cultural heritage collections to be considered.

Mike Jones then spoke about context and connections, suggesting a web of knowledge lies within archival descriptions, and considering ways in which meaning can be layered over time. Deb Verhoeven followed up with a discussion of HuNI and serendipity, to which she later provided a three minute summary link. Aimed at academic researchers it still leaves lots to think about with regards to the ways in which we make connections across collections for all researchers.

Thinking about convergence

This semester, I was the co-ordinator for a unit called “Convergence and cultural institutions”. It was a little ironic, as I have been one of the few voices in the department that routinely challenges the idea that convergence is a) happening and b) inevitable. I was (and to a significant degree remain) a convergence sceptic.

Part of this is, I think, about the way in which convergence is defined. In the digital media world, where convergence theory had its start, convergence is about shared or single modes of delivery of content. However, convergence in the library and information sector now seems to include shared resources, single points of physical access and so on.

A recent research article, Passion trumps pay, highlights some of these concerns. In this article, the researchers focus on the role of the information professional in the GLAM sector. This seems to me to be the start of confusion. Information professionals such as archivists, records managers and librarians may be found in a range of institutions and organisations, often as small specialist sectors, as identified by Vanessa Finney in her presentation in Canberra in 2013. Similarly, as any school archivist will tell you, archives or special collections staff may also find themselves in charge of a small collection of realia or artwork, which may or may not be managed in accordance with museum principles. Does the inclusion of these staff within a GLAM institution constitute convergence (or some degree of the same)? Or is it just that we work in a range of organisations, some of which are also tagged as ‘memory’ or cultural institutions?

Interestingly, the museum participants identified that their co-workers – science researchers and some curators (for which read art or history curators?) – lacked the information literacy and information management skills of the information professionals. The study has suggested that this might need to be addressed in undergraduate degrees, but I would rather suggest that this is why there are information professionals in those organisations in the first place.

Archivists and gallery staff apparently disputed whether or not they were information professionals in accordance with the definition used in the study:

an individual working in a library, archive, museum, cultural heritage or information environment whose aim is to maintain, and often improve, access to the ever growing amount of information generated from within the culture and heritage industry, the media, and, increasingly, by the general public.

(Terras, 2009)

According to the authors of the study, this is because archivists identified that

archives until now have not been driven by access (the principle theme of the Terras (2009) definition) but rather by their legislated requirements (in terms of the records initially kept) and the need to preserve the material that they manage. Although they conceded that the archive is moving towards a more access-focussed model, they see their role as more specialised, and in some cases more crucial, as archivists often manage the only copies of specific information that exists.

I’m sorry, but what? Archivists don’t get, or have not been, driven by access?

“His Creed, The Sanctity of Evidence; his Task, the Conservation of every scrap of Evidence attaching to the Documents committed to his charge; his Aim, to provide, without prejudice or afterthought, for all who wish to know the Means of Knowledge” (Jenkinson 2003:258).

I think this highlights that we may be talking at cross purposes, as separate textual communities, where we share common terminology but have different understandings of what is meant. Until we resolve these textual problems, convergence will be some way off.

Finally, there was some discussion on the role of education in a ‘converged’ environment. Librarians and museologists agreed that this was something that could be addressed, but the archivists again disagreed, identifying that there are already enough pressures in the standard archives course, leading to minimal knowledge in core areas. Speaking as an educator of both librarians and archivists, in a combined BA degree and combined Masters program, I would agree. In fact, I would suggest that the archives and records curriculum is somewhat truncated, when compared with that for librarians.

I’m very keen on looking at ways we can work together, through collaboration and linked data, but the idea that we will somehow become a single profession, working across ‘memory’ or knowledge institutions, seems unlikely.

On one thing though, we do agree, the study’s authors, the participants and me. Our professions are about passion. How we teach that, or maintain it, is far more challenging.