By Alice Meadows, Co-founder, MoreBrains Cooperative
@alicemeadows.bsky.social, www.linkedin.com/in/alice-meadows
Take Home Points:
Persistent identifiers (PIDs), such as Digital Object Identifiers (DOIs) and ORCID IDs, are foundational to the research infrastructure on which we all depend. They are integrated into many of the systems used by researchers (as authors, readers, and reviewers), as well as by funders, institutions, libraries, publishers, and others. Open PIDs (including DOIs and ORCIDs) – those with openly available metadata that can be readily accessed by both humans and machines – are especially powerful. They level the playing field and allow everyone to benefit from a shared open infrastructure that supports both research and researchers. The value of proprietary PIDs, like Elsevier’s Scopus Author ID and Web of Science’s ResearcherID, is mostly to the companies that create and manage them, although they very often include open PIDs in their metadata. Publishers (and journals) can play a vital role in maximizing the power of PIDs by integrating them in systems, encouraging (or, where appropriate) requiring their use, and educating authors about the value – to them and to us all – of providing good metadata.
What is a PID?
At their most basic, PIDs are unique and long-lasting references to an entity (a person, place, or thing) – typically a digital entity. Their primary use case is disambiguation; they ensure that people, places, or things with the same or similar names are uniquely identified. That may not sound very exciting but, together with their associated metadata, PIDs do much more than just ensuring reliable identification and, therefore, attribution and recognition. They also enable discovery, leading to increased usage; they support connections between different entities, which facilitates collaboration; and they increase transparency by including information about provenance (who did what, when, and how). All of these attributes help to build trust in research and the research ecosystem. And there’s more! PIDs have the most value as a network – their value grows as the number of participants and users grow. When integrated and adopted at scale, PIDs increase efficiency, saving time and money, by allowing reliable information to flow automatically between different research systems. For example, work that we’ve done at MoreBrains showed that the automated transfer of information between systems that would be achieved by implementing and achieving 80% adoption of five priority PIDs (for researchers, outputs, grants, organizations, and projects) could save the UK 55,000 person days a year, equivalent to £19M (approximately $25.5M) annually. Time that researchers currently spend rekeying the same information multiple times could then instead be spent doing more research. PIDs also improve the speed and accuracy of reporting, benefiting everyone in the research process, from funders to publishers, not least, researchers themselves. Importantly, they are an essential element of open research: content literally can’t be FAIR (Findable, Accessible, Interoperable, Reusable) without PIDs.
The magic of metadata
A few years ago, I wrote a Scholarly Kitchen post entitled How Better Metadata Could Help Save the World. Admittedly, it was a bit of a tongue-in-cheek title but, actually, the metadata (additional information) associated with PIDs is what makes them so powerful. On their own, PIDs are like a geographical coordinate; they uniquely identify something (in the case of coordinates, an exact location) but without the context you need for that information to be useful. (Would it mean anything to you that the town I live in is latitude 42.332218, longitude -71.121483? I think not!) However, in combination with their metadata, PIDs provide that additional context. They support the fast, reliable, and transparent sharing of a wide range of invaluable data such as publisher, publication title, licensing information, authorship, and much much more – including links to other PIDs. Critically, metadata typically includes provenance information: who added or edited the data, when, and how. This type of transparency helps to improve trust in the reliability of both PIDs and metadata.
So PIDs and metadata are inextricably linked – the value of each is maximized by the other.
Now let’s take a look at some of the most commonly used PIDs in the research space, and what you should know about them.
PIDs for people, places, and things
At 25 years old, Crossref’s Digital Object Identifier (DOI) is arguably the best-known PID in our community – and certainly the most ubiquitous. Originally launched in 2000 to enable citation links between journal articles, Crossref has now registered about 175 million DOIs and supports nearly two million monthly API queries. Citation linking is something that we take completely for granted today, but 25 years ago it was almost unimaginable. It was made possible thanks to the support of an initial group of 12 publishers – both commercial and not-for-profit organizations – who came together to solve this industry-wide need collaboratively. Today Crossref provides a host of DOI and metadata services including grant linking, Crossmark, Similarity Check, and more. Crossref DOIs are now used to register a wide variety of content types, not just journal articles. This includes book chapters, review reports, images, dissertations, preprints and, most recently, grants.
ORCID (Open Researcher and Contributor Identifier) has a similar backstory, though its founding group included a broader mix of organizations including funders, institutions, publishers, and platforms. Launched in 2012, ORCID is now probably just as well known as Crossref, but as a – or perhaps the – PID for people. There are currently around 10 million active ORCID IDs, which have been registered by researchers and other contributors to research working in every country of the world. As well as researchers at all career stages, across all disciplines, and in a range of settings, ORCID users also include librarians, editors, lab technicians, and many more. Registration is free and ORCID records are fully controlled by the researchers themselves – they decide what information to add to their record (the only requirements are a name, which is shown publicly, and a confirmed email, which by default is kept private, but can be made openly available); who is allowed to edit it (ORCID members must request access); what is fully public, what is fully private, and what can only be seen by ORCID members.
Crossref DOIs and ORCID IDs are especially well-known and widely used by scholarly publishers because they relate directly to two central entities in the publishing process – researchers (scholarly authors, editors, reviewers, etc) and their outputs. But it’s worth also being aware of a number of other open PIDs that are increasingly important in the research ecosystem.
DataCite, launched in 2009, also registers DOIs. Its original focus, unsurprisingly, was identifiers for datasets, but coverage has now expanded to include many of the same entities as Crossref, as well as DOIs for data management plans (DMPs), International Generic Sample Numbers (IGSNs) for physical samples, and Research Activity Identifiers (RAiDs) for research projects.
In addition, Crossref, DataCite, and the California Digital Library (CDL) collectively operate another important PID – the Research Organization Registry (ROR) identifier – which is also the default organization identifier in the Crossref, DataCite, and ORCID registries. Like people, organizations change their names; they also merge, split, shut down, and re-emerge. ROR IDs help publishers and others keep track of research organizations and, in combination with Crossref, DataCite, and ORCID, their outputs and researchers. Those organizations now include funders, following the ongoing transition of Crossref’s Open Funder Registry to ROR.
Conclusion and take-home messages
Of course, PIDs aren’t THE solution to all the challenges around trustworthy identification, attribution, provenance, etc. For example, they don’t – and were never intended to – verify the quality of an entity. In addition, open PIDs can be vulnerable to misuse; the ORCID registry has millions of unused records, many of which are spam, which is why the ORCID statistics page now focuses on active records. And, because open does not equal free, the cost of joining a PID organization and/or integrating PIDs into researcher systems can still be prohibitive for some organizations and communities. However, PIDs can and should be part of the solution to many of these challenges.
Origin Editorial is now part of KnowledgeWorks Global Ltd., the industry leader in editorial, production, online hosting, and transformative services for every stage of the content lifecycle. We are your source for society services, market analysis, intelligent automation, digital delivery, and more. Email us at info@kwglobal.com.



