Dan	Valen,	figshare	
2.2017	
Metadata and the [Data] Repository
Why	research	data	repositories?		
•  Funder	Mandates	
•  Publisher	Mandates	
•  FACTS	(aka,	data	or	it	didn’t	happen)	
Source:	hJp://datasharing.sparcopen.org/data
Drivers:	Governments	aligning	to	support	public	
access	of	research	
  OSTP	Memo	
  NIH,	NSF,	NOAA,	SI,	and	more… 		
  Tri-Agency	Open	Access	Policy	
  FASTR	
  Other	Agency/Funder	Policies	
ie:	Gates,	Moore,	Sloan,	NEH,	Howard	Hughes		
  Concordat	on	Open	Research	Data	
(HEFCE,	RCUK,	UUK,	Wellcome	Trust),	
EPSRC,	and	more	
  Horizon	20/20	
OpenAIRE		
North	America	 	 	
	 		
UK	/	Europe	
Australia	
  ARC	–	data	management	plan	
expecta^on	from	researchers	
  Australia	Na^onal	Data	Service	–	
suppor^ng	the	sector,	Ins^tu^onal	Data	
Management	Frameworks
Drivers:	Nudge	the	system	towards	promo^ng	
research	integrity
Source:	hJp://www.dlib.org/dlib/january15/01guest_editorial.html	
Drivers:	Data	as	a	“first-class	ci^zen”,	
Transparency,	and	the	3	Rs
Source:	hJps://www.rd-alliance.org/sites/default/files/aJachment/The%20Data%20Harvest%20Final.pdf
So	what	is	the	community	doing	to	make	data	
discoverable?		
Source:	hJp://www.niso.org/apps/
group_public/download.php/15375/
PrimerRDM-2015-0727.pdf
Standards	and	the	community	
Source:	hJp://rd-alliance.github.io/metadata-directory/standards/	+	hJp://www.dcc.ac.uk/resources/subject-areas/physical-science
Adhering	to	
Force11	Joint	Declara^on	
of	Data	Cita^on	principles	
(for	discovery)…	
Source:	hJps://www.force11.org/group/joint-declara^on-data-cita^on-principles-final	Source:	hJps://thlibrary.wordpress.com/2013/08/13/spotlight-on-data-cita^on-index/	
PREAMBLE	
	
Sound,	reproducible	scholarship	rests	upon	a	founda^on	
of	robust,	accessible	data.	For	this	to	be	so	in	prac^ce	as	
well	as	theory,	data	must	be	accorded	due	importance	in	
the	prac^ce	of	scholarship	and	in	the	enduring	scholarly	
record.	In	other	words,	data	should	be	considered	
legi^mate,	citable	products	of	research.	Data	cita^on,	
like	the	cita^on	of	other	evidence	and	sources,	is	good	
research	prac^ce	and	is	part	of	the	scholarly	ecosystem	
suppor^ng	data	reuse.	
In	support	of	this	asser^on,	and	to	encourage	good	
prac^ce,	we	offer	a	set	of	guiding	principles	for	data	
within	scholarly	literature,	another	dataset,	or	any	other	
research	object.
Data	FAIRport.	N.p.,	n.d.	Web.	20	June	2016.
What’s	out	there?
The	Generalist	Repository
A	look	at	the	generalist	data	repo	
Source:	hJps://zenodo.org/dev#harvest-metadata
Metadata	at	Zenodo:	behind	the	scenes
Metadata	at	Zenodo:	the	“finished”	product	
Source:	hJps://zenodo.org/record/313122
Metadata	at	figshare:	OAI-PMH
Metadata	at	figshare:	behind	the	scenes
Metadata	at	figshare:	the	“finished”	product	
Source:	hJps://doi.org/10.6084/m9.figshare.104616.v4
Data	repos,	metadata,	and	discovery	
Source:	hJp://www.makeuseof.com/tag/technology-explained-what-is-a-meta-search-engine/
Index	content	for	discovery	in	Google	+	Google	
Scholar	
	
Can	expose	hidden	content	within	published	ar^cles
Google	and	Datasets	(tabular	data	and	beyond)	
Source:	hJps://developers.google.com/search/docs/data-types/datasets
Addi^onal	ways	to	discover	content:	SHARE		
hJps://share.osf.io/discover?sources=figshare
Addi^onal	ways	to	discover	content:	
DataCite	+	the	importance	of	metadata		
Source:	hJps://search.datacite.org
Mapping	to	different	metadata	standards	is	
tough…	
	
Source:	hJp://jennriley.com/metadatamap/
From	a	data	model	perspec^ve,	this	problem	
has	at	least	one	solu^on		
	
•  RDF	(machine	readable	format	to	describe	
anything)	
•  Ontology	(a	machine	readable	dic^onary	for	
gepng	meaning	from	RDF)
Challenges	and	next	steps	
•  Expose	metadata	as	RDF	or	LD	and	allow	folks	to	build	extensions	or	
mappings	between	what	community	has	and	other	formats	
•  Common	metadata	standards		
•  (ie:	from	CRIS/RIM	->	paper	repo	->	data	repo	->	through	to)	
•  OAI-PMH	-	doesn’t	support	versioning/wasn’t	designed	with	that	in	
mind	&	lacks	informa^on	on	the	actual	files	related	to	metadata	
records	and	that	might	be	of	use	for	some	people…	so	JSON-LD	via	
the	API?	
•  Automated	metadata	capture	and	genera^on	upon	upload	
	
Image	courtesy	of	sheffield.figshare.com
Thank	you	so	much		
for	your	^me	
dan@figshare.com		
@figshare	
figshare.com

Valen Metadata and the [Data] Repository