OpenAIRE Infrastructure Services:
technologies for Open Science
Paolo Manghi
(OpenAIRE infrastructure)
Istituto di Scienza e Tecnologie dell’Informazione
Consiglio Nazionale delle Ricerche
OPENAIRE SERVICES
TODAY
2
Visualize - Manage
Enhanced Publications
Get support
(NOADs)
Linked Content
Statistics
+++
Search & Browse
Curate & collaborate
Deposit
Publications
& data
Research impact
Citations, usage statistics
+++
APIs
Data
repositories/aggregators
Data Journals
Metadata
on data
Publication
repositories/aggregators
Institutional & Thematic
Open Access
Journals/Publishers
Usage data
Metadata
And pdfs
National funding
EC funding
Guidelines for use services
Institutional
CRIS
Systems
CERN/OpenAIRE “catch-all” repository
Guidelines for data interoperability
OpenDOA
R
ResearchID
info
packages
files
cleaned
records
files
Native
De-duplicated
De-duplicated
& enriched
Transform
Clean
Mine
Identify duplicates on
publications, persons, and
organisations
Action Set
(equivalence rels)
Aggregation
subsystem
De-duplication
subsystem
Information
Inference
subsystem
Data Sources
(repositories, entity
registries, etc.)
Populate Merge Enrich
Action Set
(inferred info)
Data provision
subsystem
Collect
Validate
Publishing
subsystem
(next slide)
Native slices
Architecture overview: PRODUCTION
Data Flow monitoring
alignment of back-ends
content increase / decrease
test.openaire.eu
Publishing subsystem:
PRODUCTION
De-duplicated
& enriched
Data
provision
subsystem
shadow index
collection
shadow OAI-
PMH
shadow
StatsDB
Pre-
public
content
preview
Preview
Go
Public HTTP API
OAI-
PMH
index
collection
StatsDB
www.openaire.eu
LOD
shadow LOD
PUBLIC
Quality
Monitor
OPENAIRE SERVICES
TOWARDS OS-AAS
6
Research Communities and “digital”
science
Research
Communication
Infrastructure
Publishing
Repeat/Reproduce/Reus
e and Evaluation
Research
data
Research
methods
e-infra
Tools & Services
Research
data
Scientific process
Research literature:
Articles, docs, white papers
01101010
01100001
11010010
01101010
01100001
11010010
Methods: e.g. software, workflows, protocols, algorithms, scripts
What does Open
Science
Publishing mean?
Open Science publishing
Supporting reuse/reproducibility and transparent evaluati
Research
data
Research
methods
e-infra
Tools & Services
Research
data
Scientific process
Research literature:
Articles, docs, white papers
Publishing
01101010
01100001
11010010
01101010
01100001
11010010
Publication
Repository
01101010
01100001
11010010
Data
Repository
Method
Repository
01101010
01100001
11010010
01101010
01100001
11010010
Package
Repository
Enabling Reproducibility
citation
citation
Enabling
Transparent
evaluation
Open Science publishing
Publishing of all kinds of research artefacts
Publishing packages of artefacts
Publishing up-to-date record of research artefacts
metadata and links
Enabling transparent
evaluation
Enabling reproducibility
Research Communication barriers to
Open Science
Repositories lack
support to Open
Science publishing
No support for integration of repositories for methods or packages
Minimal or no support for links between artefacts in different repositories
No support for keeping repositories with up-to-date links between
artefacts
Research communities
lack culture of Open
Science publishing
Lack of e-infrastructure for Open Science: e.g. repository limits above,
exchange formats, workflows
Difficulties to self-organize and sustain research communication
solutions: e.g. identify the problems, see the benefits, devise solutions,
apply economy of scale
The OpenAIRE Infrastructure
European (and beyond) infrastructure for Open
Access and Open Science
Networking infrastructure: Open Access/Science advocacy and
support, global alignment and interoperability, etc.
Technical infrastructure: services for monitoring of Open
Access/Science and Research Impact for funders and communities
OpenAIRE to support Open Science
Facilitate Research Communities
adoption of Open Science publishing
principles by supporting publishing tools
as-a-Service
Facilitate repositories at moving towards
Open Science publishing by supporting
notification-based research communication
as-a-Service
OpenAIRE technical infrastructure
01101010
01100001
11010010
01101010
01100001
11010010
01101010
01100001
11010010
01101010
01100001
11010010
fund
fund
Harvest Harmonize
De-duplicate Inference
01101010
01100001
11010010
Deposit articles/datasets (DOI)
Claim links
Stats:
• research impact
• Open Access trends
Repositories of publications, datasets, projects
Entities:
• 14M+ publications (de-duplicated)
• 200,000+ links publication-project from 5 funders
• 20,000+ datasets linked to publications or projects
• 80,000+ organizations (de-duplicated)
Collected from:
• 700+ “direct” data providers
• 5,000+ “indirect” data providers (inherited from aggregators)
Open Science as-a-Service (OSaaS)
in OpenAIRE
Catch-All-Notification
Broker
Methods
Packages
Articles DataProjects
Research Community
Dashboard
Harvesting
Search-Browse-
Monitor-Research
Impact
Subscribe & Receive Notification
Articles
Data
Researchers
Content Providers
Articles
Data
Projects
Methods
OpenAIRE OSaaS: methods and packag
01101010
01100001
11010010
01101010
01100001
11010010
01101010
01100001
11010010
01101010
01100001
11010010
fund
fund
Harvest Harmonize
De-duplicate Inference
01101010
01100001
11010010
Repositories of publications, datasets, projects, methods, packages
• Metadata description for methods and packages: citation and
reproducibility (e.g. Research Objects, Rmap)
• Interoperability guidelines for exchanging packages of interlinked
artefacts: enabling exchange of information across research
communication infrastructure
OpenAIRE OSaaS: Dashboard for Rese
Communities
01101010
01100001
11010010
01101010
01100001
11010010
01101010
01100001
11010010
01101010
01100001
11010010
fund
Harvest Harmonize
De-duplicate Inference
Repositories of publications, datasets, projects, methods, packages
Open Science as-a-Service: Dashboard for Research Communities
Research
Community
Operator
01101010
01100001
11010010
fund
Request
Dashboard
OpenAIRE OSaaS: Dashboard for Research
Communities
01101010
01100001
11010010
01101010
01100001
11010010
01101010
01100001
11010010
01101010
01100001
11010010
fund
Harvest Harmonize
Deduplicate Inference
Repositories of publications, datasets, projects, methods, packages
Open Science as-a-Service: Dashboard for Research Communities
Research
Community
Operator
Researchers
01101010
01100001
11010010
• Deposit (DOI)
• Claim
• Manage users
• Configure stats
• Configure inference
fund
OpenAIRE OSaaS: Dashboard for Research Communities
01101010
01100001
11010010
01101010
01100001
11010010
01101010
01100001
11010010
01101010
01100001
11010010
fund
Harvest Harmonize
Deduplicate Inference
Repositories of publications, datasets, projects, methods, packages
Open Science as-a-Service: Dashboard for Research Communities
Research
Community
Operator
Researchers
01101010
01100001
11010010
• Deposit (DOI)
• Claim
• Stats: research impact & OA
• Manage users
• Configure stats
• Configure inference
fund
OpenAIRE OSaaS: Dashboard for Research Communities
01101010
01100001
11010010
01101010
01100001
11010010
01101010
01100001
11010010
01101010
01100001
11010010
fund
Harvest Harmonize
Deduplicate Inference
Repositories of publications, datasets, projects, methods, packages
Open Science as-a-Service: Dashboard for Research Communities
Research
Community
Operator
Researchers
01101010
01100001
11010010
• Deposit (DOI)
• Claim
• Stats: research impact & OA
• Manage users
• Configure stats
• Configure inference
fund
Research Community: Open
Science benefits
• Can continue their publishing practices, but, if needed they have support for
deposition of any artefact
Common repository for publishing (deposition) of
datasets, methods, and packages
• Community information space to share, discovery, and reuse (reproduce)
scientific results
Collaborative curation of a community-specific
research communication domain
• Scientific reward strategies can be developed
Research impact and statistics
OpenAIRE OSaaS: Notification Broker
01101010
01100001
11010010
01101010
01100001
11010010
01101010
01100001
11010010
01101010
01100001
11010010
fund
Harvest Harmonize
De-duplicate Inference
Repositories of publications, datasets, projects, methods, packages
Open Science as-a-Service: Notification Broker for repositories
01101010
01100001
11010010
fund
Repository
Manager
Request
Notification
Workspace
OpenAIRE OSaaS: Notification Broker
01101010
01100001
11010010
01101010
01100001
11010010
01101010
01100001
11010010
01101010
01100001
11010010
fund
Harvest Harmonize
Deduplicate Inference
Repositories of publications, datasets, projects, methods, packages
Open Science as-a-Service: Notification Broker for repositories
01101010
01100001
11010010
fund
Repository
Manager
• Subscription: enrichment
or additions to repository
• Notification management
OpenAIRE OSaaS: Notification Broker
01101010
01100001
11010010
01101010
01100001
11010010
01101010
01100001
11010010
01101010
01100001
11010010
fund
Harvest Harmonize
Deduplicate Inference
Repositories of publications, datasets, projects, methods, packages
Open Science as-a-Service: Notification Broker for repositories
01101010
01100001
11010010
fund
Repository
Manager
• Subscription: enrichment or
additions to repository
• Notification management
?
OpenAIRE OSaaS: Notification Broker
01101010
01100001
11010010
01101010
01100001
11010010
01101010
01100001
11010010
01101010
01100001
11010010
fund
Harvest Harmonize
Deduplicate Inference
Repositories of publications, datasets, projects, methods, packages
Open Science as-a-Service: Notification Broker for repositories
01101010
01100001
11010010
fund
Repository
Manager
• Subscription: enrichment or
additions to repository
• Notification management
Notifications
• Mail
• APIs
• UIs
?
Enabling a Network of Research
Communication Brokers
Producers of events
Subscriptions Subscriptions Subscriptions
Consumers of events
subscribe notify subscribe notify subscribe notify
Exchanging
Subscriptions &
channeling
notifications
Exchanging
Subscriptions &
channeling
notifications
Repositories: Open Science benefits
• Enabling addition of links to artefacts of any kind
Extending repository metadata models to
Open Science
• “Almost real-time” exchange of information: notifications about links to other
artefacts, missing properties, and missing artefacts
Keeping their collection up-to-date:
enrichments and additions
• Enabling repositories to be notified of content of interest, enabling
construction of research-focused aggregators by notifications
Fostering notification-based and federated
dissemination of knowledge
OpenAIRE towards Open Science
Research Community Dashboard
Repository Notification Broker
Served on-demand according to the OSaaS approach
Customizable by different disciplines and providers, each with
different practices and maturity levels
Framework aligning communities and repositories on practices
addressing transparent evaluation and reproducibility
Questions?
Off-line questions are most welcome:
paolo.manghi@isti.cnr.it

OpenAIRE: Open Science as-a-Service - presentation at #DI4R2016

  • 1.
    OpenAIRE Infrastructure Services: technologiesfor Open Science Paolo Manghi (OpenAIRE infrastructure) Istituto di Scienza e Tecnologie dell’Informazione Consiglio Nazionale delle Ricerche
  • 2.
  • 3.
    Visualize - Manage EnhancedPublications Get support (NOADs) Linked Content Statistics +++ Search & Browse Curate & collaborate Deposit Publications & data Research impact Citations, usage statistics +++ APIs Data repositories/aggregators Data Journals Metadata on data Publication repositories/aggregators Institutional & Thematic Open Access Journals/Publishers Usage data Metadata And pdfs National funding EC funding Guidelines for use services Institutional CRIS Systems CERN/OpenAIRE “catch-all” repository Guidelines for data interoperability OpenDOA R ResearchID
  • 4.
    info packages files cleaned records files Native De-duplicated De-duplicated & enriched Transform Clean Mine Identify duplicateson publications, persons, and organisations Action Set (equivalence rels) Aggregation subsystem De-duplication subsystem Information Inference subsystem Data Sources (repositories, entity registries, etc.) Populate Merge Enrich Action Set (inferred info) Data provision subsystem Collect Validate Publishing subsystem (next slide) Native slices Architecture overview: PRODUCTION
  • 5.
    Data Flow monitoring alignmentof back-ends content increase / decrease test.openaire.eu Publishing subsystem: PRODUCTION De-duplicated & enriched Data provision subsystem shadow index collection shadow OAI- PMH shadow StatsDB Pre- public content preview Preview Go Public HTTP API OAI- PMH index collection StatsDB www.openaire.eu LOD shadow LOD PUBLIC Quality Monitor
  • 6.
  • 7.
    Research Communities and“digital” science Research Communication Infrastructure Publishing Repeat/Reproduce/Reus e and Evaluation Research data Research methods e-infra Tools & Services Research data Scientific process Research literature: Articles, docs, white papers 01101010 01100001 11010010 01101010 01100001 11010010 Methods: e.g. software, workflows, protocols, algorithms, scripts What does Open Science Publishing mean?
  • 8.
    Open Science publishing Supportingreuse/reproducibility and transparent evaluati Research data Research methods e-infra Tools & Services Research data Scientific process Research literature: Articles, docs, white papers Publishing 01101010 01100001 11010010 01101010 01100001 11010010 Publication Repository 01101010 01100001 11010010 Data Repository Method Repository 01101010 01100001 11010010 01101010 01100001 11010010 Package Repository Enabling Reproducibility citation citation Enabling Transparent evaluation
  • 9.
    Open Science publishing Publishingof all kinds of research artefacts Publishing packages of artefacts Publishing up-to-date record of research artefacts metadata and links Enabling transparent evaluation Enabling reproducibility
  • 10.
    Research Communication barriersto Open Science Repositories lack support to Open Science publishing No support for integration of repositories for methods or packages Minimal or no support for links between artefacts in different repositories No support for keeping repositories with up-to-date links between artefacts Research communities lack culture of Open Science publishing Lack of e-infrastructure for Open Science: e.g. repository limits above, exchange formats, workflows Difficulties to self-organize and sustain research communication solutions: e.g. identify the problems, see the benefits, devise solutions, apply economy of scale
  • 11.
    The OpenAIRE Infrastructure European(and beyond) infrastructure for Open Access and Open Science Networking infrastructure: Open Access/Science advocacy and support, global alignment and interoperability, etc. Technical infrastructure: services for monitoring of Open Access/Science and Research Impact for funders and communities
  • 12.
    OpenAIRE to supportOpen Science Facilitate Research Communities adoption of Open Science publishing principles by supporting publishing tools as-a-Service Facilitate repositories at moving towards Open Science publishing by supporting notification-based research communication as-a-Service
  • 13.
    OpenAIRE technical infrastructure 01101010 01100001 11010010 01101010 01100001 11010010 01101010 01100001 11010010 01101010 01100001 11010010 fund fund HarvestHarmonize De-duplicate Inference 01101010 01100001 11010010 Deposit articles/datasets (DOI) Claim links Stats: • research impact • Open Access trends Repositories of publications, datasets, projects Entities: • 14M+ publications (de-duplicated) • 200,000+ links publication-project from 5 funders • 20,000+ datasets linked to publications or projects • 80,000+ organizations (de-duplicated) Collected from: • 700+ “direct” data providers • 5,000+ “indirect” data providers (inherited from aggregators)
  • 14.
    Open Science as-a-Service(OSaaS) in OpenAIRE Catch-All-Notification Broker Methods Packages Articles DataProjects Research Community Dashboard Harvesting Search-Browse- Monitor-Research Impact Subscribe & Receive Notification Articles Data Researchers Content Providers Articles Data Projects Methods
  • 15.
    OpenAIRE OSaaS: methodsand packag 01101010 01100001 11010010 01101010 01100001 11010010 01101010 01100001 11010010 01101010 01100001 11010010 fund fund Harvest Harmonize De-duplicate Inference 01101010 01100001 11010010 Repositories of publications, datasets, projects, methods, packages • Metadata description for methods and packages: citation and reproducibility (e.g. Research Objects, Rmap) • Interoperability guidelines for exchanging packages of interlinked artefacts: enabling exchange of information across research communication infrastructure
  • 16.
    OpenAIRE OSaaS: Dashboardfor Rese Communities 01101010 01100001 11010010 01101010 01100001 11010010 01101010 01100001 11010010 01101010 01100001 11010010 fund Harvest Harmonize De-duplicate Inference Repositories of publications, datasets, projects, methods, packages Open Science as-a-Service: Dashboard for Research Communities Research Community Operator 01101010 01100001 11010010 fund Request Dashboard
  • 17.
    OpenAIRE OSaaS: Dashboardfor Research Communities 01101010 01100001 11010010 01101010 01100001 11010010 01101010 01100001 11010010 01101010 01100001 11010010 fund Harvest Harmonize Deduplicate Inference Repositories of publications, datasets, projects, methods, packages Open Science as-a-Service: Dashboard for Research Communities Research Community Operator Researchers 01101010 01100001 11010010 • Deposit (DOI) • Claim • Manage users • Configure stats • Configure inference fund
  • 18.
    OpenAIRE OSaaS: Dashboardfor Research Communities 01101010 01100001 11010010 01101010 01100001 11010010 01101010 01100001 11010010 01101010 01100001 11010010 fund Harvest Harmonize Deduplicate Inference Repositories of publications, datasets, projects, methods, packages Open Science as-a-Service: Dashboard for Research Communities Research Community Operator Researchers 01101010 01100001 11010010 • Deposit (DOI) • Claim • Stats: research impact & OA • Manage users • Configure stats • Configure inference fund
  • 19.
    OpenAIRE OSaaS: Dashboardfor Research Communities 01101010 01100001 11010010 01101010 01100001 11010010 01101010 01100001 11010010 01101010 01100001 11010010 fund Harvest Harmonize Deduplicate Inference Repositories of publications, datasets, projects, methods, packages Open Science as-a-Service: Dashboard for Research Communities Research Community Operator Researchers 01101010 01100001 11010010 • Deposit (DOI) • Claim • Stats: research impact & OA • Manage users • Configure stats • Configure inference fund
  • 20.
    Research Community: Open Sciencebenefits • Can continue their publishing practices, but, if needed they have support for deposition of any artefact Common repository for publishing (deposition) of datasets, methods, and packages • Community information space to share, discovery, and reuse (reproduce) scientific results Collaborative curation of a community-specific research communication domain • Scientific reward strategies can be developed Research impact and statistics
  • 21.
    OpenAIRE OSaaS: NotificationBroker 01101010 01100001 11010010 01101010 01100001 11010010 01101010 01100001 11010010 01101010 01100001 11010010 fund Harvest Harmonize De-duplicate Inference Repositories of publications, datasets, projects, methods, packages Open Science as-a-Service: Notification Broker for repositories 01101010 01100001 11010010 fund Repository Manager Request Notification Workspace
  • 22.
    OpenAIRE OSaaS: NotificationBroker 01101010 01100001 11010010 01101010 01100001 11010010 01101010 01100001 11010010 01101010 01100001 11010010 fund Harvest Harmonize Deduplicate Inference Repositories of publications, datasets, projects, methods, packages Open Science as-a-Service: Notification Broker for repositories 01101010 01100001 11010010 fund Repository Manager • Subscription: enrichment or additions to repository • Notification management
  • 23.
    OpenAIRE OSaaS: NotificationBroker 01101010 01100001 11010010 01101010 01100001 11010010 01101010 01100001 11010010 01101010 01100001 11010010 fund Harvest Harmonize Deduplicate Inference Repositories of publications, datasets, projects, methods, packages Open Science as-a-Service: Notification Broker for repositories 01101010 01100001 11010010 fund Repository Manager • Subscription: enrichment or additions to repository • Notification management ?
  • 24.
    OpenAIRE OSaaS: NotificationBroker 01101010 01100001 11010010 01101010 01100001 11010010 01101010 01100001 11010010 01101010 01100001 11010010 fund Harvest Harmonize Deduplicate Inference Repositories of publications, datasets, projects, methods, packages Open Science as-a-Service: Notification Broker for repositories 01101010 01100001 11010010 fund Repository Manager • Subscription: enrichment or additions to repository • Notification management Notifications • Mail • APIs • UIs ?
  • 25.
    Enabling a Networkof Research Communication Brokers Producers of events Subscriptions Subscriptions Subscriptions Consumers of events subscribe notify subscribe notify subscribe notify Exchanging Subscriptions & channeling notifications Exchanging Subscriptions & channeling notifications
  • 26.
    Repositories: Open Sciencebenefits • Enabling addition of links to artefacts of any kind Extending repository metadata models to Open Science • “Almost real-time” exchange of information: notifications about links to other artefacts, missing properties, and missing artefacts Keeping their collection up-to-date: enrichments and additions • Enabling repositories to be notified of content of interest, enabling construction of research-focused aggregators by notifications Fostering notification-based and federated dissemination of knowledge
  • 27.
    OpenAIRE towards OpenScience Research Community Dashboard Repository Notification Broker Served on-demand according to the OSaaS approach Customizable by different disciplines and providers, each with different practices and maturity levels Framework aligning communities and repositories on practices addressing transparent evaluation and reproducibility
  • 28.