FAIR Principles

The FAIR Principles are a set of guiding principles aimed at making data and other research outputs Findable, Accessible, Interoperable, and Reusable. First described in 2016, these principles have rapidly come to define best practice in the management of research data, and much attention has been focused internationally on describing what these principles mean in practice as well as determining how to assess the quality of their implementation. The Digital Repository of Ireland (DRI) has contributed to this process, through the European Commission’s expert group on FAIR, which produced the Turning FAIR into Reality report, through involvement in the RDA FAIR Data Maturity Model Working Group, the European Open Science Cloud (EOSC) FAIR working group and the WorldFAIR Project.

DRI is committed to the advocacy and policy efforts around the FAIR principles in national and international contexts and has also pioneered their implementation in Ireland as a trustworthy and certified repository for social sciences and humanities data. Our work around FAIRification has focused on optimising policies, consistent application of persistent identifiers, including Digital Object Identifiers (DOI) and Open Researcher and Contributor IDs (ORCID), rich and standardised metadata that is indexed and searchable, machine-actionable rights and licences for reuse, as well as promoting the use of vocabularies.

As part of the WorldFAIR Project, DRI completed a FAIR Implementation Profile (FIP) describing the various resources, standards and tools used to support the FAIR sharing of collections and research data in the Repository. DRI’s profile is being used to support the development of a Cross-Domain Interoperability Framework designed “to provide a core set of agreements, which integrate the generic aspects of FAIR (basic discovery metadata, structural aspects of data needed for processing and integration, basic access information, provenance, etc.) with links to domain-specific semantic components such as controlled vocabularies and ontologies.”

Below, we outline some of the specific ways that we have addressed the different elements of FAIR.

Findable

F1. (Meta)data are assigned a globally unique and persistent identifier

F2. Data are described with rich metadata

F3. Metadata clearly and explicitly include the identifier of the data they describe

F4. (Meta)data are registered or indexed in a searchable resource

All objects within the Repository are assigned a unique identifier on ingest, and then a persistent identifier, a DataCite DOI, when published.

The DOI is included in the automatically generated citation styles which users can select from in order to properly credit the depositor and source the Repository.

Collections are also made discoverable through FAIR Signposting, a standardised format for machine agents to navigate scholarly web objects.

Datasets or Digital Objects ingested into DRI are described by rich metadata. These include a number of mandatory fields such as Title, Description, Rights and so on, as well as a large number of recommended fields including subjects, places, dates, etc. All metadata is indexed into a Solr Search Engine and can be searched either across all fields, or by searching certain individual fields including Subjects, Person names, etc.

Accessible

A1. (Meta)data are retrievable by their identifier using a standardised communications protocol

A1.1 The protocol is open, free, and universally implementable

A1.2 The protocol allows for an authentication and authorisation procedure, where necessary

A2. Metadata are accessible, even when the data are no longer available

Metadata and data can be retrieved over the HTTPS protocol, in a web browser or programmatically via a REST API. While metadata is always accessible, there may be cases where access to data files is restricted, e.g. where the data contains potentially identifiable personal information. In this case the data is still available to authenticated and authorised users over HTTPS, and can be downloaded by authenticated applications.

In the case where data is removed, the metadata remains available.

Interoperable

I1. (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.

I2. (Meta)data use vocabularies that follow FAIR principles

I3. (Meta)data include qualified references to other (meta)data

All metadata in the Repository use formal, accessible, shared and domain appropriate metadata standards, including Dublin Core, Qualified Dublin Core, EAD, MODS and MARC XML formats. DRI chose to implement these particular standards because they are widely used by our community members, and support current disciplinary practices.

The metadata can be entered via an online form which enforces the input of the mandatory fields, allows users to search various controlled vocabularies for metadata values, and provides tooltips and guidance to assist users in making decisions about the metadata that they enter. For users who have already catalogued their data in an external system, the Repository supports ingest of metadata in the formats listed above. DRI also provides tabular templates to assist users in gathering metadata with additional guidance, and tools for exporting from a tabular format to Qualified Dublin Core for batch ingest into the Repository. DRI also publishes several guidelines and fact sheets about metadata and runs regular training courses to help users to provide quality metadata.

A machine-readable version of the object can be retrieved as a JSON or RDF record. This includes the main metadata fields as well as links to the digital asset file(s), related objects, licence, etc. This information can also be exported in BagIt Format. or harvested via an OAI-PMH feed.

Reusable

R1. Meta(data) are richly described with a plurality of accurate and relevant attributes

R1.1. (Meta)data are released with a clear and accessible data usage license

R1.2. (Meta)data are associated with detailed provenance

R1.3. (Meta)data meet domain-relevant community standards

All objects in the repository require a formal Rights Statement and supplementary rights information. Each object must also have a Licence associated with it which can be selected from any of the supported DRI licences. The Licences specify under what conditions the digital assets can be reused. All of the descriptive metadata is reusable under a CC0 licence, unless otherwise indicated.

The internal DRI unique identifier can be used to create relationships between multiple objects. These relations enable researchers to link to supporting documentation, such as a publication or records describing, for example, how the dataset was collected (for instance, a datasheet).

Technical metadata about the file format and other significant properties is automatically generated, as is provenance metadata relating to actions carried out on the file after it has been ingested. This additional metadata is stored with the object.

DRI is funded by the Department of Further and Higher Education, Research, Innovation and Science (DFHERIS) via the Higher Education Authority (HEA).