Guest Blog from DCU Library: Seven steps to publishing your collection on the Digital Repository of Ireland

by Liam O’Dwyer and Killian Downing, Dublin City University Library.

Introduction

An increasing number of Irish organisations, including libraries, galleries, archives and museums, are using the Digital Repository of Ireland (DRI) to publish their collections online. Liam O’Dwyer and Killian Downing, Dublin City University (DCU) Library, speak about publishing the Henry Morris and Seán Lester Collections on the DRI and Europeana.

DCU Library has been publishing its collections with the DRI since 2021. We are also aggregating these collections so they are discoverable on Europeana, the European Union’s platform for sharing digital cultural heritage. In this blog, we show our local implementation of the staging process, our approaches to it and our aims in publishing to DRI and Europeana in the first place.

The guide below is an iterative one as we continue to add DCU collections to the DRI but as a work-in-progress we hope it may be useful for any members at a similar stage. We will look at it in the context of two collections, Seán Lester and Henry Morris, both at different points in our cycle.

About the Collections:

The Seán Lester Collection is available on both the DRI and Europeana platforms. It comprises diaries and personal papers from his years as High Commissioner of the League of Nations in the Free City of Danzig (now Gdańsk) in the mid-1930s, and subsequently his time in Geneva during the League’s remaining years. During World War 2 Lester buried the diaries in a metal case near the League of Nations headquarters in Geneva. After the war ended, he retrieved the case and returned with the diaries to Ireland where this collection was subsequently donated to DCU Library by Patricia Kilroy and Ann Gorski, daughters of Seán Lester.

*Lester, Seán, 1888-1959. Diary 6: September 1938 – March 1939, Digital Repository of Ireland [Distributor], Dublin City University [Depositing Institution], https://doi.org/10.7486/DRI.ws85q5671*

The Henry Morris Collection has recently been published on DRI and is in the process of being aggregated to Europeana. Henry Morris was a writer and Irish scholar, born on 14 January 1874 in Lisdoonan, Donaghmoyne, Co. Monaghan, Ireland. Morris was a teacher and school inspector for the Department of Education, collector of 18th and 19th century Irish manuscripts, and involved in the revival of Irish language, and antiquarian studies. The collection is an album of correspondence, press-cuttings, autographs, warrants, postcards, invitations, and receipts, collected by Henry Morris including correspondence from notable figures in Irish History such as Padraic Pearse, Eoin MacNeill and T.M. Healy.

Repeal Association of Ireland. (2023) Repeal Association of Ireland Membership Card for Francis Connolly, 17 April 1843. — Repeal Association of Ireland. Repeal Association of Ireland Membership Card for Francis Connolly, 17 April 1843, Digital Repository of Ireland [Distributor], Dublin City University [Depositing Institution], https://doi.org/10.7486/DRI.8c980h269

1. Creating the Collection

In DRI’s repository, objects cannot be uploaded without a parent collection so the first step is to create your collection. This work is done via a webform in the DRI Workspace and is shown in the video below.

Walkthrough 1: Creating Collections in the Workspace

2. Collection Arrangement / Presentation

Once the collection is created, there are different ways you can go with DRI’s Collection/ Subcollection/ Object/ Asset model. The main principles are:

A collection can contain subcollections
Subcollections can contain further subcollections
Collections and subcollections can each contain objects
Objects contain one or more assets (e.g. image files) along with metadata
An object cannot be uploaded without a parent collection

We take a case-by-case approach to what suits each collection here. With Lester, we could have created a subcollection as a container for each diary :

Collection = Sean Lester Collection Sub Collection = Diary 1: October 1935 - January 1936 Object = Diary 1: October 1935 - January 1936 Sub Collection = Diary 2: January - June 1936 Object = Diary 2: January - June 1936 Sub Collection = Correspondence from Diary 2: January - June 1936

Instead we decided to have each diary as a compound object (i.e. a set of multiple assets) directly below the top level collection.

Collection = Sean Lester Collection Object = Diary 1: October 1935 - January 1936 Object = Diary 2: January - June 1936

This mirrors how the Lester Collection is arranged on AtoM, our web based archival management system.

With the Morris Collection, we are not matching the AtoM arrangement to the same degree. This is a set of letters and other objects held in a scrapbook. On AtoM it is a single compound PDF of the original scrapbook reflecting the original order. For DRI however, the letters and other items are presented individually with object-level metadata. This entailed some further metadata creation but the arrangement adds value and usability, and better leverages the benefits afforded by IIIF. Some of this was also with further aggregation in mind such as Europeana (expanded in step further below) and how the content would appear there.

3. File Formats

The type of objects in a collection also influences our choice of file format.

For Lester, each diary contained 100 assets/images so we included a compound pdf alongside the individual image files for ease of navigation.
For Morris, the letters objects are a lot smaller so the image files alone are enough.
In both cases the images are made available as hi-res tiffs to take full advantage of DRI’s IIIF implementation. It also enables us to make hi-res versions accessible which we do not yet do via our local publishing platforms.

4. Batch Ingest

Once our collection arrangement is finalised we can begin preparing object-level metadata and assets for upload. With batch ingest all metadata is created or managed outside of the DRI workspace prior to upload in DCMI (Dublin Core) XML format. This has benefits in that we can store it locally, and potentially reuse it later in other platforms such as Omeka.

It also acts as a useful audit of asset files as it requires consistent file-naming. Metadata and associated asset files must have matching filenames. Where a single metadata record has multiple assets, you follow the metadata filename with an underscore and numeric sequence. An example of this from our Lester upload :

5. Creating Batch Ingest-Ready Metadata

Archival ISAD(G) metadata is exported from AtoM as DC records and moved into OpenRefine where each row describes a single item or record. The exported DC records may require varying levels of enhancement and standardisation to fit DRI DC requirements. OpenRefine is very useful for this process and is also used for the next stage of work, that of turning the OpenRefine dataset into a set of Dublin Core XML records ready for import. The two videos below outline this process:

Walkthrough 2: Templating in OpenRefine to create Batch records file

Walkthrough 3: Splitting the Exported Batch File as Ingest-ready XML Files

DRI OpenRefine export template and accompanying python script referenced in the videos above are shared on github.

6. Staging and Ingest

Once we have our metadata and data files ready and named correctly, we can go ahead with the batch ingest process. We first stage the files to make them available to the repository prior to ingest. Both staging and ingest are relatively straightforward processes but it is important to review files after each stage to ensure they have all moved across successfully. Particularly during staging we can find one or two have been missed out and we simply add them again before ingest.

We upload tiff files which can take some time to move across, particularly during staging. For larger collections we tend to not stage all the files at once but do it in lots of 50-100 assets.

The video below shows this process for the first batch of 40 objects for the Henry Morris Collection.

Walkthrough 4: Staging and Batch Ingest

7. Aggregation to Europeana

The DRI is the accredited national aggregator for Europeana, and an active member of the Europeana Aggregators’ Forum. Published DRI collections can be shared with Europeana, giving educators, researchers, culture lovers and people everywhere the opportunity to explore Ireland’s cultural heritage.

Publishing to Europeana is coordinated by the DRI team, who guide members how to add at least one type of contextual metadata under the Europeana Publishing Framework. This framework outlines the minimum metadata requirements and shows how the quality of your metadata can impact on your collection’s discoverability and reach. Here’s a quick summary on managing metadata under the Europeana Publishing Framework.

At DCU Library, we added temporal and spatial subject metadata to the Sean Lester Diaries. This work involved adding metadata using Library of Congress Subject Headings (LCSH), an online thesaurus of subject headings which is a controlled vocabulary for subject indexing used at DCU Library and GeoNames for geographical coordinates.

This work occurred In the DRI Workspace where we used the editor tools menu to add temporal and spatial metadata. This work is shown in the video below.

Walkthrough 5: Aggregation to Europeana

We hope this guide may be of assistance to any DRI members preparing to publish their collections on the DRI and Europeana. Don’t hesitate to reach out to the brilliant DRI Team if you have any further questions and we’d both highly recommend doing the online DRI introductory training and join the upcoming workshop on Europeana for DRI Members on Wednesday, 20th September 2023.