In the first of a short series of blog posts relating to the DRI’s Batch Metadata Template, we define the concepts of metadata and metadata standards. We then move on to introduce the DRI Batch Metadata Template, and how it can be used to input bulk metadata.
—
Funders are increasingly asking researchers to submit data management plans, either as part of their funding application or shortly after they begin their research process. One of the topics that often causes difficulty to researchers is metadata. We are asked what metadata standards we will use, how we will capture, create and manage our metadata? Is there a discipline-specific standard that we should know about? We also have to explain how our research can be shared and re-used. The DRI has developed a Batch Metadata Template to help researchers generate useful metadata that will enable their data to be archived and shared.
First things first. What is metadata?
Metadata is information that tells us information about an object. The metadata we encounter most regularly is a library catalogue record. This record tells us the title of a book, the name of the author, the publisher and when it was published. If you are holding a book in your hand, you can flick to the first few pages and you will find all this metadata recorded in the book. The advantage of material objects is that the metadata can be embedded within the object. Metadata has become an important topic for researchers in the last few years as we are moving, more and more, towards producing digital objects – word documents, sound files, jpgs of photographs. The difficulty with digital objects is that they don’t necessarily have any information attached to then that tells you what the object is. We all have come across documents on our laptops with no information on who wrote it or when, or photographs whose only information is a vague numerical name. Increasingly, it is important for researchers to be diligent and methodical about recording the data they generate.
And, a ‘metadata standard’?
A metadata standard is an international agreement about what information should be recorded. There are hundreds of standards. Almost all standards will ask for basic information such as name of object and when it was created. After that, the standards diverge depending on the specific type of data that they are used to describing. That is, different domains or disciplines might decide they need to capture more information on an object – libraries may want to ensure that the version number of a book is recorded, geographers will want geographical location data noted.
Introducing the DRI Batch Metadata Template
The DRI Batch Metadata Template is based on one of the most simple standards: Qualified Dublin Core. Dublin Core was born in 1995 at a meeting in Dublin, Ohio. With the rise of the internet, a group of librarians recognised the need to develop a very simple list of terms that could be used to describe digital data. The standard outlines 15 pieces of information you should try to record about your data. Qualified Dublin Core, introduced in 2008, built on these 15 to add more complexity. Generally within metadata standards there is always a tension between adding more terms, which will on one hand make the standard more precise in describing the object, but will also take longer to be filled in, and so is less likely to be used.
Why use a standard at all? Why not just pick terms that best suit your own data? The advantage of using a widely shared standard is, if you do deposit your data in a repository or a library or share it online, the metadata you use will be similar to that used in other projects. This means it will be easier for other researchers to both find your research and make useful connections with research completed by other researchers.
The DRI Batch Metadata Template is a spreadsheet that can be opened in excel or open office. Each object has a line to itself and the columns outline the metadata required for each object. The template also outlines which metadata is necessary for ingest into the DRI repository – should you want to do this at a later date. It also gives examples showing how other researchers have filled in the fields within DRI.
The Batch Metadata Template can be found here: Digital Repository of Ireland Batch Metadata Template.
Further Resources:
DCC: List of Metadata Standards
Digital Repository of Ireland: Guide to Research Data Management Plans
Digital Repository of Ireland: Factsheet No. 1: Metadata and the DRI
Digital Repository of Ireland, Metadata Guidelines: Simple Dublin Core and the Digital Repository of Ireland
Digital Repository of Ireland, Metadata Guidelines: Qualified Dublin Core and the Digital Repository of Ireland
Digital Repository of Ireland, Metadata Guidelines: MARC21 encoded as MARCXML and the Digital Repository of Ireland
Digital Repository of Ireland, Metadata Guidelines: EAD, ISAD(G) and the Digital Repository of Ireland
Digital Repository of Ireland, Metadata Guidelines: MODS and the Digital Repository of Ireland
The Data Documentation Initiative (DDI): https://www.ddialliance.org/
Data Curation Centre Briefing Papers: What are Metadata Standards?
University of Cambridge Research Data Management: Electronic Lab Notebooks
Programmes that can help add metadata directly into the files:
JHOVE: http://hul.harvard.edu/jhove/
XMP: https://www.pdflib.com/knowledge-base/xmp-metadata/
Exif: http://www.exif.org/
[Main image: Photo by Filiberto Santillán on Unsplash]