Back to top

New Paper on Preserving Data Journalism ….and a few thoughts about the challenges of Open Access

by Natalie Harrower and Kathryn Cassidy 

We’re excited to announce the publication of a new paper -- a literature review on the digital preservation of data journalism, with a particular focus on the challenges of preserving data visualisations. Along with our co-authors Dr. Bahareh Heravi at University College Dublin and Edie Davis at Trinity College Dublin, we hope that this will make a significant contribution to the fields of digital preservation and journalism studies, and mark a step forward in preserving data journalism for the longer term.

On the digital preservation side, which is the ‘thing we do’ at DRI, the paper builds on the work of many digital preservationists, it reviews the literature from adjacent fields (because there is a not a lot specifically on preserving data journalism), and highlights the challenges facing the preservation of complex digital objects that are too often stored in locations or built on applications that are not sustainable. It discusses issues familiar to those working in digital preservation and also identifies gaps in how digital preservation plays out (or doesn’t) in existing journalistic workflows. A key issue is that the archiving systems of most newspapers were built with print copy as the model and need to be adapted to the new digital-first reality, which has multiple dependencies, reliance on third party platforms, and so on. Without intervention, a significant portion of data journalism - and therefore a significant portion of contemporary journalism history - will be lost.  We offer some recommendations about what needs to happen to make these outputs and workflows more sustainable, and we also provide suggestions for immediate actions that can be taken until systemic solutions are adopted. The paper is available, Open Access, on the  Journalism Practice website of the publisher, Taylor & Francis.  

The paper is also available as a preprint on our colleague Dr. Heravi’s website. Why a preprint version if the journal version is open access, you may ask? Well, there is a story behind this that involves several twists and turns! There was a fairly involved decision-making process behind the publication of this article, and it wasn’t without its challenges. We know that many scholars face these same challenges when aiming to get their research in front of those who would benefit from it most, so we thought we would share our process in this blog, in the spirit of keeping the dialogue open.

We said above that digital preservation is the ‘thing we do’ at DRI, but we also do a number of other things, and for the last number of years one of our key areas of focus has been supporting the transition to Open Science. We contribute in a number of ways to the work of the National Open Research Forum, we serve on national and international expert and advisory groups related to data management, digital archiving, repository sustainability and digital humanities, we are the national aggregator for the open culture network Europeana, and we do a lot of education and advocacy around FAIR data in research and for GLAM institutions. Importantly, we have also taken steps to ensure that our repository is certified, that it enables the FAIR principles, and that we can support researchers in making their research outputs FAIR. Our focus has been on data management, but we also advocate for  the wider aspects around Open Science, which includes the movement to make all outputs of publicly funded research Open Access.

Journalism Practice, like most peer reviewed journals currently, is not open access by default. So… why did we send our article there? Well, the short answer is that it was the journal we thought would best reach the audience who needs to read this work — journalism scholars working on these issues — and because we planned to put an immediate open access preprint in the DRI Publications collection. Great journal, Green Open Access, mission accomplished.

…and then we realised that Journalism Practice allows you to publish a preprint on a personal website immediately, but has an 18-month embargo period on publishing preprints in repositories. This was a bit of a surprise, because their Archives and Library & Information Science journals are immediate Green OA. But not Journalism Practice.  Taylor & Francis also have a number of Open Access agreements by country and institution. But at the point of our submission, in March 2021, there was no agreement with Ireland, nor with any of our institutions.

So after a year of work and several gruelling rounds of revisions, we were left with two options: pay the Open Access article processing charge (APC), calculated at £2,650, for immediate Gold Open Access, or don’t pay the Open Access fee and place it on a personal website for that 18-month embargo period. For us, the fee was not impossible to pay, but would have a noticeable effect on our very modest research budget; for others, this cost may be entirely prohibitive. The ‘personal website’ option was not the way we had envisioned making the paper immediately available, and we knew it was not the most desirable outcome. However, after considerable discussion, this was the option we chose. And here is why:

The world of scholarly publishing is undergoing a massive transformation. At the national level, a number of agreements have been reached since January 2020 between IReL (Irish Research e-Library) and publishers -- some with major publishers, some for a subset of journals by a particular publisher, and so on. See the IReL website for a list of facilitated agreements. At DRI’s headquarters, the Royal Irish Academy, the Dictionary of Irish Biography recently opened its entire catalogue online, and earlier this year an OA agreement was signed to publish without APCs in the RIA’s six journals. This kind of development might seem like a coup for researchers, but it’s only one step in a much larger, longer, concerted effort to transform scholarly publication practices, and to ensure that research funders don’t pay for research twice: first by funding the research, and then through funding journal access charges or OA charges. The challenge is still underway, girded by the efforts of librarians, researchers, research support staff, funders, open scholarship organisations, and governments and alliances around the world. We considered paying the fee so that the article would appear on the journal’s site, free and open, from the moment of its publication. But we decided against that on principle, for two reasons: we did not want to support this system, which is in the process of being transformed, and the end goal, of ensuring the paper could be read immediately and freely by anyone interested, could still be achieved through preprint publication on the lead author’s website. Because the lead author is also a leading scholar of data journalism, we felt assured that the publication would reach the intended audiences through this method.

Decisions made, we wrote this blog, and just as we were checking all links and getting ready to publish it, IreL announced that an agreement had been reached for a read and publish agreement with Taylor & Francis! The agreement is for 3 years going forward, but importantly, it included this clause:

For articles accepted between 1 January 2021 and 30 March 2021: eligible articles approved by a participating institution will be made open access retrospectively as part of the agreement, with the exception of articles already published OA.

Our article was eligible, and suddenly the decision to not pay the fee seemed wiser than we could have possibly anticipated!


Read the article here:

Bahareh Heravi, Kathryn Cassidy, Edie Davis & Natalie Harrower (2021) Preserving Data Journalism: A Systematic Literature Review, Journalism Practice, DOI: 10.1080/17512786.2021.1903972