Publication: The Paper of Record Meets an Ephemeral Web: An Examination of Linkrot and Content Drift within The New York Times
No Thumbnail Available
Open/View Files
Date
2021-04-26
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Harvard Innovation Lab, Harvard Law School
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
Zittrain, Jonathan, John Bowers, and Clare Stanton. 2021. "The Paper of Record Meets an Ephemeral Web: An Examination of Linkrot and Content Drift within The New York Times." Library Innovation Lab, Harvard Law School.
Research Data
Abstract
Hyperlinks are a powerful tool for journalists and their readers. Diving deep into the context of an article is just a click away. But hyperlinks are a double-edged sword; for all of the internet’s boundlessness, what’s found on the web can also be modified, moved, or entirely disappeared. This often-irreversible decay of web content is commonly known as linkrot. It comes with a similar problem of content drift, or the often-unannounced changes––retractions, additions, replacement––to the content at a particular URL.
Our team of researchers at Harvard Law School has undertaken a project to gain insight into the extent and characteristics of journalistic linkrot and content drift. We examined hyperlinks in New York Times articles starting with the launch of the Times website in 1996 up through mid-2019, developed on the basis of a dataset provided to us by the Times. We focus on the Times not because it is an influential publication whose archives are often used to help form a historical record. Rather, the substantial linkrot and content drift we find here across the New York Times corpus accurately reflects the inherent difficulties of long-term linking to pieces of a volatile web.
Results show a near linear increase of linkrot over time, with interesting patterns emerging within certain sections of the paper or across top level domains. Over half of articles containing at least one URL also contained a dead link. Additionally, of the ostensibly “healthy” links existing in articles, a hand review revealed additional erosion to citations via content drift.
Description
Other Available Sources
Keywords
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service