Strategies for Creating Open Digital Projects with Paywalled Sources

October 19th 2016 @ 12:00pm to @ 1:00pm
Hazelbaker Hall (E159)
Herman B Wells Library
1320 E 10th St
Bloomington, IN 47405
 Digital Library Brown Bag Series logo
Kalani Craig, Department of History
Digital history projects bring with them visions of freely accessible data sets, clean and ready to be adapted from one project for use in another. The reality of the data landscape for digital humanists is much more complicated. Many digitized, transcribed sources available for use in text mining are institutionally owned and often paywalled as a result. Additionally, archives often place limitations on the transcription results of their archived documents. These copyrighted digitized texts are still valuable targets for text mining, hGIS and network-theoretical approaches, but it can be difficult to provide results that meet open-data best practices within the constraints of the original copyrights.
The historical analysis of memory building in medieval saints' lives at the core of this paper is built on a foundation of data curation oriented toward the preservation of citation data, which is often lost or obscured as we work with topic modeling and natural-language processing. The paper will use this memory-building argument as a case study to demonstrate a process for scraping, cleaning and importing paywalled sources, and then adding a layer of natural language processing that preserves the citation data scholars need to participate in a historiographic debate.
A set of online documentation and resources for replicating the process will accompany the paper.
This presentation is part of the ongoing Digital Library Brown Bag Series. Follow and contribute to the presentations and discussions on twitter: #dlbb.

Fall 2016 Digital Library Brown Bag Schedule

Programs will be held from 12:00 pm to 1:00 pm EST in the Herman B Wells Library in Room E159 (Hazelbaker Hall in the Scholars' Commons).

Remote Access to the Brown Bag

This semester's Digital Library Brown Bag series will be available for remote access via the Web, unless otherwise specified. Anyone may log in; you do not need to be an IU affiliate.

Presentation slides and audio will be available via the Adobe Connect Meeting Service). Go to to view and listen to the presentation. If you are not a registered user for Connect Meeting/Breeze, select the "Enter as a Guest" option.

Sign up for email reminders! Send an email to with the message body: sub dl-brownbag-l Your Full Name



Read more about host(s): Related Events and/or Exhibitions: Related Content:

Contact Info

Wells Library W501
1320 East Tenth Street
Indiana University
Bloomington, IN 47405
Michelle Dalmau
Michelle Dalmau - Head, Digital Collections Services, Associate Librarian
(812) 855-1261