Musings on Digital History

Databases and Digitization: Using Online Tools for Historical Research


Things I Wish I Learned Earlier In My Research Career.

Open Source Research Management Tools

Zotero Citation App

Since I began GMU’s PhD program, now two weeks ago, I have been trying to establish good habits and organized workflows for my dissertation research. I experienced the horrors of not having a good system for keeping track of my research materials during my masters, and I am really trying to avoid a repeat of that in grad school, part 2. About a year ago, when I decided to focus my research on Instrumental Women, I started using Zotero citation software, a secondary source management system. Zotero allows you to organize your source materials by folder and subfolder, and also generates footnotes, endnotes, and bibliographies in your preferred style. Zotero has a plug-in for your web browser so that you can add citations directly from the web. It generally has had life changing impacts on my research. I am still glad to know how to write individual footnote citations, or bibliographic entries using Chicago Style (thanks Dr. Norton!), but Zotero is a powerful tool to keep track of everything in a way that actually makes sense, and interfaces with your web browser. I started learning how to use Zotero from my husband Jeff, who uses it for his scientific publications, bt it wasn’t until recently that I learned it was developed by the George Mason University’s Roy Rosenzweig Center for History and New Media.

Tropy Archival Material Organizer

This week, I learned about the wonders of Tropy, a research app for organizing digitized archival photos and documents. I have always had an incredibly frustrating relationship with the photos I have taken in archival collections. What should I name this photo? Can I combine these photos without turning them into a PDF? Where do I keep the metadata for these photos, most importantly, which box/folder did the damn thing come from so that I can properly cite it? Well luckily for me, the Rosenzweig Center has me covered. Tropy (est. 2017) solves these problems.

Here is a screenshot of the interface. I used a set of photos I have taken related to instruments made for women by men. This particular image, below, came from the Smithsonian Archives of American Art, and I used it for a blog they asked me to write about my research. I was able to add dates, box and folder numbers, and even save a zoomed in shot of an important part of this advertisement:

Last year, I was hired as a proxy researcher at the Archives for American Art for a client in England. I took thousands of photos, and really wish I had Tropy to use! Tropy allows the user to group items, but also do batch edits. For example, you can add “Box 10” to hundreds of photos at one time. How cool is that? Note to self: do not delete the photos you are using in Tropy. The app needs to communicate with your computer’s storage, so the images don’t “live” on the app. This was learned after furiously searching for a photo uploaded a few days before.

Open Refine: Cleaning Up Messy Data

As a museum cataloger, I have been tasked with database management, cross-checking data with approved style guides, and grooming object records for consistency. Sometimes, generating an excel spreadsheet of over 1000 objects to groom can be an incredibly tedious process. Over the years, catalogers have used different date naming conventions (19th Century/1800s/1800-1900/c.1800-1900/about 1800-1900). When trying to generate statistics about the make-up of a collection, these seemingly innocuous date names can prove to be time consuming and frustrating to fix. Open Refine is a data grooming software that is far more agile and equipped to deal with grooming large amounts of data in quick order. While it definitely has a learning curve, Open Refine has the ability to let the user separate names into multiple columns (easier than excel in regards to names that are comprised of multiple words, hyphenations, etc.). Open Refine also can allow the user to zero in on subsets of data to groom in smaller batches. I am looking forward to using this software on my Instrumental Women database, which needs some serious grooming!

Admittedly, I was hesitant and nervous for the tutorial on Open Refine. When I hear the term data, I think math. I guess I never considered my research points data points. But now that I rethinking how I look at my materials, I am interested in data analysis and visualization. Christof Schoch states that as historians, “we need smart big data because it can not only adequately represent a sufficient number of relevant features of humanistic objects of inquiry to enable the level of precision and nuance scholars in the humanities need, but it can also provide us with a sufficient amount of data to enable quantitative methods of inquiry that help us transgress the limitations inherent in methods based on close reading strategies. To put it in a nutshell: only smart big data enables intelligent quantitative methods”1. Henceforth, I am going to seriously consider the information I gather and it’s potential uses in both qualitative and quantitative ways.

Open Source = Free!

I must mention that all of these tools are open source and free. Given last week’s blog on how digital history have the potential to democratize historical research, open source tools like Zotero, Tropy, and Open Refine are also democratizing research methods and data organization.

Visualizing, Organizing, and Telling History Online

As I embark on later phases of my Instrumental Women Project, I have been thinking about how my database of living makers, and timeline of women in the industry could be shared online. I have been inspired by the interface and interactivity of several DH projects, including the Trans-Atlantic Slave Trade Database and Old Bailey Online. These projects allow the user to sort and search through data in a very user-friendly way. As I move forward, I want to figure out how best represent my work, and I think I will be returning to these sites to find inspiration.

Digitalization and the Big Picture

Just this week, in Professor O’Malley’s course on the Gilded Age and the Progressive Era, I was introduced to the idea of Transnational History. As Lara Putnam writes that the shift in historian associations (i.e. geographic history as a specific course of study) is happening simultaneously with the rise of the digital age. She says that “source digitization has transformed historians’ practice in ways that facilitate border-crossing research in particular.”2. Simply, the increased digital access to sources and materials from various repositories, coupled with our increased awareness of globalization and the influence thereof is potentially making studies like “American History” obsolete. Even more basic, we all study world history—American history, like any other, is influenced by various other global influences. Before the digital age, a geographical boundry made pragmatic sense, since “the real-world geography of textual sources used to define our work. Information in physical form…tends to cluster in administrative centers near where it was produced”2. She warns, however, that “Meanwhile, we are going to have to work actively so those systematically less present in printed sources do not fall out of view. Size up the absence. Who wasn’t publishing papers or pamphlets, or wasn’t reading them, or was far from the people who did? Rural people, illiterate people, people who stayed put: all stand in the shadows that digitized sources cast.”2.

Jonathan Blaney and Judith Siefring address the strange bias that exists toward the printed word when citing sources. Without even thinking about it, I too have succombed to finding the page number of an online article so that I can use the hard-copy citation. Why? Maybe because it looks cleaner, with fewer blue hyperlinks. Blaney and Siefring posit that “change of practice at individual scholarly level reflects and promotes change at a wider cultural level. As more and more established academics are open about their use of online resources, the belief that digital content is less scholarly should lessen, and citation should improve…Digital citation is important because it is a reflection of how digital resources are valued.”3


I’ll be brief. This week I learned some seriously cool tools for research, with some great contextual literature to allow me to think critically abou the responsibility I have with the data I collection.

Alas, I am tired. See y’all next week.

  1. Christof Schoch, “Big? Smart? Clean? Messy? Data in the Humanities,” Journal of Digital Humanities 2, no 3 (2013). []
  2. Lara Putnam, “The Transnational and the Text Searchable: Digitized Sources and the Shadows They Cast” in American Historical Review 121, no 2 (April 2016): 377-402. [] [] []
  3. Jonathan Blaney and Judith Siefring “A Culture of non-citation: Assessing the digital impact of British History Online and the Early English Books Online Text Creation Partnership,” Digital Humanities Quarterly 11, no 1 (2017). []

3 replies on “Databases and Digitization: Using Online Tools for Historical Research”

Jayme, You raise an interesting point concerning whether the study of American history will become obsolete due to the availability of digital sources and transnational study. It’s hard to imagine it becoming obsolete, but definitely the study will become more complex and complicated as more of those stories are included. I read a book for a class this summer that was transnational: What Soldiers Do, Mary Louise Roberts, and while it’s a recent publication, at quick glance it doesn’t appear to utilize digital sources even though it seems like a lot of the army newspapers that were consulted could possibly be digitized. It does, however, cite numerous archives in France and the US. At the very least, the study of GI behavior is definitely grounded in a transnational approach and complicates our understanding of the GI during WWII.

That book sounds interesting! To be honest, I don’t think American History is going away. I think it is just interesting to see how our historical research is becoming even more interdisciplinary. When I teach ethnomusicology courses (the study of musical cultures), there is often a east vs. west dichotomy, and I often remind my students that we are always east AND west of something, that these directional monikers are constructions, and to me, what is most interesting is seeing where things intersect, and acknowledging that everything is global at the point.

“Digital citation is important because it is a reflection of how digital resources are valued.”

That’s a great sentence to add to your blog, which reminds me of how different generations feel about digital materials online. I remember during undergrad, one professor specifically asked us to *not* use digital archives to find materials for our Research Seminar capstone class. He wanted to experience the very essence of real, in-person research in the local archives as if there is a difference in how sources are found and located–even though they ultimately are the same thing at the end of the project. I always wondered why this disparity looked at references between physical archives and those available online.

As noted in that article, I wonder if older generations worried about manipulating all web-based sources. Even though reputable libraries store these materials, they fit in the same category as the website Wikipedia. Students might *lose* the experience of not working in the archives.

I remember in that final course in undergrad, I defied his orders and did more research online than in person. While researching a topic on a WW1 battalion that primarily originated from Macon, Georgia, and I found that the only materials collected could be found at the University of Nebraska. The only materials I could find in the local library archives were newspapers on microfilm. All of this is to say that the access provided by UNL’s archives gave me opportunities I would have missed if I only collected sources from the local, physical archive.

Leave a Reply to Rebecca Cancel reply

Your email address will not be published. Required fields are marked *