The image dataset was developed as a way to compile and understand the visual material throughout the entire tuberculosis corpus (X.1.1; X.1.2). As of the completion of the dissertation, 3981 unique images were described and entered into the dataset, accounting for a little less than half of the books in the full corpus. The dataset was developed by Sean Purcell with the help of data entry specialists Xavier Daniels and Lauren Sweany.
The dataset was developed for future researchers interested in the history of medicine and the digital health humanities. The creation of the spreadsheet was valuable because it afforded a detailed examination of each image in the corpus, which in turn provided the contextual information necessary for the first and second chapters (0.2.1).
The protocol for creating the dataset is detailed in the next section (X.1.4). The goal was to describe the type of image—if it was a photograph, a graphic/illustration, or an x-ray—and to provide some overarching descriptors about what was imaged—some examples include describing if the image was pathological (2.2.2); architectural (1.2.3; 1.2.5); an advertisement; a museum exhibit, or a doctor’s portrait (1.4.1).
Some images were included in the dataset but otherwise removed from the larger collection of images because they depicted children in vulnerable positions. Those all have the tag of explicit. I chose to remove those images early in the process of developing the image corpus, so they did not have a major influence on some of the observations made elsewhere in this dissertation.
Sean Purcell,2023 - 2025. Community-Archive Jekyll Theme by Kalani Craig is licensed under CC BY-NC-SA 4.0 Framework: Foundation 6.