Readings
- Judith H. Dobrzynski, “Modernizing Art History,” Wall Street Journal, April 28, 2014, sec. Life and Style, http://www.wsj.com/news/articles/SB10001424052702304518704579519632304010744
- Rob Kitchin, “Conceptualising Data,” in The Data Revolution (Thousand Oaks, CA: SAGE Publications Ltd, 2014). View PDF
- Matthew D. Lincoln, “Foreign and Domestic Interaction in the Early Modern Printmaking Network,” Matthew Lincoln, October 17, 2014, http://matthewlincoln.net/2014/10/17/foreign-and-domestic-interaction-in-the-early-modern-printmaking-network.html
- Trevor Owens, “Defining Data for Humanists: Text, Artifact, Information or Evidence?,” Journal of Digital Humanities 1, no. 1 (March 16, 2012), http://journalofdigitalhumanities.org/1-1/defining-data-for-humanists-by-trevor-owens/
- Hadley Wickham, “Tidy Data,” Journal of Statistical Software 59, no. 10 (August 2014), http://www.jstatsoft.org/v59/i10/paper
Activities
Morning
- Discussion of data and readings
- Anatomy of textual data
- Cleaning textual data
- Demo Session: Bookworm and Google n-grams
- Demo and Hands-on Session: Lexos and Voyant Tools
Afternoon
- Getting data from PDFs: using OCR and Tabula
- Anatomy of tabular data
- Cleaning tabular data
- Demo Session: Manipulating and cleaning with Excel and OpenRefine
- Hands-on Session: working with data using spreadsheets
- Sample datasets (via git fork/clone or zip download): https://github.com/robertss/getty-institute-data
Sites
- GitHub: https://github.com/
Tools
- OpenRefine: http://openrefine.org/download.html
- Voyant: http://voyant-tools.org
Extra Material
Zotero Folder – Day 5 – Data and Text Analysis – Extra Material