Subtitles section Play video Print subtitles Can you describe your thesis in two minutes? Most people think historians spend all their time in the library reading books and you wouldn't be far off, but recently the library has gotten too big. Way too big. And it's getting bigger at an alarming rate. That's because billions of records have been digitized and are now online. Historians are faced with far more material than they could ever hope to read in a lifetime, or even a hundred lifetimes. My research looks at a pretty typical historical question: how were Irish immigrants to London, England, treated at the dawn of the industrial revolution? But instead of heading to the library, I'm heading to my computer to apply some of the best tricks of computer science to the task, namely distant reading. Distant reading basically means figuring out what something says without actually reading it. It's the type of classifying that Google does to help you find a recipe for an apple pie. Google hasn't read those webpages. Instead they've created a computer program that does it for them. I'm doing the same thing, but instead of focusing on pies I'm asking questions such as, which documents refer to Irish people? Like Google, I've developed a set of computerized tests to determine if a document is relevant or if it's not. That automation is crucial when you're dealing with databases containing hundreds of millions of words of texts. But finding relevant material isn't all we can do in the age of the Internet. Computers have also allowed me to measure aspects of nineteenth century life in which the Irish experience differed from that of a typical Londoner. For example I can tell you that an Irish person was roughly four times more likely than an English counterpart to appear in a London court on trial for his or her life. There's no way we would ever have found that out without distant reading. We live in a world in which information is overabundant and managing it effectively can mean the difference between finding what you're after and getting lost in a jumble of data. There's too much to read out there, so it's time we found another way to do it. My name is Adam Crymble, I'm a student at Kings College London in the United Kingdom, and the title of my thesis is Understanding the Irish London Immigrant Experience through large-scale textual analysis, 1801 to 1820.
B1 irish reading london distant library thesis Big Data + Old History 319 29 Afra posted on 2014/09/14 More Share Save Report Video vocabulary