Along the banks of the Allegheny River on a tepid September day in 2009, a college freshman decided to read the complete works of Henry David Thoreau.
Needless to say, he never reached his goal.
Reading the entire corpus of an author is pretty difficult. Not only for the sheer volume it contains, but also for the access it requires, with some books relegated to expensive collections. It’s also a question of utility: Why read an entire author’s oeuvre, when you’ll probably forget most of it?
But in digital humanities, the use of technology allows a range of new practices–new “reading” and analysis–that makes this act a little more feasible. Franco Morretti’s “distant reading,” for example, can allow a scholar to sift through millions of texts, using different data-driven lenses to pry out patterns.
And while this ability to access large swaths of text is helpful in itself, technology can play with texts in other ways, highlighting certain words, collecting certain patterns, making visualizations. As Tanya Clemens points out, such methodologies “defamiliarize texts, making them unrecognizable in a way (putting them at a distance) that helps scholars identify features they might not otherwise have seen.” This defamiliarizing lies at the heart of literary scholarship, finding new ways to understand texts.
But for now, I want to get back to my freshman self, sitting on the riverbank, reading an old library book of Thoreau.
Thoreau was a curious figure. Shirking the usual American work ethic, he took long solitary walks around Concord almost daily, even in bad weather. He also read everything from field guides to the Upanishads, taking full advantage of Harvard’s library when he studied there. One of his first major publication was a poetic guide to local flora and fauna around Concord, and one of his last projects–unfinished before his death–was a field guide for berries.
A once-jailed tax-dodger who bummed off his friends (occasionally), Thoreau was also ugly and irascible. As friend and author Nathaniel Hawthorn described him, “He is as ugly as sin, long-nosed, queer-mouthed, and with uncouth and rustic.” Though, Hawthorn later adds, “his ugliness is of an honest and agreeable fashion, and becomes him much better than beauty.” Also, Thoreau barely left the pastoral forests around Concord, and when he did, like to Maine, the wilderness he found possibly overwhelmed him.
Thoreau was also a surveyor. While walking, he took in broad sweeps of the Concord countryside through the lens of his equipment, measuring and tabulating the fields and forests. But on such walks, he also noticed the smallest of details, like the “purple grass” he describes in “Autumnal Tints,” “puny cases” that he nearly misses, “difficult to detect” up close.
Thoreau smelled, heard, and tasted the landscape, describing with relish the sweetness of wild apples on the ground crystallized by frost. He measured the landscape. He classified it. He described in in poetry and taxonomies.
With the decline in book reading that Katherine Hayles cites in “How We Read” and the decline of “deep attention” (to use Hayles’ word) that many authors discuss, notably Nicholas Carr, questions of reading and what reading is or could be seem particularly pertinent. Here, “hyperreading” sticks out, especially with its “computer-assisted” element. Since hyperreading is not merely “speed reading” or “skimming” by default, though they may be connected, this technology-assisted element is a major extension, defamiliarizing the act of reading substantially.
Part of this change involves our attention. As Hayles writes, “hyperattention is useful for its flexibility in switching between different information streams, its quick grasp of the gist of material, and its ability to move rapidly among and between different kinds of texts” (72). Different from the sustained focus of deep attention, hyperattention is flexible and fast. This in itself makes it a tool, argues Hayles, and not some evil that we must eradicate, though deep reading has its place as well.
Beyond increasing hyperattention, technology allows us to play with reading further. Topic modelling, distant reading, data mining, working through interfaces to access thousands of texts–all of these acts are reading. So is the quantifying of word clusters. And the measuring of trends. More importantly, these interactions between human reader and machine reader can increase understanding, as Seth Long illustrates with his work with the Unabomber Manifesto.
Thoreau hammered relentlessly on his draft of Walden. It is a heavily wrought book, hardly the sort of effuse rambling on paper that one might imagine. He cut a lot out and refined the wording heavily.
Fortunately, you can see some of his editing in the fluid text edition, a project from the “Digital Thoreau,” “a resource and a community dedicated to promoting the deliberate reading of Thoreau’s works in new ways, ways that take advantage of technology to illuminate Thoreau’s creative process and facilitate thoughtful conversation about his words and ideas,” as their homepage notes.
Thoreau’s cross-outs and revisions reveal a window into his process. Like his journals, the edition is a guidebook of sorts, allowing one to survey Thoreau’s thinking and writing process. His ecology of writing, in a sense.
Like entering a pathless wood, one can enter the text as a database, equipped to study it. Perhaps one may get lost, but the shifted reading itself is new. On a wider scale, one can also use tools like searches to “read” texts in The Reader’s Thoreau. by finding certain word patterns or points of interest, instead of following the narrative path.
Here, I want to stress the malleability of the old word “reading.” None of what I’m saying is new. In fact, everything I’m saying is probably a bit dull to DH scholars. But I think stressing these methods as “reading” and not just researching offers a key point regarding scale. In other words, I see no difference between “searching” and “reading” in this context. As we change, our literacy practices, our understanding of literacy-related terms must also change, including a romanticized view of reading.
A shifted view of reading may be more accurate for many of the tasks we do daily, ranging from database exploration to skimming articles online. A shifted view may also be more helpful for students who feel that to “read” for class involves certain set, inaccurate practices, not a situated process that may involve multiple agents and approaches. I continually encounter resistance to reading from students, having to walk students through “strategies” that they often feel reluctant to follow, calling them lazy or cheating.
Reading can be a romantic cover-to-cover experience–like the arch-Romantic image of my freshman self reading Thoreau under an oak. But reading can also be a frenetic scanning of titles and texts or a digitized data-cloud of words and associations.
While I was undertaking my reading of Thoreau, I took up his routine of walking for an hour or two every day. As a freshman with no close friends, I had plenty of time. Sometimes I walked the same paths, but I often diverged from them, picking my own way through deer trails and a ridges. In rain and winter, I took in the Appalachian landscape.
I got used to its smells and rhythms, its sounds and locations. It was a dialogic knowledge, a daily communication through my sense and the space around me. Often on paths, but often not.
But also like Thoreau, I would “survey,” using Google Earth or local maps to quantify and measure the landscape. In depth. Often from a distance. I could see the way the river cut alongside the cornfields, and how one cornfield buffed up next to an alfalfa field, or the way oil drilling had left artificial ridges like grids weaving through the landscape in an emergent pattern. I suddenly saw the same landscape in a completely new way.
Much of what Matthew Jockers and Julia Flanders, get at in “A Matter of Scale” speaks to this: using different levels of scale together to see new connections, exploring the text with as many tools as possible in strategic ways.
But what interests me about Thoreau and our own reading is that, in each case, the the data are the “same” landscape, yet different. The database emerges as narratives and patterns in a co-creative sense, through human and machine, just as a landscape emerges through our experience of it, through environment and human. So we may have the same raw data, but a different “text,” pattern, or insight emerges depending on our scale and our tools. Texts and databases are dynamic and expressive interfaces through our co-interactions, not dead repositories we study.
As Thoreau says, “It’s not what you look at that matters, it’s what you see.” And, I would add, the tools that you use.