Select one of the articles behind the links above (and/or in the exercises) to annotate, asking yourself, ‘who benefits from this? who is hurt from this?’. Make an entry in your blog on this theme.
I found Michelle Moravec’s blog posting “Corpus Linguistics for Historians” most interesting. I appreciated how she explicitly set out the main reasons why she liked using Corpus Linguistics and then briefly explained what tools she used and why she chose them. She then gave visual examples of each of the tools and explained her thinking/use of them.
I found that this was a good example of how historians can practice principles of digital history such as being “open source” while also producing an education and easy-to-follow document. I think that anyone interested in using any of the tools she identifies would benefit from her post. Her use of “in-process” visuals and her explanations of what is happening and why is useful in helping the reader to follow the text, while also critically engaging with the tools and methods being presented. From this perspective, I do not see how people could be harmed.
However, the situation does get a bit more ambiguous when we explore the examples she provides and the way she uses the tools. By viewing files based on word frequency, she is assuming that frequently used words mean they are more important. This potentially misses the few instances where it is used “more importantly”. Depending on the likelihood that this happened (which is impossible to know really), it is possible that a vital piece of her argument has been left out. This could potentially impact how people use her work and continue forward. However, the fact that she is aware of how and why she focused on particular words or clusters (her methodology), and provides it in narrative form to the reader, is beneficial.
Again, her use of the cluster feature is troubling for the same reason. While it seems logical for people to assume frequency equates to importance, this is a particular assumption held by North Americans. We cannot be sure others hold it, nor that this assumption was meant to be held by those who created the program.
The fact that the author presents her paper as an “exploration” can be helpful or harmful, depending on how familiar the reader is with history, particularly digital history, and how dedicated they are to good practice and methodology (as explored in the previous module). For example, when the author says that she notes that the densest file is once again Stanton’s and that she “[continues] exploring”, she presents both the strengths and weaknesses of digital technology such as this.
One strength of this type of analysis is that it allows for rapid and easy manipulation of data and lets the user quickly see if an idea or theory they have had is feasible, based on how they manipulate data. For example, a csv file can produce many different types of charts and use multiple variables. A user can find an interesting trend and quickly follow it through, checking against the data or cross-referencing with another variable. The problem is part of the same advantage because it is the user who manipulates and changes the way the data is being viewed or compared against. Someone who is not familiar with the ethics and methodology of good digital history can easily work off their biases and produce potentially misleading data. This could harm their reputation and the scholarship at large.
Ultimately, as the author writes, these tools are useful because they allow for broader comparison of “patterns and shifts over time and space“. These practices can also be harmful if not done in conjunction with typical academic methods, a fact that the author also points out when she argues that these are good tools before beginning close readings.
In conclusion, I think the same tools can be both beneficial and harmful. It is not the tool that has a negative or positive value, but the people and methods that are using them. That is why this class is about learning how to responsibly and helpfully use the tools to contribute to history at large. The fact that it makes it easier to use vast quantities of data provides the same level of harm to the unsuspecting user that a casual scan of library shelves does or the use of only one archive source. For those inclined to do poor history, these tools will allow them to continue and for those concerned about methodology, these tools assist in broadening the data they can explore.