Aaron Beveridge
6 articles-
Using Natural Language Processing to Rhetorically Contextualize Audiences: Vaccine Sentiment Analysis of Newspaper Comments, 2017–2023 ↗
Abstract
This article demonstrates the value of sentiment analysis for contextualizing audiences in Rhetoric of Health and Medicine (RHM) by comparing vaccine related newspaper comments to non-vaccine related comments in the New York Times from 2017–2023 (n = 22,330,999). Our results show that while all comments skew negative, following a similar trend line, after the emergence of COVID-19, vaccine related comments decouple from the negative trend of baseline non-vaccine comments, becoming more negative and volatile. These results raise additional questions about the nature of the negativity for vaccine related comments, and we provide a properly sampled dataset for follow-up research to encourage iterative investigation into the public response to vaccine policy. In addition to these findings, this article calls for broader engagement with Natural Language Processing (NLP) and data science in RHM.
-
Abstract
This article advocates for web scraping as an effective method to augment and enhance technical and professional communication (TPC) research practices. Web scraping is used to create consistently structured and well-sampled data sets about domains, communities, demographics, and topics of interest to TPC scholars. After providing an extended description of web scraping, the authors identify technical considerations of the method and provide practitioner narratives. They then describe an overview of project-oriented web scraping. Finally, they discuss implications for the concept as a sustainable approach to developing web scraping methods for TPC research.
-
Hand Collecting and Coding Versus Data-Driven Methods in Technical and Professional Communication Research ↗
Abstract
Background: Qualitative technical communication research often produces datasets that are too large to manage effectively with hand-coded approaches. Text-mining methods, used carefully, may uncover patterns and provide results for larger datasets that are more easily reproduced and scaled. Research questions: 1. To what degree can hand collection results be replicated by automated data collection? 2. To what degree can hand-coded results be replicated by machine coding? 3. What are the affordances and limitations of each method? Literature review:We introduce the stages of data collection and analysis that researchers typically discuss in the literature, and show how researchers in technical communication and other fields have discussed the affordances and limitations of hand collection and coding versus automated methods throughout each stage. Research methodology: We utilize an existing dataset that was hand-collected and hand-coded. We discuss the collection and coding processes, and demonstrate how they might be replicated with web scraping and machine coding. Results/discussion: We found that web scraping demonstrated an obvious advantage of automated data collection: speed. Machine coding was able to provide comparable outputs to hand coding for certain types of data; for more nuanced and verbally complex data, machine coding was less useful and less reliable. Conclusions: Our findings highlight the importance of considering the context of a particular project when weighing the affordances and limitations of hand collecting and coding over automated approaches. Ultimately, a mixed-methods approach that relies on a combination of hand coding and automated coding should prove to be the most productive for current and future kinds of technical communication work, in which close attention to the nuances of language is critical, but in which processing large amounts of data would yield significant benefits as well.
-
Abstract
As multimodal writing continues to shift and expand in the era of Big Data, writing studies must confront the new challenges and possibilities emerging from data mining, data visualization, and data-driven arguments. Often collected under the broad banner of data literacy , students’ experiences of data visualization and data-driven arguments are far more diverse than the phrase data literacy suggests. Whether it is the quantitative rhetoric of “likes” in entertainment media, the mapping of social sentiment on cable news, the use of statistical predictions in political elections, or the pervasiveness of the algorithmic phrase “this is trending,” data-driven arguments and their accompanying visualizations are now a prevalent form of multimodal writing. Students need to understand how to read data-driven arguments, and, of equal importance, produce such arguments themselves. In Writing through Big Data, a newly developed writing course, students confront Big Data’s political and ethical concerns head-on (surveillance, privacy, and algorithmic filtering) by collecting social network data and producing their own data-driven arguments.