Journal of Writing Analytics

6 articles
Year: Topic: Clear
Export:
rhetorical criticism ×

January 2021

  1. Computer-Assisted Rhetorical Analysis: Instructional Design and Formative Assessment Using DocuScope
    doi:10.37514/jwa-j.2021.5.1.09

January 2018

  1. Placing Writing Tasks in Local and Global Contexts: The Case of Argumentative Writing
    Abstract

    Background: Current research in composition and writing studies is concerned with issues of writing program evaluation and how writing tasks and their sequences scaffold students toward learning outcomes. These issues are beginning to be addressed by writing analytics research, which can be useful for identifying recurring types of language in writing assignments and how those can inform task design and student outcomes. To address these issues, this study provides a three-step method of sequencing, comparison, and diagnosis to understand how specific writing tasks fit into a classroom sequence as well as compare to larger genres of writing outside of the immediate writing classroom environment. By doing so, we provide writing program administrators with tools for describing what skills students demonstrate in a sequence of writing tasks and diagnosing how these skills match with writing students will do in later contexts. Literature Review: Student writing that responds to classroom assignments can be understood as genres, insofar as they are constructed responses that exist in similar rhetorical situations and perform similar social actions. Previous work in corpus analysis has looked at these genres, which helps us as writing instructors understand what kind of constructed responses are required of students and to make those expectations explicit. Aull (2017) examined a corpus of first-year undergraduate writing assignments in two courses to create “sociocognitive profiles” of these assignments. We analyze student writing that responds to similar writing tasks, but use a different corpus method that allows us to understand the tasks in both local and global contexts. By doing so, we gain confidence and depth in our understanding of these tasks, analyze how they sequence together, and are able to compare argumentative writing across institutions and contexts. Research Questions: Two questions guided our study: What is the trajectory of skills targeted by the sequence of tasks in the two first-year writing courses, as evidenced by the rhetorical strategies employed by the writers in successive assignments? Focusing on the final argument assignments, how similar are they to argumentative writing in other contexts, in terms of rhetorical profiles? Methodology: We first conducted a local analysis, in which we used a dictionary-based corpus method to analyze the rhetorical strategies used by writers in the first-year writing courses to understand how they built on each other to form a sequence. Having understood what skills students are demonstrating in a course, we then conducted a global analysis which calculated a “distance” between the first-year argument writing and a corpus of argument writing drawn from other contexts. Recognizing that there was a non-trivial distance, we then identified and evaluated the sources of the distance so that the writing tasks could be assessed or modified. Results: The local analysis revealed eight key rhetorical strategies that student writing exhibits between the two first-year writing courses. With this understanding, we then placed the argument writing in global contexts to find that the assignments in both courses differ somewhat from argument writing in other contexts. Upon analyzing this difference, we found that the first-year writing primarily differs in its usage of academic language, the personal register, assertive language, and reasoning. We suggest that these differences stem primarily from the rhetorical situation and learning objectives associated with first-year writing, as well as the sequencing of the courses. Discussion: The three-step method presented provides a means for writing program administrators to describe and analyze writing that students produce in their writing programs. We intend these steps to be understood as an iterative process, whereby writing programs can use these results to evaluate what rhetorical skills their students are exhibiting and to benchmark those against the program’s goals and/or other similar writing programs. Conclusions: By presenting these analyses together, we ultimately provide a cohesive method by which to analyze a writing program and benchmark students’ use of rhetorical strategies in relation to other argumentative contexts. We believe this method to be useful not only to individual writing programs, but to assessment literature broadly. In future research, we anticipate learning how this process will practically feed back into pedagogy, as well as understanding what placing writing tasks into a global context can tell us about genre theory.

    doi:10.37514/jwa-j.2018.2.1.03
  2. Is this Too Polite? The Limited Use of Rhetorical Moves in a First-Year Corpus
    Abstract

    Background: The researchers conducted a corpus analysis of 548 research-based argument essays, totalling 1,465,091 words, written by first-year students at The City College of New York (CCNY). The purpose of this study was to better understand the ways in which CCNY students were constructing arguments in research essays in order to better support our instruction of the research essay. Curricular guidelines for the research assignment are general. Instructors are directed to require a research-based, persuasive argument that includes conflicting points of view. Model assignment sheets are provided to instructors, but they are free to write their own. Assignment sheets are not collected or approved. In the fall semester in which this corpus was collected, over 70 part-time instructors taught approximately 120 sections of the first- or second-semester composition course.Literature Review: The study of The City College of New York Corpus (CCNYC) partially replicates and relies on the analysis of three corpora of academic writing conducted by Zak Lancaster (2016a) in his examination of Gerald Graff’s and Cathy Birkenstein’s textbook They Say/I Say: The Moves that Matter in Academic Writing (2014). The current study also compares the CCNYC findings to studies of stance and voice markers frequency conducted by Ken Hyland (2012) and Ellen Barton (1993) and suggests the classroom use of corpus analysis as described by Raith Abid and Shakila Manan (2015), and Maggie Charles (2007).Research Questions: The study was guided by a narrowly-focused interest in learning whether or not the CCNYC would demonstrate the range and distribution of rhetorical moves that Lancaster found in his study of academic writing (2016a). The analysis of the corpus consists of frequency counts; we did not conduct other statistical analyses. Since we had little prior experience with corpus analysis, we wondered what would be revealed about students’ writing practices by a partial replication of Lancaster’s study. We did not reproduce Lancaster’s analysis but relied on his publised results. This study served as an assessment tool, providing a microscopic view of a limited number of rhetorical moves across a large corpus of student essays. As a result of our study, we hoped to be able to create assignments for research essays that responded directly to the patterns that we saw in our students’ essays.Methodology: Modeled on Lancaster’s study and the templates of rhetorical moves offered by Graff and Birkenstein, concordances of terms used to introduce objections, offer concessions, and make counterarguments were drawn from the CCNYC and then analyzed to confirm that the rhetorical form was in fact functioning as one of the above rhetorical moves within the context of the essay in which it was found.Results: Our study demonstrates that CCNY students use fewer linguistic resources than their peers at other institutions, a finding that helps shape faculty development seminars. The corpus analysis reveals that while CCNY students introduce objections to their arguments at about the same rates as in other corpora, they are less likely to concede to those objections. In addition, when students made counterarguments, they used only a limited range of the linguistic resources available to them.Conclusions: The low rate of engagement with opposing points of view and the limited use of linguistic resources for counterarguments all suggest the potential value of focused, corpus-based instruction.

    doi:10.37514/jwa-j.2018.2.1.04

January 2017

  1. Applying Natural Language Processing Tools to a Student Academic Writing Corpus: How Large are Disciplinary Differences Across Science and Engineering Fields?
    Abstract

    • Background: Researchers have been working towards better understanding differences in professional disciplinary writing (e.g., Ewer & Latorre, 1969; Hu & Cao, 2015; Hyland, 2002; Hyland & Tse, 2007) for decades. Recently, research has taken important steps towards understanding disciplinary variation in student writing. Much of this research is corpus-based and focuses on lexico-grammatical features in student writing as captured in the British Academic Written English (BAWE) corpus and the Michigan Corpus of Upper-level Student Papers (MICUSP). The present study extends this work by analyzing lexical and cohesion differences among disciplines in MICUSP. Critically, we analyze not only linguistic differences in macro-disciplines (science and engineering), but also in micro-disciplines within these macro-disciplines (biology, physics, industrial engineering, and mechanical engineering).\n• Literature Review: Hardy and Römer (2013) used a multidimensional analysis to investigate linguistic differences across four macro-disciplines represented in MICUSP. Durrant (2014, in press) analyzed vocabulary in texts produced by student writers in the BAWE corpus by discipline and level (year) and disciplinary differences in lexical bundles. Ward (2007) examined lexical differences within micro-disciplines of a single discipline.\n• Research Questions: The research questions that guide this study are as follows:\n1. Are there significant lexical and cohesive differences between science and engineering student writing? 2. Are there significant lexical and cohesive differences between micro-disciplines within science and engineering student writing?\n• Research Methodology: To address the research questions, student-produced science and engineering texts from MICUSP were analyzed with regard to lexical sophistication and textual features of cohesion. Specifically, 22 indices of lexical sophistication calculated by the Tool for the Automatic Analysis of Lexical Sophistication (TAALES; Kyle & Crossley, 2015) and 38 cohesion indices calculated by the Tool for the Automatic Analysis of Cohesion (TAACO; Crossley, Kyle, & McNamara, 2016) were used. These features were then compared both across science and engineering texts (addressing Research Question 1) and across micro-disciplines within science and engineering (biology and physics, industrial and mechanical engineering) using discriminate function analyses (DFA).\n• Results: The DFAs revealed significant linguistic differences, not only between student writing in the two macro-disciplines but also between the micro-disciplines. Differences in classification accuracy based on students’ years of study hovered at about 10%. An analysis of accuracies of classification by paper type found they were similar for larger and smaller sample sizes, providing some indication that paper type was not a confounding variable in classification accuracy.\n• Discussion: The findings provide strong support that macro-disciplinary and micro-disciplinary differences exist in student writing in these MICUSP samples and that these differences are likely not related to student level or paper type. These findings have important implications for understanding disciplinary differences. First, they confirm previous research that found the vocabulary used by different macro-disciplines to be “strikingly diverse” (Durrant, 2015), but they also show a remarkable diversity of cohesion features. The findings suggest that the common understanding of the STEM disciplines as “close” bears reconsideration in linguistic terms. Second, the lexical and cohesion differences between micro-disciplines are large enough and consistent enough to suggest that each micro-discipline can be thought of as containing a unique linguistic profile of features. Third, the differences discerned in the NLP analysis are evident at least as early as the final year of undergraduate study, suggesting that students at this level already have a solid understanding of the conventions of the disciplines of which they are aspiring to be members. Moreover, the differences are relatively homogeneous across levels, which confirms findings by Durrant (2015) but, importantly, extends these findings to include cohesion markers.\n• Conclusions: The findings from this study provide evidence that macro-disciplinary and micro-disciplinary differences at the linguistic level exist in student writing, not only in lexical use but also in text cohesion. A number of pedagogical applications of writing analytics are proposed based on the reported findings from TAALES and TAACO. Further studies using different corpora (e.g., BAWE) or purpose assembled corpora are suggested to address limitations in the size and range of text types found within MICUSP. This study also points the way toward studies of disciplinary differences using NLP approaches that capture data which goes beyond the lexical and cohesive features of text, including the use of part-of-speech tags, syntactic parsing, indices related to syntactic complexity and similarity, rhetorical features, or more advanced cohesion metrics (latent semantic analysis, latent Dirichlet allocation, Word2Vec approaches).

    doi:10.37514/jwa-j.2017.1.1.04
  2. Transforming Text: Four Valences of a Digital Humanities Informed Writing Analytics
    Abstract

    Aim: This research note narrates existing and continuing potential crossover between the digital humanities and writing studies. I identify synergies between the two fields’ methodologies and categorize current research in terms of four permutations, or “valences,” of the phrase “writing analytics.” These valences include analytics of writing , writing of analytics , writing as analytics , and analytics as writing . I bring recent work in the two fields together under these common labels, with the goal of building strategic alliances between them rather than to delimit or be comprehensive. I offer the valences as one heuristic for establishing connections and distinctions between two fields engaged in complementary work without firm or definitive discursive borders. Writing analytics might provide a disciplinary ground that incorporates and coheres work from these different domains. I further hope to locate the areas in which my current research in digital humanities, grounded in archival studies, might most shape writing analytics. Problem Formation: Digital humanities and writing studies are two fields in which scholars are performing massive data analysis research projects, including those in which data are writing or metadata that accompanies writing. There is an emerging environment in the Modern Language Association friendly to crossover between the humanities and writing studies, especially in work that involves digital methods and media. Writing analytics accordingly hopes to find common disciplinary ground with digital humanities, with the goal of benefitting from and contributing to conversations about the ethical application of digital methods to its research questions. Recent work to bridge digital humanities and writing studies more broadly has unfortunately focused more on territorial and usability concerns than on identifying resonances between the fields’ methodological and ethical commitments. Information Collection: I draw from a history of meta-academic literature in digital humanities and writing studies to review their shared methodological commitments, particularly in literature that recognizes and responds to pushback against the fields’ ostensible use of extra-disciplinary methods. I then turn to current research in both fields that uses and critiques computational techniques, which is most relevant to writing analytics’ articulated focus on massive data analysis. I provide a more detailed explanation, drawing from my categorization of this work, of the conversations in digital humanities surrounding the digital archives that enable data analysis. Conclusions: A review of past and current research in digital humanities and writing studies reveals shared attention to techniques for tokenizing texts at different scales for analysis, which is made possible by the curation of large corpora. Both fields are writing new genres to compose this analysis. In these genres, both fields emphasize process in their provisional work, which is sociocognitively repurposed in different rhetorical contexts. Finally, both fields recognize that the analytical methods they employ are themselves modes of composition and argumentation. An ethics of data transformation present in digital humanities, however, is largely absent from writing studies. This ethics comes to digital humanities from the influence of textual studies and archival studies. Further research in writing analytics might benefit from reframing writing corpora as archives—what Paul Fyfe (2017) calls a shift from “data mining” to “data archaeology”—in its analyses. This is especially true for analyses of text, which in particular foreground writing and analysis of writing as acts of transformation. Directions for Further Research: I recommend that future efforts to find crossover between digital humanities and writing studies do so by identifying their common values rather than trying to co-opt language and spaces or engaging in broad definitional work. I further provide a set of guiding principles that writing analytics might follow in order to pursue research that draws upon and contributes to both digital humanities and writing studies. These research projects might consider and account for the silences of writing corpora—unseen versions of documents, and documents’ elements not described in structured data—while attending to the silences that these efforts might in turn (re)produce.

    doi:10.37514/jwa-j.2017.1.1.11
  3. Doing Big Data: Considering the Consequences of Writing Analytics
    Abstract

    Aim: This research note focuses on some of the consequences of big data as an emerging methodology. Its purpose is to provide a brief literature review of the method’s development and some of the critical questions researchers should consider as they move forward. Salvo (2012) contends that big data as a form of design of communication itself “is necessarily a rhetorically-based field” (p. 38). With big data as an up and coming methodology (McNely, 2012; Salvo, 2012), using caution in its application is a necessity for scholars. Not only should researchers seek out the unseen and untapped applications of big data, but they should learn its limitations as well (Spinuzzi, 2009). You adopt a methodology, you adopt its flaws. Problem Formation: This section identifies a gap in the field as it relates to some of the consequences of applying big data as a methodology and seeing it as a rhetorical tool. As big data gains steam in the field of humanities, some are sure to question what they see as a flaw: the act of quantifying language. This argument is not new nor is its rebuttal. Harris (1954) discusses the distributional structure of language with each part of a sentence acting as co-occurents, each in a particular position, and each with a relationship to the other co-occurents (p. 146). Salvo (2012) argues that the combination of these new methodologies and technologies “knits together invention, arrangement, style, memory, and delivery in ways that challenge conceptions of print based literacy and textuality” (p. 39). While big data itself has several rhetorical methodologies embedded within, deciding which one to use depends on the amount of data and how it’s aggregated. • Information Collection: As described above, this research note functions primarily as a brief review of literature. This section focuses on how writing analytics developed from content analysis in mass communications and shifted into latent semantic analysis assisted by computer technology. Riffe, Lacy, & Fico (1995) offer a clear explanation of content analysis, which was developed with comparably small data sets in mind: “Usually, but not always, content analysis involves drawing representative samples of content, training coders to use the category rules developed to measure or reflect differences in content, and measuring reliability (agreement or stability over time) of coders applying the rules” (p. 2). Finding a representative sample of content was once a more feasible methodology, but in the digital age that amount of content exponentially increases every day. Conclusions: As latent semantic analysis is an extension of quantitative content analysis (and vice versa)—and knowing that an adopted methodology carries adopted flaws—it makes sense to turn to some of the concerns voiced by mass communication scholars in order to understand limitations. While quantitative content analysis grew in popularity in mass communication, so did the refining of its methods. Reporting the reliability of a study adds credibility to the study itself, and when a human coder is involved, the reporting of this intercoder reliability becomes imperative (Hayes & Krippendorf, 2007; Krippendorf, 2008, 2011). While intercoder reliability measures the degree to which coders agree, researchers should also be keenly aware of the theory and valence informing their study, which impacts their coders, which ultimately impacts the results of the study itself. Directions for Further Research: As the field of writing studies begins to adopt big data methodologies, researchers must continue to challenge and question their applications, implementations, and implications, turning to familiar questions from our own fields. Big data is exciting and new, but it’s not the methodology to explain it all. It’s just as rhetorical as every other methodology—it’s just better at hiding it.

    doi:10.37514/jwa-j.2017.1.1.12