Journal of Writing Analytics
3 articlesJanuary 2018
-
Abstract
The Writing Mentor TM (WM) application is a Google Docs add-on designed to help students improve their writing in a principled manner and to promote their writing success in postsecondary settings. WM provides automated writing evaluation (AWE) feedback using natural language processing (NLP) methods and linguistic resources. AWE features in WM have been informed by research about postsecondary student writers often classified as developmental (Burstein et al., 2016b), and these features address a breadth of writing sub-constructs (including use of sources, claims, and evidence; topic development; coherence; and knowledge of English conventions). Through an optional entry survey, WM collects self-efficacy data about writing and English language status from users. Tool perceptions are collected from users through an optional exit survey. Informed by language arts models consistent with the Common Core State Standards Initiative and valued by the writing studies community, WM takes initial steps to integrate the reading and writing process by offering a range of textual features, including vocabulary support, intended to help users to understand unfamiliar vocabulary in coursework reading texts. This paper describes WM and provides discussion of descriptive evaluations from an Amazon Mechanical Turk (AMT) usability task situated in WM and from users-in-the-wild data. The paper concludes with a framework for developing writing feedback and analytics technology.
January 2017
-
Abstract
Aim: This research note narrates existing and continuing potential crossover between the digital humanities and writing studies. I identify synergies between the two fields’ methodologies and categorize current research in terms of four permutations, or “valences,” of the phrase “writing analytics.” These valences include analytics of writing , writing of analytics , writing as analytics , and analytics as writing . I bring recent work in the two fields together under these common labels, with the goal of building strategic alliances between them rather than to delimit or be comprehensive. I offer the valences as one heuristic for establishing connections and distinctions between two fields engaged in complementary work without firm or definitive discursive borders. Writing analytics might provide a disciplinary ground that incorporates and coheres work from these different domains. I further hope to locate the areas in which my current research in digital humanities, grounded in archival studies, might most shape writing analytics. Problem Formation: Digital humanities and writing studies are two fields in which scholars are performing massive data analysis research projects, including those in which data are writing or metadata that accompanies writing. There is an emerging environment in the Modern Language Association friendly to crossover between the humanities and writing studies, especially in work that involves digital methods and media. Writing analytics accordingly hopes to find common disciplinary ground with digital humanities, with the goal of benefitting from and contributing to conversations about the ethical application of digital methods to its research questions. Recent work to bridge digital humanities and writing studies more broadly has unfortunately focused more on territorial and usability concerns than on identifying resonances between the fields’ methodological and ethical commitments. Information Collection: I draw from a history of meta-academic literature in digital humanities and writing studies to review their shared methodological commitments, particularly in literature that recognizes and responds to pushback against the fields’ ostensible use of extra-disciplinary methods. I then turn to current research in both fields that uses and critiques computational techniques, which is most relevant to writing analytics’ articulated focus on massive data analysis. I provide a more detailed explanation, drawing from my categorization of this work, of the conversations in digital humanities surrounding the digital archives that enable data analysis. Conclusions: A review of past and current research in digital humanities and writing studies reveals shared attention to techniques for tokenizing texts at different scales for analysis, which is made possible by the curation of large corpora. Both fields are writing new genres to compose this analysis. In these genres, both fields emphasize process in their provisional work, which is sociocognitively repurposed in different rhetorical contexts. Finally, both fields recognize that the analytical methods they employ are themselves modes of composition and argumentation. An ethics of data transformation present in digital humanities, however, is largely absent from writing studies. This ethics comes to digital humanities from the influence of textual studies and archival studies. Further research in writing analytics might benefit from reframing writing corpora as archives—what Paul Fyfe (2017) calls a shift from “data mining” to “data archaeology”—in its analyses. This is especially true for analyses of text, which in particular foreground writing and analysis of writing as acts of transformation. Directions for Further Research: I recommend that future efforts to find crossover between digital humanities and writing studies do so by identifying their common values rather than trying to co-opt language and spaces or engaging in broad definitional work. I further provide a set of guiding principles that writing analytics might follow in order to pursue research that draws upon and contributes to both digital humanities and writing studies. These research projects might consider and account for the silences of writing corpora—unseen versions of documents, and documents’ elements not described in structured data—while attending to the silences that these efforts might in turn (re)produce.
-
Abstract
Technique Identification: A new graphical technique is presented for visualizing and assessing inter-rater agreement in discrete ordinal or categorical data, such as rubric ratings. To that aim, a chance-corrected Kappa with two new features is derived. First, it is based on interpreting ratings for each subject as vectors to visualize the data. This is done by creating two-dimensional vectors from a subject-rating summary table, sorting the vectors by their slopes, and plotting them in that order to create a trajectory that displays all the data in context. Second, it presents a graph and accompanying statistics (Kappa, p -value) for each pair of ratings in an organized display so that all useful comparisons of the data are visually displayed and statistically assessed. This information is presented on a logical grid, usually called facets . Kappa is calculated in the usual way, by referencing the actual results with an average of random rating assignments. This average becomes a reference line on each graph as a visual cue, as well. The statistical basis for the Kappa and significance testing are derived, and the test assumptions are specified. Value Contribution: The most commonly used statistics for inter-rater agreement, such as the Cohen Kappa or Inter-Class Correlation, give only a single parameter estimate of reliability from which to make judgments about ratings data. The technique presented here constructs graphs of all the data that allow visual inspection of the ratings versus a reference curve that represents chance-matching. The detailed reports on inter-rater agreement can show how to fine-tune ratings systems, such as understanding which parts of an ordinal scale are working best. This solves a practical problem for researchers who rely on rating-type classification by revealing which overall aspects of the rating system need to be improved and adds to the list of tools available for assessing rating reliability. In creating this approach to analysis of rater data, human usability is emphasized. Specifically, the use of geometry is designed to facilitate interpretability rather than being a mathematical derivation from first principles. Technique Application: Two applications are given, both involving social meaning-making. The first uses data from wine-judging to illustrate how the method can illuminate expertise in that domain. The results reproduce published findings that were based on a classical statistical method. A second sample application uses data from a university assessment of student writing in which ratings on a developmental scale are assigned by course instructors to their students. The rating program is an example of social meaning-making that can be used to generate larger data sets than are typical for classroom-based assessment programs. The analysis shows the strengths and weaknesses of the rating system in terms of reliability and demonstrates how that knowledge leads to improvements in assessment. Directions for Further Research: An argument is made for a public library of inter-rater data for empirical use by researchers. The social aspects of rating are discussed, and there is an illustration of the potential to derive new measures of inter-rater agreement from the meaning-making program that produces the data.