Journal of Writing Analytics
5 articlesJanuary 2021
January 2019
-
Understanding Attainment Disparity: The Case for a Corpus-Driven Analysis of the Language used in Written Feedback Information to Students of Different Backgrounds ↗
Abstract
Background: Disparity of attainment between different groups of students in UK higher education has been correlated with ethnicity (UUK & NUS, 2019). For example, students who declared their ethnicity as Black were 20% less likely to graduate with a top classification than those who declared their ethnicity as White (OfS, 2018a). The causes of such attainment gaps are complex, and one important factor may be the nature of the feedback given by academic staff on assignments written by different groups of students. This paper aims to explore the feasibility of investigating this hypothesis by analyzing written feedback and looking for patterns in feedback given to different groups of students. Literature Review: Research on attainment among Black and Minority Ethnic (BAME) students in the UK has explored a number of aspects, and has generally concluded that there are issues of “belonging” (Richardson, 2015), particularly in institutions where the majority of academic staff and students are White, but that no single variable can explain the disparity. The wording of feedback on lower-scoring papers has been shown to be more impersonal and distant than that given to students on higher-scoring papers (e.g., Gardner, 2004), which has the (unintended) result of increasing the sense of belonging of higher performing students in ways that can build incrementally over the years of a degree course. While there have been many such small-scale studies of written feedback, none have aimed to collect large quantities of authentic written feedback for analysis. Research Questions: The hypotheses that drive our exploration are that written feedback information (WFI) (Boud & Malloy, 2013) is worded differently to different groups of students, and that there is a direct relationship between this aspect of feedback and academic attainment as measured by grades on summative assessments. Specifically, we asked: 1. Can a framework of WFI functions be developed for our data that share a meaningful set of attributes? 2. Can these categories be used to differentiate WFI to different groups of students? Methodology: A small pilot corpus was compiled from written feedback comments on twelve student assignments from two large Faculties. Metadata was added to each file, and the WFI comments were annotated and analyzed according to a framework developed in a branching format through a recursive construction process informed by the literature reviewed and the data in the corpus. This technique was used to characterize the WFI styles of the two Faculties. Results: The results show that all WFI comments could be classified using the novel systematic framework developed, and that its binary nature enabled ready cross-tabulation with metadata variables. Praise and critique were found to be most frequent, with specific praise of ideas (P1A) accounting for 68% of all praise, and specific critique of content (C1A) accounting for 49% of all critique. Observations tend to be the longest feedback comments (average 15.4 words). When the two Faculties are compared, two different feedback styles are evident, with Fac1 providing more advice, query, and observation style feedback than Fac2, and Fac2 providing more praise and critique than Fac1.
January 2018
-
Abstract
Background: Research incorporating large data sets and data and text mining methodologies is making initial contributions to writing studies. In writing program administration (WPA) work, one could best characterize the body of publications as small but growing, led by such work as Moxley and Eubanks’ 2015 “On Keeping Score: Instructors' vs. Students' Rubric Ratings of 46,689 Essays” and Arizona State University’s Science of Learning & Educational Technology (SoLET) Lab. Given the information that large-scale textual analysis can provide, it seems incumbent on program administrators to explore ways to make regular and aggressive use of such opportunities to give both students and instructors more resources for learning and development. This project is one attempt to add to this corpus of work; the sample for the study consisted of 17,534 pieces of student writing representing 141,659 discrete comments on that writing, with 58,300 unique words out of over 8.25 million total words written. This data is used to examine trends in the program’s instructor commentary over five years’ time. By doing so, this study revisits a fundamental task of writing instruction—responding to student writing, and from the data’s results considers how large writing programs with constant turnover of graduate teaching assistants (GTAs) might manage their ongoing instructor professional development and how those GTAs will improve their ability to teach and respond to writing.Literature Review: Researchers have attempted to unpack and understand the task of instructor commentary for several decades; the published literature demonstrates a complex and occasionally ambivalent relationship with this central task of writing instruction. Recent scholarship has moved from the small-scale studies long used by the field to implement large-scale examinations of the instruction occurring in writing programs. Research questions: Three questions guided the inquiry:Does the work of new instructors (MA1s) more closely resemble the lexicon of novice or experienced responders to student writing?How does the new instructors’ work compare to that of more experienced (PHD1 or INS) instructors in the program throughout their time?How does their work evolve over a four-semester longitudinal time frame (as MA1 or MA2 experience levels) in the first-year writing program? [Please note that the abbreviations used above and throughout the article to designate instructor experience levels are as follows: MA1 (first-year master’s students); MA2 (second-year master’s students); PHD1 (first-year doctoral students); INS (instructors—those with 3 or more years’ experience teaching and who are not currently pursuing an additional degree—nearly all of these individuals held a Master’s degree)].Methodology: This study extends the work of Anson and Anson (2017) who first surveyed writing instructors and program administrators to create wordlists that survey respondents associated with “high-quality” and “novice” responses, and then examined a corpus of nearly 50,000 peer responses produced at a single university to learn to what extent instructors and student peers adopted this lexicon. Specifically, the study analyzes a corpus of instructor comments to students using the Anson and Anson wordlists associated with principled and novice commentary to see if new writing instructors align more closely with the concepts represented in either list during their first semester in the program. It then tracks four cohorts for evolution and change in their vocabulary of feedback over their next three semesters in the program; the study also compares the vocabulary used in their comments to that used by experienced instructors in the program over the same time.Results: The study found that from the outset, the new instructors (MA1) incorporated more of the principled response terms than the novice response terms. Overall, in comparing the MA1 instructors with the most experienced group (INS), the results reveal three important findings about the feedback of both MA1s and INSs in this program.While there are some differences in commentary as seen via examination of the two lexicons, the differences are perhaps less than one might assume.The cohorts do increase their use of the principled terms as they move through the two years’ appointment in the program, but few of the increases demonstrate statistical significance.Few of the terms from either the novice or principled lexicon, with the exception of terms that also appear in the assignment descriptions, what I label as “content terms,” appear frequently in the overall corpus.Discussion: Based on the results, the instructors in this program had acquired a more consistent vocabulary, but not primarily one based on Anson and Anson’s two lexicons—instead, the most frequent and commonly used terms seem to come from a more local “canon,” that is, one based on the assignment descriptions and course outcomes. Regardless of whether the acquisition of a common vocabulary came from more global concepts or an assignment-based local canon, using common terms is something that Nancy Sommers (1982) saw as contributing to “thoughtful commentary” on student writing. As no one has previously studied how quickly new instructors acquire a professional vocabulary for responding to student writing, it is hard to know whether or not the results of this particular group of instructors would be considered “typical.” However, it may well be that the context of this writing program contributed to a more accelerated acquisition.Conclusions: Working with the lexicons developed via Anson and Anson’s survey is a useful starting point for understanding more of what our instructors actually do when responding to student writing, as well as for identifying critical differences in our instructors’ comments. The lexicons, though, only provide us with a subset of expected (thus acceptable) terms included in commentary—terms that afford students the opportunity to act upon receiving them via revision or transfer. Directions for Future Research: Additional research is necessary to expand and refine the lexicons and their impact on student writing. One possibility is to return to the current data set to engage in additional lexical analysis of both the novice and principled lexicons as well as the overall frequency tables to understand how terms are used in the context of response by the various instructor groups. Differences in the application of the terms might help us understand why comments might be labeled as more or less helpful to writers. Another strategy is to examine the data in terms of markers of stance; finally, topic modeling could be used to locate more subtle differences in the instructor comments that are not as easily identifiable with lexical analysis. Such examinations could serve as a baseline for broadening the study out to other sets of assignments and commentary, perhaps helping us build a set of threshold concepts for talking about writing with our students. Ultimately, it is important to replicate and expand Anson and Anson’s survey to other stakeholder groups. As with much research on the teaching of writing, we default to the group most accessible to us—other writing professionals. Replicating this survey with other stakeholders—graduate teaching assistants, undergraduate students at both lower and upper division levels— could help us understand whether or not a gap exists in understanding what constitutes good feedback from the various stakeholders.