Pinakes — Rhetoric & Composition

April 2026

Apr 2026 OA PDF

Assessing fairness in finetuned scoring models with demographically restricted training data ↗

Langdon Holmes; Wesley Morris; Scott Crossley; Joon Suh Choi

Abstract

The increasing adoption of automated essay scoring (AES) in high-stakes educational contexts necessitates careful examination of potential biases within the systems. This study investigates how the demographic composition of training data influences fairness in AES systems developed from finetuned large language models (LLMs). Using the PERSUADE corpus of 26,000 student essays, we conducted a systematic analysis using demographically restricted training sets to isolate the impact of training data demographics on LLM-AES performance. Each demographically restricted training set comprised essays written by one racial/ethnic group. Four variants of a Longformer-based AES were developed: one trained on demographically balanced data and three trained on demographically restricted datasets. An initial analysis of the human ratings indicated that demographic factors significantly predict human essay scores (marginal R² = 0.125), a pattern that is paralleled in national writing assessment data. LLM-AES systems trained on demographically restricted data exhibited small systematic biases (marginal R² = 0.043). However, the LLM trained on balanced data showed minimal demographic bias, suggesting that representative training data can effectively prevent amplification of demographic disparities beyond those present in human ratings. These results highlight both the importance and limitations of training data diversity in achieving fair assessment outcomes. • 12.5% of variance in human essay ratings was explained by demographics. • We construct demographically restricted training sets to isolate bias. • Balanced training data minimized LLM-AES bias across demographic groups. • LLM-AES trained on demographically restricted data showed more bias.

assessment artificial intelligence race and writing

doi:10.1016/j.asw.2026.101032

January 2025

Jan 2025 OA PDF

Examining the use of academic vocabulary in first-year ESL undergraduates’ writing: A corpus-driven study in Hong Kong ↗

Edsoulla Chung; Aaron Wan

Abstract

A good command of academic vocabulary is important for academic success in higher education. However, research has primarily focused on the receptive academic vocabulary knowledge of L2 learners while devoting relatively limited attention to their productive use of such vocabulary and its impact on writing quality. To address this gap, we analysed the problem-solution essays written by 168 first-year undergraduates in Hong Kong, focusing on the relationship between their use of academic words in the Academic Vocabulary List (AVL) and the overall quality of their writing. We also explored the relationship between the size of students’ receptive academic vocabulary and the frequency of its use in writing. Findings revealed that essays with high scores contained a greater density and diversity of academic vocabulary than low-scored essays, with greater frequency of words in the 1–500 and 501–1000 tiers of the AVL significantly predicting better writing quality. The essays also showed a significant relationship between the participants’ receptive academic vocabulary size and the diversity of academic words used in writing. However, no significant relationship was observed between receptive academic vocabulary size and the density of academic words used. We highlight the implications of these findings for EAP teaching and research. • Problem-solution essays written by undergraduates in Hong Kong were analysed. • Density and diversity of academic vocabulary (AV) predict L2 writing quality. • Learners’ receptive AV size significantly relates to AV diversity in their writing. • Only words from two tiers of the AVL significantly predicted writing scores. • A holistic and tiered approach to assessing AV use is important.

teacher development multilingual writers race and writing

doi:10.1016/j.asw.2024.100913

October 2024

Oct 2024 OA PDF

Effects of a genre and topic knowledge activation device on a standardized writing test performance ↗

Natalia Ávila Reyes; Diego Carrasco; Rosario Escribano; María Jesús Espinosa; Javiera Figueroa; Carolina Castillo

Abstract

The aim of this article was twofold: first, to introduce a design for a writing test intended for application in large-scale assessments of writing, and second, to experimentally examine the effects of employing a device for activating prior knowledge of topic and genre as a means of controlling construct-irrelevant variance and enhancing validity. An authentic, situated writing task was devised, offering students a communicative purpose and a defined audience. Two devices were utilized for the cognitive activation of topic and genre knowledge: an infographic and a genre model. The participants in this study were 162 fifth-grade students from Santiago de Chile, with 78 students assigned to the experimental condition (with activation device) and 84 students assigned to the control condition (without activation device). The results demonstrate that the odds of presenting good writing ability are higher for students who were part of the experimental group, even when controlling for text transcription ability, considered a predictor of writing. These findings hold implications for the development of large-scale tests of writing guided by principles of educational and social justice. • Genre and topic knowledge are forms of prior knowledge relevant to writing. • Higher odds for better writing in students exposed to prior knowledge activation. • Results support use of prior knowledge activation in standardized assessment.

genre theory writing pedagogy assessment race and writing

doi:10.1016/j.asw.2024.100898

January 2022

Jan 2022 OA PDF

Appropriateness as an aspect of lexical richness: What do quantitative measures tell us about children's writing? ↗

Philip Durrant; Ayça Durrant

Abstract

Quantitative measures of vocabulary use have added much to our understanding of first and second language writing development. This paper argues for measures of register appropriateness as a useful addition to these tools. Developing an idea proposed by Durrant and Brenchley (2019), it explores what such measures can tell us about vocabulary development in the L1 writing of school children in England and critically examines how results should be interpreted. It shows that significant patterns of discipline- and genre-specific vocabulary development can be identified for measures related to four distinct registers, though the strongest patterns are found for vocabulary associated with fiction and academic writing. Follow-up analyses showed that changes across year groups were primarily driven, not by the nature of individual words, but by the overall quantitative distribution of register-specific vocabulary, suggesting that the traditional distinction between measures of lexical diversity and lexical sophistication may not be helpful for understanding development in this context. Closer analysis of academic vocabulary showed development of distinct vocabularies in Science and English writing in response to sharply differing communicative needs in those disciplines, suggesting that development in children’s academic vocabulary should not be seen as a single coherent process.

genre theory multilingual writers race and writing editorial matter

doi:10.1016/j.asw.2021.100596

July 2021

Jul 2021 OA PDF

Examining lexical features and academic vocabulary use in adolescent L2 students’ text-based analytical essays ↗

Undarmaa Maamuujav

Abstract

Having rich and complex vocabulary is a crucial component that contributes to the quality of writing for academic purposes. However, use of academic vocabulary can be challenging for adolescent L2 writers who are developing their academic language proficiency. Thus, understanding lexical needs of adolescent L2 students in composing academic essays is pivotal in supporting this population in their endeavor to become proficient academic writers. This study investigates the lexical features of adolescent L2 students’ text-based analytical essays and analyzes the extent to which lexical density, lexical diversity, and lexical sophistication predict the quality of their writing. Computational tools Coh-Metrix and VocabProfiler were used to obtain quantitative measures of lexical density, diversity, and sophistication. The results of the study indicate that the essays (n = 70), on average, have (1) low lexical density, (2) more repetition of words indicating less diversity compared to grade-level estimates, and (3) a higher percentage of basic words and lower percentage of academic words. 44 % of the AWL words in the essays come from the source text and prompt. The results of multiple hierarchical regression indicate that the use of academic vocabulary is a predictor of writing quality. The study has important pedagogical implications for classroom practice at secondary school.

writing pedagogy multilingual writers race and writing

doi:10.1016/j.asw.2021.100540

January 2021

Jan 2021

Lexical density and diversity in dissertation abstracts: Revisiting English L1 vs. L2 text differences ↗

Maryam Nasseri; Paul Thompson

graduate education race and writing

doi:10.1016/j.asw.2020.100511
Jan 2021

Investigating minimum text lengths for lexical diversity indices ↗

Fred Zenker; Kristopher Kyle

race and writing

doi:10.1016/j.asw.2020.100505

October 2019

Oct 2019

Making our invisible racial agendas visible: Race talk in Assessing Writing, 1994–2018 ↗

J.W. Hammond

assessment race and writing

doi:10.1016/j.asw.2019.100425

January 2019

Jan 2019 OA PDF

The influence of lexical features on teacher judgements of ESL argumentative essays ↗

Cristina Vögelin; Thorben Jansen; Stefan D. Keller; Nils Machts; Jens Möller

Abstract

Numerous studies have examined the relationship between lexical features of students’ compositions and judgements of text quality. However, the degree to which teachers’ judgements are influenced by the quality of vocabulary in students’ essays with regard to their assessment of other textual characteristics is relatively unexplored. This experimental study investigates the influence of lexical features on teachers’ judgements of English as a second language (ESL) argumentative essays. Using analytic and holistic rating scales, English pre-service teachers (N = 37) in Switzerland assessed four essays of different proficiency levels in which the levels of lexical diversity and sophistication had been experimentally varied. Coh-Metrix software was used to manipulate the level of lexical diversity, as measured by MTLD and D, and the Tool for the Automatic Analysis of Lexical Sophistication (TAALES) software was used to obtain differing levels of lexical sophistication, as measured by word range. The results suggested that texts with greater lexical diversity and sophistication were assessed more positively concerning their overall quality as well as the analytic criteria ‘grammar’ and ‘frame of essay’. The implications of this study for classroom practice and teacher education are discussed.

teacher development assessment multilingual writers grammar and mechanics race and writing

doi:10.1016/j.asw.2018.12.003

January 2012

Jan 2012

Linguistic discrimination in writing assessment: How raters react to African American “errors,” ESL errors, and standard English errors on a state-mandated writing exam ↗

David Johnson; Lewis VanBrackle

assessment multilingual writers race and writing

doi:10.1016/j.asw.2011.10.001

Assessing Writing

April 2026

January 2025

October 2024

July 2024

January 2023

July 2022

January 2022

July 2021

January 2021

October 2019

January 2019

January 2012