Assessing Writing

30 articles
Year: Topic: Clear
Export:
grammar and mechanics ×

April 2026

  1. Pursuing fair writing assessment: Halo effects in primary school foreign language writing in grade six
    Abstract

    Assessing the writing competence of pupils learning English as a foreign language (EFL) at primary school is associated with specific challenges because of learners’ limited language resources. This study investigates the extent to which characteristics of their texts trigger so-called halo effects. Halo effects are an assessment bias where the quality of one feature unintentionally influences the evaluation of other aspects. The study examines halo effects across nine aspects of text quality (communicative effect, level of detail, coherence, cohesion, complexity of syntax and grammar, correctness of syntax and grammar, vocabulary, orthography and punctuation), based on a random sample of narrative texts from a sixth-grade corpus. 200 pre-service teachers assessed four randomly assigned texts. Halo effects were calculated by comparison to expert ratings using multi-level regression analyses. Results show that orthography and vocabulary were the two main triggers of halo effects. Punctuation also triggered some halo effects, but to a smaller extent. The assessment of communicative effect, complexity and correctness of syntax and grammar was not determined by the corresponding text quality but dominated by other criteria. Results highlight the importance of being aware of halo effects when assessing young EFL learners’ texts and emphasise the need for suitable training measures. • Analysis of halo effects across nine aspects of text quality. • Random sample of narrative texts from a sixth-grade EFL corpus. • Orthography and vocabulary are the two main triggers of halo effects. • Punctuation also triggers halo effects but to a smaller extent. • Halo effects call for awareness and targeted training.

    doi:10.1016/j.asw.2026.101036
  2. From spelling to content: The influence of spelling quality on text assessment
    doi:10.1016/j.asw.2026.101014
  3. How do L2 writing subskills interact hierarchically? Insights from diagnostic classification models
    Abstract

    This study examined the hierarchical structure among second/foreign language (L2) writing subskills using a Hierarchical Diagnostic Classification Model (HDCM). A pool of 500 essays composed by English as a Foreign Language (EFL) students was assessed by four experienced EFL teachers using the Empirically-derived Descriptor-based Diagnostic (EDD) checklist. Based on a literature review and the expertise of three content experts, several models were developed to reflect various hierarchical interactions among L2 writing subskills, including linear, divergent, convergent, independent, unstructured, mixed, and higher-order. The comparison of the models showed the presence of an unstructured interaction among L2 writing subskills, indicating that content is the foundational subskill for the mastery of vocabulary, grammar, organization, and mechanics. Higher mastery classes were also associated with higher educational levels, greater frequency of English use, and longer exposure to L2. Understanding the hierarchical relationships among L2 writing subskills can improve targeted instructional strategies and assessment practices. • A constrained version of existing DCMs is represented by hierarchical DCMs. • Models were developed to show hierarchical interactions among L2 writing subskills. • An unstructured interaction among L2 writing subskills was identified. • Higher mastery classes were associated with higher educational levels. • The classes were associated with greater English use and longer L2 exposure.

    doi:10.1016/j.asw.2026.101029

January 2026

  1. The effects of online resource use on L2 learners’ computer-mediated writing processes and written products
    Abstract

    While previous studies on online resource use in L2 writing have focused on the overall writing quality, limited attention has been paid to its effects on linguistic complexity and real-time writing processes. Addressing this gap, the present study explored how online resource use influences both the processes and products of L2 writing. Forty-nine intermediate L2 learners completed two computer-mediated argumentative writing tasks, either with or without the use of online resources. Writing behaviors were captured via keystroke logging and screen recording, and analyzed for search activity, fluency, pausing, and revision quantity. Cognitive processes were examined through stimulated recall interviews, and written products were evaluated for both quality and linguistic complexity. The results showed that participants spent an average of 14 % of task time using online resources, with considerable individual variation. Mixed-effects modeling revealed that resource use facilitated the production of more sophisticated words, with marginal influence on writing quality or syntactic complexity. Resource use was also associated with longer between-word pauses, fewer within-word pauses, and reduced revisions. These findings highlight the potential of online resource use to enhance the authenticity of L2 writing assessment tasks without compromising test validity, while encouraging the use of more advanced vocabulary in writing. • Learners spent 14 % of the total writing task time using online resources. • Online resource use had no significant impact on L2 writing quality. • Online resource use improved lexical sophistication, not syntactic complexity. • Online resource use reduced within-word pauses and aided spelling retrieval. • Online resource use led to fewer revisions but did not affect fluency.

    doi:10.1016/j.asw.2025.100994
  2. How reliable and valid is peer evaluation in adolescents’ L2 argumentative writing?
    Abstract

    Peer evaluation is widely recognized for its educational benefits; however, its reliability and validity, particularly among adolescent second-language (L2) writers at the early stages of English language and literacy development, remain insufficiently explored. This explanatory sequential mixed-methods study investigated the reliability and validity of peer evaluation in English argumentative writing among 35 Grade 10 and 37 Grade 12 students from a public high school in Beijing, China. Twelve of the participating students (six at each grade) were interviewed about the validity, reliability, and value of peer evaluation. The findings indicated that peer evaluations demonstrated high levels of reliability and validity, with peer-assessed writing scores closely aligning with inter-teacher assessments. Notably, variations were observed among Grade 10 students, particularly in the evaluation of lower-order writing skills, such as grammar and vocabulary, which exhibited reduced validity. These results underscore the potential of peer evaluation in assessing higher-order content-level writing across varying levels of L2 English writing proficiency. The study also highlights areas where adolescent L2 writers may require additional support to enhance the effectiveness of peer evaluation practices in English argumentative writing. Implications for improving English argumentative writing instruction and refining peer evaluation strategies in high school L2 English classrooms are discussed. • Peer evaluation shows high reliability, similar to inter-teacher rating. • Peer evaluation works well for higher-order skills in L2 argumentative writing. • 10th graders struggled with evaluating lower-order skills like grammar. • 12th graders evaluate lower- and higher-order skills with greater validity than 10th graders.

    doi:10.1016/j.asw.2025.100992
  3. Is it beneficial to strive for perfection in writing?: Exploring the relationship between perfectionism, motivational regulation, and second language (L2) writing performance
    Abstract

    Perfectionism, a personality trait characterized by the pursuit of flawlessness and high personal standards, and motivational regulation, the strategies through which individuals manage their motivational states, have received limited attention in second language (L2) writing. Framed within social cognitive theory, this study examines how two dimensions of perfectionism—perfectionistic strivings and perfectionistic concerns—relate to writing performance (syntactic complexity, accuracy, lexical complexity, and fluency) and how motivational regulation sub-strategies (interest enhancement, self-talk, and emotional control) mediate these relationships. Data from 689 university students in China were analyzed using questionnaires and argumentative writing samples. Results indicated that perfectionistic strivings positively predicted syntactic complexity, accuracy, and lexical complexity, while perfectionistic concerns negatively predicted these dimensions; neither dimension significantly affected fluency. Crucially, motivational regulation sub-strategies partially mediated the relations between perfectionism and writing performance. These findings underscore the importance of distinguishing perfectionism dimensions and targeting motivational regulation strategies to improve L2 writing. Implications for instruction and directions for future longitudinal research are discussed. • Perfectionistic strivings and concerns affect writing via motivational regulation. • Strivings improve syntax, accuracy, and lexical complexity; concerns hinder them. • Most motivational regulation sub-strategies mediate perfectionism’s impact on CALF. • Perfectionism influences writing through motivational regulation.

    doi:10.1016/j.asw.2025.101012

October 2025

  1. Assessing L2 writing formality using syntactic complexity indices: A fuzzy evaluation approach
    doi:10.1016/j.asw.2025.100973
  2. Judgment accuracy in primary school EFL writing assessment: Do text characteristics matter?
    Abstract

    Assessing the writing competence of pupils learning English as a foreign language (EFL) at primary school is challenging. This study aimed at examining a largely unexplored topic, namely the role of text characteristics in writing assessment, and analysed judgment accuracy differentiated by nine aspects of text quality (communicative effect, level of detail, coherence, cohesion, complexity of syntax and grammar, correctness of syntax and grammar, vocabulary, orthography and punctuation). Two hundred pre-service teachers assessed four randomly assigned texts from learners in grade six. Their assessment was compared to the existing ratings of two experts from a previous study. We found a relative judgment accuracy between r = .34 and .60 for the nine assessment criteria, with vocabulary being assessed significantly more accurately than almost all other criteria. Orthography, complexity and correctness of syntax and grammar and punctuation were rated with significantly more accuracy than cohesion, level of detail, communicative effect and coherence. The pre-service teachers assessed most criteria more strictly and with higher variability than the experts. The results suggest that teacher education should offer pre-service teachers concrete opportunities to practise writing assessment, implement activities to strengthen the assessment of content- and structure-related criteria, and help them adjust their assessment rigour. • Judgment accuracy in the assessment of primary school EFL learners’ texts. • Relative judgment accuracy between r = .34 and .60 for the different criteria. • Significant differences in relative judgment accuracy between assessment criteria. • Linguistic text qualities are assessed with more accuracy than content- and structure-related aspects. • Pre-service teachers are more rigorous and heterogeneous in rating than experts.

    doi:10.1016/j.asw.2025.100957
  3. The development of syntactic complexity in integrated writing: A focus on fine-grained measures
    doi:10.1016/j.asw.2025.100983

July 2025

  1. The impact of self-revision, machine translation, and ChatGPT on L2 writing: Raters’ assessments, linguistic complexity, and error correction
    doi:10.1016/j.asw.2025.100950

January 2025

  1. A meta-analysis of relationships between syntactic features and writing performance and how the relationships vary by student characteristics and measurement features
    Abstract

    Students’ proficiency in constructing sentences impacts the writing process and writing products. Linguistic demands in writing differ in terms of both student characteristics and measurement features. To identify various syntactic demands considering these features, we conducted a meta-analysis examining the relationships between syntactic features (complexity and accuracy) and writing performance (quality, productivity, and fluency) and moderating effects of both student characteristics and measurement features. A total of 109 studies (effect sizes: 871; the total number of participants: 24,628) met the inclusion criteria. Results showed that there was a weak relationship for syntactic accuracy (r = .25) and complexity (r = .16). Writers' characteristics, including grade level and language proficiency, and measurement features, writing genres, writing outcomes, whether the writing task is text-based or not, and type of syntactic complexity measures, were significant moderators for certain syntactic features. The findings highlighted the importance of writer and measurement factors when considering the relationships between linguistic features in writing and writing performance. Implications were discussed regarding the selection of syntactic features in assessing language use in writing, gaps in the literature, and significance for writing instruction and assessment. • Aimed to depict the relationships between syntactic features and writing performance. • Found weak relationships between syntactic features and writing outcomes. • Relationships vary as a function of student characteristics and measurement features. • Noun phrase complexity might be more valid than some traditional syntactic complexity measures. • Findings have important implications for writing assessments.

    doi:10.1016/j.asw.2024.100909

July 2024

  1. Modeling relationships among large-grained, fine-grained absolute syntactic complexity and assessed L2 writing quality: An SEM approach
    doi:10.1016/j.asw.2024.100875
  2. How syntactic complexity indices predict Chinese L2 writing quality: An analysis of unified dependency syntactically-annotated corpus
    doi:10.1016/j.asw.2024.100847
  3. EFL students' syntactic complexity development in argumentative writing:A latent class growth analysis (LCGA) approach
    Abstract

    The study explored EFL students' development of syntactic complexity by employing the Latent Class Growth Analysis (LCGA) approach. A total of 214 tertiary EFL students from Southwest China were invited to write four argumentative essays over an academic semester. The unconditional models of LCGA were utilized to explore the optimal latent classes of students' development trajectories of syntactic complexity. The conditional models of LCGA were employed to investigate the predictive effect of English proficiency on the optimal latent classes. Results of the unconditional models revealed different latent classes of development trajectories for six indices of syntactic complexity rather than the remaining ones, which offers tentative evidence for the heterogeneity of L2 development trajectories. Results of the conditional models showed that English proficiency did not predict the membership in these latent classes. These results are discussed and implications for L2 instruction are attempted.

    doi:10.1016/j.asw.2024.100877

April 2024

  1. Is the variation in syntactic complexity features observed in argumentative essays produced by B1 level EFL learners in Finland and Pakistan attributable exclusively to their L1?
    Abstract

    This study has explored the syntactic complexity features of English learners at the B1 Common European Framework of Reference (CEFR) (CoE, 2001) level from both Pakistan and Finland. The learners in question were taught English as a Foreign Language (EFL) using different pedagogical methods. This study took into account various factors including the learners' proficiency level, age, and grade, as well as variations in their native language. To assess the impact of the learners' native language and pedagogical methods on syntactic complexity features, twelfth grade EFL students from Upper-Secondary schools in both nations were given identical instructions and time limits to complete an English academic essay on the same topic. The study utilized L2 syntactic complexity analyzer (L2SCA) to extract fourteen syntactic complexity features, and Mann-Whitney U Tests were used to analyze the differences in the syntactic complexity features between the two groups. The study has revealed significant differences between Finnish and Pakistani EFL learners due to variations in their native language and the effects of pedagogical methods on syntactic complexity features. The implications of this study extend to language testing and assessment, the CEFR framework, and pedagogy in both Finland and Pakistan.

    doi:10.1016/j.asw.2024.100839
  2. Assessing writing and spelling interest and self-beliefs: Does the type of pictorial support affect first and third graders’ responses?
    Abstract

    An array of pictorial supports (e.g., emojis, geometrical figures, animals) is often used in studies assessing young students’ writing motivation with Likert scales. However, although these images may influence the students’ responses, sufficient rationales for these choices are often absent from the studies. To the best of our knowledge, the present study is the first to investigate two different types of pictorial support (circles vs. faces) in Likert scales assessing first and third graders’ writing interest, self-concept, and spelling interest and self-efficacy. The samples consist of 2197 first graders (mean age 6.8 years) and 1740 third graders (mean age 8.4 years). Results show statistically significant differences among the scales indicating that when face-scales are used, first-graders skip motivation items more often, and students in both grades avoid the minimum values of the scale more often. Gender differences are also found indicating that when face-scales are used, boys in third grade avoid maximum values more often, and girls in both grades avoid the minimum values more often. These findings suggest that the use of circle-scales compared to face-scales seem more appropriate in scales measuring young students’ writing and spelling interest and self-beliefs.

    doi:10.1016/j.asw.2024.100833

January 2024

  1. A mixed Rasch model analysis of multiple profiles in L2 writing
    Abstract

    The present study used the Mixed Rasch Model (MRM) to identify multiple profiles in L2 students’ writing with regard to several linguistic features, including content, organization, grammar, vocabulary, and mechanics. To this end, a pool of 500 essays written by English as a foreign language (EFL) students were rated by four experienced EFL teachers using the Empirically-derived Descriptor-based Diagnostic (EDD) checklist. The ratings were subjected to MRM analysis. Two distinct profiles of L2 writers emerged from the sample analyzed including: (a) Sentence-Oriented and (b) Paragraph-Oriented L2 Writers. Sentence-Oriented L2 Writers tend to focus more on linguistic features, such as grammar, vocabulary, and mechanics, at the sentence level and try to utilize these subskills to generate a written text. However, Paragraph-Oriented Writers are inclined to move beyond the boundaries of a sentence and attend to the structure of a whole paragraph using higher-order features such as content and organization subskills. The two profiles were further examined to capture their unique features. Finally, the theoretical and pedagogical implications of the identification of L2 writing profiles and suggestions for further research are discussed.

    doi:10.1016/j.asw.2023.100803

April 2023

  1. The predictive powers of fine-grained syntactic complexity indices for letter writing proficiency and their relationship to pragmatic appropriateness
    doi:10.1016/j.asw.2023.100707

January 2023

  1. Assessing the writing quality of English research articles based on absolute and relative measures of syntactic complexity
    doi:10.1016/j.asw.2022.100692

October 2022

  1. Integrated writing and its correlates: A meta-analysis
    Abstract

    Integrated tasks are increasing in popularity, either replacing or complementing writing-only independent tasks in writing assessments. This shift has generated many research interests to investigate the underlying construct and features of integrated writing (IW) performances. However, due to the complexity of the IW construct, there are conflicting findings about whether and the extent to which various language skills and IW text features correlate to IW scores. To understand the construct of IW, we conducted a meta-analysis to synthesize correlation coefficients between scores of IW performances and (1) other language skills and (2) text quality features of IW. We also examined factors that may moderate the correlation of IW scores with these two groups of correlates. Consequently, (1) reading and writing skills showed stronger correlations than listening to IW scores; and (2) text length had a strongest correlation, followed by source integration, organization and syntactic complexity, with a smallest correlation of lexical complexity. Several IW task features affected the magnitude of correlations. The results supported the view that IW is an independent construct, albeit related, from other language skills and IW task features may affect the construct of IW.

    doi:10.1016/j.asw.2022.100662

July 2022

  1. Diversity of Advanced Sentence Structures (DASS) in writing predicts argumentative writing quality and receptive academic language skills of fifth-to-eighth grade students
    doi:10.1016/j.asw.2022.100649

April 2022

  1. The trajectory of syntactic complexity development in L1 Chinese narrative writings of primary school children: A systematic 5-year longitudinal study
    doi:10.1016/j.asw.2022.100622
  2. Automated writing evaluation: Does spelling and grammar feedback support high-quality writing and revision?
    doi:10.1016/j.asw.2022.100608

January 2022

  1. Revisiting the predictive power of traditional vs. fine-grained syntactic complexity indices for L2 writing quality: The case of two genres
    doi:10.1016/j.asw.2021.100597

January 2021

  1. Syntactic complexity in L2 learners’ argumentative writing: Developmental stages and the within-genre topic effect
    doi:10.1016/j.asw.2020.100506

October 2020

  1. ‘I will go to my grave fighting for grammar’: Exploring the ability of language-trained raters to implement a professionally-relevant rating scale for writing
    doi:10.1016/j.asw.2020.100488

July 2019

  1. Error analysis and diagnosis of ESL linguistic accuracy: Construct specification and empirical validation
    doi:10.1016/j.asw.2019.05.002

January 2019

  1. The influence of lexical features on teacher judgements of ESL argumentative essays
    Abstract

    Numerous studies have examined the relationship between lexical features of students’ compositions and judgements of text quality. However, the degree to which teachers’ judgements are influenced by the quality of vocabulary in students’ essays with regard to their assessment of other textual characteristics is relatively unexplored. This experimental study investigates the influence of lexical features on teachers’ judgements of English as a second language (ESL) argumentative essays. Using analytic and holistic rating scales, English pre-service teachers (N = 37) in Switzerland assessed four essays of different proficiency levels in which the levels of lexical diversity and sophistication had been experimentally varied. Coh-Metrix software was used to manipulate the level of lexical diversity, as measured by MTLD and D, and the Tool for the Automatic Analysis of Lexical Sophistication (TAALES) software was used to obtain differing levels of lexical sophistication, as measured by word range. The results suggested that texts with greater lexical diversity and sophistication were assessed more positively concerning their overall quality as well as the analytic criteria ‘grammar’ and ‘frame of essay’. The implications of this study for classroom practice and teacher education are discussed.

    doi:10.1016/j.asw.2018.12.003

January 2018

  1. Analysis of syntactic complexity in secondary education EFL writers at different proficiency levels
    doi:10.1016/j.asw.2017.11.002

July 2016

  1. Searching for differences and discovering similarities: Why international and resident second-language learners’ grammatical errors cannot serve as a proxy for placement into writing courses
    doi:10.1016/j.asw.2016.05.001