Assessing Writing
149 articlesJanuary 2020
-
Linking TOEFL iBT® writing rubrics to CEFR levels: Cut scores and validity evidence from a standard setting study ↗
Abstract
English writing is a key competence for higher education success. However, research on the assessment of writing skills in English as a foreign language in European upper secondary education (i.e. beyond year 9) remains scarce. The Common European Framework of Reference (CEFR) describes language proficiency on a scale of six ascending levels (A1-C2). For writing skills at the end of secondary education in Europe, the common standard is vantage level B2. In this study, experts from Germany and Switzerland linked upper secondary students’ writing profiles elicited in a constructed response test (integrated and independent essays from the TOEFL iBT®) to CEFR levels. Standard setting methodology (a modified examinee paper selection/performance profile approach) was used to establish the linkages. The study reports the methodology and procedure of the standard setting process and discusses the procedural and internal validity of resulting cut scores. It also applies the cut scores to a large sample of upper secondary students in Germany and Switzerland to gain evidence for external and consequential validity.
July 2019
April 2019
January 2019
-
Abstract
Numerous studies have examined the relationship between lexical features of students’ compositions and judgements of text quality. However, the degree to which teachers’ judgements are influenced by the quality of vocabulary in students’ essays with regard to their assessment of other textual characteristics is relatively unexplored. This experimental study investigates the influence of lexical features on teachers’ judgements of English as a second language (ESL) argumentative essays. Using analytic and holistic rating scales, English pre-service teachers (N = 37) in Switzerland assessed four essays of different proficiency levels in which the levels of lexical diversity and sophistication had been experimentally varied. Coh-Metrix software was used to manipulate the level of lexical diversity, as measured by MTLD and D, and the Tool for the Automatic Analysis of Lexical Sophistication (TAALES) software was used to obtain differing levels of lexical sophistication, as measured by word range. The results suggested that texts with greater lexical diversity and sophistication were assessed more positively concerning their overall quality as well as the analytic criteria ‘grammar’ and ‘frame of essay’. The implications of this study for classroom practice and teacher education are discussed.
October 2018
July 2018
April 2018
-
Going online: The effect of mode of delivery on performances and perceptions on an English L2 writing test suite ↗
Abstract
In response to changing stakeholder needs, large-scale language test providers have increasingly considered the feasibility of delivering paper-based examinations online. Evidence is required, however, to determine whether online delivery of writing tests results in changes to writing performance reflected in differential test scores across delivery modes, and whether test-takers hold favourable perceptions of online delivery. The current study aimed to determine the effect of delivery mode on the two writing tasks (reading-into-writing and extended writing) within the Trinity College London Integrated Skills in English (ISE) test suite across three proficiency levels (CEFR B1-C1). 283 test-takers (107 at ISE I/B1, 109 at ISE II/B2, and 67 at ISE III/C1) completed both writing tasks in paper-based and online mode. Test-takers also completed a questionnaire to gauge perceptions of the impact, usability and fairness of the delivery modes. Many-facet Rasch measurement (MFRM) analysis of scores revealed that delivery mode had no discernible effect, apart from the reading-into-writing task at ISE I, where the paper-based mode was slightly easier. Test-takers generally held more positive perceptions of the online delivery mode, although technical problems were reported. Findings are discussed with reference to the need for further research into interactions between delivery mode, task and level.