-
Bramley (2019)
The effect of adaptivity on the reliability coefficient in adaptive comparative judgement
Assessment in Education: Principles, Policy Practice
-
Chambers (2022)
Exploring the validity of comparative judgement: Do judges attend to construct-irrelevant…
-
Cohen (1988)
Statistical power analysis for the behavioral sciences
Statistical Power Analysis for the Behavioral Sciences
-
Council of Europe (2009)
Relating language examinations to the Common European Framework of Reference for Languages: Learning, teaching, assessment (CEFR): A Manual
-
Crossley (2023)
Crowd-sourcing human ratings of linguistic production
Proceedings of the Annual Meeting of the Cognitive Science Society
-
Douglas (2023)
Data quality in online human-subjects research: Comparisons between MTurk, Prolific, Clou…
-
Ericsson (1993)
Protocol analysis: verbal reports as data
-
Everitt, B.S., & Skrondal, A. (2010). The Cambridge dictionary of statistics. 〈http://196.43.179.6:8080/xmlui…
-
Granger (2020)
-
Gravetter, F.J., & Wallnau, L.B. (2017). Statistics for the Behavioral Sciences (10th ed.). Cengage. 〈https:/…
-
Green (2020)
Exploring language assessment and testing: language in action
-
Han (2022)
A comparative judgment approach to assessing Chinese Sign Language interpreting
-
Jones, I. (2022). Sirt functions [Computer software]. 〈https://github.com/NoMoreMarking/sirt/blob/main/R/sirt…
-
Jones (2023)
Comparative judgement in education research
International Journal of Research Method in Education
-
Jones (2015)
The problem of assessing problem solving: can comparative judgement help?
Educational Studies in Mathematics
↗
-
Jones, I., & Inglis, M. (2023). The validity of comparative judgement: A comment on Kelly, Richardson and Isa…
-
Jones (2016)
Fifty years of A-level mathematics: Have standards changed?
British Educational Research Journal
↗
-
Kelly (2022)
Critiquing the rationales for using comparative judgement: A call for clarity
Assessment in Education: Principles, Policy Practice
-
Landis (1977)
The measurement of observer agreement for categorical data
-
Landrieu (2022)
Assessing the quality of argumentative texts: examining the general agreement between dif…
-
Lesterhuis (2022)
Validity of comparative judgment scores: How assessors evaluate aspects of text quality w…
-
Lesterhuis (2018)
When teachers compare argumentative texts: Decisions informed by multiple complex aspects…
L1-Educational Studies in Language and Literature
↗
-
Messick (1989)
Validity
In Educational measurement
-
Paquot (2022)
Crowdsourced adaptive comparative judgment: A community-based solution for proficiency rating
-
Park (2022)
Proficiency reporting practices in research on second language acquisition: Have we made …
-
Peer (2022)
Data quality of platforms and panels for online behavioral research
Behavior Research Methods
↗
-
Pinot de Moira (2022)
The classification accuracy and consistency of comparative judgement of writing compared …
-
R Core Team. (2023). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Co…
-
Robitzsch, A. (2023). sirt: Supplementary Item Response Theory Models. 〈https://CRAN.R-project.org/package=sirt〉.
-
Şahin (2021)
Feasibility of using comparative judgement and student judges to assess writing performan…
Journal of Pedagogical Research
↗
-
Sims (2020)
Rubric rating with MFRM versus randomly distributed comparative judgment: A comparison of…
Educational Measurement: Issues and Practice
↗
-
Stadthagen-González (2018)
Using two-alternative forced choice tasks and Thurstone’s law of comparative judgments fo…
Linguistic Approaches to Bilingualism
↗
-
Steedle (2016)
Evaluating comparative judgment as an approach to essay scoring
Applied Measurement in Education
↗
-
Thomas (1994)
Assessment of L2 proficiency in second language acquisition research
-
Thomas (2006)
Research synthesis and historiography
Synthesizing Research on Language Learning and Teaching
↗
-
Thurstone (1927)
A law of comparative judgment
-
Thwaites et al. (2024)
Assessing Writing
-
Thwaites (2024)
Comparative judgment for advancing research in applied linguistics
Research Methods in Applied Linguistics
↗
-
Thwaites, P., Kollias, C., & Paquot, M. (under review). Testing crowdsourcing as a means of recruitment for t…
-
Thwaites, P., Vandeweerd, N., & Paquot, M. (2024). Crowdsourced comparative judgement for evaluating learner …
-
Tremblay (2011)
Proficiency assessment standards in second language acquisition research: “Clozing” the Gap
Studies in Second Language Acquisition
↗
-
van Daal (2019)
Validity of comparative judgement to assess academic writing: Examining implications of i…
Assessment in Education: Principles, Policy Practice
-
Vandeweerd (2024)
Using crowdsourced comparative judgement and rubric-based rating to grade texts in the IC…
Learner Corpus Research 2024
-
Verhavert (2019)
A meta-analysis on the reliability of comparative judgement
Assessment in Education: Principles, Policy Practice
-
Verhavert (2018)
Scale separation reliability: What does it mean in the context of comparative judgment?
Applied Psychological Measurement
↗
-
Vidal Rodeiro (2022)
Moderation of Non-exam assessments: Islem- Comparative Judgement a practical alternative?
-
Wheadon (2020)
A comparative judgement approach to the large-scale assessment of primary writing in England
Assessment in Education: Principles, Policy Practice