AI-ASSISTED SCORING OF FOREIGN LANGUAGE PROFICIENCY EXAMS: PROBLEMS AND SOLUTIONS

Kamalova Shakhlo Nugmanovna

Vol. 1 No. 2 (2026), Articles

Vol. 1 No. 2 (2026)

AI-ASSISTED SCORING OF FOREIGN LANGUAGE PROFICIENCY EXAMS: PROBLEMS AND SOLUTIONS

Articles

Published 2026-03-03

Kamalova Shakhlo Nugmanovna⁺⁻

Kamalova Shakhlo Nugmanovna

Senior lecturer at Sarbon University in Tashkent

pdf

Keywords

AI scoring; automated essay scoring; speech assessment; language testing; validity; fairness; bias auditing; human-in-the-loop; CEFR; high-stakes exams.

How to Cite

AI-ASSISTED SCORING OF FOREIGN LANGUAGE PROFICIENCY EXAMS: PROBLEMS AND SOLUTIONS. (2026). International Conference on Education, Psychology and Humanities, 1(2), 35-45. https://www.econferencia.com/index.php/10/article/view/253

Abstract

The use of artificial intelligence (AI) to score foreign language proficiency exams is expanding due to the need for scalability, faster turnaround, and cost efficiency—especially in large-scale, high-stakes contexts. AI-assisted scoring commonly includes automated marking of selected-response items, automated essay scoring (AES) for writing, and speech technologies (automatic speech recognition and pronunciation/prosody models) for speaking assessments. Despite operational benefits, AI scoring introduces critical challenges: construct underrepresentation, bias and fairness risks across accents and demographic groups, limited explainability, vulnerability to gaming, domain shift across prompts and test forms, and governance issues related to data privacy and accountability. This article synthesizes key problems associated with AI-based scoring and proposes a solutions framework centered on construct validity, human-in-the-loop moderation, rigorous psychometric calibration, continuous bias auditing, robust security controls, and transparent candidate-facing policies (including appeal mechanisms). The paper argues that AI can be used responsibly in language assessment only when it is embedded within a defensible assessment design that prioritizes validity, reliability, and equity.

pdf