How ETS Human Raters Use TOEFL Speaking Rubric to Evaluate Responses

toefl rubric speaking

What’s Inside?

Table of Contents

The TOEFL Speaking section is crucial for assessing a test-taker’s ability to communicate effectively in English. Human raters from ETS (Educational Testing Service) play a vital role in evaluating these responses using well-defined TOEFL rubric speaking guidelines. Understanding how these raters use the TOEFL rubric speaking criteria can help test-takers prepare more effectively.

The Role of Human Raters

Human raters are extensively trained by ETS to provide holistic evaluations of non-native speakers’ responses to TOEFL Speaking questions. These US-based professionals ensure that each test is assessed fairly and consistently according to the TOEFL speaking rubric standards.

Calibration and Daily Preparation

Every day before scoring begins, human raters undergo a calibration process to ensure their evaluations align with ETS standards. Calibration involves reviewing sample responses and their corresponding scores to maintain scoring accuracy and consistency. This daily practice helps raters stay sharp and ensures they apply the TOEFL rubric speaking consistently.

Rubric Components for Independent Speaking Tasks

For the Independent Speaking tasks, raters use a TOEFL rubric speaking that focuses on three key areas:


  • Clarity and Fluency: Raters assess how clearly and fluently the test-taker speaks. This includes pronunciation, intonation, and pacing. Effective delivery means speaking in a natural and smooth manner, without frequent pauses or hesitations.
  • Ease of Understanding: The speech should be easy to follow, with appropriate stress and rhythm.

Language Use

  • Grammar and Vocabulary: Raters look for accurate and varied grammatical structures and vocabulary. The use of complex sentences and a rich vocabulary demonstrates a higher level of proficiency.
  • Precision: The language used should precisely convey the intended meaning.

Topic Development

  • Relevance and Coherence: Responses should be relevant to the prompt and well-organized. Ideas should be logically connected and clearly presented.
  • Support and Detail: Providing specific examples and details to support the main ideas is crucial for a high score.

Rubric Components for Integrated Speaking Tasks

For the Integrated Speaking tasks, the TOEFL rubric speaking focuses on similar criteria but with additional emphasis on the integration of information:


  • Clarity and Fluency: As with the Independent tasks, clarity and fluency are critical. The speech should be smooth and natural.
  • Comprehensibility: The response should be easy to understand, with appropriate use of stress and intonation.

Language Use

  • Grammar and Vocabulary: Accurate and varied language use is essential. The response should demonstrate control over complex grammatical structures and appropriate vocabulary.
  • Integration of Sources: Effective paraphrasing and summarizing of information from reading and listening materials are key.

Topic Development

  • Relevance and Coherence: Responses must be relevant and coherent, with clear connections between ideas.
  • Integration and Accuracy: Raters look for accurate representation and integration of information from the reading and listening passages. Proper summarization and synthesis of the content are important.

Rater Guidelines and Procedures

To ensure fairness and objectivity:

  • Anonymity: Raters do not know the identities of test-takers. This anonymity helps prevent bias.
  • Limited Exposure: No single rater can score more than two responses from the same test-taker, ensuring diverse perspectives in scoring.
  • Task Inputs: Raters have access to a “cheat sheet” that provides information about the task inputs. This helps them determine if the test-taker’s response is relevant and on-topic according to the TOEFL rubric speaking guidelines.

Training and Calibration of Human Raters

ETS provides extensive training to human raters, including familiarization with the TOEFL speaking rubric guide, practice scoring of sample responses, and regular calibration sessions to maintain scoring consistency. Raters are periodically evaluated to ensure they adhere to the scoring standards.

Combining SpeechRater and Human Rater Scores

In addition to human raters, ETS’s automated scoring system, SpeechRater, also evaluates TOEFL Speaking responses. SpeechRater uses advanced speech recognition and natural language processing technologies to assess various aspects of spoken language.

How SpeechRater Works

SpeechRater evaluates responses based on features similar to those assessed by human raters, such as pronunciation, fluency, vocabulary, and grammar. It analyzes the acoustic properties of speech and the linguistic content to provide an objective score. Some of the specific features analyzed by SpeechRater include:

  • Pronunciation Accuracy: How closely the speech matches standard American English pronunciation.
  • Fluency: The smoothness and pace of speech, including the presence of hesitations or pauses.
  • Vocabulary and Grammar: The complexity and accuracy of the language used.

Combining Scores into a Scaled Score

The final TOEFL Speaking score is a combination of scores from human raters and SpeechRater. Here’s how the process typically works:

  • Initial Scoring: Each speaking response is scored by multiple human raters and SpeechRater.
  • Score Normalization: To ensure fairness, scores from human raters and SpeechRater are normalized. This means they are adjusted to account for any differences in scoring tendencies.
  • Score Integration: The normalized scores from both human raters and SpeechRater are combined to form a composite score. This composite score is then converted into a scaled score out of 30.

Benefits of Combining Scores

Combining human and automated scores leverages the strengths of both methods:

  • Human Insight: Human raters bring nuanced understanding and the ability to evaluate complex aspects of communication that may be challenging for automated systems.
  • Consistency and Objectivity: SpeechRater adds a layer of consistency and objectivity, reducing the potential for human bias and variability in scoring.

How My Speaking Score Uses SpeechRater

My Speaking Score leverages SpeechRater’s technology to evaluate test-takers using a criterion-referenced framework. This proprietary technology assesses responses on four primary dimensions:

  • Delivery: How clearly and fluently the test-taker speaks.
  • Language Use: The accuracy and complexity of grammatical structures and vocabulary.
  • Topic Development: The relevance and coherence of the response.
  • Integration: Effective paraphrasing and summarizing of information from reading and listening materials.

Norm-Referenced Data on 12 Dimensions

In addition to the primary dimensions, My Speaking Score provides norm-referenced data on 12 dimensions that loosely align with the TOEFL rubric speaking:

  • Speaking Rate: The speed at which the test-taker speaks.
  • Sustained Speech: The ability to speak continuously without unnecessary pauses.
  • Pause Frequency: The frequency of pauses in the speech.
  • Distribution of Pauses: How pauses are distributed throughout the speech.
  • Phrase Length: The length of phrases spoken.
  • Repetitions: The number of repeated words or phrases.
  • Rhythm: The natural flow and rhythm of the speech.
  • Vowels: Pronunciation and clarity of vowel sounds.
  • Vocabulary Depth: The range and depth of vocabulary used.
  • Vocabulary Diversity: The variety of vocabulary used.
  • Grammatical Accuracy: The correctness of grammatical structures.
  • Discourse Coherence: The logical flow and coherence of the response.


Understanding how ETS human raters and SpeechRater use TOEFL rubric speaking to evaluate responses provides valuable insight for test-takers. By focusing on clarity, fluency, grammatical accuracy, and coherent topic development, you can enhance your performance in the TOEFL Speaking section. Combining human and automated scores ensures a fair and comprehensive assessment, ultimately leading to a reliable scaled score out of 30. Leveraging tools like My Speaking Score can provide detailed feedback and help you improve across multiple dimensions, boosting your chances of achieving a high TOEFL Speaking score.

Frequently Asked Questions

What are the main components of the TOEFL rubric speaking?

The main components are Delivery, Language Use, and Topic Development for both Independent and Integrated Speaking tasks.


Human raters undergo daily calibration, use anonymized scoring, and have limited exposure to any single test-taker’s responses.

SpeechRater evaluates responses based on pronunciation, fluency, vocabulary, and grammar, providing an objective score that complements human raters’ evaluations.