A method for adaptive scoring of responses to constructed response
questions is disclosed. Adaptive scoring may be used to apply evaluator
time in such a way that a predetermined reliability level is reached with
the least possible use of evaluator time, including adjusting the number
of response graded and/or the number of evaluators grading each response.
A score may be calculated after grading a subset of a test taker's
responses to the constructed response questions. A probability or an
error estimate is calculated and compared to a threshold value. Grading
may be discontinued based on the comparison. A score may be calculated
based on a predetermined number of ratings for the test taker's response
to a constructed response. A probability that the score is within a
predetermined range of what the score would be if all the responses are
graded is calculated. If the probability is less than a threshold value,
the number of ratings is increased.