The upper and lower bounds were selected because an interval of plus or minus three standard deviations contains more than 99% of a population.
Note that scaling does not affect the psychometric properties of a test; it is something that occurs after the assessment process (and equating, if present) is completed.
The number of right answers or the sum of item scores (where partial credit is given) is assumed to be the appropriate and sufficient measure of current performance status.
In the first place, a correct answer can be achieved using memorization without any profound understanding of the underlying content or conceptual structure of the problem posed.
This departure should be dependent upon the level of psycholinguistic maturity of the student choosing or giving the answer in the vernacular in which the test is written.
[3] Such extraction processes, the Rasch model for instance, are standard practice for item development among professionals.
This commentary suggests that the current scoring procedure conceals the dynamics of the test-taking process and obscures the capabilities of the students being assessed.
This RSE approach provides an interpretation of every answer, whether right or wrong, that indicates the likely thought processes used by the test taker.
[5] Among other findings, this chapter reports that the recoverable information explains between two and three times more of the test variability than considering only the right answers.