Automatic item generation

[3][4][5] More recently, neural networks, including Large Language Models, such as the GPT family, have been used successfully for generating items automatically.

[8] Some characteristics measured by psychological and educational tests include academic abilities, school performance, intelligence, motivation, etc.

Achieving measurement quality standards, such as test validity, is one of the most important objectives for psychologists and educators.

This is the reason why it is believed that incidentals produce only slight differences among the item parameters of the isomorphs.

[17][18] A test of melodic discrimination developed with the aid of the computational model Rachman-Jun 2015[19] was administered to participants in a 2017 trial.

[20] Ferreyra and Backhoff-Escudero[21] generated two parallel versions of the Basic Competences Exam (Excoba), a general test of educational skills, using a program they developed called GenerEx.

Holling, Bertling, and Zeuch[28] used probability theory to automatically generate mathematical word problems with expected difficulties.

Holling, Blank, Kuchenbäcker, and Kuhn[31] made a similar study with statistical word problems but without using AIG.

[35] A study which identified sources of measurement bias related to response elimination strategies for figural matrix items concluded that distractor salience favors the pursuit of response elimination strategies and that this knowledge could be incorporated into AIG to improve the construct validity of such items.

[36] The same group used AIG to study differential item functioning (DIF) and gender differences associated with mental rotation.

He concluded that GeomGen was more suitable for AIG because IRT principles can be incorporated during item generation.

[39] In a parallel research project using GeomGen, Arendasy and Sommer[40] found that variation of the perceptual organization of items could influence the performance of respondents depending on their ability levels and that it had an effect on several psychometric quality indices.

Four-rule-based figural analogy stem automatically generated with the IMak package (for more information, see Blum & Holling, 2018).