09 SES 10 C, Methodological Issues in Tests and Assessments
Reasoning tests are popular components of the assessment toolbox for selection and admission into higher education and job employment (Leighton, 2004; Stanovich, Sá, & West, 2004). Abstract reasoning tests tap into a core reasoning ability (Carpenter, Just, & Shell, 1990) with tasks that require the examinee to generate and apply rules (Wüstenberg, Greiff, & Funke, 2012), but require neither explicit prior contents-specific knowledge of the examinee nor specific language skills (Raven, 2000). Traditionally, test construction and assembly have been the product of creative item writing processes and post-hoc psychometric evaluations, without explicit consideration of cognitive theory (Hunt, Frost, & Lunneborg, 1973). Yet, abstract reasoning provides a case that in principle is ideally suitable for modern test design (e.g. Embretson, 1998; Mislevy, Almond, & Lukas, 2003), combining cognitive theory with a more systematic approach to construction and assembly of test items.
Objective and Research Questions. This study is part of a larger project aimed at reverse engineering an existing abstract reasoning test from a modern test design perspective to setup a virtual item bank that does not store individual items, but instead uses automatic item generation rules based on cognitive complexity (see e.g., Gierl & Haladyna, 2013). The objective of the current study represents one step towards such a virtual item bank with research questions focusing on (i) identifying the cognitive relevant item features (i.e. “radicals”) that impact the behaviour of the test and of the participants and (ii) identifying the merely “cosmetic” irrelevant item features (i.e., incidentals).
The test. The abstract reasoning test is composed of testlets consisting of items related to the same problem situation from which a set of rules need to be derived that are necessary to solve the individual items. Each testlet is structured around a problem set consisting of a varying number of rows each consisting of a specified input stimulus configuration, an activated set of action buttons and a resulting output stimulus configuration. This problem set allows the examinee to derive the transformations that will happen to the input when a specific action button is activated. This rule knowledge is necessary to solve the connected items. An item consists of a single row with a specified input stimulus configuration, the activated set of action buttons for that item, and four alternative output stimulus configuration possibilities of which the examinee has to decide on the correct one.
Theoretical framework. A rational task analysis of the abstract reasoning test proposes an artificial intelligent algorithm (see Newell & Simon, 1972) that consists of 4 core steps. (1) Inventorisation: all the characteristics of input stimulus configurations and output stimulus configurations of the problem set are registered; (2) Matching: an input/output dissimilarity matrix is computed; (3) Rule finding: computationally this would be similar to solving a system of equations or a more greedy version using elimination; (4) Rule application. The test has some characteristics built in by design that can be directly connected to the artificial intelligent algorithm and the related (i) cognitive load of the stimulus material and (ii) cognitive complexity of the rules that need to be derived. Examples of the former characteristics can be as simple as the number of symbols in the input stimulus configuration, examples of the latter characteristics can be whether or not the transformation caused by a specific action button can be derived on its own (i.e., independent of the other action buttons in the problem set). Some theoretically irrelevant item features can also be defined such as the type of symbols used in a stimulus configuration (e.g., triangle or circle).
Carpenter, P. A., Just, M. A., & Shell, P. (1990). What one intelligence test measures: a theoretical account of the processing in the Raven Progressive Matrices Test. Psychological Review, 97(3), 404–431. http://doi.org/10.1037/0033-295X.97.3.404 De Boeck, P., & Wilson, M. (2004). Explanatory item response models: A generalized linear and nonlinear approach. New York: Springer. Embretson, S. E. (1998). A cognitive design system approach to generating valid tests: Application to abstract reasoning. Psychological Methods, 3(3), 380–396. http://doi.org/10.1037/1082-989X.3.3.380 Ericsson, K. A., & Simon, H. A. (1993). Protocol analysis: verbal reports as data (Revised). Cambridge: The MIT press. Gierl, M. J., & Haladyna, T. M. (2013). Automatic item generation: an introduction. In M. J. Gierl & T. M. Haladyna (Eds.), Automatic item generation: theory and practice. New York: Routledge. Hart, S. G., & Staveland, L. E. (1988). Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. In P. A. Hancock & N. Meshkati (Eds.), Human Mental Workload. Amsterdam: North Holland Press. Hunt, E., Frost, N., & Lunneborg, C. (1973). Individual Differences in Cognition: A New Approach to Intelligence. Psychology of Learning and Motivation - Advances in Research and Theory, 7(C), 87–122. http://doi.org/10.1016/S0079-7421(08)60066-3 Leighton, J. P. (2004). The Assessment of Logical Reasoning. In J. P. Leighton & R. J. Sternberg (Eds.), The Nature of Reasoning (pp. 291–312). Cambridge: Cambridge University Press. Mislevy, R. J., Almond, R. G., & Lukas, J. F. (2003). A Brief Introduction to Evidence-centered Design, (July). Newell, A., & Simon, H. A. (1972). Human problem solving. Englewood Cliffs, N.J: Prentice-Hall. R Core Team. (2014). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Retrieved from http://www.r-project.org/ Raven, J. (2000). Psychometrics, cognitive ability, and occupational performance. Review of Psychology, 7(1-2), 51–74. Retrieved from http://mjesec.ffzg.hr/revija.psi/vol 07 no 1-2 2000/Raven_2000-7-1-2.pdf Stan Development Team. (2015). Stan Modeling Language Users Guide and Reference Manual, Version 2.8.0. Retrieved from http://mc-stan.org/ Stanovich, K. E., Sá, W. C., & West, R. F. (2004). Individual Differences in Reasoning. In J. P. Leighton & R. J. Sternberg (Eds.), The Nature of Reasoning (pp. 375–409). Cambridge: Cambridge University Press. Wüstenberg, S., Greiff, S., & Funke, J. (2012). Complex problem solving - More than reasoning? Intelligence, 40, 1–14. http://doi.org/10.1016/j.intell.2011.11.003
The programme is updated regularly (each day in the morning)
- Search for keywords and phrases in "Text Search"
- Restrict in which part of the abstracts to search in "Where to search"
- Search for authors and in the respective field.
- For planning your conference attendance you may want to use the conference app, which will be issued some weeks before the conference
- If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.