Can domain-specific problem-solving competence be validly measured using computer-based assessment (CBA)?
The core question of our study is if it is possible to measure occupation-specific competence defined as on the job problem-solving abilities (domain-specific problem-solving competence) within technical systems using a computer-simulation.
There are numerous studies which addressed the evidence for convergent validity of computer simulation-based test scores in relation to on the job performance or other measures (or constructs) (Mullen, Charlton, Devlin, & Bédard, 2011, 13-3-13-6; Riley, 2008). On a more precise look on the studies, there are heterogeneous results. Zendejas, Brydges, Wang, and Cook (2013) and Cook, Brydges, Zendejas, Hamstra, and Hatala (2013) argue that the “methodological and reporting quality of assessment studies leaves much room for improvement”. Aucar, Groch, Troxel, and Eubanks (2005) argue for the field of surgical craft that there is a strong difference in the number of subjects in the studies and that there have been only few data “obtained so far that examine the validity of simulators” (Aucar et al., 2005, p. 88).
For the occupation of electronics technicians for automation technology (ETAT), based on our research, we found no study that addressed the evidence for convergent validity of computer-simulation-based test scores in relation to on the job performance so far.
In accordance with Kane (2013) argument-based approach of validity we developed items and instrument to measure domain-specific problem-solving competence.
The approach to measuring problem-solving competence was a set of realistic problems with the focus on domain-specific activities such and troubleshooting a programmable logic controller (PLC) (Kallies, Hägele, & Zinke, 2014). The item development was based on the structure of the control program in relation to Benda (2008, p. 145). Following this approach, items regarding the operating mode, the step chain and output routine were developed. In total, eight troubleshooting scenarios were generated.
As a reference for measuring on the job problem-solving competence (troubleshooting) we used a real automation system similarly to the final exam of the occupation of electronics technicians for automation technology. The system consists of real industrial components like a touchpanel to control the system, a PLC and sensors and actuators as well.
In accordance to Funke and Reuschenbach (2011) a computer simulation has to fulfil two requirements display realism and response-system realism. Following this approach, our computer simulation maps the real automation system exactly. Furthermore the apprentices can interact with the system (PLC, actor, sensors) and the control program in real-time processing (Walker et al., 2016).
If it is possible to measure domain-specific problem-solving competence using computer-based simulations, then the results of the competence assessment on the job and in the virtual environment of a computer-based scenario should be comparable. We assume that there are no differences between the two types of assessment. This point is important for finding a feasible testing regimen (that means highly valid to the job environment, highly reliable and objective, and cost-effective) to assess competences within a large-scale assessment format that will be suitable for testing a large number of apprentices. Besides that argument, it could have an impact on constructing high-quality learning environments or getting information about improving school or job-based assessment procedures. In a large-scale assessment, it is impossible to use job observance methods (e.g. due to the problems of representativeness for the individual status of occupational competency and inter-rater-reliability problems), so other solutions have to be considered and empirically validated: virtual assessment environments, specifically computer-based approaches.
In our study, two samples were randomized generated: One sample solved the problems one to four based on the computer simulation and the problems five to eight with the real automation system. The second sample solved the problems vice versa. The sample size in total is 308 apprentices attending the 3rd and 4th year of the German dual apprenticeship track of ETAT, 154 for each sample.
To analyze the data, we used SEM (structural equation modeling). Firstly, we applied confirmatory factor analyses (CFA) to each group and examined whether troubleshooting on real and computer simulated automation system measure the same dimension (i.e. domain-specific problem-solving competence). Secondly, we used a multi-group invariance analysis CFA (weak and strong invariance) to determine whether the psychometric properties (difficulty and discrimination) of the indicators (problems) differ in terms of the testmode (real vs. computer simulation). Due to the ordinal items, the WLSMV estimators were chosen (Muthèn & Muthèn, 1998-2010). To test the invariance we used the delta-approach (Muthèn & Asparouhov, 2002). Based on Satorra-Bentler-adjusted χ²-values the DIFF Test option in MPlus was applied to compare the different multi-group invariance models. For the evaluation of the models we used different criteria (ΔWRMR; ΔCFI and ΔRMSEA) (Byrne & Stewart, 2006; Rutkowski & Svetina, 2013).
The analyses confirmed our hypothesis and showed that the test scores of troubleshooting in the real and the computer simulated automation system correlated high (Group 1 and 2: r≥ .85). To strengthen the evidence for convergent validity all items fulfil the assumptions of the analysis of invariance, which means, that the they represent the problem-solving competence in both settings in the same manner and furthermore, that they do not differ in difficulty regardless if they has been measured in real or the computer-simulation.
In terms of large-scale-assessments in the domain of vocational education and training, the results show that computer-based simulations enable competence assessments of high psychometric quality and testing economy. Additionally, the study implies that the performance on the job and computer-simulated assessment refers to very similar cognitive components and processes which suggests that computer-based environments are suitable for teaching domain-specific competences. Here, we focused on the domain-specific problem-solving competence for the occupation of electronics technicians for automation technology. It is the task of further research to assess whether this finding can be generalized to other systems and contexts.
Practical implications for teaching structures can also be drawn in a manner that, the computer-simulation can be used in high-quality learning environments. Furthermore, log-files about the behavior patterns during the troubleshooting in the computer-simulation can be analyzed and used to improve the domain-specific problem-solving competence.
Aucar, J. A. et al. (2005). A review of surgical simulation with attention to validation methodology. Surgical Laparoscopy Endoscopy & Percutaneous Techniques, 15(2), 82–89.
Benda, D. (2008). Das große Handbuch Fehlersuche in elektronischen Schaltungen: Lesen und Auswerten von Schaltungsunterlagen, Fehlersuche mit Methode, Messen und Prüfen mit dem Oszilloskop. Poing: Franzis.
Byrne, B. M., & Stewart, S. M. (2006). TEACHER'S CORNER: The MACS Approach to Testing for Multigroup Invariance of a Second-Order Structure: A Walk Through the Process. Structural Equation Modeling: A Multidisciplinary Journal, 13(2), 287–321.
Cook, D. A. et al. (2013). Technology-enhanced simulation to assess health professionals: a systematic review of validity evidence, research methods, and reporting quality. Academic medicine : journal of the Association of American Medical Colleges, 88(6), 872–883.
Funke, J., & Reuschenbach. (2011). Einsatz technischer Mittel in der psychologischen Diagnostik. In L. F. Hornke, M. Amelang, & M. Kersting (Eds.), Enzyklopädie der Psychologie Methodologie und Methoden Psychologische Diagnostik: ; Bd. 3. Leistungs-, Intelligenz- und Verhaltensdiagnostik (pp. 595–631). Göttingen [u.a.]: Hogrefe.
Kallies, H., Hägele, T., & Zinke, G. (2014). Betriebsuntersuchungen zur Analyse betrieblicher Tätigkeiten von Mechatronikern und Mechatronikerinnen sowie Elektronikern und Elektronikerinnen für Automatisierungstechnik. Bonn: Bundesinstitut für Berufsbildung (BIBB).
Kane, M. T. (2013). Validating the Interpretations and Uses of Test Scores. Journal of Educational Measurement, 50(1), 1–73.
Mullen, N., Charlton, J., Devlin, A., & Bédard, M. (2011). Simulator Validity: Behaviors Observed on the Simulator and on the road. In D. L. Fisher, M. Rizzo, J. Caird, & J. D. Lee (Eds.), Handbook of driving simulation for engineering, medicine, and psychology (13-1–13-18). CRC Press.
Muthèn, B. O., & Asparouhov, T. (2002). Latent Variable Analysis With Categorical Outcomes:Multiple-Group And Growth Modeling In Mplus. Mplus Web Notes: No. 4, Version 5, December 9, 2002.
Muthèn, L. K. & Muthèn, B. O. (1998-2010). MPlus user's guide.
Riley, R. H. (Ed.). (2008). Manual of Simulation in Healthcare: Oxford Univ. Press.
Rutkowski, L., & Svetina, D. (2013). Assessing the Hypothesis of Measurement Invariance in the Context of Large-Scale International Surveys. Educational and Psychological Measurement, 74(1), 31–57.
Walker, F. et al. (2016). Berufsfachliche Kompetenzen von Elektronikern für Automatisierungstechnik: Kompetenzdimensionen, Messverfahren und erzielte Leistungen (KOKO EA). In K. Beck, M. Landenberger, & F. Oser (Eds.), Wirtschaft - Beruf - Ethik: Vol. 32. Technologiebasierte Kompetenzmessung in der beruflichen Bildung. Ergebnisse aus der BMBF-Förderinitiative ASCOT (pp. 139–170). Bielefeld: WBV.
Zendejas, B. et al. (2013). Patient outcomes in simulation-based medical education: a systematic review. Journal of general internal medicine, 28(8), 1078–1089.