Session Information
09 SES 08 B, Theoretical and Methodological Issues in Testing and Measurement (Part 1)
Paper Session
Contribution
In the Italian system, assessment data on primary, lower middle, and high-school students are yearly collected by the National Institute for the Evaluation of the Educational System (INVALSI). Before collecting these data, INVALSI questionnaires are calibrated using the outcomes from pretesting sessions. These preliminary data are analysed by standard Item Response Theory (IRT) models, such as Rasch (1961) model.
In this paper, we focus on data collected on middle school students; these data are having an increasing relevance in the Italian educational context and their collection will become compulsory in the near future. In particular, we aim at studying if the assumptions of the IRT models used by the INVALSI are met for the “live” data collected by this Institution. In particular, we focus on the assumption of unidimensionality, which characterizes the most used IRT models. The data are based on a nationally representative sample made of 27,592 students within 1,305 schools (one class is sampled in each school) and refer to the students’ performance on the reading comprehension national test administered in June 2009. The methodology applied is based on a sequence of likelihood ratio tests between pairs of models belonging to a class of multidimensional latent class IRT models studied by Bartolucci (2007); see also Goodman (1974), Lazarsfeld and Henry (1968), Martin-Löf (1973), Verhelst (2001), and Bartolucci and Forcina (2005).
According to the assumption of unidimensionality, the difference between two subjects in responding to a set of items depends on a single latent trait, which corresponds to the ability measured by the items. Obviously, if unidimensionality does not hold, the conclusions reached on the basis of a unidimensional IRT model may be misleading and summarizing the test performance of a subject through a single score is not sensible any more. Several authors have dealt with testing unidimensionality in connection with the Rasch model (Glas and Verhelst, 1995; Verhelst, 2001). One of the main contributions is due to Martin-Löf (1973) who developed a likelihood ratio test for the hypothesis that the Rasch model holds for the whole set of items against the hypothesis that this model holds for two disjoint subsets of items defined in advance. A major problem of this test is that it implicitly assumes that the items discriminate equally well between subjects. Therefore, the test based on this assumption may lead to wrong conclusions when items have different discriminating power, as it often happens. In this contribution, we address the above issues through multidimensional IRT models (Bartolucci, 2007) in which (i) a two-parameter logistic parametrization (Birnbaum, 1968) may be even used for the probability of success in responding to an item, given the ability; and (ii) the latent traits are represented through a random vector with a discrete distribution, any level of which identifies a different latent class in the population of students.
Method
Expected Outcomes
References
Bartolucci, F., & Forcina, A. (2005). Likelihood inference on the underlying structure of IRT models. Psychometrika, 70, 31–43. Bartolucci, F. (2007). A class of multidimensional IRT models for testing unidimensionality and clustering items. Psychometrika, 72, 141–157. Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F.M. Lord & M.R. Novick (eds.), Statistical theories of mental test scores (pp. 395–479). Reading, MA: Addison-Wesley. Goodman, L.A. (1974). Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika, 61, 215–231. Lazarsfeld, P.F., & Henry, N.W. (1968). Latent structure analysis. Boston: Houghton Mifflin. Martin-Löf, P. (1973). Statistiska modeller.Anteckningar från seminarier lasåret 1969–1970, utarbetade av Rolf Sundberg. Obetydligt ändrat nytryck, October 1973. Stockholm: Institütet för Försäkringsmatemetik ochMatematisk Statistisk vid Stockholms Universitet. Rasch, G. (1961). On general laws and the meaning of measurement in psychology. Proceedings of the IV Berkeley Symposium on Mathematical Statistics and Probability, 4, 321–333. Verhelst, N.D. (2001). Testing the unidimensionality assumption of the Rasch model. Methods of Psychological ResearchOnline, 6, 231–271.
Search the ECER Programme
- Search for keywords and phrases in "Text Search"
- Restrict in which part of the abstracts to search in "Where to search"
- Search for authors and in the respective field.
- For planning your conference attendance you may want to use the conference app, which will be issued some weeks before the conference
- If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.