Does Reading Questionnaires Aloud Improve the Validity of Student Responses?

Author(s):

Rolf Strietholt(presenting / submitting)Cornelia Gresch

Conference:

ECER 2013

Network:

09. Assessment, Evaluation, Testing and Measurement

Format:

Paper

Session Information

09 SES 07 B, Issues in Test Construction and Validation

Paper Session

Time:

2013-09-11

17:15-18:45

Room:

D-310

Chair:

Andres Sandoval-Hernandez

Contribution

A major finding from recent large-scale assessments on student achievement is that all around the world a remarkable proportion of students is only barely able to read. This does not only hold true at the end of primary school but also at the end of secondary school (Mullis, Martin, Kennedy, & Foy, 2007; OECD, 2010). Against this background, one might be concerned that students usually fill out the background questionnaires self-administered and individually in large-scale assessments. The present study, therefore, investigates whether the quality of scales can be improved by reading questionnaires aloud to the students.

The importance of language has been studied in the context of student assessments and survey research. But even though survey experts repeatedly emphasize the importance of the question wording and suggest using simple syntax and avoiding unfamiliar terms (e.g. Bradburn, Sudman, & Wansink, 2004), many questionnaires are quite demanding. The field of test development faces similar challenges. Here, test developers suggested to make linguistically demanding tests more accessible by reading them aloud. Experimental studies suggest that particularly poor readers can benefit from this test administration policy (e.g. Meloy, Deville, & Frisbie, 2002; Randall & Engelhard, 2010; Wolf, Kim, Kao, & Rivera, 2009). In other words, students’ test responses were not affected by the linguistic complexity when the tests were read to them and the factorial structure became one-dimensional. Consequently, the test results are more valid measures of the construct being assessed. With respect to background questionnaires, however, there is no research on the consequences of reading them aloud to students for the data quality.

The main aim of the current study is to estimate the effect of reading background questionnaires aloud on the data quality of scales. We focus on the factorial validity of the responses to a complex scale. Therefore, we conducted an experiment where we compare students’ responses when we read the questionnaires aloud (treatment group) with the responses of students that read them on their own (control group).

The study is part of the National Educational Panel Study (NEPS) in Germany (Blossfeld, , Rossbach & von Maurice, 2011). The project analyzes educational processes from early childhood to late adulthood. The main purpose is to collect longitudinal data of educational processes, decisions, competences, and returns of education throughout the life course. In our study we invited about 50 schools from the lower secondary track in four German states to participate. In most German states tracking starts after grade 4 and students from this track generally perform lowest in large-scale tests (Naumann, Artelt, Schneider & Stanat, 2010). The four states represent different geographical parts in Germany from rural and urban areas. About half of the schools accepted to participate with all in all 664 grade 5 students (323 girls, 341 boys, M_age=12.3 years, SD_age=0.9 years) from 27 schools.

Method

Within each school we assign participants at random to a treatment group (TG) or control group (CG). Thereafter, the test administrators in the CGs advice the students to complete the questionnaire individually, and in the TG the administrator reads the questionnaires aloud using a standardized script. We analyze students’ responses to Rosenberg’s Global Self-Esteem (RGSE) scale. The scale covers five positively and five negatively worded items. In a number of studies, the factorial validity of the RGSE scale has been studied and researcher found that the model fit improved after introducing a method factor for the negatively worded items (e.g. DiStefano & Motl, 2006; Marsh, 1996). This finding indicates that some students misunderstand the negatively worded items. If reading aloud improves the accessibility of the reverse worded questions, then the factor loadings on the general self-esteem (GSE) factor will be larger in the TG and the factor loading on the method factor will be larger in the CG. To test this hypothesis we use multi-group CFA and model a GSE factor along with a second orthogonal method factor for the reversely coded items. We test for differences by comparing models with vs. without equally constrained factor loadings in both groups.

Expected Outcomes

A first tentative result of the study is that in the more relaxed model the factor loading on the GSE factor tends to be higher in the TG (reading aloud) which indicates that reading items aloud produces more valid responses. This is most obvious for the negatively worded items. In other words students will struggle more if the questionnaires are self-administrated; reading aloud can improve the validity of the responses. The findings of the study raise further questions for studies in school context. What are the consequences of changing the administration mode within a panel study (e.g. reading aloud in lower grades and switching to self-administration in higher grades)? Another issue concerns the comparability of data obtained by different administration modes. Our study highlights that we presumably cannot neglect the administration mode when interpreting the results from various studies.

References

Blossfeld, H.-P., Rossbach, H. G., & von Maurice, J. (2011). Education as a Lifelong Process: The German National Educational Panel Study (NEPS). Zeitschrift für Erziehungswissenschaft (Sonderheft). 14. Wiesbaden: VS Verlag für Sozialwissenschaften. Crawford, L., & Tindal, G. (2004). Effects of a Read-Aloud Modification on a Standardized Reading Test. Exceptionality, 12(2), 89-106. DiStefano, C., & Motl, R. W. (2006). Further investigating method effects associated with negatively worded items on self-report surveys. Structural Equation Modeling, 13(3), 440-464. Marsh, H. W. (1996). Positive and negative global self-esteem: A substantively meaningful distinction or artifactors? Journal of Personality and Social Psychology, 70(4), 810-819. Meloy, L. L., Deville, C., & Frisbie, D. A. (2002). The Effect of a Read Aloud Accommodation on Test Scores of Students With and Without a Learning Disability in Reading. Remedial and Special Education, 23(4), 248–255. Naumann, J., Artelt, C., Schneider, W., & Stanat, P. (2010). Lesekompetenz von PISA 2000 bis PISA 2009 [Reading Literacy from PISA 2000 to PISA 2009]. In E. Klieme, C. Artelt, J. Hartig, N. Jude, O. Köller, M. Prenzel, W. Schneider & P. Stanat (Eds.), PISA 2009. Bilanz nach einem Jahrzehnt [PISA 2009. Review after one decade] (pp. 23-71). Münster: Waxmann. Randall, J., & Engelhard, G. (2010). Performance of Students With and Without Disabilities Under Modified Conditions. Using Resource Guides and Read-Aloud Test Modifications on a High-Stakes Reading Test. The Journal or Special Education, 44(2), 79-93. Wolf, M. K., Kim, J., Kao, J. C., & Rivera, N. M. (2009). Examining the effectiveness and validity of glossary and read-aloud accommodations for English language learners in a math assessment (CRESST Report 766). Los Angeles: University of California, National Center for Research on Evaluation, Standards, and Student Testing (CRESST).

Author Information

Rolf Strietholt (presenting / submitting)

TU Dortmund

Institute for School Development Research

Münster

Cornelia Gresch

WZB, Germany

Search the ECER Programme

Search for keywords and phrases in "Text Search"
Restrict in which part of the abstracts to search in "Where to search"
Search for authors and in the respective field.
For planning your conference attendance you may want to use the conference app, which will be issued some weeks before the conference
If you are a session chair, best look up your chairing duties in the conference system (Conftool) or the app.