新世纪20年国内测验信度研究

陈虹熹温忠麟叶宝娟蔡保贞方杰

摘要：随着验证性因子分析模型的应用, 信度研究进入了崭新的发展阶段。新世纪前20年国内有关测验信度的研究有三条发展主线。一是基于验证性因子模型的信度发展, 包括同质性系数、合成信度、最大信度等; 二是数据类型的拓展, 包括两水平和追踪数据的信度; 三是信度用途的拓展, 如评分者信度、编码者信度等。对于通常的测验(题目之间的测量误差不相关), 如果α系数够高, 信度就够高; 否则使用合成信度。如果一个统计模型中所有变量的合成信度都很高(超过0.95), 使用显变量建模与使用潜变量建模的结果差别不大; 否则, 使用潜变量建模较好。

学科分类：教育科学、科学研究

中文关键词：信度α系数同质性系数合成信度区间估计

推荐引用：陈虹熹,温忠麟,叶宝娟,蔡保贞,方杰.新世纪20年国内测验信度研究[EB/OL].(2023-03-28)[2025-10-18].https://chinaxiv.org/abs/202303.09602.点此复制

Abstract：With the application of confirmatory factor analysis, research on reliability has entered a new stage. In the first two decades of the 21st century, the studies on test reliability (including point estimation and interval estimation) in Chinas mainland show three main lines of development. The first line is the development from research centered on the coefficient to the reliability research based on confirmatory factor models, including the homogeneity coefficient, composite reliability, maximum reliability, single-indicator reliability and reliability of the whole item set scores. Studies have shown that the coefficient is still useful. In most cases, the coefficient is the lower bound of the reliability of the composite score (total or average score). As long as the coefficient is high enough, the test reliability will be even higher. But the coefficient cannot be used to measure the homogeneity and the internal consistency of a test. The homogeneity coefficient based on the bi-factor model can be adopted to measure the homogeneity of a multidimensional scale, and the composite reliability can be adopted to measure the internal consistency (if consistency is understood as the consistency within each dimension). Furthermore, the Delta method can be employed to estimate the confidence intervals of various reliability. The second line is the expansion of data types collected by scales (or questionnaires), from single-level data to multi-level and longitudinal data. Whether unidimensional or multidimensional, it is recommended to use a multi-level confirmatory factor model to calculate the reliability of multi-level data. As for the longitudinal data, it is recommended to use the test reliability developed on the basis of the linear mixed model, and the longitudinal data can also be used as a special case of the two-level data for reliability analysis. The third line is the extended use of reliability, involving rater reliability, encoder reliability, attribute-level classification consistency in cognitive diagnostic assessment, and reliability of difference scores. In addition, research of reliability generalization and reliability meta-analysis appeared. For a common test with item-errors that can be reasonably assumed uncorrelated, the following procedure of reliability analysis is recommended. When the coefficient is high enough, report the coefficient ; otherwise calculate the composite reliability on the basis of the factor model. If the composite reliability is high enough, report the composite reliability; otherwise the test reliability is considered unacceptable. If the composite reliability of every variable in a statistical model is very high (over 0.95), modeling with composite scores does not differ much from modeling with latent variables. Otherwise, it is better to use latent variable modeling.

展开英文信息

新世纪20年国内测验信度研究

评论