Statistical Modelling 9 (2009), 99–118

On the estimation of the misclassification table for finite count data with an application in caries research

Emmanuel Lesaffre
Department of Biostatistics,
Erasmus Medical Centre,
Rotterdam
The Netherlands
and
L-Biostat,
Catholic University Leuven,
Leuven
Belgium
eMail: emmanuel.lesaffre@med.kuleuven.be

Helmuth Küchenhoff
Department of Statistics,
Ludwig-Maximilians-Universität,
München
Germany

Samuel M Mwalili
Statistics and Actuarial Science,
Jomo Kenyatta University of Agriculture and Technology
Kenya

Dominique Declerck
School of Dentistry,
Catholic University of Leuven
Belgium

Abstract:

We look at the correction for misclassification of possibly corrupted finite count data in epidemiological studies. In general, the misclassification probabilities are estimated from a validation study and used to correct for the distortion. However, most often the validation study is quite small implying that the misclassification probabilities are impossible to calculate or estimate with high variability if based on the multinomial distribution. To increase efficiency, we propose an approach based on the fact that to determine a count the examiner needs to evaluate all items that make up that count, called the double binomial (DB) approach. We suggest various extensions of the DB approach which might mimic better the scoring behaviour of the examiner relative to a gold standard. We evaluate the performance of our approach(es) to estimate the misclassification probabilities in comparison to the multinomial approach in an analytical way and in a simulation study. Finally, the practical use of our methods is exemplified on an oral health survey examining caries experience in 7-year-old Flemish children involving 16 dental examiners.

Keywords:

Count data; logistic regression; misclassification; prevalence; response error
back