Efficient regression analysis with ranked-set sampling

Chen, Zehua & Wang, You-Gan (2004) Efficient regression analysis with ranked-set sampling. Biometrics, 60(4), pp. 997-1004.

View at publisher


This article is motivated by a lung cancer study where a regression model is involved and the response variable is too expensive to measure but the predictor variable can be measured easily with relatively negligible cost. This situation occurs quite often in medical studies, quantitative genetics, and ecological and environmental studies. In this article, by using the idea of ranked-set sampling (RSS), we develop sampling strategies that can reduce cost and increase efficiency of the regression analysis for the above-mentioned situation. The developed method is applied retrospectively to a lung cancer study. In the lung cancer study, the interest is to investigate the association between smoking status and three biomarkers: polyphenol DNA adducts, micronuclei, and sister chromatic exchanges. Optimal sampling schemes with different optimality criteria such as A-, D-, and integrated mean square error (IMSE)-optimality are considered in the application. With set size 10 in RSS, the improvement of the optimal schemes over simple random sampling (SRS) is great. For instance, by using the optimal scheme with IMSE-optimality, the IMSEs of the estimated regression functions for the three biomarkers are reduced to about half of those incurred by using SRS.

Impact and interest:

9 citations in Scopus
Search Google Scholar™
7 citations in Web of Science®

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

ID Code: 90573
Item Type: Journal Article
Refereed: Yes
Keywords: lung cancer study, optimal design, ranked-set sampling, regression, sampling efficiency, lung-cancer, lymphocytes, frequency, cost
DOI: 10.1111/j.0006-341X.2004.00255.x
ISSN: 0006-341X
Divisions: Current > QUT Faculties and Divisions > Science & Engineering Faculty
Deposited On: 23 Nov 2015 22:48
Last Modified: 23 Nov 2015 22:48

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page