The effect of preferential sampling on sampling variance

Clifford, D., Kuhnert, P., Dobbie, M., Baldock, J., Harch, B., McKenzie, N.J., Wheeler, I., & McBratney, A.B. (2012) The effect of preferential sampling on sampling variance. In Minasny, Budiman, Malone, Brendan P., & McBratney, Alex B. (Eds.) Digital Soil Assessments and Beyond : Proceedings of the 5th Global Workshop on Digital Soil Mapping 2012, Sydney, Australia. CRC Press.

View at publisher


We examine some variations of standard probability designs that preferentially sample sites based on how easy they are to access. Preferential sampling designs deliver unbiased estimates of mean and sampling variance and will ease the burden of data collection but at what cost to our design efficiency? Preferential sampling has the potential to either increase or decrease sampling variance depending on the application. We carry out a simulation study to gauge what effect it will have when sampling Soil Organic Carbon (SOC) values in a large agricultural region in south-eastern Australia. Preferential sampling in this region can reduce the distance to travel by up to 16%. Our study is based on a dataset of predicted SOC values produced from a datamining exercise. We consider three designs and two ways to determine ease of access. The overall conclusion is that sampling performance deteriorates as the strength of preferential sampling increases, due to the fact the regions of high SOC are harder to access. So our designs are inadvertently targeting regions of low SOC value. The good news, however, is that Generalised Random Tessellation Stratification (GRTS) sampling designs are not as badly affected as others and GRTS remains an efficient design compared to competitors.

Impact and interest:

0 citations in Scopus
Search Google Scholar™

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

ID Code: 72747
Item Type: Book Chapter
Keywords: Australia, Data collection, Data sets, Design efficiency, Efficient designs, Probability design, Sampling design, Simulation studies, Soil organic carbon, Unbiased estimates, Soils, Design
ISBN: 9780415621557
Divisions: Current > QUT Faculties and Divisions > Science & Engineering Faculty
Deposited On: 12 Jun 2014 00:32
Last Modified: 27 Oct 2015 16:00

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page