DHS Survey Design:
Frequently Asked Questions
Most DHS surveys are representative* at the national level, for urban
and rural areas, and for the first administrative level subdivisions,
which are usually called regions, zones, provinces, governorates, or
states. There is growing interest in providing DHS data for even
lower administrative levels, such as counties and districts.
*In this document, by “representative at a specific domain level” we
mean that most of the survey results/indicators can be produced on
the level of that domain with good precision.
What factors determine sample size?
Sample sizes for DHS surveys are based on the number of survey
domains (usually subnational units such as regions), the precision
requirements for priority indicators, and the budget. Generally, The
DHS Program samplers design surveys with fertility and childhood
mortality estimates in mind.
To calculate fertility with an adequate level of precision, for
example, samplers include between 800 women for every
subnational region in countries with a high total fertility rate
(TFR) to 1,000 women in countries with a low TFR. The typical
subnational region is usually administrative level 1. In other words,
a high TFR country with 10 regions needs to interview at least
8,000 women in order to have estimates of fertility and child
mortality with reasonably small standard errors for each of the
10 regions. In countries with more than 10 regions, one option
to keep the sample size down and reduce costs is to group some
regions to form study domains. A total sample size of about 10,000
women is ideal to maintain cost efficiency and high data quality.
How precise are these estimates?
All survey sampling strategies are subject to sampling error. The
DHS Program designs samples to provide national and subnational
estimates with a reasonable relative standard error. The larger the
sample size, the smaller the relative standard error on any given
indicator will be. The standard errors at the admin 1 level are
DHS Survey Design: Sample Size
Considerations
Validity:
Increasing sample size is a valid
practice if funding and human
resources are sufficient for a larger
survey. All sampling strategies
are subject to sampling error; the
larger the sample, the smaller the
relative standard error will be.
Impact on cost:
Sample size is the single largest
driver of survey cost, as it impacts
all elements of the survey process
from hiring and training of staff to
processing of data, report writing
and dissemination.
Impact on quality:
Large sample sizes can overburden
the implementing agency and
survey management staff and lead
to poorer data quality because of
the challenges in data collection
and overall survey management.
A large survey requires additional
coordination and leadership
and should be undertaken by an
experienced implementing agency
with robust data quality checks in
place.
The Demographic and Health Surveys Project, implemented by ICF and funded by the United States Agency for International
Development (USAID). www.dhsprogram.com
DHS Survey Design: Frequently Asked Questions about Sample Size
wider than those at the national level; standard errors at the admin 2 level are larger than those at the admin
1 level. For this reason, interpretation of subnational trends and comparison of subnational units should be
undertaken with caution unless the total sample is very large. Appropriate significance testing is necessary to
confirm changes over time or true differences between subnational areas.
What options exist to provide data at lower administrative levels?
Full and shortened questionnaires: Depending on the priority indicators for the survey, a compromise
can sometimes be reached wherein many indicators are available at the second administrative level, while
others are available only for admin 1. For example, in the 2014 Kenya DHS the total sample size was 40,000,
which is very large for a DHS. Many indicators were available for all 47 counties, while those indicators that
require larger sample sizes (TFR, childhood mortality rates) were only available for each of the 8 regions.
Though the total sample size was very large, only half of the households received a full questionnaire, while
the other half received a shortened questionnaire.
Over-sampling in focus areas: Sometimes stakeholders have data needs that are specific to a few
districts; perhaps they want to monitor progress in key intervention areas. In this case, those districts can be
over-sampled without expanding the entire sample to the second administrative level. This helps to maintain
a manageable and cost-effective sample size while still meeting specific data needs.
Larger sample sizes: It is also possible to implement a standard DHS that is representative at the
second administrative level, which, in some countries, requires a sample size of over 50,000 women. This
can be undertaken only when financial and human resources are sufficient to properly manage the survey
without threatening the overall data quality. Very large sample sizes present unique logistical challenges for
training, staff management, data and lab processing, and provision of supplies. Management of this type of
survey requires a highly skilled and organized implementing agency coordinated with external technical
assistance.
Small area estimation: Another alternative for producing estimates at lower subnational levels relies
on small area estimation (SAE) techniques and statistical modeling using ancillary geographic information.
If these techniques are acceptable, the DHS survey scope can be limited to data collection for admin 1
estimates only and these data along with covariate data from external sources can be used to estimate
indicators at lower administrative levels.
Currently, The DHS Program uses statistical modeling using ancillary geographic information to create
interpolated surfaces for lower-level estimates. The DHS Program standardly produces modeled surfaces for
12 key indicators. These maps are presented at a 5x5km grid scale, which can be aggregated to represent the
needed subnational units along with confidence intervals for the aggregated indicators. This approach does
not provide estimates that are as accurate as high quality survey data and is currently limited to a small set of
indicators. However, if appropriate for the specific data need, interpolated surfaces may offer an alternative
to very large sample sizes. (References: https://dhsprogram.com/pubs/pdf/SAR15/SAR15.pdf, https://
dhsprogram.com/pubs/pdf/SAR14/SAR14.pdf).