Assessing the human condition: capture-recapture techniques

Allows accurate counts of those difficult to reach populations

Evaluating the human condition occurs in many disciplines- for example, epidemiology, sociology, political sciences, cnminology, and market research. Despite advances in these fields progress has been sluggish compared with that in the "hard" sciences. A pnmary force for rapid developments in these sciences has been the discovery and use of new technologies (for example, the polymerase chain reaction, electron microscopy, carbon-14 dating), which increase the precision of measurement and reduce costs, resulting in a rapid accumulation of knowledge. (1,2) Human population science has society as its laboratory and "counting humans" as its basis. Counting techniques, however, have changed little this century. The use of capture-recapture techniques could bring about a paradigm shift in how counting is done in all the disciplines that assess human populations.

Historically, the main approach to evaluating human populations has been to ftnd the members of a community with a characteristic of interest and count them-for example, researchers have counted people with a particular disease (epidemiology), income level (economics), and party affilia- tion (political science). This approach is rooted in the belief that one needs to count and classify everyone to know about them. Complete enumeration, though, is costly and inefficient. Alternatives such as sampling a small group and extrapolating the results to a region or nation have bcen developed. These techniques may, however, be slow, costly, limited, and "foreign" to the people who need the data for policy-for example, governments.

Governments typically cannot wait for population scientists to come up with answers to their urgent questions. Instead they extract data from vast repositories of routinely collected lists of people categorised according to social, medical, or demographic factors. But because these lists may be incom- plete, the conclusions may be flawed. Could the technique of capture-recapture provide an answer to this impasse of accu- rate but limited data versus inaccurate but broad based data? Counting is not limited to humans. Animal population scientists share many goals with human population saentists, but in terms of the data they have collected the animal scientists are way ahead. This is because animal ecologists recognized that a complete count of wildlife was impossible and quickly scrapped human demography's goal of complete enumeration. Instead, they developed intuitive estimators of the population based on incomplete sampling; that of capture- recapture.(3)

It works like this. If you wanted to ascertain the number of fish in the Sea of Galilee you would go out and catch fish, tag them, and then release them. On subsequent days you would net fish again and note the number of tagged fish in the catch. By using a simple formula one can estimate the total number of fish, with confidence intervals surrounding the estimate. This approach collects samples (lists) and looks for tags (duplicates) and from this determines the degree of under- counting. The sample is then adjusted for the degree of ascertainment. Further advances include log linear modelling (to evaluate and control for the degrees of dependency among samples) and "open" system models (which permit migration in and out).(3-5)

Much of what we know about the size, distribution, and characteristics of wildlife-populations is based on this and other approaches to counting with incomplete enumeration. As a result, we know considerably more about the global numbers of eagles, sperm whales, and bison (6) than we know about the number and distribution of people who are unemployed, sick, or hungry in our societies.

Demography also has a long history of evaluating under- counting and to a limited extent has employed capture- recapture methods to adjust counts.7 However, there still lingers from demography (for example, the census and vital statistics) the fundamental belief that anything less than counting every person is "imperfect." Our animal ecologist friends would argue that trying to count everyone is a noble but futile and expensive goal.(8)

Using capture-recapture techniques as a primary means of monitoring the human condition could bring substantial benefits. With readily accessible or newly collected lists, broad and inexpensive measures of events shaping humankind can be obtained at both the community and the national level. Human population scientists have avoided using such methods mainly because they believe that the low and variable degrees of ascertainment of lists yield "shoddy" data and therefore flawed conclusions. Yet estimates of bird, fish, and mosquito populations show that the degree of undercounting can be estimated precisely and used to adjust for the degree of ascertainment. These estimates are more accurate than those derived from available lists, either alone or aggregated. We must therefore break away from two basic tenets of human population scientists: that undercounting is bad and that we need to count everyone.(9)

Capture-recapture would be useful in any discipline that counts people. To cite one example, most countries routinely collect data on occupational injuries. These data lists are usually incomplete, yet important policy decisions are based on them. Typically, occupational injuries are idendfied to governments by muldple sources. These sources are pooled together, the duplicates taken out, and the names aggregated into a single list which forms the basis of the published "occupational injury statistics." Many other examples of incomplete government lists exist (for example, those of unemployed people, disabled people, and places treating patients with cancer).

Most academics scoff at using multiple data sources provided by government because the sources' degree of ascertainment varies. By using information on the duplicate dau, capture-recapture techniques can formally measure the degree of undercoundng in the individual sources. Estimates could be adjusted for the degree of undercounting and thus the statistics move beyond the aggregated "count" and closer to the "truth."

Another perceived disadvantage of this approach is that the criteria for entry in a particular list may not be consistent. Although the criteria for listing someone as socialist, jobless, disadvantaged, Asian, art patron, or occupationally injured may vary considerably within and between lists, assessing the sensitivity and specificity of the individual items on the lists is possible. Once determined, estimates derived from capture- recapture can then be adjusted for the diagnostic accuracy of the lists.

This is not to say that capture-recapture is perfect--30 to 40 years of work has been needed to evaluate the method in animals.' The assumptioons now need to be assessed in human populations, but, given our current knowledge, the techniques offer a viable alternative or companion to current methods. The ramifications are immense: for the first time we could have widespread, accurate, and cost effective assessments of people's conditions. For both population scientists and governments, statistics would be more accurate and cheaper. This could lead to a new approach towards measurements in society and applying accurate knowledge to policy.

In future capture-recapture could be coupled with some of the remarkable advances in global telecommunications.(10,11) Accurate tele-monitoring of humans could be available from community level to global level on an almost daily basis. The accuracy of the data that inform government for decisions on public health and welfare could dramadcally increase while costs fall. Two papers recently published in the BMJ give an idea of this method's exciting potential to count difficult to reach populations-female streetworking prostitutes in Glasgow'(12) and homeless people in Westminster (p 27).(13)

I thank Dr Janice Dorman for her help. My work was supported by the National Institutes of Health (grant # 5 R0 I DK24021)

Department of Epidemiology
Graduate School of Public Health,
University of Pittsburgh,
Pittsburgh, PA 15261
I Hall SS. How technique is changing science. Science 1992,257:244-9.
2 Kuhn TS. The structure of scientific revolutions Chicago: University
of Chicago Press, 1970.
3 Seber GAF. The estimation of animal abundance and related parameters.
2nd ed. London: Charles Griffin, 1982.
4 Bishop, YMM, Feinberg SE, HOlland PW. Discrete multivariate analysis:
theory and practice. Cambridge, MA: MIT Press, 1975:229-56.
5 Cormack R. Log linear models for capture-recapture. Biometrics 1989,
6. LaPorte Re, McCarty DJ, Tull ES, Tajima N. Counting birds, trees
and NCDs. Lancet 1992, 339:494-5.
7 Sekar CC, Deming WE. On a method of estimating birth and death rates
and the extent of registration. American Statistical Association Bureau
1949; 44:101-15.
8 McCarty DJ, Tull ES, Moy CS, LaPorte RE. Ascertainment corrected
rates: applications of capture-recapture methods. Int J epidemiol 1993, 22:559-65.
9 LaPorte Re, McCarty DJ, Songer TJ, Bruno, G Tajima N. Disease
monitoring. Lancet 1993, 341:1416.
10 LaPorte RE, Gooch WA, Gamboa C, Tajima N. International disease
counting form. Lancet 19093; 342:1416.
11 Gore A. Infrastructure for the global village. Sci Am 1991;
12 McKeganey N, Barnard M, Leyland A, Coote I, Follet E. Female
streetworking prostitutes and HIV infection in Glasgow. BMJ 1992;
13 Fisher N, Turner SE, Pugh R, Taylor C. Estimating numbers of
homeless and homeless mentally ill people in north east Westminster
by using capture-recapture analysis. BMJ 1994; 308:27-30.