International Database on Longevity

Login | User Registration

IDL Project

Validation procedures

The IDL contains thoroughly validated information on individuals of extreme ages. Data collection is performed in such a way that age ascertainment bias is avoided.

Age validation procedures vary across countries depending on the sources of information that are available in each country. Birth or baptism records are available in some countries, while in others, the validation is performed by checking early census records.

In the initial database collection (2010), the quality of the validation procedures used in the production of the country data was assigned to one of two categories: fully validated, which is the more reliable and desirable level of validation; and carefully checked, which is the less reliable and desirable level of validation. Cases in which the individual’s early life documents were validated were classified as fully validated, while all of the other validated cases were classified as carefully checked.

In the current version of the IDL, the quality of the validation procedures used is not noted, because in some situations it was difficult to establish the formal criteria needed to distinguish between the two levels of validation. Instead, the information about the documents used to validate age in each country is provided in standardized, country-specific metadata files, and is included in the data files.

The lists of semi-supercentenarian cases were compiled in the same way as the corresponding lists of supercentenarians in all of the countries except England and Wales, France, and the USA. In France and England and Wales, the lists of semi-supercentenarians contain all of the known candidates, but the validation of semi-supercentenarians was done on a sample basis (so-called sample validation). Thus, in these countries, the age-specific probability of the successful validation of every candidate was estimated not by conducting an exhaustive validation of all candidates, but by validating only the candidates in a randomly selected sample. For instance, in England and Wales 22.5% of the female deaths and 100% of the male deaths at ages 105-109 were validated. In France, the validation of semi-supercentenarians was undertaken at various times. The records from the 2004 data extract from the Registre National d'Identification des Personnes Physiques (RNIPP) were the first to be validated. Because of the large number of records available (2031 cases), exhaustive checks were performed only for the oldest semi-supercentenarians (107-109 years old). By contrast, only one-half of the cases of individuals reaching age 106 and one-third of the cases of individuals reaching age 105 (randomly selected) were checked. This process resulted in a sample of 1051 cases among the cohorts who were supposedly born in 1883-1897 and who died in 1988-2002 in France (Metropolitan France and French overseas départements, except Mayotte). Of these records, 1043 were validated. Later on, using the 2014 RNIPP data extract, a random sample of 100 cases (99 of which were validated) was selected by choosing 20 records from each one-year age group. This 100-record sample is provided in the updated IDL, together with the complete list of known semi-supercentenarian candidates in France. The only validated cases provided for the United States are from a sample that was randomly drawn from the population. Thus, the list of candidates was limited to a sample representing only around 10% of men aged 105-109 and women aged 108-109 in the US population.

Age ascertainment bias

The IDL emphasize that the data on validated semi- and supercentenarians included in the database are free of age ascertainment bias. This is an important point, because some kinds of identification procedures may be more likely to register people who are, for example, 115 years old than people who are merely 110 years old. For example, as media coverage is more common for the oldest cases, individuals who died at younger ages (i.e., shortly after their 110th birthday) may be underrepresented in the press. This kind of age bias in the inclusion of semi- and supercentenarians in a database can result in serious misestimates of mortality patterns. Consequently, the IDL aims to compile lists of validated semi- and supercentenarians for which the probability of cases being identified is age-independent.