Epi 101: Indirect Age Adjustment by Hand and in R

Last time, we talked about direct age adjustment. In direct age adjustment, you know what the age-specific death counts and rates are. Because you’re comparing two populations, you need to account for the differences in the populations’ age distributions. For that, we use a reference population to which the two (or more) populations in question are compared to.

In indirect age adjustment, you have a population for which you have a total number of events (e.g. deaths), but you don’t have the age-specific counts. Something like this:

Age GroupPopulationDeaths
Over 7550,000Unknown
Total Deaths:2,550

This is rarely the case, right? We usually have the ages of the people who die. But this could be the case if, for example, the age-specific counts are unreliable or too small. Or if the total number is very much an estimate than an actual headcount.

Like with direct age adjustment, we get a standard population and add its age-specific population and age-specific deaths to our table:

Age GroupPopulationDeathsStandard PopulationStd. Pop.
Over 7550,000Unknown28,0004,200
Total Deaths:2,550Total Deaths:12,050

Next, we get the age-specific death rate from the population, and multiply that by the age-specific population counts of the population in question:

Age GroupPopulationDeathsStandard PopulationStd. Pop.
Std. Pop.
Expected Deaths
Over 7550,000Unknown28,0004,2000.157,500
Total Deaths:2,550Total Deaths:12,050Total Expected Deaths:13,120

As you can see, the expected deaths (13,120) is much higher than the observed deaths (2,550). The Standardized Mortality Ratio (SMR) is 2,550 divided by 13,120, which equals 0.194. This tells us that there are about 20% the number of expected deaths. This could be a good thing, meaning that your population is exhibiting less deaths than the standard population. Or it could mean that, just like you don’t have age-specific death counts, you might be missing out on other deaths. It’s up to you to interpret this.

One final bit of math is the indirectly standardized rate, where you use the SMR above and multiply it by the standard population’s crude death rate (12,050/150,000, which is 0.080). Multiplying 0.194 times 0.080 gives you 0.0156, or about 15.6 deaths per 1,000 people. This is the death rate (also called the Adjusted Mortality Rate, AMR) that you would see in the population in question once you remove the effect of age. The crude rate of the population in question is 2,550/136,00, which equals 0.01875, or 18.8 deaths per 1,000 people.

What About R?

Just like we did last time with direct age standardization, let’s fire up that RStudio and start by writing which libraries we want to use:


Next, we create our data:

age_groups <- c("0-15", "16-35", "36-55", "56-75", ">75") # Labels for Age Groups
population <- c(15000,20000,21000,30000,50000) # Population by age group of population of interest
observed <- 2550 # Observed deaths in the population of interest
std_pop <- c(20000,30000,35000,37000, 28000) # Population by age group of Standard Population
std_pop_sum <- sum(std_pop) # Total population in population of interest
std_count <- c(200,1500,2450,3700,4200) # Observed deaths in standard population
std_sum_count <- sum(std_count) # Total observed deaths in standard population
std_rate <- std_count/std_pop # Age specific death rates in standard population
std_crude_rate <- std_sum_count/std_pop_sum # Crude death rate in the standard population

Now, we create a data frame “df” and put the data we created above into it. We also create an additional variable “expected” and fill that in by multiplying the standard population’s age-specific rates by the population in question:

df <- data.frame(age_groups,population,std_pop,std_count, std_rate)
df$expected <- std_rate*population

Finally, two quick operations to calculate the SMR and the AMR:

smr <- observed/sum(df$expected) # Standardized mortality ratio
amr <- smr*std_crude_rate*1000 # Adjusted mortality rate per 1,000

Finally, a quick way to do all of the above and get some 95% confidence intervals with them:

indirect <- ageadjust.indirect(observed, # The number of events observed in the population of interest
                         population, # The population by age group in the population of interest
                         std_count, # The number of events observed in the standard population
                         std_pop, # The population by age group in the standard population
                         stdrate = std_rate, # The age specific death rates in the standard population
                         conf.level = 0.95) # The confidence level for your confidence intervals

round(indirect$sir,3) # Standardized Mortality Ratio (sir)
round(indirect$rate*1000,2) # Adjusted Mortality Rate per 1,000 (adj.rate)

From this, you get the same values as we did above (both by hand and with R), but you also get the 95% confidence intervals:

> round(indirect$sir,3) # Standardized Mortality Ratio (sir)
 observed       exp       sir       lci       uci 
 2550.000 13120.000     0.194     0.187     0.202 
> round(indirect$rate*1000,2) # Adjusted Mortality Rate per 1,000 (adj.rate)
crude.rate   adj.rate        lci        uci 
     18.75      15.61      15.02      16.23 

And there you have it. Two ways to do this in R. You can now save the R file (or download the file I used by clicking here). If you have a lot of data, and you don’t want to write out the individual vectors at the beginning, then I suggest you learn the basics of importing data into R and then use that with the packages mentioned above.

Big thanks to these folks and these folks for their examples on how to use dsr and epitools, respectively.

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
%d bloggers like this: