Episensr 0.9.2 now available on CRAN

Bug fixes were brought to episensr R package, regarding the use of distributions and computations of odds and risk ratios in probsens.conf function for the probabilistic sensitivity analysis for unmeasured confounder. Improvement on the use of distributions was also brought to other probsens series of functions. Let’s run the example from Modern Epidemiology by Rothman, Greenland & Lash, on page 365-366.

This example is taken from a paper by Greenland et al. (1994), looking at the association between occupational resins exposure and lung cancer mortality, where smoking is an unmeasured potential confounder. The 2-by-2 table is the following:

Resins exposed Resins unexposed
Cases 45 94
Controls 257 945

After adjustment for age and year at death, Greenland et al. found an OR of 1.77 (95% CI from 1.18 to 2.64). Is smoking, for which they had no data about, had an effect on this association? How large the association between resins and smoking had to be to remove the association between resins and lung cancer association after adjustment for smoking? For this, you have to know 3 bias parameters: the association between smoking and lung cancer, the prevalence of smoking among those unexposed to resins, and the prevalence of smoking in those exposed to resins.

Prior probability distributions are given to each of these bias parameters. Prevalences of smoking in those exposed to resins, and of smoking in those unexposed to resins receive prior distributions that are uniform between 0.40 and 0.70. Prior distribution for the odds ratio associating smoking with lung cancer mortality is log-normal with 95% limits of 5 and 15. How to find the parameters of this distribution? Its mean is [ln(15) + ln(5)] / 2 = 2.159, and its standard deviation is [ln(15) - ln(5)] / 3.92 = 0.28, and it looks like this:

set.seed(123)
x <- rlnorm(10000, meanlog = 2.159, sdlog = 0.28)
quantile(x, c(0.025, 0.975))
##      2.5%     97.5% 
##  4.979672 14.959055
plot(density(x))
Log-normal distribution with meanlog = 2.159 and sdlog = 0.28.

Figure 1: Log-normal distribution with meanlog = 2.159 and sdlog = 0.28.

Now, let’s run episensr.conf with 50,000 iterations:

library(episensr)

set.seed(123)
res <- probsens.conf(matrix(c(45, 94, 257, 945),
                            dimnames = list(c("Cases+", "Cases-"),
                                            c("Res+", "Res-")),
                            nrow = 2, byrow = TRUE),
                     reps = 50000,
                     prev.exp = list("uniform", c(.4, .7)),
                     prev.nexp = list("uniform", c(.4, .7)),
                     risk = list("log-normal", c(2.159, .28)))
res
## --Observed data-- 
##          Outcome: Cases+ 
##        Comparing: Res+ vs. Res- 
## 
##        Res+ Res-
## Cases+   45   94
## Cases-  257  945
## 
##                                      2.5%    97.5%
## Observed Relative Risk: 1.646999 1.182429 2.294094
##    Observed Odds Ratio: 1.760286 1.202457 2.576898
## ---
##                                            Median 2.5th percentile
##            RR (SMR) -- systematic error: 1.591116         1.197359
## RR (SMR) -- systematic and random error: 1.586537         1.025504
##            OR (SMR) -- systematic error: 1.761771         1.246469
## OR (SMR) -- systematic and random error: 1.758665         1.050075
##                                          97.5th percentile
##            RR (SMR) -- systematic error:          2.098129
## RR (SMR) -- systematic and random error:          2.459242
##            OR (SMR) -- systematic error:          2.485622
## OR (SMR) -- systematic and random error:          2.956335

The median adjusted OR is 1.76 [1.25,2.49]. The ratio of the two 95% CI bounds is 2.49/1.25 = 1.99. The ratio from the conventional 95% confidence limits was 2.64/1.18 = 2.24. These 2 ratios are pretty, and therefore our uncertainty about confounding is similar to our uncertainty about random error.

You can even plot the bias-adjusted OR including both systematic and random error:

library(ggplot2)
ggplot(res$sim.df, aes(x = tot.ORadj.smr)) +
    geom_histogram(binwidth = .1, colour = "blue", fill = "white") +
    xlab("OR adjusted for confounding and random error")
Distribution of the 50,000 confounder-adjusted odds ratios.

Figure 2: Distribution of the 50,000 confounder-adjusted odds ratios.