Bug fixes were brought to
episensr R package, regarding the use of distributions and computations of
odds and risk ratios in
probsens.conf function for the probabilistic
sensitivity analysis for unmeasured confounder.
Improvement on the use of distributions was also brought to other
series of functions.
Let’s run the example from Modern Epidemiology by Rothman, Greenland &
Lash, on page 365-366.
This example is taken from a paper by Greenland et al. (1994), looking at the association between occupational resins exposure and lung cancer mortality, where smoking is an unmeasured potential confounder. The 2-by-2 table is the following:
|Resins exposed||Resins unexposed|
After adjustment for age and year at death, Greenland et al. found an OR of 1.77 (95% CI from 1.18 to 2.64). Is smoking, for which they had no data about, had an effect on this association? How large the association between resins and smoking had to be to remove the association between resins and lung cancer association after adjustment for smoking? For this, you have to know 3 bias parameters: the association between smoking and lung cancer, the prevalence of smoking among those unexposed to resins, and the prevalence of smoking in those exposed to resins.
Prior probability distributions are given to each of these bias parameters. Prevalences of smoking in those exposed to resins, and of smoking in those unexposed to resins receive prior distributions that are uniform between 0.40 and 0.70. Prior distribution for the odds ratio associating smoking with lung cancer mortality is log-normal with 95% limits of 5 and 15. How to find the parameters of this distribution? Its mean is [ln(15) + ln(5)] / 2 = 2.159, and its standard deviation is [ln(15) - ln(5)] / 3.92 = 0.28, and it looks like this:
set.seed(123) x <- rlnorm(10000, meanlog = 2.159, sdlog = 0.28) quantile(x, c(0.025, 0.975))
## 2.5% 97.5% ## 4.979672 14.959055
Now, let’s run
episensr.conf with 50,000 iterations:
library(episensr) set.seed(123) res <- probsens.conf(matrix(c(45, 94, 257, 945), dimnames = list(c("Cases+", "Cases-"), c("Res+", "Res-")), nrow = 2, byrow = TRUE), reps = 50000, prev.exp = list("uniform", c(.4, .7)), prev.nexp = list("uniform", c(.4, .7)), risk = list("log-normal", c(2.159, .28))) res
## --Observed data-- ## Outcome: Cases+ ## Comparing: Res+ vs. Res- ## ## Res+ Res- ## Cases+ 45 94 ## Cases- 257 945 ## ## 2.5% 97.5% ## Observed Relative Risk: 1.646999 1.182429 2.294094 ## Observed Odds Ratio: 1.760286 1.202457 2.576898 ## --- ## Median 2.5th percentile ## RR (SMR) -- systematic error: 1.591116 1.197359 ## RR (SMR) -- systematic and random error: 1.586537 1.025504 ## OR (SMR) -- systematic error: 1.761771 1.246469 ## OR (SMR) -- systematic and random error: 1.758665 1.050075 ## 97.5th percentile ## RR (SMR) -- systematic error: 2.098129 ## RR (SMR) -- systematic and random error: 2.459242 ## OR (SMR) -- systematic error: 2.485622 ## OR (SMR) -- systematic and random error: 2.956335
The median adjusted OR is 1.76 [1.25,2.49]. The ratio of the two 95% CI bounds is 2.49/1.25 = 1.99. The ratio from the conventional 95% confidence limits was 2.64/1.18 = 2.24. These 2 ratios are pretty, and therefore our uncertainty about confounding is similar to our uncertainty about random error.
You can even plot the bias-adjusted OR including both systematic and random error:
library(ggplot2) ggplot(res$sim.df, aes(x = tot.ORadj.smr)) + geom_histogram(binwidth = .1, colour = "blue", fill = "white") + xlab("OR adjusted for confounding and random error")