%\VignetteIndexEntry{Years of life lost (YLL)} \SweaveOpts{results=verbatim, keep.source=TRUE, include=FALSE, eps=FALSE} \documentclass[a4paper, twoside, 12pt]{report} \newcommand{\Title}{Years of Life Lost (YLL) to disease:\\Diabetes in DK as example} \newcommand{\Tit}{YLL} \newcommand{\Version}{2} \newcommand{\Dates}{June 2024} \newcommand{\Where}{SDC} \newcommand{\Homepage}{\url{http://bendixcarstensen.com/Epi}} \newcommand{\Faculty}{\begin{tabular}{rl} Bendix Carstensen & Steno Diabetes Center Copenhagen, Herlev, Denmark\\ & {\small \& Department of Biostatistics, University of Copenhagen} \\ & \texttt{b@bxc.dk}\\ & \url{http://BendixCarstensen.com} \\[1em] \end{tabular}} \input{topreport} <>= options(width=90, SweaveHooks=list(fig=function() par(mar = c(3, 3, 1, 1), mgp = c(3, 1, 0) / 1.6, las = 1, bty = "n"))) @ % \renewcommand{\rwpre}{./yll} <>= anfang <- Sys.time() cat("Start time:", format(anfang, "%F, %T"), "\n") @ % \chapter{Theory and technicalities} This vignette for the \texttt{Epi} package describes the probabilistic and demographic background for and technical implementation of the \texttt{erl} and \texttt{yll} functions that computes the expected residual life time and years of life lost in an illness-death model. \section{Years of life lost (YLL)} \ldots to diabetes or any other disease for that matter. The general concept in calculation of ``years lost to\ldots'' is the comparison of the expected lifetime between two groups of persons; one with and one without disease (in this example DM). The expected lifetime is the area under the survival curve, so basically the exercise requires that two survival curves that are deemed relevant be available. The years of life lost is therefore just the area between the survival curves for those ``Well'', $S_W(t)$, and for those ``Diseased'', $S_D(t)$: \[ \YLL = \int_0^\infty S_W(t) - S_D(t) \dif t \] The time $t$ could of course be age, but it could also be ``time after age 50'' and the survival curves compared would then be survival curves \emph{conditional} on survival till age 50, and the YLL would be the years of life lost for a 50 year old person with diabetes relative to a 50 year old person without diabetes. If we are referring to the expected lifetime we will more precisely use the label expected residual lifetime, ERL. \section{Constructing the survival curves} YLL can be computed in two different ways, depending on the way the survival curve and hence the expected lifetime of a person \emph{without} diabetes is computed: \begin{itemize} \item Assume that the ``Well'' persons are \emph{immune} to disease --- using only the non-DM mortality rates throughout for calculation of expected life time. \item Assume that the ``Well'' persons \emph{can} acquire the disease and thereby see an increased mortality, thus involving all three rates shown in figure \ref{fig:states}. \end{itemize} The former gives a higher YLL because the comparison is to persons assumed immune to DM (and yet with the same mortality as non-immune prior to diagnosis), the latter gives a more realistic picture of the comparison of group of persons with and without diabetes at a given age that can be interpreted in the real world. The differences can be illustrated by figure \ref{fig:states}; the immune approach corresponds to an assumption of $\lambda(t)=0$ in the calculation of the survival curve for a person in the ``Well'' state. Calculation of the survival of a diseased person already in the ``DM'' state is unaffected by assumptions about $\lambda$. We can illustrate the states and transitions using \texttt{boxes}: <>= library(Epi) TM <- matrix(NA, 4, 4) rownames(TM) <- colnames(TM) <- c("Well", "DM", "Dead", "Dead(DM)") TM[1, 2:3] <- TM[2, 4] <- 1 TM zz <- boxes(TM, boxpos = list(x = c(20, 80, 20, 80), y = c(80, 80, 20, 20)), wm = 1.5, hm = 4) @ % We can edit the output from \texttt{boxes} to get the proper annotation of the transition rates: <>= zz$Arrowtext <- c(expression(lambda(a)), expression(mu[W](a)), expression(mu[D][M](a,d))) boxes.MS(zz) @ % \insfig{states}{0.7}{Illness-death model describing diabetes incidence and -mortality and functions of age and duration} \subsection{Total mortality --- a shortcut?} A practical crude shortcut could be to compare the ERL in the diabetic population to the ERL for the \emph{entire} population (that is using the total mortality ignoring diabetes status). Note however that this approach also counts the mortality of persons that acquired the disease earlier, thus making the comparison population on average more ill than the population we aim at, namely those well at a given time, which only then become more gradually ill. How large these effects are can however be empirically explored, as we shall do later. \subsection{Disease duration} In the exposition above there is no explicit provision for the effect of disease duration, but if we were able to devise mortality rates for any combination of age and duration, this could be taken into account. There are however severe limitations in this as we in principle would want to have duration effects as long as the age-effects --- in principle for all $(a, d)$ where $d\leq A$, where $A$ is the age at which we condition. So even if we were only to compute ERL from age, say, 40 we would still need duration effects up to 60 years (namely to age 100). The incorporation of duration effects is in principle trivial from a computational point of view, but we would be forced to entertain models predicting duration effects way beyond what is actually observed disease duration in any practical case. \subsection{Computing integrals} The practical calculations of survival curves, ERL and YLL involves calculation of (cumulative) integrals of rates and functions of these as we shall see below. This is easy if we have a closed form expression of the function, so its value may be computed at any time point --- this will be the case if we model rates by smooth parametric functions. Computing the (cumulative) integral of a function is done as follows: \begin{itemize} \item Compute the value of the function (mortality rate for example) at the midpoints of a sequence of narrow equidistant intervals --- for example one- or three month intervals of age, say. \item Take the cumulative sum of these values multiplied by the interval length --- this will be a very close approximation to the cumulative integral evaluated at the end of each interval. \item If the intervals are really small (like 1/100 year), the distinction between the value at the middle and at the end of each interval becomes irrelevant. \end{itemize} Note that in the above it is assumed that the rates are given in units corresponding to the interval length --- or more precisely, as the cumulative rates over the interval. \section{Survival functions in the illness-death model} The survival functions for persons in the ``Well'' state can be computed under two fundamentally different scenarios, depending on whether persons in the ``Well'' state are assumed to be immune to the disease ($\lambda(a)=0$) or not. \subsection{Immune approach} In this case both survival functions for person in the two states are the usual simple transformation of the cumulative mortality rates: \[ S_W(a) = \exp\left(-\int_0^a\!\!\mu_W(u) \dif u \right), \qquad S_D(a) = \exp\left(-\int_0^a\!\!\mu_D(u) \dif u \right) \] \subsubsection{Conditional survival functions} If we want the \emph{conditional} survival functions given survival to age $A$, say, they are just: \[ S_W(a|A) = S_W(a)/S_W(A), \qquad S_D(a|A) = S_D(a)/S_D(A) \] \subsection{Non-immune approach} For a diseased person, the survival function in this states is the same as above, but the survival function for a person without disease (at age 0) is (see figure \ref{fig:states}): \[ S(a) = \ptxt{Well}\!(a) + \ptxt{DM}\!(a) \] In the appendix of the paper \cite{Carstensen.2008c} is an indication of how to compute the probability of being in any of the four states shown in figure \ref{fig:states}, which I shall repeat here: In terms of the rates, the probability of being in the ``Well'' box is simply the probability of escaping both death (at a rate of $\mu_W(a)$) and diabetes (at a rate of $\lambda(a)$): \[ \ptxt{Well}(a) = \exp\left(\!-\int_0^a\!\!\mu_W(u)+\lambda(u) \right) \dif u \] The probability of being alive with diabetes at age $a$, is computed given that diabetes occurred at age $s$ ($s>= data(DMepi) @ % The dataset \texttt{DMepi} contains diabetes events, deaths and person-years for persons without diabetes and deaths and person-years for persons with diabetes, classified by age (\texttt{A}) and calendar year (\texttt{P}): <<>>= str(DMepi) head(DMepi) @ % For each combination of sex, age, period and date of birth in 1 year age groups, we have the person-years in the ``Well'' (\texttt{Y.nD}) and the ``DM'' (\texttt{Y.DM}) states, as well as the number of deaths from these (\texttt{D.nD}, \texttt{D.DM}) and the number of incident diabetes cases from the ``Well'' state (\texttt{X}). In order to compute the years of life lost to diabetes and how this has changed over time, we fit models for the mortality and incidence of both groups (and of course, separately for men and women). The models we use will be age-period-cohort models \cite{Carstensen.2007a} providing estimated mortality rates for ages 0--99 and dates 1.1.1996--1.1.2016. First we transform the age and period variables to reflect the mean age and period in each of the Lexis triangles. We also compute the total number of deaths and amount of risk time, as we are going to model the total mortality as well. Finally we restrict the dataset to ages over 30 only: <<>>= DMepi <- transform(subset(DMepi, A > 30), A = A + 0.5, P = P + 0.5, D.T = D.nD + D.DM, Y.T = Y.nD + Y.DM) head(DMepi) @ % With the correct age and period coding in the Lexis triangles, we fit models for the mortalities and incidences. Note that we for comparative purposes also fit a model for the \emph{total} mortality, ignoring the <<>>= # Knots used in all models (a.kn <- seq(40, 95, , 6)) (p.kn <- seq(1997, 2015, , 4)) (c.kn <- seq(1910, 1976, , 6)) # Check the number of events between knots ae <- xtabs(cbind(D.nD, D.DM, X) ~ cut(A, c(30, a.kn, Inf)) + sex, data=DMepi) ftable(addmargins(ae, 1), col.vars=3:2) pe <- xtabs(cbind(D.nD, D.DM, X) ~ cut(P, c(1990, p.kn, Inf)) + sex, data=DMepi) ftable(addmargins(pe, 1), col.vars=3:2) ce <- xtabs(cbind(D.nD, D.DM, X) ~ cut(P-A, c(-Inf, c.kn, Inf)) + sex, data=DMepi) ftable(addmargins(ce, 1), col.vars=3:2) # Fit an APC-model for all transitions, separately for men and women mW.m <- glm(cbind(D.nD, Y.nD) ~ -1 + Ns( A, knots=a.kn, int=TRUE) + Ns(P , knots=p.kn, ref=2005) + Ns(P - A, knots=c.kn, ref=1950), family = poisreg, data = subset(DMepi, sex=="M")) mD.m <- update(mW.m, cbind(D.DM, Y.DM) ~ .) mT.m <- update(mW.m, cbind(D.T , Y.T ) ~ .) lW.m <- update(mW.m, cbind(X , Y.nD) ~ .) # Model for women mW.f <- update(mW.m, data = subset(DMepi, sex == "F")) mD.f <- update(mD.m, data = subset(DMepi, sex == "F")) mT.f <- update(mT.m, data = subset(DMepi, sex == "F")) lW.f <- update(lW.m, data = subset(DMepi, sex == "F")) @ % \section{Residual life time and years lost to DM} We now collect the estimated years of life lost classified by method (immunity assumption or not), sex, age and calendar time: <<>>= a.ref <- 30:90 p.ref <- 1996:2016 aYLL <- NArray(list(type = c("Imm", "Tot", "Sus"), sex = levels(DMepi$sex), age = a.ref, date = p.ref)) str(aYLL) system.time( for(ip in p.ref) { nd <- data.frame(A = seq(30, 90, 0.2)+0.1, P = ip, Y.nD = 1, Y.DM = 1, Y.T = 1) muW.m <- ci.pred(mW.m, nd)[, 1] muD.m <- ci.pred(mD.m, nd)[, 1] muT.m <- ci.pred(mT.m, nd)[, 1] lam.m <- ci.pred(lW.m, nd)[, 1] muW.f <- ci.pred(mW.f, nd)[, 1] muD.f <- ci.pred(mD.f, nd)[, 1] muT.f <- ci.pred(mT.f, nd)[, 1] lam.f <- ci.pred(lW.f, nd)[, 1] aYLL["Imm", "M", , paste(ip)] <- yll(int=0.2, muW.m, muD.m, lam=NULL, A=a.ref, age.in=30, note=FALSE)[-1] aYLL["Imm", "F", , paste(ip)] <- yll(int=0.2, muW.f, muD.f, lam=NULL, A=a.ref, age.in=30, note=FALSE)[-1] aYLL["Tot", "M", , paste(ip)] <- yll(int=0.2, muT.m, muD.m, lam=NULL, A=a.ref, age.in=30, note=FALSE)[-1] aYLL["Tot", "F", , paste(ip)] <- yll(int=0.2, muT.f, muD.f, lam=NULL, A=a.ref, age.in=30, note=FALSE)[-1] aYLL["Sus", "M", , paste(ip)] <- yll(int=0.2, muW.m, muD.m, lam=lam.m, A=a.ref, age.in=30, note=FALSE)[-1] aYLL["Sus", "F", , paste(ip)] <- yll(int=0.2, muW.f, muD.f, lam=lam.f, A=a.ref, age.in=30, note=FALSE)[-1] }) round(ftable(aYLL[, , seq(1, 61, 10), ], col.vars=c(3, 2)), 1) @ % We now have the relevant points for the graph showing YLL to diabetes for men and women by age, and calendar year, both under the immunity and susceptibility models for the calculation of YLL. <>= plyll <- function(wh, xtxt){ par(mfrow = c(1, 2), mar = c(3, 3, 1, 1), mgp = c(3, 1, 0) / 1.6, bty = "n", las = 1) matplot(a.ref, aYLL[wh, "M", , ], type="l", lty=1, col="blue", lwd=1:2, ylim=c(0, 12), xlab="Age", ylab=paste0("Years lost to DM", xtxt), yaxs="i") abline(v=50, h=1:11, col=gray(0.7)) text(90, 11.5, "Men", col="blue", adj=1) text(40, aYLL[wh, "M", "40", "1996"], "1996", adj=c(0, 0), col="blue") text(43, aYLL[wh, "M", "44", "2016"], "2016", adj=c(1, 1), col="blue") matplot(a.ref, aYLL[wh, "F", , ], type="l", lty=1, col="red", lwd=1:2, ylim=c(0, 12), xlab="Age", ylab=paste0("Years lost to DM", xtxt), yaxs="i") abline(v=50, h=1:11, col=gray(0.7)) text(90, 11.5, "Women", col="red", adj=1) text(40, aYLL[wh, "F", "40", "1996"], "1996", adj=c(0, 0), col="red") text(43, aYLL[wh, "F", "44", "2016"], "2016", adj=c(1, 1), col="red") } plyll("Imm", " - immunity assumption") @ % <>= plyll("Tot", " - total mortality refernce") @ % <>= plyll("Sus", " - susceptibility assumed") @ % \begin{figure}[h] \centering \includegraphics[width=\textwidth]{yll-imm} \caption{Years of life lost to DM: the difference in expected residual life time at different ages between persons with and without diabetes, assuming the persons without diabetes at a given age remain free from diabetes (immunity assumption --- not reasonable). The lines refer to date of evaluation; the top lines refer to 1996-1-1 the bottom ones to 2016-1-1. Blue curves are men, red women.} \label{fig:imm} \end{figure} \begin{figure}[h] \centering \includegraphics[width=\textwidth]{yll-sus} \caption{Years of life lost to DM: the difference in expected residual life time at different ages between persons with and without diabetes, allowing the persons without diabetes at a given age to contract diabetes and thus be subject to higher mortality. The lines refer to date of evaluation; the top lines refer to 1996-1-1 the bottom ones to 2016-1-1. Blue curves are men, red women.} \label{fig:sus} \end{figure} \begin{figure}[h] \centering \includegraphics[width=\textwidth]{yll-tot} \caption{Years of life lost to DM: the difference in expected residual life time at different ages between persons with and without diabetes. Allowance for susceptibility is approximated by using the total population mortality instead of non-DM mortality. The lines refer to date of evaluation; the top lines refer to 1996-1-1 the bottom ones to 2016-1-1. Blue curves are men, red women.} \label{fig:tot} \end{figure} From figure \ref{fig:sus} we see that for men aged 50 the years lost to diabetes has decreased from 6.5 to 4.5 years and for women from 4 to 4 years; so a greater improvement for women. <>= ende <- Sys.time() cat(" Start time:", format(anfang, "%F, %T"), "\n") cat(" End time:", format( ende, "%F, %T"), "\n") cat("Elapsed time:", round(difftime(ende, anfang, units = "mins"), 2), "minutes\n") @ % \bibliographystyle{plain} \begin{thebibliography}{1} \bibitem{Carstensen.2007a} B~Carstensen. \newblock Age-{P}eriod-{C}ohort models for the {L}exis diagram. \newblock {\em Statistics in Medicine}, 26(15):3018--3045, 2007. \bibitem{Carstensen.2008c} B~Carstensen, JK~Kristensen, P~Ottosen, and K~Borch-Johnsen. \newblock The {D}anish {N}ational {D}iabetes {R}egister: {T}rends in incidence, prevalence and mortality. \newblock {\em Diabetologia}, 51:2187--2196, 2008. \end{thebibliography} \addcontentsline{toc}{chapter}{References} \end{document}