Basic Assumptions and Equations
(a.) Mortality Data Base: historical age-specific cancer rates.
U.S. cancer mortality numbers and populations recorded 1900-2006 by the U.S. Census Bureau (1900-1935) and the U.S. Public Health Service (1936-2006) have been matched and organized with regard to gender, ethnicity, calendar interval of birth, ÒhÓ, (ten years: 1800-09, 1810-19, É), calendar year interval of death, ÒyÓ (five years, 1900-04, 1905-09,É), and age at death interval, ÒtÓ (five years, 0-4, É.100-104)) (Supporting Information Table S1). These data allow computation of raw age-specific lifetime mortality rates, OBS(h,t), as the number of deaths by the observed cause divided by the number of persons alive at the beginning of the one-year interval ÒtÓ. Thus OBS(h,t) is an approximation of the conditional probability that a person would have died of the observed cause given that he or she was still alive. However, cancer models predict incidence rates, INC(h,t), as a calculated approximation, CAL(h,t), of conditional rates of deaths absent covariant factors such as competing forms of death or the effect of medical intervention in the age/time interval observed. Transforming observed raw mortality rates, OBS(h,t), to estimates of incidence rates, INC(h,t), requires correction for several sources of bias. In extreme old age (t = 100-104) death rates approach ~0.3 per year and must have reduced the number of deaths by the observed cause. Correction for this bias consists of determining the total raw mortality rate for each five year age interval, TOT(h, t), and defining the coincidence-corrected mortality rate at the third or middle year, OBS*(h,t) as OBS(h,t) /[1-TOT(h,t) + OBS(h,t)]. Accounting for historically improving five-year survival rates, SUR(h,t), is also required for some cancers such as colorectal cancers. The expected incidence rate, INC(h, t), adjusted for these considerations is:
INC(h,t ) = OBS(h,t)/( [1-SUR(h,t)][1-TOT(h,t) + OBS(h,t)] ). Equation 1
Diagnostic errors at death may also be expected and these would vary among cancer types, age at death, historical year of reporting etc. so that INC(h,t) as defined here is an approximation and its uncertainties must be considered in comparing predictions of models, CAL(h,t), to incidence represented by INC(h,t).
(b.) Algebraic elements of the two-stage model.
Limitation of initiation mutations to the fetal/juvenile stem cell doublings.
Growth of normal fetal/juvenile stem cells is here modeled as a series of ÒaÓ net binomial doublings (a = 0, 1, 2, É, amax) in which ÒnÓ required initiation mutations, i, j, Én, occur in any order at constant mutation rates Ri, Rj, É,Rn per doubling. The number of newly initiated stem cells in doubling period ÒaÓ is (Πn Ri) a(n-1) 2a. In the fetal/juvenile model organogenic stem cells are posited to reach maturity represented by ÒamaxÓ doublings with high constant mutation rates and to undergo metamorphosis to maintenance stem cells with no net additional net cell growth and much lower mutation rates.
Assuming each of the ~107 adult colonic crypts to be represented at juvenile/adult metamorphosis by a single metakaryotic stem cell, the number of net doublings at maturity, amax, is about 23.25, i.e., 107~ 223.25 . The metakaryotic mutator/hypermutable stem cell lineage of human organ anlagen appears to begin in gestational week 4-5 with creation of two metakaryotic stem cells from symmetrical amitosis of a single precursor embryonic mitotic stem cell at a = 0. At birth, a colon contains ~220 colonic crypts each containing a basal metakaryotic stem cell; thus at birth, a ~ 20, at maturity, a = amax ~ 23.25.
Promotion mutations during preneoplastic stem cell doublings.
After initiation in any fetal/juvenile doubling ÒaÓ growth of preneoplastic stem cells as a colony is modeled as a series of Òg - aÓ binomial doublings (g-a = 0, 1, 2, É) in which ÒmÓ required promotion mutations (A, B, Ém, occur at constant mutation rates RA, RB, É,Rm per doubling). The expected number of newly initiated stem cells in preneoplastic doubling period Òg-aÓ is (Πm RA) (g-a)(m-1) 2(g-a). Under these assumptions the number of organogenic doublings ÒaÓ at initiation and the number of preneoplastic doublings Òg-aÓ after initiation sum to ÒgÓ which is a very useful continuous variable because it describes the age of humans in terms of continuous stem cell doublings through fetal/juvenile and then preneoplastic growth. In each organogenic doubling interval ÒaÓ new preneoplastic colonies are created (initiated) and these colonies grow until promotion and subsequent death remove them. The extinction of preneoplastic colonies at ÒaÓ and at Òg - aÓ is driven by the supra-exponential term exp[-m ( RA (g-a)(m-1) 2(g-a)]).
If all persons have the same numbers and rates of ÒnÓ required initiation and ÒmÓ required promotion oncomutations and all initiated cells grow at the same average rate as preneoplastic stem cells (homogeneous risk) without any synchronously competitive forms of mortality the expected number of promotional events at the binomial doubling age interval ÒgÓ, V(g) may be represented as:
V(g) = nn Ri Σ(0,amax) a(n-1) 2a d(1-exp[-m RA(g-a)(m-1) 2(g-a)]) /d(g-a) Equation 2
This process is illustrated in Figure 2 in which the contribution to promotion at age ÒgÓ from initiation at each organogenic doubling ÒaÓ is shown to rise and fall with Òg-aÓ. The sum of these terms from initiations in all organogenic doubling intervals ÒaÓ approximates well the observed lifetime incidence rate of many cancer types including colorectal cancer: it increases sub-exponentially, reaches a maximum in old age and declines appreciably in extreme old age. The earliest initiations of fetal organogenesis drive the tumor incidence rate of juveniles and young adults, the initiations of adolescent organogenesis drive the tumor incidence rate in extreme old age.
Under these conditions the expected number of newly promoted lesions through the end of any doubling interval ÒgÓ, CAL(g), is:
CAL(g) = (1-e-V(g)) Equation 3
Age of death, ÒtÓ, and doubling age of promotion, ÒgÓ.
Cancer mortality data corrected for coincident deaths within the year of death, OBS*(h,t) and its derived estimate of incidence, INC(h,t) are calculated in five year age-of-death intervals 5-9 ,É., 100-104 years such that deaths in any age interval are plotted at the mid interval. CAL(h, g) is, however, approximated as the instantaneous rate of promotion at the end of each stem cell doubling interval ÒgÓ.
To account for the difference between age at promotion and death we adopt Armitage and DollÕs estimate of 2.5 yr. Death at age t = 72.5 is thus attributed to promotion at age t = 70.
The relationship between human age at death in years, ÒtÓ, and stem cell doubling age at promotion, ÒgÓ is then defined if there is a constant average preneoplastic stem cell annual doubling rate, ÒmÓ. Given the age of maturity for males as 16.5 yr at g = amax:
g = μ (t -16.5 – 2.5 ) + amax = μ (t -19 ) + amax Equation 5
Stratification of risks in the population.
We represent the fraction of the population in whom all of the potential conditions necessary for cancer death are present as ÒFÓ. The corresponding fraction in which any necessary condition is absent is represented as (1-F). Stratification need not, however, be an Òall or noneÓ phenomenon. Stratification with regard to mutation rates in fetal/juvenile expansion has been noted for both mitochondrial and nuclear genes. The use of ÒFÓ in this present report serves as first approximation in stratification from any underlying cause. Equation 4 rewritten to account for stratification in this way creates the model:
CAL(g) = F(1-e-V(g))/ [F + (1-F) e ∫ V(g) dg] evaluated from g = 0 to g. Equation 6.
Competing synchronous forms of mortality.
Epidemiological observations have also demonstrated that forms of cancer may share environmental or inherited risk factors with another, e.g. breast and ovarian cancers, in which the death rates increase synchronously with age. The term ÓfÓ has been introduced to represent the fraction of persons that die of the observed cause among the set of mortal diseases with shared risks and synchronous changes in death rates. Equation 6 rewritten to account for both stratification and a hypothetical synchronous competing form of mortality with shared risk factors with the observed disease in this way creates the model:
CAL(g) = F(1-e-V(g))/ [F + (1-F) e ∫ 1/f V(g) dg] evaluated from g = 0 to g. Equation 7.