Basic
Assumptions and Equations
(a.)
Mortality Data Base: historical age-specific cancer rates.
U.S. cancer mortality numbers
and populations recorded 1900-2006 by the U.S. Census Bureau (1900-1935) and
the U.S. Public Health Service (1936-2006) have been matched and organized with
regard to gender, ethnicity, calendar interval of birth, ÒhÓ, (ten years: 1800-09, 1810-19, É), calendar year interval
of death, ÒyÓ (five years,
1900-04, 1905-09,É), and age at death interval, ÒtÓ (five years, 0-4, É.100-104)) (Supporting
Information Table S1). These data allow computation of raw age-specific lifetime
mortality rates, OBS(h,t), as the number of deaths by the observed cause
divided by the number of persons alive at the beginning of the one-year
interval ÒtÓ. Thus OBS(h,t) is an
approximation of the conditional probability that a person would have died of
the observed cause given that he or she was still alive. However, cancer models
predict incidence rates, INC(h,t), as a calculated approximation, CAL(h,t), of
conditional rates of deaths absent covariant factors such as competing forms of
death or the effect of medical intervention in the age/time interval observed. Transforming
observed raw mortality rates, OBS(h,t), to estimates of incidence rates,
INC(h,t), requires correction for several sources of bias. In extreme old age
(t = 100-104) death rates approach ~0.3 per year and must have reduced the
number of deaths by the observed cause. Correction for this bias consists of
determining the total raw mortality rate for each five year age interval,
TOT(h, t), and defining the coincidence-corrected mortality rate at the third
or middle year, OBS*(h,t) as OBS(h,t) /[1-TOT(h,t) + OBS(h,t)]. Accounting for
historically improving five-year survival rates, SUR(h,t), is also required for
some cancers such as colorectal cancers. The expected incidence
rate, INC(h, t), adjusted for these considerations is:
INC(h,t ) = OBS(h,t)/(
[1-SUR(h,t)][1-TOT(h,t) + OBS(h,t)] ).
Equation 1
Diagnostic errors at death
may also be expected and these would vary among cancer types, age at death,
historical year of reporting etc. so that INC(h,t) as defined here is an
approximation and its uncertainties must be considered in comparing predictions
of models, CAL(h,t), to incidence represented by INC(h,t).
(b.) Algebraic elements
of the two-stage model.
Limitation of initiation mutations to the
fetal/juvenile stem cell doublings.
Growth of normal
fetal/juvenile stem cells is here modeled as a series of ÒaÓ net binomial doublings (a = 0, 1, 2, É, amax)
in which ÒnÓ required initiation
mutations, i, j, Én, occur in any order at constant mutation rates Ri,
Rj, É,Rn per doubling. The number of newly
initiated stem cells in doubling period ÒaÓ is (Πn Ri) a(n-1) 2a. In the
fetal/juvenile model organogenic stem cells are posited to reach maturity
represented by ÒamaxÓ
doublings with high constant mutation rates and to undergo metamorphosis
to maintenance stem cells with no net additional net cell growth and much lower
mutation rates.
Assuming each of the ~107
adult colonic crypts to be represented at juvenile/adult metamorphosis by a
single metakaryotic stem cell, the number of net doublings at maturity, amax,
is about 23.25, i.e., 107~ 223.25 . The
metakaryotic mutator/hypermutable stem cell lineage of human organ anlagen
appears to begin in gestational week 4-5 with creation of two metakaryotic stem
cells from symmetrical amitosis of a single precursor embryonic mitotic stem
cell at a = 0. At birth, a colon contains ~220 colonic crypts
each containing a basal metakaryotic stem cell; thus at birth, a ~ 20, at
maturity, a = amax ~ 23.25.
Promotion mutations during preneoplastic stem cell
doublings.
After initiation in any fetal/juvenile
doubling ÒaÓ growth of
preneoplastic stem cells as a colony is modeled as a series of Òg - aÓ binomial doublings (g-a = 0, 1, 2, É) in which ÒmÓ required promotion mutations (A, B, Ém, occur at
constant mutation rates RA, RB, É,Rm per
doubling). The expected number of newly initiated stem cells in
preneoplastic doubling period Òg-aÓ
is (Πm RA) (g-a)(m-1) 2(g-a). Under these assumptions the number of organogenic
doublings ÒaÓ at initiation and
the number of preneoplastic doublings Òg-aÓ after initiation sum to ÒgÓ
which is a very useful continuous variable because it describes the age of
humans in terms of continuous stem cell doublings through fetal/juvenile and
then preneoplastic growth. In each organogenic doubling interval ÒaÓ new preneoplastic colonies are created (initiated)
and these colonies grow until promotion and subsequent death remove them. The
extinction of preneoplastic colonies at ÒaÓ and at Òg - aÓ is driven
by the supra-exponential term exp[-m ( RA (g-a)(m-1) 2(g-a)]).
If all persons have the same
numbers and rates of ÒnÓ required
initiation and ÒmÓ required
promotion oncomutations and all initiated cells grow at the same average rate
as preneoplastic stem cells (homogeneous risk) without any synchronously
competitive forms of mortality the expected number of promotional events at the
binomial doubling age interval ÒgÓ, V(g) may be represented as:
V(g) = nn Ri Σ(0,amax) a(n-1) 2a
d(1-exp[-m
RA(g-a)(m-1) 2(g-a)]) /d(g-a) Equation 2
This process is illustrated
in Figure 2 in which the contribution to promotion at age ÒgÓ from initiation
at each organogenic doubling ÒaÓ is shown to rise and fall with Òg-aÓ. The sum
of these terms from initiations in all organogenic doubling intervals ÒaÓ
approximates well the observed lifetime incidence rate of many cancer types
including colorectal cancer: it increases sub-exponentially, reaches a maximum
in old age and declines appreciably in extreme old age. The earliest
initiations of fetal organogenesis drive the tumor incidence rate of juveniles
and young adults, the initiations of adolescent organogenesis drive the tumor
incidence rate in extreme old age.
Under
these conditions the expected number of newly promoted lesions through the end
of any doubling interval ÒgÓ, CAL(g), is:
CAL(g)
= (1-e-V(g)) Equation 3
Age of death, ÒtÓ, and doubling age of promotion, ÒgÓ.
Cancer mortality data corrected
for coincident deaths within the year of death, OBS*(h,t) and its derived
estimate of incidence, INC(h,t) are calculated in five year age-of-death intervals
5-9 ,É., 100-104 years such that deaths in any age interval are plotted at the
mid interval. CAL(h, g) is, however, approximated as the instantaneous rate of
promotion at the end of each stem
cell doubling interval ÒgÓ.
To account for the difference
between age at promotion and death we adopt Armitage and DollÕs estimate of
2.5 yr. Death at age t = 72.5 is thus attributed to promotion at age t = 70.
The relationship between
human age at death in years, ÒtÓ,
and stem cell doubling age at promotion, ÒgÓ is then defined if there is a constant average
preneoplastic stem cell annual doubling rate, ÒmÓ. Given the age of maturity for males as 16.5 yr at g = amax:
g = μ (t -16.5 – 2.5 ) + amax = μ (t -19 ) + amax Equation 5
Stratification of risks in the population.
We represent the fraction of
the population in whom all of the potential conditions necessary for cancer
death are present as ÒFÓ. The corresponding fraction in which any necessary
condition is absent is represented as (1-F). Stratification need
not, however, be an Òall or noneÓ phenomenon. Stratification with regard to
mutation rates in fetal/juvenile expansion has been noted for both
mitochondrial and nuclear genes. The use of ÒFÓ in this present report
serves as first approximation in stratification from any underlying cause.
Equation 4 rewritten to account for stratification in this way creates the
model:
CAL(g) = F(1-e-V(g))/
[F + (1-F) e ∫ V(g) dg] evaluated from g = 0 to g. Equation 6.
Competing synchronous forms of mortality.
Epidemiological observations
have also demonstrated that forms of cancer may share environmental or
inherited risk factors with another, e.g. breast and ovarian cancers, in which
the death rates increase synchronously with age. The term ÓfÓ has been
introduced to represent the fraction of persons that die of the observed cause
among the set of mortal diseases with shared risks and synchronous changes in
death rates. Equation 6 rewritten to account for both
stratification and a hypothetical synchronous competing form of mortality with
shared risk factors with the observed disease in this way creates the model:
CAL(g) = F(1-e-V(g))/
[F + (1-F) e ∫ 1/f V(g) dg] evaluated from g = 0 to
g. Equation 7.