Blood Journal
Leading the way in experimental and clinical research in hematology

Geriatric assessment predicts survival for older adults receiving induction chemotherapy for acute myelogenous leukemia

  1. Heidi D. Klepin1,
  2. Ann M. Geiger2,
  3. Janet A. Tooze2,
  4. Stephen B. Kritchevsky3,
  5. Jeff D. Williamson3,
  6. Timothy S. Pardee1,
  7. Leslie R. Ellis1, and
  8. Bayard L. Powell1
  1. 1Comprehensive Cancer Center of Wake Forest University, Winston-Salem, NC; and
  2. 2Division of Public Health Sciences, and
  3. 3Section on Gerontology and Geriatric Medicine, Wake Forest School of Medicine, Winston-Salem, NC
  1. Presented in part at the annual meeting of the American Society of Hematology, 2010.

Key Points

  • Geriatric assessment, with a focus on cognitive and physical function, improves prediction of survival among older adults treated for AML.

  • Use of geriatric assessment may inform trial design and interventions to improve outcomes for older adults with AML.


We investigated the predictive value of geriatric assessment (GA) on overall survival (OS) for older adults with acute myelogenous leukemia (AML). Consecutive patients ≥ 60 years with newly diagnosed AML and planned intensive chemotherapy were enrolled at a single institution. Pretreatment GA included evaluation of cognition, depression, distress, physical function (PF) (self-reported and objectively measured), and comorbidity. Objective PF was assessed using the Short Physical Performance Battery (SPPB, timed 4-m walk, chair stands, standing balance) and grip strength. Cox proportional hazards models were fit for each GA measure as a predictor of OS. Among 74 patients, the mean age was 70 years, and 78.4% had an Eastern Cooperative Oncology Group (ECOG) score ≤ 1. OS was significantly shorter for participants who screened positive for impairment in cognition and objectively measured PF. Adjusting for age, gender, ECOG score, cytogenetic risk group, myelodysplastic syndrome, and hemoglobin, impaired cognition (Modified Mini-Mental State Exam < 77) and impaired objective PF (SPPB < 9) were associated with worse OS. GA methods, with a focus on cognitive and PF, improve risk stratification and may inform interventions to improve outcomes for older AML patients.


Acute myelogenous leukemia (AML) is largely a disease of older adults, characterized by disproportionately worse survival associated with increasing age.1-3 Although selected patients can benefit from standard induction chemotherapy, as a group they experience increased treatment-related toxicity and decreased survival.1-6 There is controversy regarding treatment recommendations for older patients, in part because of their heterogeneity.7-9 Improved assessment strategies are needed to discriminate between those older adults who are fit for intensive therapies and those who are vulnerable and may experience excess toxicity.

Risk stratification for older adults with AML has primarily focused on tumor biology, oncology performance status (Eastern Cooperative Oncology Group [ECOG] or Karnofsky) and chronologic age.10-14 Older chronologic age has been a consistent predictor of poor outcomes, contributing to the controversy surrounding optimal treatment strategies for older adults.8 Chronologic age, however, is a surrogate marker for specific impairments that increase vulnerability to treatment toxicity and poor outcomes. Although heterogeneity of tumor biology in younger and older patients with AML has received much attention, few studies have focused on measurement of underlying impairments that better reflect physiologic age and reserve capacity during treatment.

A brief geriatric assessment (GA), including evaluation of cognitive function, psychological state, physical function (PF), and comorbid disease, could identify patients most vulnerable to the side effects of AML chemotherapy.15 In older non-AML patients, GA can identify problems that may interfere with cancer treatment16-18 and can predict chemotherapy toxicity and survival.19-22 This type of assessment is recommended by the National Comprehensive Cancer Network guidelines for “Senior Adult Oncology”,10 but is not routinely used in clinical practice in part because of the lack of evidence in specific tumor types.

To date, no studies have evaluated the predictive value of GA among newly diagnosed older adults with AML. Having previously demonstrated the feasibility of performing a bedside GA among older adults hospitalized for induction chemotherapy,23 we wanted to assess the predictive value of impairments detected by GA on overall survival (OS) among older adults receiving intensive induction chemotherapy for AML.

Materials and methods

Study design and population

Between January 2009 and January 2011, we conducted a single-institution prospective cohort study enrolling consecutive patients aged ≥ 60 years with newly diagnosed, pathologically confirmed AML. The minimum age was 60 to align our results with previous AML trials.11,24 Additional eligibility criteria were: inpatient status, candidate for induction chemotherapy per treating physician, capacity to sign informed consent, and ambulatory (ECOG score 0-3). Patients who required intensive care unit support at initial evaluation or had prior therapy for AML were ineligible. The analysis cohort consisted of participants who received intensive (nonhypomethylating-based) induction chemotherapy including anthracycline and/or cytarabine to increase homogeneity and facilitate comparisons with existing literature.4,5 The treating physician chose the chemotherapy regimen, reflecting usual care; this was determined before enrollment and performance of the GA.

Subjects were enrolled within 5 days of initial hospitalization date. Bedside GA was performed on the inpatient ward by the study nurse at enrollment. The study nurse followed published procedures for administration and scoring of each assessment measure detailed in the following sections. This study was approved by the Institutional Review Board of Wake Forest University Health Sciences, and all participants provided written informed consent in accordance with the Declaration of Helsinki.

GA measures

Details of the GA battery were previously published.23 The GA was designed using previously validated, standardized, mostly survey-based measures to maximize comparisons to other geriatric patient populations and optimize reproducibility of results. The assessment measures do not require a specialized training background for administration and have predefined administration and scoring algorithms. Cognitive function was assessed using the 100-point Modified Mini-Mental State (3MS) Exam, a validated screening tool to assess global cognition.25,26 Depressive symptoms were assessed using the 20-item 60-point Center for Epidemiologic Studies Depression Scale (CES-D).27,28 We used the Distress Thermometer, a single-item rating from 0 (no distress) to 10 (extreme distress) to evaluate distress.29-31 PF was assessed using both self-report and objective measurements. Self-reported PF was assessed with the Pepper Assessment Tool for Disability,32-34 which includes subscales assessing mobility, instrumental and basic activities of daily living.35 We were interested in self-reported function at the time of treatment (reflecting both prediagnosis PF and disease burden) and recalled “prediagnosis” PF (defined as 6 months before treatment). Subjects were asked to report functional ability using the same survey questions for both of these time points. Higher scores indicate worse functioning. Subjects were considered impaired if they reported difficulty with one or more items in any subscale.

Objective physical performance measurements included hand grip strength and the Short Physical Performance Battery (SPPB). Grip strength predicts mortality, functional limitations, and disability in geriatric populations and was measured in both hands using an adjustable, hydraulic grip strength dynamometer.36 The SPPB evaluates lower extremity function and predicts future disability, hospitalizations, and mortality among elderly patients with demonstrated reliability across diverse older adult populations.37-45 Training for standardized administration is publicly available.46 The SPPB comprises a short walk (4 m), repeated chair stands, and balance test. Each measure was scored ranging from 0 to 4 (0 = unable to complete the test; 4 = highest performance level), with total summed score ranging from 0 to 12.37 Comorbidity burden was recorded using the validated Hematopoietic Cell Transplantation Comorbidity Index score.47,48


Demographic (age, gender, race/ethnicity) and laboratory data (eg, hemoglobin), tumor characteristics, and treatment were collected from the medical record. Tumor-specific variables included: white blood cell count, lactate dehydrogenase level at admission, history of prior myelodysplastic syndrome (MDS), and cytogenetic risk group categorized from the diagnostic bone marrow biopsy according to the Southwest Oncology Group classification.2,3 Cytogenetic risk group was categorized as favorable/intermediate or unfavorable because few patients had a favorable classification. Admission height and weight were used to calculate body mass index. The treating attending physician's estimate of the patient’s ECOG performance score at admission was recorded.2 In analyses, ECOG performance score was categorized as good functional status (score ≤ 1) or poor functional status (score > 1).


The primary outcome of this analysis was OS defined from the date of beginning induction chemotherapy to the date of death or last follow-up for censored patients. Exploratory outcomes included complete remission (CR), early mortality, and late mortality cause of death. CR was defined as morphologic leukemia-free state, including <5% blasts in the bone marrow, no blasts with Auer rods, no persistent extramedullary disease, and inclusive of patients with incomplete platelet count recovery (<100 000) who were transfusion independent (CRi).49 Early mortality was defined as death within 30 days of initiation of induction chemotherapy. Cause of death was explored for patients who survived 30 days and categorized as resulting from complications of relapse (leukemia present at the time of death) or complications of treatment (relapse-free).

Statistical analyses

Means and frequencies were used to describe the baseline characteristics of participants and performance on GA measures. Time to GA assessment was described using medians (overall and by impairment category), and compared using the Wilcoxon rank sum test. CR/CRi rates and associated 95% CIs were estimated as the proportion of patients who achieved CR at the follow-up assessment. Patients who died before the follow-up assessment were considered not to have achieved CR. Rates and 95% CIs for 30-day mortality were estimated for all patients and by 3MS (<77) and SPPB score (<9). OS was estimated using the Kaplan-Meier method. The log-rank test was used to compare survival by 3MS (<77) and SPPB score (<9). After assessing the proportionality assumption and the forms of the covariates using sums of cumulative martingale residuals50,51 and Akaike’s Information Criteria to compare models, Cox proportional hazards models were fit for each GA measure as a predictor of OS, in unadjusted models and models controlling for age, gender, hemoglobin, ECOG score, prior MDS, and cytogenetic risk group. Sensitivity analyses were done, adding chemotherapy type to the adjusted models to determine if the hazard ratios for the GA variables were attenuated. We also estimated the hazard ratios in adjusted Cox regression models treating SPPB and 3MS as continuous variables. To assess the incremental impact of cognitive and PF variables on predicting survival, we used Integrated Discrimination Improvement as described for survival analysis by Chambless et al52 and implemented in the RiskPredictionParams SAS Macro (macro version 8, Chapel Hill, NC). The variables included in the model were those associated with survival in our analysis and included in Kantarjian’s predictive model for early mortality.8 Internal cross-validation of the model was assessed using 1000 bootstrap replicates implemented in the RiskPredictionParams SAS macro.

χ-square and Fisher’s exact tests were used to investigate relationships between baseline 3MS and SPPB and exploratory outcomes of 30-day mortality and CR/CRi. Competing risk modeling was used to investigate a differential effect of impairment in GA measures by cause of death.53 All analyses were conducted at a two-sided α-level of 0.05 using SAS statistical software, version 9.2 (SAS Institute Inc., Cary, NC).


Screening and enrollment are illustrated in Figure 1. Among the 74 consecutive patients enrolled, the mean age was 70 (SD 6.2) years, 54% were male and 96% were white (Table 1). Median laboratory values indicated anemia, high levels of lactate dehydrogenase, and leukopenia. Only 4% had favorable cytogenetic abnormalities, and approximately one-quarter had preceding MDS. Approximately two-thirds (70%) received standard induction therapy with anthracycline, cytarabine, ± etoposide. The remainder received alternative anthracycline- or cytarabine-based induction regimens. Anthracycline and cytarabine doses ranged within accepted standards per National Comprehensive Cancer Network guidelines (see Table 1 footnote).

Figure 1

Study screening, eligibility, and enrollment.

View this table:
Table 1

Characteristics of older patients initiating induction chemotherapy for AML (N = 74)

All enrolled patients participated in GA. Missing data were minimal. One patient failed to complete the 3MS, and seven patients could not perform grip testing, primarily from arthritis. The median time from admission to administration of GA was 2 days. Mean baseline GA scores are reported in Table 2. As a group, these patients presented with depressive symptoms, distress, and impaired PF. Overall, the low SPPB scores characterize the population as physically frail.37 Between 20% to 70% met criteria for impairment on individual GA measures, despite good overall oncology performance scores (Table 2).

View this table:
Table 2

Baseline GA measure scores among older adults initiating induction chemotherapy for AML (N = 74)

The cohort median OS was 11 months, with a CR/CRi rate of 64% and 30-day mortality rate of 15%. Causes of death for patients who survived at least 30 days (N = 63) but died during study follow-up (N = 39) were divided between complications of relapse (N = 25) and complications of treatment (N = 14).

Among standard clinical characteristics, only cytogenetic risk group, prior MDS, and baseline hemoglobin were significantly associated with OS (Table 3). Chronologic age and ECOG score were not associated with OS. By contrast, among GA variables, both cognitive function and objective PF were associated with OS (Figures 2 and 3). Specifically, patients with poor cognitive function (3MS score <77) at baseline had a median OS of 5.2 months vs 15.6 months for those with better cognitive function (scored ≥77), P = .002. Similarly, patients with low physical performance at baseline (SPPB score <9) had an OS of 6.0 months compared with 16.8 for those with better physical performance (SPPB ≥9) (P = .018). Comorbidity burden and psychological health measures, including depression and distress, were not associated with survival, nor was self-reported PF at either time point.

View this table:
Table 3

Association between clinical characteristics, baseline GA measures, and OS among older adults with AML (N = 73)

View this table:
Table 4

Explanatory power of baseline variables to predict OS

Figure 2

Baseline cognitive function is associated with worse OS among older adults treated for AML (N = 73). Median survival differed using log-rank testing.

Figure 3

Impaired physical performance is associated with worse OS among older adults treated for AML (N = 74). Median survival differed using log-rank testing.

In multivariable analyses, when controlling for a common set of covariates (age, gender, ECOG performance status, cytogenetic risk group, prior MDS, and hemoglobin), patients with impaired cognition had a 2.5-fold higher risk of death than those without impaired cognitive function (HR 2.5, 95% CI 1.2-5.5) (Table 3). Similarly, patients with impaired objective physical performance on the SPPB testing had a twofold higher risk of death than those with better physical performance (HR 2.2, 95% CI 1.1-4.6). These HRs were not attenuated in sensitivity analyses adjusted for chemotherapy type as a covariate; 3MS and SPPB remained significant predictors of OS (HR 3.0, 95% CI 1.3-6.9 and HR 2.3, 95% CI 1.03-5.1, respectively). When considered as continuous variables, for each 5-point improvement in 3MS score, the hazard of death decreased by 26% (HR 0.74, 0.62-0.91). For each 2-point improvement in the SPPB score, the hazard of death decreased by 15% (HR 0.85, 0.72-1.01, P = .06). We found no relationship between timing of GA from date of admission and baseline SPPB (P = .29) or 3MS score (P = .16) to suggest confounding of this relationship, with estimated median time to GA of 2 days for those who were impaired or not impaired in either measure.

Table 4 illustrates the explanatory value of GA measures on OS. The integrated discrimination improvement can be interpreted as the proportion of variance explained by the model, similar to r2 in linear regression.52 Standard clinical characteristics including those proposed by the Kantarjian predictive model for early mortality8 (age, gender, cytogenetic risk group, ECOG score, prior MDS, hemoglobin, creatinine, comorbidity) explained only 21% of the variability in OS. The addition of cognitive function and PF explained an additional 12% (a 60% relative increase in predictive power).

In exploratory analyses, differences in attainment of CR/CRi were seen by baseline SPPB score (≥9 = 70% vs <9 = 57%, P = .23) and 3MS score (≥77 = 67% vs <77 = 57%, P = .41), but were not statistically significant. Rates of 30-day mortality were also higher among impaired patients, but did not achieve statistical significance (3MS, P = .14; SPPB, P = .5). In particular, in patients with SPPB <9, the 30-day mortality estimate was 18.9% (95% CI 8.0-35.2%) vs 10.8% (95% CI 3.0-25.4%) for SPPB ≥9. Among patients who screened positive for cognitive impairment, 30-day mortality was 23.8% (95% CI 8.2-47.2%) vs 9.6% (95% CI 3.0-21.0%) for 3MS ≥77. Finally, we did not detect a difference in the hazard ratios for impairment in 3MS or SPPB by cause of death.


This study demonstrates the prognostic significance of GA in the evaluation of older adults with AML. In these patients, considered fit for intensive chemotherapy by standard oncology assessment, we found significant impairment and heterogeneity in physical, cognitive, and psychological health. In our cohort, objectively measured PF and cognitive function were more important than chronologic age in predicting survival. Measuring these two clinical characteristics alone increased the predictive power of our model by 60% to explain differences in OS.

Our study adds to the published literature proposing risk stratification models for older adults with AML using age, ECOG performance status, and clinical variables heavily weighted toward tumor biology. For example, the prognostic model proposed by Kantarjian and colleagues includes patients age >80, complex karyotype, poor ECOG performance status (>1), and elevated creatinine (>1.3 mg/dL).54 Rollig and colleagues developed a model that included chronologic age, karyotype, NPM1 mutational status, white blood cell count, lactate dehydrogenase levels, and CD4 expression.55 The next step in risk stratification is to better understand which specific age-related impairments or conditions (comorbidity, PF, cognitive function) drive the poor outcomes seen with increased chronologic age. When we added measurement of physical performance and cognitive function to a base model (Table 4) that incorporates clinically available variables from the published predictive model for early mortality by Kantarjian,8 ability to explain differences in OS improved significantly in our cohort. Importantly, this model also highlights that two-thirds of the variability on survival remains unexplained.

This study builds on a growing body of evidence suggesting that a multidimensional GA can predict chemotherapy toxicity and survival in older cancer patients. Large cohort studies in heterogeneous older cancer populations have suggested that GA assessments add to standard clinical evaluations in predicting treatment outcomes.20-22,56,57 However, the screening tools best suited to identify vulnerability will differ by patient populations. Our study is unique in focusing specifically on a homogeneous AML population, using a multidimensional approach to identify which impairments are most important and validated screening tools developed in geriatric populations.

Our study specifically highlights objectively measured physical and cognitive function as key predictors of vulnerability among “good performance status” older adults with AML. Objectively measured PF appears more sensitive than self-report measures in predicting survival postinduction therapy. In noncancer geriatric populations, impaired physical performance (such as slow walking speed) has been a strong and consistent predictor of future disability and mortality, and improves outcome prediction over self-report alone.40-42,44,45,58 Among hospitalized older adults, Volpato et al found that patients with moderate impairment in physical performance (SPPB score 5-7) had a 2.6 times higher risk of death or rehospitalization after discharge compared with patients with better physical performance (SPPB score 8-12).45 We reported previously an association between physical performance and OS in a heterogenous cohort of older adults diagnosed with cancer.59 A similar measure of physical performance (the Timed Up and Go Test)60 was an independent predictor of 6-month mortality among a heterogeneous cohort of older patients treated with first-line chemotherapy.57

Unlike a previous study that showed an association between self-reported impairment in instrumental activities of daily living and survival,61 we found no association between self-reported function and survival. Our results may differ because our participants were already highly selected for intensive therapy and the range of self-reported impairments may be more limited. Self-report measures may not accurately capture functional capacity. Patients in our study recalled less functional impairment 6 months before diagnosis, but recalled functional status was not associated with survival. Thus, our results suggest that objective measurement of function at the time of treatment using more sensitive measures (ie, SPPB) is a better predictor of treatment outcome than recalled functional status.

Importantly, this finding also supports the investigation of interventions to target physical vulnerability. Mechanisms by which impaired physical performance may lead to worse survival in AML include increased risk of infectious complications related to inactivity, falls, and accelerated deconditioning that prohibits delivery of consolidation therapy needed for cure. Testing interventions such as exercise during and after chemotherapy could decrease risks associated with low physical performance at presentation and potentially improve outcomes.62

We also demonstrate the prognostic implications of cognitive screening. Among our highly selected cohort of patients, a substantial proportion screened positive for impaired cognitive function on a standardized screening test. Cognitive impairment is often underrecognized among hospitalized patients and has been associated with increased mortality among older adults.63,64 Similar screening data were reported in non-AML elderly cancer cohorts.22,57 Little information is available specifically on cognitive function among older adults with AML. A study of 54 nonelderly patients with AML/MDS documented impaired performance on a battery of cognitive tests in up to 40% of the patients before treatment.65 This is consistent with our findings.

Cognitive test score may identify a patient who either has, or is at risk for, delirium. Delirium, particularly the hypoactive form, is frequently unrecognized but is a known independent risk factor for mortality among hospitalized older patients with other medical conditions.66 Further research is needed to elucidate the relationship between baseline cognitive performance and subsequent delirium during treatment of AML. If delirium is identified as a mechanism by which baseline cognitive function predicts worse survival, interventions directed at its prevention and treatment may improve outcomes for older patients.67 Most important, our analysis and others support the need to integrate cognitive screening in the pretreatment workup for older cancer patients.20,56,57

Comorbidity did not independently predict survival in our study, although it was a risk factor for both treatment-related mortality and survival in other studies.48,68,69 There are several potential explanations for this discordance. First, our patient population was already highly selected, with a low prevalence of major comorbid conditions known to complicate induction therapy. Second, comorbidity may be a surrogate marker for functional status. In our study, we used a sensitive measure of PF, which may have decreased the independent predictive value of our comorbidity measure. Third, our study may have been underpowered to detect the association between the comorbidity index score and OS.

Finally, our exploratory analyses of 30-day mortality, attainment of CR/CRi, and late mortality cause of death suggest that impairment in SPPB or 3MS increases vulnerability to death from any cause. However, this study was not powered to distinguish differences in hazard ratios by cause of death. Larger studies are needed to elucidate these relationships and the specific mechanisms by which these impairments affect survival. This will be critical to inform interventions targeting specific vulnerabilities.

Our study has several limitations. This is a relatively small, single-institution cohort, which limits generalizability of the findings but is hypothesis-generating. The GA was designed to be nurse-administered, and its length (30 minutes) may make it impractical in a clinical setting. Our study is underpowered to evaluate the predictive value of GA measures on treatment-related mortality outcomes; therefore, these associations are reported as exploratory. Finally, molecular data (ie, Flt-3, NPM1 mutational status) were not available and will be important variables to consider in validation studies.

Next steps include multisite validation of the feasibility and predictive value of performing GA for older adults with AML. This work is under way in the cooperative group setting. Identification of the most efficient screening tools will be necessary to translate GA into clinical practice and individualize treatment decision-making. Finally, interventions targeting physical and cognitive function may improve treatment tolerance and ultimately modify the negative association between increasing age and poor prognosis.


Contribution: H.D.K. designed the study, interpreted data, and wrote the paper; A.M.G., S.B.K., and J.D.W. assisted in study design, interpretation of data, and manuscript writing; B.L.P. assisted in study design, data collection, interpretation of data, and manuscript writing; J.A.T. performed analyses, interpreted data, and assisted in manuscript writing; and T.S.P. and L.R.E. assisted in data collection and manuscript writing.

Conflict-of-interest disclosure: The authors declare no competing financial interests.

Correspondence: Heidi D. Klepin, Comprehensive Cancer Center of Wake Forest University, Medical Center Blvd, Winston-Salem, NC, 27157; e-mail: hklepin{at}


Special thanks to the patients and their families and to Rose Fries, Christy Thompson, Jill Hyson, and Vivian Grubbs for their support.

This work was supported by an American Society of Hematology-Association of Specialty Professors Junior Faculty Scholar Award in Clinical/Translational Research (supported by the American Society of Hematology, Atlantic Philanthropies, John A. Hartford Foundation, and the Association of Specialty Professors) (H.D.K) and the Wake Forest University Claude D. Pepper Older Americans Independence Center (P30 AG-021332) (H.D.K). Current support includes a Paul Beeson Career Development Award in Aging Research K23AG038361 (supported by National Institute on Aging, American Federation for Aging Research, The John A. Hartford Foundation, and The Atlantic Philanthropies) (H.D.K) and The Gabrielle's Angel Foundation for Cancer Research (H.D.K).


  • The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.

  • Submitted December 4, 2012.
  • Accepted March 23, 2013.


View Abstract