Personalized Modeling of Disease Evolution in CLL: Does Statistical Significance Translate into Predictive Accuracy?

Elisavet Chatzilari, Panagiotis Baliakas, Aliki Xochelli, Anastasios Maronidis, Anna Vardi, Mattias Mattsson, Karin Larsson, Vassiliki Douka, Michail Iskas, George Karavalakis, Apostolia Papalexandri, Carsten Niemann, Marco Montillo, Achilles Anagnostopoulos, David Oscier, Sarka Pospisilova, Frederic Davi, Niki Stavroyianni, Paolo Ghia, Anastasia Hadzidimitriou, Richard Rosenquist, Spiros Nikolopoulos, Kostas Stamatopoulos and Yannis Kompatsiaris


The remarkable clinical heterogeneity of CLL has prompted several initiatives towards the development of prognostic models aiming to stratify patients into subgroups with distinct outcome. However, despite progress, the resultant prognostic models, mostly based on Cox regression analysis, have not been adopted in everyday clinical practice, mainly due to failure to provide sufficiently accurate predictions on a per patient basis. Here, we approached the issue of prognostication amongst Binet stage A CLL cases following a novel approach, in particular using Adaboost, an ensemble learning algorithm based on decision trees. Adaboost jointly considers all available parameters providing a specific prediction for each patient, unlike Cox regression models which are based on identifying parameters with independent prognostic significance. In addition, Adaboost models are completely automated with minimal time for training and prediction generation. This is in contrast to Cox models which are manually trained and require significantly more time for prediction generation. Both Cox regression and Adaboost models were evaluated regarding their predictive accuracy i.e. the number of patients successfully assigned to their true risk group divided by the total number of patients. For the development of the prognostic models, 5-fold cross-validation was used. The patients were equally subdivided into 5 subgroups. Each time, 4 out of the 5 subgroups were used to train the Cox regression and the Adaboost models while the 5th was kept as the validation cohort, where the models were applied to. The study cohort included 789 Binet A CLL patients with available data regarding gender, age, immunogenetic profile, CD38 expression, Döhner model cytogenetic aberrations and treatment status with a median follow up of 8.5 years (range 0-40.5 years, at least 5 years for untreated cases). Patients were subdivided in 3 groups: (i) high risk (HR): time-to-first-treatment (TTFT) <2 years, n=215 (27%); (ii) intermediate risk (IR): TTFT≥2 years and <5 years, n=151 (20%); and, (iii) low risk (LR): no need for treatment within 5 years from diagnosis, n=422 (53%). Applying Adaboost, the HR, IR and LR groups included 326 (41.5%), 0 (0%) and 463 (58.5%) cases, respectively. On multivariate analysis, unmutated IGHV genes U-CLL, subset #2 assignement and CD38 expression emerged as independently predictive of shorter TTFT; in contrast, adverse prognosis cytogenetic aberrations i.e. del(17p) and del(11q) did not retain significance (p=0.06 and 0.052, respectively), likely due to their strong association with U-CLL. Applying Cox regression models based on the significant independent parameters, patients were classified as follows: (i) HR: unmutated IGHV genes (U-CLL) and/or assignment to stereotyped subset #2 (n=357, 45%); (ii) IR: mutated IGHV genes (M-CLL) and high CD38 expression (CD38+) (n=41, 5%); and, (iii) LR: M-CLL and low CD38 expression (CD38-) (n=397, 50%). Prediction accuracies were 58.2% and 61.1% for the Cox regression and the Adaboost model, respectively (McNemar's test: p<0.0025). Both models often failed to identify patients belonging to the IR group. Further, we gave the same clinico-biological parameters used for the development of the prognostic models to 7 trained hematologists and asked them to assign each patient included in the study to one of the 3 risk groups. Among the trained hematologists, responses varied within the range of 51.2-58.4%, leading to an average prediction accuracy of 54.6%: particularly challenging was the discrimination between the HR vs the IR group. In conclusion, Adaboost outperforms to a small, yet statistically significant, degree the predictive accuracy of both Cox regression and expert judgment, suggesting its potential for clinical testing. However, the predictive accuracy rates of both the Adaboost and Cox regression approach are still unsatisfactory, highlighting that further development is required in order to provide robust personalized predictive modeling, while also suggesting that statistical significance does not automatically translate into clinical utility. This indicates the need for incorporating disease- and host-related parameters not yet evaluated for their prognostic/predictive value in CLL in order to refine risk stratification, thus meaningfully empowering physicians in clinical decision-making.

Disclosures Niemann: Janssen: Consultancy; Roche: Consultancy; Gilead: Consultancy; Novartis: Other: Travel grant. Ghia: Janssen Pharmaceuticals: Research Funding.

  • * Asterisk with author names denotes non-ASH members.