| Peer-Reviewed

Using Evolutionary Trees for the Colorectal Cancer Prognosis Prediction

Received: 2 January 2021     Accepted: 29 January 2021     Published: 10 February 2021
Views:       Downloads:
Abstract

The measurement of tree similarity based on structure comparison has been long used in diverse fields. We applied the evolutionary tree method to study the cancer genome. Cancer evolutionary trees, representing cancer diversity, provide information on the clonal evolution and the clinical outcome of cancer patients. This study considered 107 colorectal cancer (CRC) patients who received deep targeted sequencing of cancer tissues. The evolutionary trees of individual cancer patients were reconstructed from genome sequencing data based on variant allele frequencies (VAFs) of point mutations and small insertions or deletions (indels). The main purpose of this study was to predict cancer recurrence. We mapped the structure of a cancer evolutionary tree to a rooted tree and developed a canonical-form transformation for solving tree isomorphism to ensure that each patient has a unique tree structure. We proposed an algorithm for comparing tree similarity in terms of cost calculation between evolutionary structure trees. The cost was calculated using the node position, tree size (or number of nodes), tree height, node depth, number of descendants of the node (the size of the subtree with the node as a root), and relationship of the node with other nodes. After tree similarity comparison, the cancer patients were clustered into two groups through k-means clustering. The clustering information indicated that the evolutionary structure trees were associated with gender and tumor invasion stage. Several machine-learning strategies including random forest, support vector machine (SVM), bagging, and boosting were used to predict cancer recurrence in these patients. Our results revealed that combining the clustering information of evolutionary structure trees increased the prediction performance compared with using clinical information alone, and tree similarity comparison can help in the prognostic analysis of cancer patients.

Published in Computational Biology and Bioinformatics (Volume 9, Issue 1)
DOI 10.11648/j.cbb.20210901.11
Page(s) 1-14
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2021. Published by Science Publishing Group

Keywords

Cancer Evolutionary Trees, Colorectal Cancer, Evolutionary Structure Trees, Canonical-form Transformation, Tree Simiarity Comparison

References
[1] P. Nowell, “The clonal evolution of tumor cell populations,” Science, vol. 194, no. 4260, pp. 23–28, Jan. 1976.
[2] C. Swanton, “Cancer evolution: the final frontier of precision medicine?,” Annals of Oncology, vol. 25, no. 3, pp. 549–551, 2014.
[3] R. Schwartz and A. A. Schaffer, “The evolution of tumour phylogenetics: principles and practice,” Nature Reviews Genetics, vol. 18, no. 4, pp. 213–229, 2017.
[4] P. J. Campbell, E. D. Pleasance, P. J. Stephens, E. Dicks, R. Rance, I. Goodhead, G. A. Follows, A. R. Green, P. A. Futreal, and M. R. Stratton, “Subclonal phylogenetic structures in cancer revealed by ultra-deep sequencing,” Proceedings of the National Academy of Sciences, vol. 105, no. 35, pp. 13081–13086, 2008.
[5] A. Schuh, J. Becq, S. Humphray, A. Alexa, A. Burns, R. Clifford, S. M. Feller, R. Grocock, S. Henderson, I. Khrebtukova, Z. Kingsbury, S. Luo, D. Mcbride, L. Murray, T. Menju, A. Timbs, M. Ross, J. Taylor, and D. Bentley, “Monitoring chronic lymphocytic leukemia progression by whole genome sequencing reveals heterogeneous clonal evolution patterns,” Blood, vol. 120, no. 20, pp. 4191–4196, 2012.
[6] M. El-Kebir, L. Oesper, H. Acheson-Field, and B. J. Raphael, “Reconstruction of clonal trees and tumor composition from multi-sample sequencing data,” Bioinformatics, vol. 31, no. 12, pp. i62–i70, 2015.
[7] S. Malikic, A. W. Mcpherson, N. Donmez, and C. S. Sahinalp, “Clonality inference in multiple tumor samples using phylogeny,” Bioinformatics, vol. 31, no. 9, pp. 1349–1356, 2015.
[8] W. Jiao, S. Vembu, A. G. Deshwar, L. Stein, and Q. Morris, “Inferring clonal evolution of tumors from single nucleotide somatic mutations,” BMC Bioinformatics, vol. 15, no. 1, p. 35, 2014.
[9] K. Jahn, J. Kuipers, and N. Beerenwinkel, “Tree inference for singlecell data,” 2016.
[10] M. El-Kebir, “SPhyR: tumor phylogeny estimation from single-cell sequencing data under loss and error,” Bioinformatics, vol. 34, no. 17, pp. i671–i679, 2018.
[11] S. Miura, L. A. Huuki, T. Buturla, T. Vu, K. Gomez, and S. Kumar, “Computational enhancement of single-cell sequences for inferring tumor evolution,” Bioinformatics, vol. 34, no. 17, pp. i917–i926, 2018.
[12] E. Letouze, Y. Allory, M. A. Bollet, F. Radvanyi, and F. Guyon, “Analysis of the copy number profiles of several tumor samples from the same patient reveals the successive steps in tumorigenesis,” Genome Biology, vol. 11, no. Suppl 1, 2010.
[13] H. Zare, J. Wang, A. Hu, K. Weber, J. Smith, D. Nickerson, C. Song, D. Witten, C. A. Blau, and W. S. Noble, “Inferring Clonal Composition from Multiple Sections of a Breast Cancer,” PLoS Computational Biology, vol. 10, no. 7, 2014.
[14] A. G. Deshwar, S. Vembu, C. K. Yung, G. H. Jang, L. Stein, and Q. Morris, “PhyloWGS: Reconstructing subclonal composition and evolution from whole-genome sequencing of tumors,” Genome Biology, vol. 16, no. 1, 2015.
[15] J. S. Farris, “Methods for Computing Wagner Trees,” Systematic Biology, vol. 19, no. 1, pp. 83–92, 1970.
[16] W. M. Fitch, “Toward Defining the Course of Evolution: Minimum Change for a Specific Tree Topology,” Systematic Zoology, vol. 20, no. 4, p. 406, 1971.
[17] D. Penny, “Inferring Phylogenies.—Joseph Felsenstein. 2003. Sinauer Associates, Sunderland, Massachusetts.,” Systematic Biology, vol. 53, no. 4, pp. 669–670, 2004.
[18] T. Stijnen, “Maximum Likelihood Estimation Methods,” Encyclopedia of Medical Decision Making.
[19] N. T. Hobbs and M. B. Hooten, “Markov Chain Monte Carlo,” Bayesian Models, 2015.
[20] I. Hajirasouliha, A. Mahmoody, and B. J. Raphael, “A combinatorial approach for analyzing intra-tumor heterogeneity from highthroughput sequencing data,” Bioinformatics, vol. 30, no. 12, pp. i78–i86, 2014.
[21] C. A. Miller, B. S. White, N. D. Dees, M. Griffith, J. S. Welch, O. L. Griffith, R. Vij, M. H. Tomasson, T. A. Graubert, M. J. Walter, M. J. Ellis, W. Schierding, J. F. Dipersio, T. J. Ley, E. R. Mardis, R. K. Wilson, and L. Ding, “SciClone: Inferring Clonal Architecture and Tracking the Spatial and Temporal Patterns of Tumor Evolution,” PLoS Computational Biology, vol. 10, no. 8, 2014.
[22] V. Popic, R. Salari, I. Hajirasouliha, D. Kashef-Haghighi, R. B. West, and S. Batzoglou, “Fast and scalable inference of multi-sample cancer lineages,” Genome Biology, vol. 16, no. 1, 2015.
[23] N. Beerenwinkel, R. F. Schwarz, M. Gerstung, and F. Markowetz, “Cancer Evolution: Mathematical Models and Computational Inference,” Systematic Biology, vol. 64, no. 1, 2014.
[24] Y. Matsui, A. Niida, R. Uchi, K. Mimori, S. Miyano, and T. Shimamura, “phyC: Clustering cancer evolutionary trees,” PLOS Computational Biology, vol. 13, no. 5, 2017.
[25] M. Gerlinger, S. Horswell, J. Larkin, A. J. Rowan, M. P. Salm, I. Varela, R. Fisher, N. Mcgranahan, N. Matthews, C. R. Santos, P. Martinez, B. Phillimore, S. Begum, A. Rabinowitz, B. Spencer-Dene, S. Gulati, P. A. Bates, G. Stamp, L. Pickering, M. Gore, D. L. Nicol, S. Hazell, P. A. Futreal, A. Stewart, and C. Swanton, “Genomic architecture and evolution of clear cell renal cell carcinomas defined by multiregion sequencing,” Nature Genetics, vol. 46, no. 3, pp. 225–233, 2014.
[26] J. Zhang, J. Fujimoto, J. Zhang, D. C. Wedge, X. Song, J. Zhang, S. Seth, C.-W. Chow, Y. Cao, C. Gumbs, K. A. Gold, N. Kalhor, L. Little, H. Mahadeshwar, C. Moran, A. Protopopov, H. Sun, J. Tang, X. Wu, Y. Ye, W. N. William, J. J. Lee, J. V. Heymach, W. K. Hong, S. Swisher, I. I. Wistuba, and P. A. Futreal, “Intratumor heterogeneity in localized lung adenocarcinomas delineated by multiregion sequencing,” Science, vol. 346, no. 6206, pp. 256–259, 2014.
[27] S. Cohen, “Indexing for subtree similarity-search using edit distance,” Proceedings of the 2013 international conference on Management of data - SIGMOD 13, 2013.
[28] J. Allali and M.-F. Sagot, “Novel Tree Edit Operations for RNA Secondary Structure Comparison,” Lecture Notes in Computer Science Algorithms in Bioinformatics, pp. 412–425, 2004.
[29] S. Guha, H. V. Jagadish, N. Koudas, D. Srivastava, and T. Yu, “Approximate XML joins,” Proceedings of the 2002 ACM SIGMOD international conference on Management of data - SIGMOD 02, 2002.
[30] M. Gerstung, C. Beisel, M. Rechsteiner, P. Wild, P. Schraml, H. Moch, and N. Beerenwinkel, “Reliable detection of subclonal singlenucleotide variants in tumour cell populations,” Nature Communications, vol. 3, no. 1, 2012.
[31] K. Wang, M. Li, and H. Hakonarson, “ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data,” Nucleic Acids Research, vol. 38, no. 16, Mar. 2010.
[32] Y. Xue, C. Wang, H. H. Ghenniwa, and W. Shen, “A new tree similarity measuring method and its application to ontology comparison,” 2008 12th International Conference on Computer Supported Cooperative Work in Design, 2008.
[33] P. J. Rousseeuw, “Silhouettes: A graphical aid to the interpretation and validation of cluster analysis,” Journal of Computational and Applied Mathematics, vol. 20, pp. 53–65, 1987.
[34] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic Minority Over-sampling Technique,” Journal of Artificial Intelligence Research, vol. 16, pp. 321–357, Jan. 2002.
[35] T. K. Ho, “Random decision forests,” Proceedings of 3rd International Conference on Document Analysis and Recognition.
[36] L. Breiman, “Bagging predictors,” Machine Learning, vol. 24, no. 2, pp. 123–140, 1996.
[37] Y. Freund and R. E. Schapire, “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting,” Journal of Computer and System Sciences, vol. 55, no. 1, pp. 119–139, 1997.
[38] F. Conforti, L. Pala, V. Bagnardi, T. D. Pas, M. Martinetti, G. Viale, R. D. Gelber, and A. Goldhirsch, “Cancer immunotherapy efficacy and patients sex: a systematic review and meta-analysis,” The Lancet Oncology, vol. 19, no. 6, pp. 737–746, 2018.
[39] F. Rampen, “Malignant melanoma: Sex differences in response to chemotherapy?,” European Journal of Cancer and Clinical Oncology, vol. 18, no. 1, pp. 107–110, 1982.
[40] R. L. Siegel, K. D. Miller, and A. Jemal, “Cancer statistics, 2016,” CA: A Cancer Journal for Clinicians, vol. 66, no. 1, pp. 7–30, 2016.
Cite This Article
  • APA Style

    Hung-Yu Yan, Dun-Wei Cheng, Peng-Chan Lin, Hsin-Hung Chou, Meng-Ru Shen, et al. (2021). Using Evolutionary Trees for the Colorectal Cancer Prognosis Prediction. Computational Biology and Bioinformatics, 9(1), 1-14. https://doi.org/10.11648/j.cbb.20210901.11

    Copy | Download

    ACS Style

    Hung-Yu Yan; Dun-Wei Cheng; Peng-Chan Lin; Hsin-Hung Chou; Meng-Ru Shen, et al. Using Evolutionary Trees for the Colorectal Cancer Prognosis Prediction. Comput. Biol. Bioinform. 2021, 9(1), 1-14. doi: 10.11648/j.cbb.20210901.11

    Copy | Download

    AMA Style

    Hung-Yu Yan, Dun-Wei Cheng, Peng-Chan Lin, Hsin-Hung Chou, Meng-Ru Shen, et al. Using Evolutionary Trees for the Colorectal Cancer Prognosis Prediction. Comput Biol Bioinform. 2021;9(1):1-14. doi: 10.11648/j.cbb.20210901.11

    Copy | Download

  • @article{10.11648/j.cbb.20210901.11,
      author = {Hung-Yu Yan and Dun-Wei Cheng and Peng-Chan Lin and Hsin-Hung Chou and Meng-Ru Shen and Sun-Yuan Hsieh},
      title = {Using Evolutionary Trees for the Colorectal Cancer Prognosis Prediction},
      journal = {Computational Biology and Bioinformatics},
      volume = {9},
      number = {1},
      pages = {1-14},
      doi = {10.11648/j.cbb.20210901.11},
      url = {https://doi.org/10.11648/j.cbb.20210901.11},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.cbb.20210901.11},
      abstract = {The measurement of tree similarity based on structure comparison has been long used in diverse fields. We applied the evolutionary tree method to study the cancer genome. Cancer evolutionary trees, representing cancer diversity, provide information on the clonal evolution and the clinical outcome of cancer patients. This study considered 107 colorectal cancer (CRC) patients who received deep targeted sequencing of cancer tissues. The evolutionary trees of individual cancer patients were reconstructed from genome sequencing data based on variant allele frequencies (VAFs) of point mutations and small insertions or deletions (indels). The main purpose of this study was to predict cancer recurrence. We mapped the structure of a cancer evolutionary tree to a rooted tree and developed a canonical-form transformation for solving tree isomorphism to ensure that each patient has a unique tree structure. We proposed an algorithm for comparing tree similarity in terms of cost calculation between evolutionary structure trees. The cost was calculated using the node position, tree size (or number of nodes), tree height, node depth, number of descendants of the node (the size of the subtree with the node as a root), and relationship of the node with other nodes. After tree similarity comparison, the cancer patients were clustered into two groups through k-means clustering. The clustering information indicated that the evolutionary structure trees were associated with gender and tumor invasion stage. Several machine-learning strategies including random forest, support vector machine (SVM), bagging, and boosting were used to predict cancer recurrence in these patients. Our results revealed that combining the clustering information of evolutionary structure trees increased the prediction performance compared with using clinical information alone, and tree similarity comparison can help in the prognostic analysis of cancer patients.},
     year = {2021}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Using Evolutionary Trees for the Colorectal Cancer Prognosis Prediction
    AU  - Hung-Yu Yan
    AU  - Dun-Wei Cheng
    AU  - Peng-Chan Lin
    AU  - Hsin-Hung Chou
    AU  - Meng-Ru Shen
    AU  - Sun-Yuan Hsieh
    Y1  - 2021/02/10
    PY  - 2021
    N1  - https://doi.org/10.11648/j.cbb.20210901.11
    DO  - 10.11648/j.cbb.20210901.11
    T2  - Computational Biology and Bioinformatics
    JF  - Computational Biology and Bioinformatics
    JO  - Computational Biology and Bioinformatics
    SP  - 1
    EP  - 14
    PB  - Science Publishing Group
    SN  - 2330-8281
    UR  - https://doi.org/10.11648/j.cbb.20210901.11
    AB  - The measurement of tree similarity based on structure comparison has been long used in diverse fields. We applied the evolutionary tree method to study the cancer genome. Cancer evolutionary trees, representing cancer diversity, provide information on the clonal evolution and the clinical outcome of cancer patients. This study considered 107 colorectal cancer (CRC) patients who received deep targeted sequencing of cancer tissues. The evolutionary trees of individual cancer patients were reconstructed from genome sequencing data based on variant allele frequencies (VAFs) of point mutations and small insertions or deletions (indels). The main purpose of this study was to predict cancer recurrence. We mapped the structure of a cancer evolutionary tree to a rooted tree and developed a canonical-form transformation for solving tree isomorphism to ensure that each patient has a unique tree structure. We proposed an algorithm for comparing tree similarity in terms of cost calculation between evolutionary structure trees. The cost was calculated using the node position, tree size (or number of nodes), tree height, node depth, number of descendants of the node (the size of the subtree with the node as a root), and relationship of the node with other nodes. After tree similarity comparison, the cancer patients were clustered into two groups through k-means clustering. The clustering information indicated that the evolutionary structure trees were associated with gender and tumor invasion stage. Several machine-learning strategies including random forest, support vector machine (SVM), bagging, and boosting were used to predict cancer recurrence in these patients. Our results revealed that combining the clustering information of evolutionary structure trees increased the prediction performance compared with using clinical information alone, and tree similarity comparison can help in the prognostic analysis of cancer patients.
    VL  - 9
    IS  - 1
    ER  - 

    Copy | Download

Author Information
  • Institute of Medical Informatics, National Cheng Kung University, Tainan, Taiwan

  • Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan

  • Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan; Department of Oncology, Department of Genomics Medicine, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan, Taiwan

  • Department of Computer Science and Information Engineering, National Chi Nan University, Nantou, Taiwan

  • Department of Obstetrics and Gynecology, National Cheng Kung University, Department of Pharmacology, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan, Taiwan

  • Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan; Institute of Manufacturing Information Systems and Institute of Medical Informatics, National Cheng Kung University, Tainan, Taiwan

  • Sections