CloudPhylo

Name CloudPhylo
Type Phylogeny and Molecular Evolution
Developers Xingjian Xu, Zhang Zhang
Description CloudPhylo is a fast and scalable Spark-based tool for reconstruction of phylogenetic tree from large-scale datasets. Phylogeny reconstruction methods roughly fall into two categories: alignment-based methods and alignment-free methods. Compared to alignment-based methods that need to perform multiple sequence alignments beforehand, alignment-free methods are more suitable to be ported to cloud computing infrastructure by nature and unbiasedly make full use of data by taking whole genome sequences as input to infer phylogenetic tree. Among them, CVTree is proven to be a robust alignment-free method, producing accurate phylogenetic trees in prokaryotes. Basically, it normalizes any whole genome sequence into a high dimensional composition vector, and then evaluates the evolutionary relationship between any two genomes by estimating their cosine distance. CloudPhylo principally employs the statistical model of CVTree but adopts Spark to fully parallelize the algorithm in a fine-grained manner. In addition to the Spark version of CloudPhylo (CloudPhylo-Spark), we also implement a Hadoop version (CloudPhylo-Hadoop) to evaluate their performances on real data.
Downlaod https://ngdc.cncb.ac.cn/biocode/tools/BT000006
Article https://academic.oup.com/bioinformatics/article/33/3/438/2585051?login=false
Cite Count 10

1 Beichen West Road, Chaoyang District Beijing 100101, China | 86-10-84097216

© China National Center for Bioinformation 2025, 京ICP备 10050270号-13