8000 GitHub - flu-crew/phylo-rs: Library for phylogenetic trees
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

flu-crew/phylo-rs

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Phylo

Phylo is a fast, extensible, general-purpose, and WebAssembly-capable library for phylogenetic analysis and inference written in Rust. Phylo-rs leverages a combination of memory safety, speed, and native WebAssembly support offered by Rust to provide a robust set of memory-efficient data structures with basic algorithms ranging from tree manipulation with SPR to computing tree statistics such as phylogenetic diversity.

A note on implementation

Implementation of tree-like structures in Rust can be complex and time-intensive. Additionally implementing tree traversals and operations on tree structures (recursive or otherwise) can be a substantial task. This crate aims to implement most such methods as easily derivable traits, so you don't have to implement them from scratch where they are not needed.

We also provide a struct, so you don't have to implement one...

Using phylo

Most of the functionality is implemented in [crate::tree::simple_rtree]. The

[crate::tree::ops] module deals with phylogenetic analysis that requires tree mutations such as SPR, NNI, etc.

[crate::tree::simulation] module is used to simulate random trees

[crate::tree::io] module is used to read trees from various encodings

[crate::tree::distances] module is used to compute various types of distance between nodes in a tree and between trees

[crate::iter] is a helper module to provide tree traversals and iterations.

Building trees

The simplest way to build a tree is to create an empty tree, add a root node, and then add children to the various added nodes:

use phylo::prelude::*;

let mut tree = PhyloTree::new(1);

let new_node = PhyloNode::new(2);
tree.add_child(tree.get_root_id(), new_node);
let new_node = PhyloNode::new(3);
tree.add_child(tree.get_root_id(), new_node);
let new_node: PhyloNode = PhyloNode::new(4);
tree.add_child(2, new_node);
let new_node: PhyloNode = PhyloNode::new(5);
tree.add_child(2, new_node);

Reading and writing trees

This library can build trees strings (or files) encoded in the newick format:

use phylo::prelude::*;

let input_str = String::from("((A:0.1,B:0.2),C:0.6);");
let tree = PhyloTree::from_newick(input_str.as_bytes()).unwrap();

Traversing trees

Several traversals are implemented to visit nodes in a particular order: pre- and post-order. A traversal returns an [Iterator] of nodes or NodeIDs in the order in which they are to be visited.

use phylo::prelude::*;

let input_str = String::from("((A:0.1,B:0.2),C:0.6);");
let tree = PhyloTree::from_newick(input_str.as_bytes()).unwrap();

let dfs_traversal = tree.dfs(tree.get_root_id()).into_iter();
let bfs_traversal = tree.bfs_ids(tree.get_root_id());
let postfix_traversal = tree.postord_ids(tree.get_root_id());

Comparing trees

Several metrics taking into account topology and branch lengths are implemented to compare trees with each other:

use phylo::prelude::*;

fn depth(tree: &PhyloTree, node_id: usize) -> f32 {
    tree.depth(node_id) as f32
}

let newick_1 = "((A:0.1,B:0.2):0.6,(C:0.3,D:0.4):0.5);";
let newick_2 = "((D:0.3,C:0.4):0.5,(B:0.2,A:0.1):0.6);";

let mut tree_1 = PhyloTree::from_newick(newick_1.as_bytes()).unwrap();
let mut tree_2 = PhyloTree::from_newick(newick_2.as_bytes()).unwrap();

tree_1.precompute_constant_time_lca();
tree_2.precompute_constant_time_lca();

tree_1.set_zeta(depth);
tree_2.set_zeta(depth);


let ca = tree_1.ca(&tree_2);
let cophen = tree_1.cophen_dist(&tree_2, 2);

Examples

The following snippets are code examples of some phylogenetic analyses. You can find these in the examples directory of the repository. To visualize the outputs of the examples, please first install the requirements using the following command:

pip install -r examples/visualization/requirements.txt

Quantifying Phylogenetic Diversity

Quantifying the Phylogenetic Diversity of a set of trees using the Faith Index:

#[cfg(feature = "non_crypto_hash")]

use fxhash::FxHashMap as HashMap;

#[cfg(not(feature = "non_crypto_hash"))]

use std::collections::HashMap;


use itertools::Itertools;

use std::fs::{File, read_to_string};

use phylo::prelude::*;

use std::io::Write;


fn main() {

    let paths: HashMap<_, _> = std::fs::read_dir("examples/phylogenetic-diversity/trees")

    .unwrap()

    .map(|x| (x.as_ref().unwrap().file_name().into_string().unwrap(), std::fs::read_dir(x.unwrap().path()).unwrap()

        .map(|f| (f.as_ref().unwrap().file_name().into_string().unwrap().split("-").map(|x| x.to_string()).collect_vec()[0].clone(), PhyloTree::from_newick(read_to_string(f.unwrap().path()).unwrap().as_bytes()).unwrap()))

        .collect::<HashMap<_,_>>()))

    .collect();

    

    for (clade, trees) in paths.iter(){

    println!("Clade: {}", clade);

    let mut pds = vec![];

    for year in 2015..2023{

        let tree = trees.get(&year.to_string());

        match tree{

            Some(t) => {

                println!("{}: {}", year, t.get_nodes().map(|n| n.get_weight().unwrap_or(0.0)).sum::<f32>()); 

                pds.push(t.get_nodes().map(|n| n.get_weight().unwrap_or(0.0)).sum::<f32>());

            },

            _ => {println!("{}: {}", year, 0.0); pds.push(0.0);},

        };

    }

    }

}

The variations in Phylogenetic Diversity can be visualized using the python script examples/visualization/pd.py.

Visualizing Phylogenetic Tree Space

Here, we compare all pairwise RF distances of a set of trees:

#[cfg(feature = "non_crypto_hash")]

use fxhash::FxHashMap as HashMap;

#[cfg(not(feature = "non_crypto_hash"))]

use std::collections::HashMap;


use itertools::Itertools;

use std::fs::{File, read_to_string};

use phylo::prelude::*;

use std::io::Write;

use indicatif::{ProgressIterator, ProgressBar, ProgressStyle};


fn main() {

    let trees = (1..11).progress().map(|x| read_to_string(format!("examples/pairwise-distances/r{x}-preprocessed.trees"))

            .unwrap()

            .lines()

            .enumerate()

            .map(|(y,z)| (x,y,PhyloTree::from_newick(z.as_bytes()).unwrap()))

            .collect_vec()

        )

        .flatten()

        .collect_vec();

    

    

    let bar = ProgressBar::new((trees.len()*(trees.len()-1)/2) as u64);

    bar.set_style(ProgressStyle::with_template("[{elapsed_precise}] {bar:40.cyan/blue} {pos:>7}/{len:7} {msg} [eta: {eta}]")

        .unwrap()

        .progress_chars("##-"));

    

    trees.iter().combinations(2).map(|v| (v[0], v[1])).for_each(|(x,y)| {

        let out = format!("{}-{}-{}-{}-{}\n", x.0, y.0, x.1, y.1, x.2.ca(&y.2));

        println!("{}", out);

        bar.inc(1);

    });

    bar.finish();

}

The tree space can be visualized using the python script examples/visualization/tree-space.py.

To run the code examples, run the following:

cargo run --example phylogenetic-diversity


cargo run --example pairwise-distances

About

Library for phylogenetic trees

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Rust 71.1%
  • C++ 20.0%
  • Python 3.6%
  • Shell 3.4%
  • Go 1.1%
  • R 0.8%
0