Empirical Example

The purpose of this notebook is to demonstrate ipcoal simulations on a topology inferred from empirical data. We provide recommendations for how to scale units from a time-calibrated phylogeny to use in coalescent simulations, and how to incorporate biological information about species, such as generation times and population sizes, to perform more realistic simulations.

Simulating coalescent genealogies and sequences on a parameterized species tree model using ipcoal can provide a null expectation for the amount of discordance that you expect to observe across different nodes of a species tree, and can even be used as a posterior predictive tool for phylogenetic analyses.

[1]:
import numpy as np
import pandas as pd
import ipcoal
import toytree
import toyplot
colormap = toyplot.color.brewer.map("BlueRed", reverse=True)

Mammal phylogeny data set

In this example we use published data for mammals. We will use a time-calibrated MCC phylogeny by Upham et al. (2009) as a species tree hypothesis; we will use species geographic areas from the PanTHERIA database as a proxy for effective population sizes; and we will use generation time estimates from the Pacifici et al. (2014) data set, which imputes a lot of missing data from pantheria by using mean values among close relatives.

[2]:
# load the phylogenetic data (big tree, takes a few seconds)
TREE_URL = (
    "https://github.com/eaton-lab/ipcoal/blob/master/"
    "notebooks/mammal_dat/MamPhy_fullPosterior_BDvr_DNAonly"
    "_4098sp_topoFree_NDexp_MCC_v2_target.tre?raw=true"
)
tree = toytree.tree(TREE_URL, tree_format=10)
print(tree.ntips, "tips in the Upham mammal tree")
4100 tips in the Upham mammal tree
[3]:
# load the mammal biological data (e.g., geo range)
PANTH_URL = (
    "https://github.com/eaton-lab/ipcoal/blob/master/"
    "notebooks/mammal_dat/PanTHERIA_1-0_WR05_Aug2008.txt?raw=true"
)
panthdf = pd.read_csv(PANTH_URL, sep='\t')
print(panthdf.shape[0], "taxa in PanTHERIA database")
5416 taxa in PanTHERIA database
[4]:
# load the generation time data
GT_URL = (
    "https://github.com/eaton-lab/ipcoal/blob/master/"
    "notebooks/mammal_dat/5734-SP-2-Editor.csv?raw=true"
)
gentimedf = pd.read_csv(GT_URL)
print(gentimedf.shape[0], "taxa in Pacifici gentime database")
5427 taxa in Pacifici gentime database

Filtering and selecting taxa

We will first trim the data down to include only taxa that are shared among all three data sources and for which there is no missing biological data. This reduces the data set to 3121 taxa. The distribution of geographic range areas is in units of kilometers\(^2\) (geogrange) and generation times is in units of years (gentime).

[5]:
# subselect species names and geo range columns from pantheria
sppdata = panthdf.loc[:, ['MSW05_Binomial', '26-1_GR_Area_km2']]

# rename sppdata columns
sppdata.columns = ["species", "georange"]
[6]:
# make column to record tree tip label names
sppdata["treename"] = np.nan

# dict map: {gen}_{spp} to {gen}_{spp}_{fam}_{order}
tipdict = {i.rsplit("_", 2)[0]: i for i in tree.get_tip_labels()}

# record whether species in pantheria is in the tree tip labels
for idx in sppdata.index:

    # match data names to tree names which have underscores
    name = sppdata.species[idx]
    name_ = name.replace(" ", "_")

    # record treename if it is in the database
    if name_ in tipdict:
        sppdata.loc[idx, "treename"] = tipdict[name_]
[7]:
# add gentime values to all species matching to names in Pacifici data set
sppdata["gentime"] = np.nan
for idx in gentimedf.index:

    # get generation time in units of years
    species, gent = gentimedf.loc[idx, ["Scientific_name", "GenerationLength_d"]]
    mask = sppdata.species == species
    sppdata.loc[mask.values, "gentime"] = gent / 365.
[8]:
# set missing data (-999) to NaN
sppdata[sppdata == -999.000] = np.nan

# remove rows where either georange or gentime is missing
mask = sppdata.georange.notna() & sppdata.gentime.notna() & sppdata.treename.notna()
sppdata = sppdata.loc[mask, :]

# reorder and reset index for dropped rows
sppdata.sort_values(by="species", inplace=True)
sppdata.reset_index(drop=True, inplace=True)

# show first ten sorted rows
sppdata.head(10)
[8]:
species georange treename gentime
0 Abeomelomys sevia 53261.73 Abeomelomys_sevia_MURIDA... 1.710684
1 Abrocoma bennettii 54615.98 Abrocoma_bennettii_ABROC... 2.829928
2 Abrocoma boliviensis 5773.97 Abrocoma_boliviensis_ABR... 2.829928
3 Abrocoma cinerea 381391.02 Abrocoma_cinerea_ABROCOM... 2.829928
4 Abrothrix andinus 722551.83 Abrothrix_andinus_CRICET... 1.614762
5 Abrothrix hershkovitzi 1775.72 Abrothrix_hershkovitzi_C... 1.614762
6 Abrothrix illuteus 35359.55 Abrothrix_illuteus_CRICE... 1.614762
7 Abrothrix jelskii 506394.71 Abrothrix_jelskii_CRICET... 1.614762
8 Abrothrix lanosus 43016.67 Abrothrix_lanosus_CRICET... 1.614762
9 Abrothrix longipilis 423823.71 Abrothrix_longipilis_CRI... 1.614762

Filter the tree to include only taxa in the data table

[9]:
# find names in tree but not in data table
names_in_data = set(sppdata.treename)
names_in_tree = set(tree.get_tip_labels())
names_to_remove = names_in_tree.difference(names_in_data)
[10]:
# drop the tips from the tree not in data table
ftree = tree.drop_tips(names_to_remove)
print(len(ftree), "tips in filtered tree (ftree)")
3121 tips in filtered tree (ftree)

Convert geographic ranges to Ne values

Here we generate a range of Ne values within a selected range that are scaled by the variation in geographic range area sizes among taxa. The distribution is plotted as a histrogram on a y-axis log scale. Many taxa have small Ne, few have very large Ne.

[11]:
# transform georange into Ne values within selected range
max_Ne = 1000000
min_Ne = 1000

# set Ne values in range scaled by geographic ranges
Ne = max_Ne * (sppdata.georange / sppdata.georange.max())
Ne = [max(min_Ne, i) for i in Ne]
sppdata["Ne"] = np.array(Ne, dtype=int)

# show 10 random samples
sppdata.sample(10)
[11]:
species georange treename gentime Ne
2475 Rattus tunneyi 689460.95 Rattus_tunneyi_MURIDAE_R... 1.465093 10937
809 Eidolon dupreanum 594211.19 Eidolon_dupreanum_PTEROP... 6.000000 9426
1429 Marmosa mexicana 905454.61 Marmosa_mexicana_DIDELPH... 1.557911 14364
2934 Tamiasciurus mearnsi 2768.05 Tamiasciurus_mearnsi_SCI... 3.982922 1000
1268 Leopardus pardalis 16040024.42 Leopardus_pardalis_FELID... 8.251716 254465
2386 Pseudomys fumeus 36320.16 Pseudomys_fumeus_MURIDAE... 2.423789 1000
1162 Hylobates pileatus 140831.03 Hylobates_pileatus_HYLOB... 15.000000 2234
1614 Microtus xanthognathus 2255416.57 Microtus_xanthognathus_C... 1.023627 35780
674 Cynogale bennettii 170977.09 Cynogale_bennettii_VIVER... 5.000000 2712
2226 Philetor brachypterus 747045.55 Philetor_brachypterus_VE... 5.621003 11851
[12]:
# plot a histogram of Ne values
a, b = np.histogram(sppdata.Ne, bins=25)
toyplot.bars((a, b), height=300, yscale="log", ylabel="bin count", xlabel="Ne");
05000001000000Ne010 010 110 210 310 4bin count

Set Ne and g values for tip and ancestral nodes on the tree object

ipcoal can accept different Ne and g values to use in simulations, and the easiest way to set variable values across different parts of the tree is to map the values to the tree object that ipcoal accepts as an argument. We only have estimates of Ne and g for species that are alive today, but it would be useful to also includes estimates for ancestral nodes in the species tree. Here we use a simple ancestral state reconstruction based on Brownian motion to infer states for ancestral nodes.

[13]:
# make a copy of the filtered tree
tree_ng = ftree.copy()

# dictionaries mapping names to values
dict_ne = {sppdata.treename[i]: sppdata.Ne[i] for i in range(sppdata.shape[0])}
dict_gt = {sppdata.treename[i]: sppdata.gentime[i] for i in range(sppdata.shape[0])}

# set values on nodes of the tree for all species (tips)
tree_ng = tree_ng.set_node_values("Ne", dict_ne)
tree_ng = tree_ng.set_node_values("g", dict_gt)

# estimate and set values on ancestral nodes as well.
tree_ng = tree_ng.pcm.ancestral_state_reconstruction("g")
tree_ng = tree_ng.pcm.ancestral_state_reconstruction("Ne")

Plot tree with Ne and g values

Let’s plot just a subset of taxa to start, since it will be much easier to visualize than trying to examine the entire tree. Here we select only the taxa in the genus Mustela. The tree plot shows variation in Ne using the thickness of edges, and generation times are shows by the color of nodes, blue to red, representing shorter to longer times. The ts='p' drawing option automatically pulls the Ne information from the nodes of the tree to draw the edge thickness.

[14]:
# make a tree copy
atree = tree_ng.copy()

# get ancestor of all tips that have 'Mustela' in their name
mrca_node_idx = atree.get_mrca_idx_from_tip_labels(wildcard="Mustela_")

# get the TreeNode object of this subtree
node = atree.get_feature_dict("idx")[mrca_node_idx]

# create as a new Toytree
subtree = toytree.tree(node)

# scale the tree height from millions of year to years
subtree = subtree.mod.node_scale_root_height(subtree.treenode.height * 1e6)
[15]:
subtree.draw(
    ts='p',
    edge_type='p',
    node_sizes=10,
    node_labels=False,
    node_colors=[
        colormap.colors(i, 0.1, 10) for i in subtree.get_node_values('g', 1, 1)
    ],
    width=400,
    height=600,
);
Mustela_eversmanii_MUSTELIDAE_CARNIVORAMustela_putorius_MUSTELIDAE_CARNIVORAMustela_lutreola_MUSTELIDAE_CARNIVORAMustela_sibirica_MUSTELIDAE_CARNIVORAMustela_itatsi_MUSTELIDAE_CARNIVORAMustela_altaica_MUSTELIDAE_CARNIVORAMustela_nivalis_MUSTELIDAE_CARNIVORAMustela_erminea_MUSTELIDAE_CARNIVORAMustela_kathiah_MUSTELIDAE_CARNIVORAMustela_nudipes_MUSTELIDAE_CARNIVORAMustela_strigidorsa_MUSTELIDAE_CARNIVORAMustela_felipei_MUSTELIDAE_CARNIVORAMustela_africana_MUSTELIDAE_CARNIVORAMustela_frenata_MUSTELIDAE_CARNIVORANeovison_vison_MUSTELIDAE_CARNIVORA 016155843231169484675364623388077922

Convert edge lengths from time to generations

Time in years is converted to units of generations by dividing by each edge length by the generation time for that edge, recorded as ngenerations/year. When this is done the crown root age of the Mustela tree is now at 2.4M generations from the furthest tip in the tree. This tree object (ttree) now contains information in its Ne values mapped to nodes and in its edge lengths to fully represent the data on population sizes and generation time differences among species and their ancestors. This is the tree we will use for our ipcoal simulations.

[16]:
# divide the edge lengths (in abosolute time) by the generation time
ttree = subtree.set_node_values(
    "dist",
    {i.name: i.dist / i.g for i in subtree.get_feature_dict()}
)
[17]:
ttree.draw(
    ts='p',
    edge_type='p',
    tip_labels_align=True,
    tip_labels=[i.rsplit("_", 2)[0] for i in ttree.get_tip_labels()],
    node_labels=False,
    node_sizes=0,
    width=400,
    height=400,
);
Mustela_eversmaniiMustela_putoriusMustela_lutreolaMustela_sibiricaMustela_itatsiMustela_altaicaMustela_nivalisMustela_ermineaMustela_kathiahMustela_nudipesMustela_strigidorsaMustela_felipeiMustela_africanaMustela_frenataNeovison_vison080577816115572417335

Examine the distribution of genealogy lengths

Here we will finally start to simulate data. The tree topology, edge lengths, and Ne values will all be inherited from the tree object by ipcoal to setup the coalescent simulation parameters. We leave the rest of the parameters at their default values (e.g., mutation rate and recombination rate). Let’s first simulate a large chromosome and examine the lengths of genealogical blocks across the genome. If they are very small then we may have to worry that given the scale of our data set recombination is likely to have occurred within the length of a single locus.

[21]:
# initialize the ipcoal model object
mod = ipcoal.Model(ttree, seed=333)

# simulate 1 chromosome 1Mb is length
mod.sim_trees(nloci=1, nsites=1e6)

# print the average length of a non-recombined region (tidx)
print(mod.df.nbps.mean())
131.49243918474687
[22]:
canvas, axes, mark = toyplot.bars(
    np.histogram(mod.df.nbps, 50, range=[0, 1000]),
    width=300,
    height=300,
    xlabel="gene tree length (bp)",
    ylabel="bin count",
    label="The size of non-recombined genomic blocks",
)
05001000gene tree length (bp)05001000bin countThe size of non-recombined genomic blocks

Simulate unlinked loci (e.g., UCE)

The figure above should give us some concern. The average length of a non-recombined region is far less than 500bp, and thus a typical phylogenetic marker, like a UCE, may actually represent multiple distinct genealogies (i.e., concatenation). If we think recombination may be suppressed in our marker region (e.g., a UCE) then you could lower the per-site per-generation recombination rate (and similarly you may want to lower the mutation rate). Here we lower both to an order of magnitude less than the default args in ipcoal.

[23]:
# initialize the ipcoal model object
mod = ipcoal.Model(ttree, seed=333, recomb=1e-10, mut=1e-9)

# simulate many 1000bp loci
mod.sim_loci(nloci=50, nsites=1000)

# show the dataframe of genealogy results
mod.df.head()
[23]:
locus start end nbps nsnps tidx genealogy
0 0 0 407 407 7 0 ((Mustela_kathiah_MUSTEL...
1 0 407 1000 593 10 1 ((Mustela_kathiah_MUSTEL...
2 1 0 41 41 1 0 ((Neovison_vison_MUSTELI...
3 1 41 309 268 7 1 ((Neovison_vison_MUSTELI...
4 1 309 1000 691 10 2 ((Neovison_vison_MUSTELI...
[24]:
# how many genealogies are inside a locus on average?
mod.df.groupby("locus").apply(len).mean()
[24]:
1.62
[25]:
# a dictionary of tree drawing styles that we will use
kwargs = {
    "tip_labels_align": True,
    "tip_labels_style": {"font-size": "9px"},
}
[26]:
# randomly sample 8 genealogy indices
rand = mod.df.sample(8).index

# draw linked genealogies
toytree.mtree(mod.df.genealogy[rand]).draw_tree_grid(ncols=4, nrows=2, **kwargs);
Mustela_putorius_MUSTELIDAE_CARNIVORAMustela_eversmanii_MUSTELIDAE_CARNIVORAMustela_lutreola_MUSTELIDAE_CARNIVORAMustela_sibirica_MUSTELIDAE_CARNIVORAMustela_itatsi_MUSTELIDAE_CARNIVORAMustela_nivalis_MUSTELIDAE_CARNIVORAMustela_altaica_MUSTELIDAE_CARNIVORAMustela_kathiah_MUSTELIDAE_CARNIVORANeovison_vison_MUSTELIDAE_CARNIVORAMustela_frenata_MUSTELIDAE_CARNIVORAMustela_africana_MUSTELIDAE_CARNIVORAMustela_felipei_MUSTELIDAE_CARNIVORAMustela_strigidorsa_MUSTELIDAE_CARNIVORAMustela_nudipes_MUSTELIDAE_CARNIVORAMustela_erminea_MUSTELIDAE_CARNIVORAMustela_erminea_MUSTELIDAE_CARNIVORAMustela_altaica_MUSTELIDAE_CARNIVORAMustela_kathiah_MUSTELIDAE_CARNIVORAMustela_strigidorsa_MUSTELIDAE_CARNIVORAMustela_nudipes_MUSTELIDAE_CARNIVORAMustela_africana_MUSTELIDAE_CARNIVORAMustela_felipei_MUSTELIDAE_CARNIVORAMustela_frenata_MUSTELIDAE_CARNIVORANeovison_vison_MUSTELIDAE_CARNIVORAMustela_sibirica_MUSTELIDAE_CARNIVORAMustela_putorius_MUSTELIDAE_CARNIVORAMustela_itatsi_MUSTELIDAE_CARNIVORAMustela_lutreola_MUSTELIDAE_CARNIVORAMustela_eversmanii_MUSTELIDAE_CARNIVORAMustela_nivalis_MUSTELIDAE_CARNIVORAMustela_putorius_MUSTELIDAE_CARNIVORAMustela_eversmanii_MUSTELIDAE_CARNIVORAMustela_sibirica_MUSTELIDAE_CARNIVORAMustela_itatsi_MUSTELIDAE_CARNIVORAMustela_lutreola_MUSTELIDAE_CARNIVORAMustela_altaica_MUSTELIDAE_CARNIVORAMustela_erminea_MUSTELIDAE_CARNIVORAMustela_nivalis_MUSTELIDAE_CARNIVORAMustela_strigidorsa_MUSTELIDAE_CARNIVORAMustela_nudipes_MUSTELIDAE_CARNIVORAMustela_kathiah_MUSTELIDAE_CARNIVORAMustela_frenata_MUSTELIDAE_CARNIVORAMustela_felipei_MUSTELIDAE_CARNIVORAMustela_africana_MUSTELIDAE_CARNIVORANeovison_vison_MUSTELIDAE_CARNIVORAMustela_putorius_MUSTELIDAE_CARNIVORAMustela_eversmanii_MUSTELIDAE_CARNIVORAMustela_sibirica_MUSTELIDAE_CARNIVORAMustela_lutreola_MUSTELIDAE_CARNIVORAMustela_erminea_MUSTELIDAE_CARNIVORAMustela_africana_MUSTELIDAE_CARNIVORAMustela_felipei_MUSTELIDAE_CARNIVORAMustela_frenata_MUSTELIDAE_CARNIVORANeovison_vison_MUSTELIDAE_CARNIVORAMustela_strigidorsa_MUSTELIDAE_CARNIVORAMustela_nudipes_MUSTELIDAE_CARNIVORAMustela_nivalis_MUSTELIDAE_CARNIVORAMustela_altaica_MUSTELIDAE_CARNIVORAMustela_itatsi_MUSTELIDAE_CARNIVORAMustela_kathiah_MUSTELIDAE_CARNIVORAMustela_putorius_MUSTELIDAE_CARNIVORAMustela_eversmanii_MUSTELIDAE_CARNIVORAMustela_sibirica_MUSTELIDAE_CARNIVORAMustela_itatsi_MUSTELIDAE_CARNIVORAMustela_lutreola_MUSTELIDAE_CARNIVORAMustela_altaica_MUSTELIDAE_CARNIVORAMustela_erminea_MUSTELIDAE_CARNIVORAMustela_nivalis_MUSTELIDAE_CARNIVORAMustela_kathiah_MUSTELIDAE_CARNIVORAMustela_africana_MUSTELIDAE_CARNIVORAMustela_felipei_MUSTELIDAE_CARNIVORAMustela_frenata_MUSTELIDAE_CARNIVORANeovison_vison_MUSTELIDAE_CARNIVORAMustela_strigidorsa_MUSTELIDAE_CARNIVORAMustela_nudipes_MUSTELIDAE_CARNIVORAMustela_lutreola_MUSTELIDAE_CARNIVORAMustela_eversmanii_MUSTELIDAE_CARNIVORAMustela_putorius_MUSTELIDAE_CARNIVORAMustela_sibirica_MUSTELIDAE_CARNIVORAMustela_itatsi_MUSTELIDAE_CARNIVORAMustela_erminea_MUSTELIDAE_CARNIVORAMustela_nivalis_MUSTELIDAE_CARNIVORAMustela_altaica_MUSTELIDAE_CARNIVORAMustela_kathiah_MUSTELIDAE_CARNIVORAMustela_africana_MUSTELIDAE_CARNIVORAMustela_felipei_MUSTELIDAE_CARNIVORAMustela_frenata_MUSTELIDAE_CARNIVORANeovison_vison_MUSTELIDAE_CARNIVORAMustela_strigidorsa_MUSTELIDAE_CARNIVORAMustela_nudipes_MUSTELIDAE_CARNIVORAMustela_nivalis_MUSTELIDAE_CARNIVORAMustela_itatsi_MUSTELIDAE_CARNIVORAMustela_sibirica_MUSTELIDAE_CARNIVORAMustela_lutreola_MUSTELIDAE_CARNIVORAMustela_putorius_MUSTELIDAE_CARNIVORAMustela_eversmanii_MUSTELIDAE_CARNIVORAMustela_erminea_MUSTELIDAE_CARNIVORAMustela_altaica_MUSTELIDAE_CARNIVORAMustela_kathiah_MUSTELIDAE_CARNIVORAMustela_strigidorsa_MUSTELIDAE_CARNIVORAMustela_nudipes_MUSTELIDAE_CARNIVORAMustela_frenata_MUSTELIDAE_CARNIVORAMustela_felipei_MUSTELIDAE_CARNIVORAMustela_africana_MUSTELIDAE_CARNIVORANeovison_vison_MUSTELIDAE_CARNIVORAMustela_nivalis_MUSTELIDAE_CARNIVORAMustela_itatsi_MUSTELIDAE_CARNIVORAMustela_altaica_MUSTELIDAE_CARNIVORAMustela_kathiah_MUSTELIDAE_CARNIVORAMustela_erminea_MUSTELIDAE_CARNIVORAMustela_lutreola_MUSTELIDAE_CARNIVORAMustela_putorius_MUSTELIDAE_CARNIVORAMustela_sibirica_MUSTELIDAE_CARNIVORAMustela_eversmanii_MUSTELIDAE_CARNIVORAMustela_africana_MUSTELIDAE_CARNIVORAMustela_felipei_MUSTELIDAE_CARNIVORAMustela_frenata_MUSTELIDAE_CARNIVORANeovison_vison_MUSTELIDAE_CARNIVORAMustela_strigidorsa_MUSTELIDAE_CARNIVORAMustela_nudipes_MUSTELIDAE_CARNIVORA

Examine expected variation in inferred gene trees

The substitution process is modeled on top of the simulated genealogies to produce sequence data, and ML gene trees are inferred from the resulting sequence data. It can be useful to compare inferred trees to the true genealogies to find which parts of the species tree are most difficult to infer, i.e., which splits are most affected not only by genealogical variation but also by gene tree estimation error caused by low information content or homoplasy.

As a next step you would likely want take these inferred gene trees and estimate a species tree using ASTRAL or a similar tool that can take gene trees as input.

[27]:
# infer a gene tree for each LOCUS
mod.infer_gene_trees()
[28]:
# show resulting true genealogies and inferred gene trees
mod.df.head()
[28]:
locus start end nbps nsnps tidx genealogy inferred_tree
0 0 0 407 407 7 0 ((Mustela_kathiah_MUSTEL... ((Mustela_frenata_MUSTEL...
1 0 407 1000 593 10 1 ((Mustela_kathiah_MUSTEL... ((Mustela_frenata_MUSTEL...
2 1 0 41 41 1 0 ((Neovison_vison_MUSTELI... (Mustela_frenata_MUSTELI...
3 1 41 309 268 7 1 ((Neovison_vison_MUSTELI... (Mustela_frenata_MUSTELI...
4 1 309 1000 691 10 2 ((Neovison_vison_MUSTELI... (Mustela_frenata_MUSTELI...
[30]:
# draw inferred unrooted gene trees
rand = mod.df.sample(8).index
toytree.mtree(mod.df.inferred_tree[rand]).draw_tree_grid(ncols=4, nrows=2, **kwargs);
Mustela_erminea_MUSTELIDAE_CARNIVORAMustela_eversmanii_MUSTELIDAE_CARNIVORAMustela_itatsi_MUSTELIDAE_CARNIVORAMustela_putorius_MUSTELIDAE_CARNIVORAMustela_sibirica_MUSTELIDAE_CARNIVORAMustela_lutreola_MUSTELIDAE_CARNIVORAMustela_altaica_MUSTELIDAE_CARNIVORAMustela_nudipes_MUSTELIDAE_CARNIVORAMustela_strigidorsa_MUSTELIDAE_CARNIVORAMustela_kathiah_MUSTELIDAE_CARNIVORAMustela_nivalis_MUSTELIDAE_CARNIVORANeovison_vison_MUSTELIDAE_CARNIVORAMustela_frenata_MUSTELIDAE_CARNIVORAMustela_felipei_MUSTELIDAE_CARNIVORAMustela_africana_MUSTELIDAE_CARNIVORANeovison_vison_MUSTELIDAE_CARNIVORAMustela_itatsi_MUSTELIDAE_CARNIVORAMustela_putorius_MUSTELIDAE_CARNIVORAMustela_nivalis_MUSTELIDAE_CARNIVORAMustela_erminea_MUSTELIDAE_CARNIVORAMustela_kathiah_MUSTELIDAE_CARNIVORAMustela_strigidorsa_MUSTELIDAE_CARNIVORAMustela_nudipes_MUSTELIDAE_CARNIVORAMustela_sibirica_MUSTELIDAE_CARNIVORAMustela_eversmanii_MUSTELIDAE_CARNIVORAMustela_lutreola_MUSTELIDAE_CARNIVORAMustela_altaica_MUSTELIDAE_CARNIVORAMustela_frenata_MUSTELIDAE_CARNIVORAMustela_africana_MUSTELIDAE_CARNIVORAMustela_felipei_MUSTELIDAE_CARNIVORAMustela_nudipes_MUSTELIDAE_CARNIVORAMustela_strigidorsa_MUSTELIDAE_CARNIVORAMustela_sibirica_MUSTELIDAE_CARNIVORAMustela_erminea_MUSTELIDAE_CARNIVORAMustela_altaica_MUSTELIDAE_CARNIVORAMustela_eversmanii_MUSTELIDAE_CARNIVORAMustela_frenata_MUSTELIDAE_CARNIVORAMustela_nivalis_MUSTELIDAE_CARNIVORAMustela_putorius_MUSTELIDAE_CARNIVORANeovison_vison_MUSTELIDAE_CARNIVORAMustela_lutreola_MUSTELIDAE_CARNIVORAMustela_itatsi_MUSTELIDAE_CARNIVORAMustela_felipei_MUSTELIDAE_CARNIVORAMustela_africana_MUSTELIDAE_CARNIVORAMustela_kathiah_MUSTELIDAE_CARNIVORAMustela_nivalis_MUSTELIDAE_CARNIVORAMustela_lutreola_MUSTELIDAE_CARNIVORAMustela_altaica_MUSTELIDAE_CARNIVORAMustela_erminea_MUSTELIDAE_CARNIVORAMustela_nudipes_MUSTELIDAE_CARNIVORAMustela_strigidorsa_MUSTELIDAE_CARNIVORAMustela_kathiah_MUSTELIDAE_CARNIVORAMustela_itatsi_MUSTELIDAE_CARNIVORAMustela_sibirica_MUSTELIDAE_CARNIVORAMustela_putorius_MUSTELIDAE_CARNIVORAMustela_eversmanii_MUSTELIDAE_CARNIVORANeovison_vison_MUSTELIDAE_CARNIVORAMustela_frenata_MUSTELIDAE_CARNIVORAMustela_africana_MUSTELIDAE_CARNIVORAMustela_felipei_MUSTELIDAE_CARNIVORAMustela_nivalis_MUSTELIDAE_CARNIVORAMustela_lutreola_MUSTELIDAE_CARNIVORAMustela_altaica_MUSTELIDAE_CARNIVORAMustela_erminea_MUSTELIDAE_CARNIVORAMustela_nudipes_MUSTELIDAE_CARNIVORAMustela_strigidorsa_MUSTELIDAE_CARNIVORAMustela_kathiah_MUSTELIDAE_CARNIVORAMustela_itatsi_MUSTELIDAE_CARNIVORAMustela_sibirica_MUSTELIDAE_CARNIVORAMustela_putorius_MUSTELIDAE_CARNIVORAMustela_eversmanii_MUSTELIDAE_CARNIVORANeovison_vison_MUSTELIDAE_CARNIVORAMustela_frenata_MUSTELIDAE_CARNIVORAMustela_africana_MUSTELIDAE_CARNIVORAMustela_felipei_MUSTELIDAE_CARNIVORAMustela_sibirica_MUSTELIDAE_CARNIVORAMustela_eversmanii_MUSTELIDAE_CARNIVORAMustela_lutreola_MUSTELIDAE_CARNIVORAMustela_putorius_MUSTELIDAE_CARNIVORAMustela_kathiah_MUSTELIDAE_CARNIVORAMustela_erminea_MUSTELIDAE_CARNIVORAMustela_strigidorsa_MUSTELIDAE_CARNIVORAMustela_nudipes_MUSTELIDAE_CARNIVORANeovison_vison_MUSTELIDAE_CARNIVORAMustela_altaica_MUSTELIDAE_CARNIVORAMustela_nivalis_MUSTELIDAE_CARNIVORAMustela_itatsi_MUSTELIDAE_CARNIVORAMustela_frenata_MUSTELIDAE_CARNIVORAMustela_africana_MUSTELIDAE_CARNIVORAMustela_felipei_MUSTELIDAE_CARNIVORAMustela_lutreola_MUSTELIDAE_CARNIVORAMustela_itatsi_MUSTELIDAE_CARNIVORAMustela_sibirica_MUSTELIDAE_CARNIVORAMustela_erminea_MUSTELIDAE_CARNIVORAMustela_eversmanii_MUSTELIDAE_CARNIVORAMustela_nivalis_MUSTELIDAE_CARNIVORAMustela_putorius_MUSTELIDAE_CARNIVORAMustela_altaica_MUSTELIDAE_CARNIVORAMustela_strigidorsa_MUSTELIDAE_CARNIVORAMustela_nudipes_MUSTELIDAE_CARNIVORAMustela_kathiah_MUSTELIDAE_CARNIVORANeovison_vison_MUSTELIDAE_CARNIVORAMustela_frenata_MUSTELIDAE_CARNIVORAMustela_africana_MUSTELIDAE_CARNIVORAMustela_felipei_MUSTELIDAE_CARNIVORAMustela_kathiah_MUSTELIDAE_CARNIVORAMustela_altaica_MUSTELIDAE_CARNIVORAMustela_erminea_MUSTELIDAE_CARNIVORAMustela_nudipes_MUSTELIDAE_CARNIVORAMustela_strigidorsa_MUSTELIDAE_CARNIVORAMustela_sibirica_MUSTELIDAE_CARNIVORAMustela_nivalis_MUSTELIDAE_CARNIVORAMustela_lutreola_MUSTELIDAE_CARNIVORAMustela_putorius_MUSTELIDAE_CARNIVORAMustela_itatsi_MUSTELIDAE_CARNIVORAMustela_eversmanii_MUSTELIDAE_CARNIVORAMustela_frenata_MUSTELIDAE_CARNIVORANeovison_vison_MUSTELIDAE_CARNIVORAMustela_africana_MUSTELIDAE_CARNIVORAMustela_felipei_MUSTELIDAE_CARNIVORA
[56]:
# get consensus of all inferred ML trees
ctree = toytree.mtree(mod.df.inferred_tree).get_consensus_tree()

# reroot tree on node 24
ctree = ctree.root(ctree.get_feature_dict("idx")[24].get_leaf_names())

# draw the consensus tree
ctree.draw(
    ts='n',
    edge_type='p',
    node_labels="support",
    use_edge_lengths=False,
);
Mustela_itatsi_MUSTELIDAE_CARNIVORAMustela_sibirica_MUSTELIDAE_CARNIVORAMustela_eversmanii_MUSTELIDAE_CARNIVORAMustela_putorius_MUSTELIDAE_CARNIVORAMustela_lutreola_MUSTELIDAE_CARNIVORAMustela_nivalis_MUSTELIDAE_CARNIVORAMustela_altaica_MUSTELIDAE_CARNIVORAMustela_kathiah_MUSTELIDAE_CARNIVORAMustela_erminea_MUSTELIDAE_CARNIVORAMustela_strigidorsa_MUSTELIDAE_CARNIVORAMustela_nudipes_MUSTELIDAE_CARNIVORAMustela_africana_MUSTELIDAE_CARNIVORAMustela_felipei_MUSTELIDAE_CARNIVORAMustela_frenata_MUSTELIDAE_CARNIVORANeovison_vison_MUSTELIDAE_CARNIVORA16351917119174225587267100100

Simulate on a larger clade

Let’s scale up to Mustelidae, the whole family containing weasels, ferrets, otters, etc.

[44]:
# make a tree copy
atree = tree_ng.copy()

# get ancestor of all tips that have 'Mustela' in their name
mrca_node_idx = atree.get_mrca_idx_from_tip_labels(wildcard="MUSTELIDAE")

# make new toytree from selected node
node = atree.get_feature_dict("idx")[mrca_node_idx]
subtree = toytree.tree(node)

# scale the tree height from millions of year to years
subtree = subtree.mod.node_scale_root_height(subtree.treenode.height * 1e6)
[46]:
canvas, axes = subtree.draw(
    ts='p',
    edge_type='p',
    node_sizes=10,
    node_labels=False,
    node_colors=[
        colormap.colors(i, 0.1, 10) for i in subtree.get_node_values('g', 1, 1)
    ],
    width=800,
    height=600,
);

# set y axis ticks at 2My intervals and fit long tip names
axes.y.ticks.locator = toyplot.locator.Explicit(
    range(0, int(2e7), int(2e6)),
    [str(int(i / 1e6)) for i in range(0, int(2e7), int(2e6))],
)
axes.y.domain.min = -20e6
Aonyx_cinerea_MUSTELIDAE_CARNIVORALutrogale_perspicillata_MUSTELIDAE_CARNIVORAAonyx_capensis_MUSTELIDAE_CARNIVORALutra_lutra_MUSTELIDAE_CARNIVORALutra_sumatrana_MUSTELIDAE_CARNIVORALontra_provocax_MUSTELIDAE_CARNIVORALontra_longicaudis_MUSTELIDAE_CARNIVORALontra_canadensis_MUSTELIDAE_CARNIVORAPteronura_brasiliensis_MUSTELIDAE_CARNIVORAIctonyx_striatus_MUSTELIDAE_CARNIVORAPoecilogale_albinucha_MUSTELIDAE_CARNIVORAIctonyx_libyca_MUSTELIDAE_CARNIVORAVormela_peregusna_MUSTELIDAE_CARNIVORALyncodon_patagonicus_MUSTELIDAE_CARNIVORAGalictis_vittata_MUSTELIDAE_CARNIVORAGalictis_cuja_MUSTELIDAE_CARNIVORAMustela_eversmanii_MUSTELIDAE_CARNIVORAMustela_putorius_MUSTELIDAE_CARNIVORAMustela_lutreola_MUSTELIDAE_CARNIVORAMustela_sibirica_MUSTELIDAE_CARNIVORAMustela_itatsi_MUSTELIDAE_CARNIVORAMustela_altaica_MUSTELIDAE_CARNIVORAMustela_nivalis_MUSTELIDAE_CARNIVORAMustela_erminea_MUSTELIDAE_CARNIVORAMustela_kathiah_MUSTELIDAE_CARNIVORAMustela_nudipes_MUSTELIDAE_CARNIVORAMustela_strigidorsa_MUSTELIDAE_CARNIVORAMustela_felipei_MUSTELIDAE_CARNIVORAMustela_africana_MUSTELIDAE_CARNIVORAMustela_frenata_MUSTELIDAE_CARNIVORANeovison_vison_MUSTELIDAE_CARNIVORAMelogale_moschata_MUSTELIDAE_CARNIVORAMelogale_personata_MUSTELIDAE_CARNIVORAMartes_martes_MUSTELIDAE_CARNIVORAMartes_zibellina_MUSTELIDAE_CARNIVORAMartes_melampus_MUSTELIDAE_CARNIVORAMartes_americana_MUSTELIDAE_CARNIVORAMartes_foina_MUSTELIDAE_CARNIVORAMartes_flavigula_MUSTELIDAE_CARNIVORAGulo_gulo_MUSTELIDAE_CARNIVORAMartes_pennanti_MUSTELIDAE_CARNIVORAEira_barbara_MUSTELIDAE_CARNIVORAMellivora_capensis_MUSTELIDAE_CARNIVORAArctonyx_collaris_MUSTELIDAE_CARNIVORATaxidea_taxus_MUSTELIDAE_CARNIVORA 024681012141618
[47]:
# convert time to generations
ttree = subtree.set_node_values(
    "dist",
    {i.name: i.dist / i.g for i in subtree.get_feature_dict()}
)
[48]:
ttree.draw(
    ts='p',
    edge_type='p',
    tip_labels_align=True,
    node_labels=False,
    node_sizes=0,
    width=800,
    height=600,
);
Aonyx_cinerea_MUSTELIDAE_CARNIVORALutrogale_perspicillata_MUSTELIDAE_CARNIVORAAonyx_capensis_MUSTELIDAE_CARNIVORALutra_lutra_MUSTELIDAE_CARNIVORALutra_sumatrana_MUSTELIDAE_CARNIVORALontra_provocax_MUSTELIDAE_CARNIVORALontra_longicaudis_MUSTELIDAE_CARNIVORALontra_canadensis_MUSTELIDAE_CARNIVORAPteronura_brasiliensis_MUSTELIDAE_CARNIVORAIctonyx_striatus_MUSTELIDAE_CARNIVORAPoecilogale_albinucha_MUSTELIDAE_CARNIVORAIctonyx_libyca_MUSTELIDAE_CARNIVORAVormela_peregusna_MUSTELIDAE_CARNIVORALyncodon_patagonicus_MUSTELIDAE_CARNIVORAGalictis_vittata_MUSTELIDAE_CARNIVORAGalictis_cuja_MUSTELIDAE_CARNIVORAMustela_eversmanii_MUSTELIDAE_CARNIVORAMustela_putorius_MUSTELIDAE_CARNIVORAMustela_lutreola_MUSTELIDAE_CARNIVORAMustela_sibirica_MUSTELIDAE_CARNIVORAMustela_itatsi_MUSTELIDAE_CARNIVORAMustela_altaica_MUSTELIDAE_CARNIVORAMustela_nivalis_MUSTELIDAE_CARNIVORAMustela_erminea_MUSTELIDAE_CARNIVORAMustela_kathiah_MUSTELIDAE_CARNIVORAMustela_nudipes_MUSTELIDAE_CARNIVORAMustela_strigidorsa_MUSTELIDAE_CARNIVORAMustela_felipei_MUSTELIDAE_CARNIVORAMustela_africana_MUSTELIDAE_CARNIVORAMustela_frenata_MUSTELIDAE_CARNIVORANeovison_vison_MUSTELIDAE_CARNIVORAMelogale_moschata_MUSTELIDAE_CARNIVORAMelogale_personata_MUSTELIDAE_CARNIVORAMartes_martes_MUSTELIDAE_CARNIVORAMartes_zibellina_MUSTELIDAE_CARNIVORAMartes_melampus_MUSTELIDAE_CARNIVORAMartes_americana_MUSTELIDAE_CARNIVORAMartes_foina_MUSTELIDAE_CARNIVORAMartes_flavigula_MUSTELIDAE_CARNIVORAGulo_gulo_MUSTELIDAE_CARNIVORAMartes_pennanti_MUSTELIDAE_CARNIVORAEira_barbara_MUSTELIDAE_CARNIVORAMellivora_capensis_MUSTELIDAE_CARNIVORAArctonyx_collaris_MUSTELIDAE_CARNIVORATaxidea_taxus_MUSTELIDAE_CARNIVORA09857521971503295725539430074928759