How to conduct a maximum likelihood analysis on Gyra

For this to work, you need:

To determine if X-forwarding is on, open a terminal, log-on to gyra, and issue the command:

$ xclock

You should see a clock if its working.

Determine the optimal model for your data

Maximum likelihood analyses require that you specify the model of evolution under which your data evolved - it’s the likelihood of the data under the model of evolution (incl. the tree) that is being optimised.

The optimal model for DNA or protein data can be determined using Modelgenerator a Java application. In a text editor write a queue submission file replacing <your data file>, <your email address>, and <job name> (the latter should be <9 characters and not start with a number).

#!/bin/bash
#$ -N <job name>
#$ -cwd
#$ -o modelgen-log
#$ -j y
#$ -S /bin/bash
#$ -M <your email address>
#$ -m bae
source ~/.bash_profile
/usr/java/latest/bin/java -jar /share/apps/bin/modelgenerator.jar <your data file> 4

Save the file as modelgen.sh and submit the file to the queue on gyra:

$ qsub modelgen.sh

You will receive and email when the analysis is finished. In the log file modelgenerator0.out find the model selected by the Akaike Information Criterion 1 (AIC1) e.g. “Model Selected: GTR+I+G” - this is the optimal model for your data.

Maximum likelihood optimal tree search analysis

Here we are going to run 200 optimisations using RAxML in parallel on 10 processors.

Write a queue submission called raxml.sh file as follows, replacing the bracketed placeholders (<...>) with your values:

#!/bin/bash
#$ -N <job name>
#$ -cwd
#$ -o raxml-log
#$ -j y
#$ -S /bin/bash
#$ -M <your email address>
#$ -m bae
#$ -pe orte 10
source ~/.bash_profile
mpirun -np 10 raxmlHPC-MPI -s <your data file> -n run1 -m <your model> -N 200 -d

To determine the value of <your model> see the RAxML documentation and/or

$ raxml -h

Submit the job to the queue:

$ qsub raxml.sh

Maximum likelihood bootstrap analysis

Here we are going to run 300 bootstrap replicates in parallel on 10 processors.

Write a queue submission file called raxmlboot.sh as follows:

#!/bin/bash
#$ -N <job name>
#$ -cwd
#$ -o raxmlboot-log
#$ -j y
#$ -S /bin/bash
#$ -M <your email address>
#$ -m bae
#$ -pe orte 10
source ~/.bash_profile
mpirun -np 10 raxmlHPC-MPI -s <your data file> -n run1 -m <your model> -f i -N 300 -b 8743947329473 -u 2

Submit the job to the queue:

$ qsub raxmlboot.sh

Analysing the results of the bootstrap analysis

Typically, bootstrap analyses are presented as 50% majority-rule consensus trees. RAxML does not create that tree for you, instead the tree from each replicate is saved to the file RAxML_bootstrap.run1, which in this case will contain 300 trees.

To make the consensus tree, issue the commmand:

$ makeConsensusTree.py -d <your data file> -t RAxML_bootstrap.run1

The consensus tree will be saved to a file named consensus.tree

Open the tree in Figtree, manipulate, save.