|
|
||||||||
| ||||||||||||||||||||||||||||
International Journal of Systematic and Evolutionary Microbiology vol. 56, part 7, pp. 1565 - 1575
Appendix 1. Computer program for conducting AIBIMM analyses
We have developed the computer program PhyloMode for conducting AIBIMM analyses. The program can be downloaded free of charge from http://www.matforsk.no/web/sampro.nsf/downloadE/Microbial_community. PhyloMode was written in the Microsoft Visual Studio .net programming environment using C#. We used .net charting libraries from ZedGraph (http://sourceforge.net/projects/zedgraph) and multivariate statistical .net libraries from CenterSpace Software (http://www.centerspace.net).
The PhyloMode program contains two basic modules. The first module transforms DNA sequences into multimer data (n = 1 to n = 6). The input is a file in FASTA format (sequences begin with a single-line description which is distinguished by '>'). The output from the module can be exported in tab-delimited text for advanced multivariate statistical analyses by software packages such as The Unscrambler (CAMO Inc.; http://www.camo.com). The PhyloMode software also includes a module for principal component analyses (PCA) and 2D visualization of both the score and loading plots. Finally, the program has an option for creating dendrograms based on the principal component data using single, centroid or complete linkage. The linkage data are exported in a format compatible with the free-of-charge software TreeView (http://taxonomy.zoology.gla.ac.uk/rod/treeview.html), which is a software package for drawing phylogenetic trees.
Appendix 2. Step-by-step details of bilinear PCA modelling
PCA is a method for extracting/computing a set of components that explain as much of the variability of a dataset (here denoted by X) as possible. The method goes as follows:
It can be shown that, if we organize the scores and loadings for the A first components in matrices T and P, the original data can be modelled as
X = TPT + E
where E denotes noise, i.e. that part of the dataset which is not considered further.
The scores and loadings for the first couple of components are usually plotted in scatter plots in order to reveal information about relationships among objects and variables, respectively.
| ||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| INT J SYST EVOL MICROBIOL | MICROBIOLOGY | J GEN VIROL |
| J MED MICROBIOL | ALL SGM JOURNALS | |