Multigene approach

Leaving aside the question of a "proper" algorithm, it should be admitted that both mono-phyletic and polyphyletic arrangements are only poorly supported by the 16S rRNA-derived signal. In other words, the 16S rRNA gene, the most frequently used marker in bacterial phylogeny, is incapable of solving the relationships among P-symbionts This observation is not surprising because such insufficiency of rRNA genes is frequently observed at various phylogenetic levels in many groups of organisms . Various protein-coding genes have been used as an alternative source of phylogenetic information (Degnan et al ., 2004; Casir-aghi et al ., 2005; Moran et al ., 2005a; Baldo et al ., 2006; Fukatsu et al ., 2007) . Although they can yield better phylogenetic resolution at some particular nodes in dependence of their evolutionary tempo, they do not provide any fundamental advantage if used in singlegene matrices

The only way of overcoming the lack of reliable information is an extension of the dataset with additional sequences and employment of the multigene approach Although seemingly simple and straightforward, the method of adding new genes is not free of potential troubles . The typical bacterial genome is a flexible assemblage of genes undergoing frequent structural changes (Snel et al , 2005) Some of these processes may hinder selection of suitable universal markers among hundreds of possible candidates For example, loss of genes leads to absence of a given phylogenetic marker in some bacterial lineages This situation may be particularly frequent in symbiotic bacteria that undergo rapid and dramatic loss of many genes; substantial reduction of genome size can be seen in all of the completely sequenced genomes of P-symbionts (Shigenobu et al , 2000; Akman et al ., 2002; Gil et al ., 2003; Nakabachi et al ., 2006; Wu et al ., 2006) and has been observed even in the presumably young symbiotic lineage Sodalis glossinidius (Toh et al , 2006) Moreover, different nutritional constraints in various host-symbiont associations lead to differential preservation/loss of various sets of genes in different symbionts . Thus, on their hypothetical pathways from free-living bacterium to highly specialized symbionts, Buchnera and Wigglesworthia reduced their genomes to approximately 583 and 621 coding genes, respectively (Shigenobu et al ., 2000; Akman et al ., 2002); only 69% of these genes are shared by both lineages (Akman et al ., 2002) . Similar functional complementarity between two different symbiotic genomes, although not based on complete genome sequences, was recently reported for the genera Sulcia and Baumannia (Wu et al ., 2006) . If such small genomes are to be analyzed together with free-living bacteria, genes have to be identified that are present in all of the included genomes To make the situation even more complicated, successful identification of homologous genes is only one prerequisite, but does not itself guarantee a consistency of phylogenetic signal At least two additional processes may disturb phyloge-netic reconstruction Duplications are a known and much feared source of paralogs, which are further inherited during the speciation process A random sampling of paralogs from different lineages during the phylogenetic analysis can be a source of serious topological inaccuracies Finally, even worse phylogenetic inconsistencies may arise due to horizontal gene transfer (HGT), a process that introduces phylogenetically distant xenologs into bacterial genomes

The significance of duplication and HGT for phylogenetic inference in bacteria has not been fully elucidated Generally, it is supposed that duplications in prokaryotes are less deleterious than in eukaryotic organisms By contrast, the HGT is often detected in bacteria and has sometimes even been considered as one of the main forces shaping bacterial genomes . However, current views on this issue are largely dependent on the methods used to estimate overall HGT frequency (Lerat et al ., 2003; Susko et al ., 2006; Doolittle and Bapt-este, 2007) . For example, a conservative view, with vertical inheritance playing a predominant role in the bacterial genome structure, has been voiced by Lerat et al (2003) These authors assessed the overall compatibility of individual single-gene matrices with selected topologies To achieve this, they postulated phylogenetic congruence as a null hypothesis and used the Shimodaira-Hasegawa test (Shimodaira and Hasegawa, 1999) to identify an HGT by its rejection . Comparing 13 genomes of y-Proteobacteria, they showed that the universally present orthologs suffer only negligible frequency of HGT: of 205 genes included in the analysis, 203 produced mutually compatible topologies . When used for phylogenetic inference within a concatenate matrix, this respectable set of genes produced a monophy-letic and well supported branch of Buchnera + Wigglesworthia that was preserved even after removal of the AT-rich codons . On the other hand, such a low level of HGT has recently been questioned by Susko et al (2006) They adopted methods from functional genomics to visualize the congruency within the core gene set suggested by Lerat et al . (2003) and concluded that around 10% of the genes may have resulted from HGT. In their discussion, they further postulate that using congruence as a null hypothesis and searching for the significant incongruence necessarily leads to underestimation of the HGT level A similar opinion about the considerable occurrence of HGT has also been reached by Doolittle and Bapteste (2007) using an entirely different source of evidence than phylogenetics Considering the whole spectrum of HGT frequency estimates, stretching from almost zero (Ge et al ., 2005) to more then 60% (Lerat et al ., 2005; Dagan and Martin, 2007), it is hard to assess the possible effect of this phenomenon on the selection of a suitable set of phylogenetic markers

0 0

Post a comment