S

Nuclear

3,720

RNA polymerase II

Pol II

Nuclear

4,360

| Other increasingly employed genes: various tRNAs, cytochrome c, wingless (Wg), rhodopsin, etc.

pattern suggested by the overall body of evidence; it is convergent.

There are an intimidating number of methods for reconstructing phylogeny. This is almost universally done with computers today because of the vast amount of data available. However, different programs have been written to solve particular problems of analysis, so many have been proposed simply to analyze specific varieties of data (e.g., protein, morphological, DNA sequences) or to analyze these data under differing assumptions of how evolution must operate. Many programs are quite easy to use (or misuse) and have led some investigators to employ a "hit-and-run" approach to phylogeny reconstruction. Regardless of methodology, traditional complications for phylogeny reconstruction, such as hybridization, introgression, and lateral gene transfer, remain as problems. Today, analyzing large numbers of taxa and DNA sequences has led to even more algorithms for inferring phylogeny.

But again, at their simplest, these programs seek some degree of agreement between the characters being studied, referred to as congruence. But how is congruence measured?

Parsimony

Parsimony criteria are used as tree-building methods and certainly have been among the most widely used methods to reconstruct evolutionary history. One of the most basic tenets of this method is that the most preferred hypothesis of evolutionary relationships will be the one that requires the fewest number of evolutionary changes, or steps. Such hypotheses also represent the most congruent arrangement of all of the characters in an analysis. They seek the greatest support for homology and the least support for homoplasy. Generalized parsimony operates throughout science and is based on the philosophical principle of Occam's Razor, which simply states that "all things being equal, the simplest explanation is the best explanation." In other words, the simplest means of explaining the observations is better than any explanation requiring numerous ad hoc hypotheses. However, there are other forms of parsimony, such as Wagner parsimony, Dollo parsimony, and Camin-Sokal parsimony, which incorporated a priori models of character change, like reversals being more likely to occur relative to independent, convergent origins.

Maximum Likelihood

The idea of incorporating evolutionary models as well as a desire to impose statistical inference led to the development of likelihood methods for phylogeny reconstruction. This first became important with the increase in biochemical and molecular data and the limited number of character states (i.e., the four nucleotides) and the aforementioned difficulties. In particular, circumstances were identified in which parsimony failed to find the "correct" evolutionary history (Felsenstein, 1973, 1978, 1981, 1983). When evolutionary rates in different lineages are extremely dissimilar and evolutionary change is large (i.e., the rates of mutation are high), then the probability of arriving at the same nucleotide not by descent but by chance becomes high. This has been referred to as the Felsenstein Zone.

Maximum likelihood estimation starts with an aligned set of sequences that are constrained onto a starting hypothesis of phylogeny, and that are frequently derived from a simple parsimony analysis. The tree is arbitrarily rooted, and the likelihood of particular sites is computed as the sum of the probabilities of every possible reconstruction at ancestral states given a particular model of nucleotide substitution. The likelihood of the entire tree is calculated as a product of the likelihood of each individual site. The objective is to end up with the simplest model to explain evolution. Then, the "best" likelihood settings are used to run the phylogenetic analysis, which employs differential weighting of characters as specified by the employed models. The method strives to

TABLE 1.3. Most Frequently Employed Ranks in Zoological Nomenclature

Kingdom Phylum

Subphylum Superclass Epiclass Class

Subclass

Superorder Order

Suborder

Superfamily (-oidea) Family (-idae)

Subfamily (-inae) Tribe (-ini) Genus Species

Animalia

Arthropoda Mandibulata

Panhexapoda Hexapoda

Insecta (Ectognatha)

Dicondylia

Hymenopterida Hymenoptera Apocrita Apoidea

Halictidae Halictinae

Augochlorini Augochlora nigrocyanea Cockerell

There are no standardized terminations in zoology for names above the rank of superfamily (ICZN, 1999).

discover the tree or set of trees that gives the highest probability of a data set being derived from it, and not the probability of a tree being derived from a data set (which is a common misconception of likelihood methods).

Despite the apparent appeal of maximum likelihood methods, they are not without problems. Likelihood methods require an a priori probabilistic model of evolutionary process. Statistical methods of phylogeny inference require a hypothesis of the processes of evolutionary change. With their small, finite number of possible character states, DNA sequences lend themselves well to such methodology. There is widespread acceptance of probabilistic models of evolution for nucleotide sequence data. According to Swofford et al. (1996), parsimony methods ignore information about branch length, whereas maximum likelihood methods, which assume that repeated changes are more likely on longer branches, do not ignore length. The most common forms of models for evolutionary change are those pertaining to base-pair frequencies in DNA/RNA; transition:transver-sion ratios; probabilities of amino acid substitutions in proteins (i.e., synonymous versus nonsynonymous codons); and the shape of complex macromolecules (e.g., DNA/RNA loops).

While some attempts have been made to apply likelihood methods to morphological data, probabilistic models of morphological change are considered to have little basis. What biological justification can be made for suggesting a particular probabilistic model of anatomical evolution, such as from the large hind wings of Mecoptera to the halteres of flies? Moreover, computationally, the calculation of probabilities is extremely time consuming, though computing ability improves every year. Some scientists still consider maximum likelihood advantageous because as sample sizes increase toward infinity and the models become more realistic, the results are believed to converge on some idea of the "truth" [and interestingly, appear to converge on a simple parsimony analysis (Goloboff, 2003)].

A recent advance in model-based phylogenetic reconstruction has been Bayesian analysis (e.g., Huelsenbeck et al., 2001). Rather than relying on a priori probabilities for estimating the most likely tree topology, Bayesian methods employ a posteriori probabilities for models of nucleotide evolution. Use of posterior probabilities eases the use of complex, ideally more realistic, models of molecular evolution and allows vastly larger data sets than was feasible under strict likelihood estimation.

TAXONOMY, NOMENCLATURE, AND CLASSIFICATION

If the names are lost the knowledge also disappears. -J. C. Fabricius, 1778, Philosophia Entomologica

The first part of knowledge is getting the names right. -Chinese proverb

Nomina si nescis, perit et cognitio rerum. [Who knows not the names, knows not the subject.]2 -Linnaeus, 1773, Critica Botanica

The first reaction upon encountering a physical object is to recall its name or to name it. Every human culture names the world around it, developing folk classifications for the biotic and abiotic realms. The human mind operates by circumscribing the features of relevant objects, often subconsciously, and then applying a name to refer to them. Folk taxonomies

2 Adapted by Linnaeus from Isidore of Seville's (ca. a.d. 560-636: Patron Saint of Students) phrase, Nisi enim nomen scieris, cognitio rerum perit, in his Origines seu Etymologiae: Liber I.

were of great importance; they served to distinguish the edible from the poisonous or the cuddly from the threatening. This was standardized and codified to form the formal taxonomy of binomial nomenclature in biology. A taxon (plural taxa) refers to a group of organisms at any rank (e.g., Peleci-nus and Musca are taxa at the rank of genus). The principal ranks in an ascending series are species, genus, family, order, class, phylum, and kingdom (although numerous other intercalated ranks and informal categories are also frequently used: Table 1.3). Classification is the codification of the results of systematic studies, using taxonomic principles.

All accumulated information of a species is tied to a scientific name, a name that serves as the link between what has been learned in the past and what we today add to the body of knowledge. For instance, how do we know that unique data on nesting biology of the honey bee referred to by Gerst├Ącker (1863) is the same species to which we refer today? What if Gerst├Ącker misidentified the species? What if he was actually mixing data from two species under a single reference? Unlike other sciences, the history of biological nomenclature is of paramount importance and the correct application of scientific names ensures that these names will be stable. Names provide for a means of universal communication and as a reference database for information storage, retrieval, documentation, and use. For this purpose, taxonomy is governed by rules of nomenclature, a precise system used by sys-tematists that concerns itself with pragmatic methods for naming taxa.

The rules of nomenclature used to assign names to taxa are a strictly utilitarian, pragmatic part of taxonomy. The rules of nomenclature have nothing to do with theories about relationships (phylogenetic work determines those), species concepts, or logical consistency (i.e., parsimony). In fact, the rules of nomenclature are not intended to impair or interfere with a taxonomist's judgment about how taxa should be circumscribed or which taxa should be named. The rules of nomenclature are simply intended to specify how taxa should be named. In instances where more than one name has been proposed, they specify which name should be used or has priority. The rules of nomenclature are codified, legal documents; International Commissions deliberate upon any changes in the rules and pass judgment on petitions for exceptions to the rules in special circumstances.

Rules of nomenclature have not always existed. Even in Linnaeus' time, there was no regulation for how to name things formally. However, as more and more species were discovered and scientists expanded biological disciplines (taxonomic and otherwise), chaos and confusion began to develop as different names were applied to the same species or identical names were given to radically different taxa. In the absence of such rules, the scientific literature was becoming utterly confusing, and in some instances meaningless, because there was no stability to the use of names. The bene fits of having a formal taxonomy were being lost. To ensure the continuity and conveyance of biological information, rules were established, and the International Commissions were formed. There are presently three independent rules of nomenclature: Botanical, Zoological, and Bacteriological. Naturally, the names of insects are governed by the International Code of Zoological Nomenclature (ICZN, 1999).

Scientific names have a codified format. Each animal has the binomial name (genus and species), the name of its author, and the year it was established: for example, Apis mel-lifera Linnaeus, 1758, or Genus Zorotypus Silvestri, 1912, or Family Megachilidae Latreille, 1802. The author name indicates who proposed that part of the name and in what year. In the aforementioned examples, Linnaeus proposed the species name mellifera in 1758 as a taxon in the genus Apis; Silvestri proposed the generic name Zorotypus in 1912; and Latreille proposed the family name Megachilidae in 1802. The author's name is always capitalized (sometimes abbreviated for famous individuals, e.g., Linnaeus is often abbreviated as "L.") and never underlined or italicized. This allows us to keep track of the name and to recognize when someone has proposed an identical name for a different biological taxon that would lead to confusion. Such identical names are called homonyms. For example, Eversmann in 1852 established the name Bombus modestus for a species of bumble bee living in Eurasia (the name was thus, Bombus modestus Eversmann, 1852). Numerous other bumble bee species were also established in Bombus, and it was later discovered that in 1861 Smith had described a completely different species from Mexico which he had also called Bombus modestus, unaware that an identical name already existed. (Indeed, a third B. modestus was proposed by Cresson in 1863!) These names are homonyms. Eversmann's name is older and is thus called a senior homonym while Smith's name is the junior homonym. These species have distinct biologies and distributions, and these data are referenced by their names. However, it is confusing that they have identical names, so how do we objectively decide which one to call B. modestus and which should be named something else? This decision has nothing to do with the biological distinctions between the species but merely how we will reference and access the data associated with each. The nomenclatural Principle of Priority states that the oldest available name is the valid name for the taxon. Thus, in the example of B. modestus, the name established by Smith had to be replaced to avoid confusion with an already existing, identical name. This correction was made by Dalla Torre (1890), and the name of the second species is now B. trinominatus. This system of priority simultaneously provides a simple, objective basis for deciding among competing, different names for the same species. Any names applied to the same species that are proposed subsequent to the first available name are junior synonyms. Priority avoids more subjective aspects like the quality of the original description because these are open to extreme differences of interpretation as to what makes a "good" versus "bad" original description.

Every species name occurs in a combination, specifically in combination with some generic name (hence binomial nomenclature). As systematics continues to refine our understanding of relationships among organisms, the taxonomic placement of species is often changed. Thus, although nomenclature seeks to stabilize taxonomy, it necessarily recognizes that taxonomy is, by its nature, dynamic. Species are often moved from one genus to another (i.e., their combination is altered from their original assignment). This poses the difficulty of historically tracing the data when a given species name has shifted. For example, the metallic green sweat bee Andrena metallica Fabricius, 1797, was moved to the genus Augochloropsis. In zoological nomenclature, the simple system of placing parentheses around the author's name indicates a change in combination; for example, "Andrena metallica Fabricius, 1793" becomes "Augochloropsis metallica (Fabricius, 1793)." Taxonomic catalogues are the repositories for such data: tracking the usage of names since their inception and unifying them with the body of scientific literature tied to the taxon. Such compendia are the most fundamental works on any group of organisms (e.g., Herman, 2001) and the first reference sought at the opening of any line of inquiry. With a group as diverse as insects, taxonomic catalogues are essential but lag far behind our needs.

Tracking the names is only a small part of the problem, which is becoming increasingly manageable with databases. While we can readily ascertain what Fabricius wrote, how do we know exactly what he was looking at and that it is conspe-cific with what we may today be studying in the field? For that, we must examine Fabricius' type specimens. As in all arenas of biology (e.g., ecology, physiology), taxonomy has its system of vouchers. However, unlike other fields, the nomen-clatural system of vouchering is widely misunderstood and erroneously cited as an explanation for taxonomy being an outdated, Victorian pursuit. The confusion ironically centers on the name of the vouchers used in nomenclature. The concept of typification, or the application of nomenclatural types, is confused with the older concept of Platonic types or archetypes. These concepts were at one time inseparable, but the two are no longer related. Earlier in the history of taxonomy it was believed that a single specimen was selected to represent each species and that this individual best approximated the eidos or ideal form of the taxon. Today, the nomenclatural type serves the purpose of vouchering biological data and determining species identity when published descriptions are inadequate.

There is a suite of rules in nomenclature for the designation of a type specimen, and types consist of two general kinds: name-bearing (or primary) types and non-name-bearing (or secondary) types. Primary types include holotypes, lectotypes, and neotypes, while all other types (e.g., paratypes) are secondary types. A holotype is a unique, name-bearing type specimen designated by the original author. The holo-type is the single individual of a species that serves as the voucher for a given species name. All taxa conspecific with the holotype must use the name associated with that holo-type. A holotype can be designated only by the original author and in the publication in which that author established the name. It is the original author's voucher for his/her species. Paratypes are additional specimens that were examined and designated by the original author in the original publication as being likely to be conspecific with the holo-type and are frequently represented by the degrees of variation known at that time. While this distinction between the holotype and paratypes may appear on the surface to be no different than the older concept of archetypes, the selection of a single specimen to hold the name is strictly pragmatic. Should future studies discover that the original author confused two or more species in the original publication, to which should we apply the proposed name and to which should we give another name? As we discussed earlier about the more frequent discovery of cryptic species, determining the name of a species frequently relies on examining types. What the original author believed to be a range of variation in a single species may now be understood to be discrete sets of variation for two species. The set of individuals that are con-specific with the holotype gets the original name, along with any paratypes or newly discovered material that is similarly conspecific. The other paratypes and newly discovered specimens become the type series for a different species. Recall that early in the formalization of taxonomy there were no rules of nomenclature; as such, early authors frequently did not designate types. However, often their original series of specimens can be located in museums. In order to stabilize the current and future application of those names, a type is subsequently selected from their original series of specimens. While the holotype indicates that the original author made the selection of the name-bearing type, the lectotype is reserved for those name-bearing types designated by later authors studying the original series of early scientific names. The lectotype is identical to a holotype except that the original author did not distinguish that specimen from his original series of specimens. Similarly, holotypes may be lost or destroyed; many types have been destroyed during European wars. In such instances, a new unique individual is selected to serve as the name-bearing type for the species; these are called neotypes.

Higher groups also have types: Genera have type species, and families have type genera. These vouchers similarly serve to stabilize the usage of these names in the same manner that they stabilize the usage of species names. If a genus is broken up into two or more genera as the result of a phylogenetic study, the name of the original composite genus goes to the new group containing its type species, and new names are required for the other sets of species.

Numerous other rules of nomenclature exist, all designed to form the database of life's diversity. More details on the historical development of zoological nomenclature are provided by Melville (1995), while the complete set of rules can be found in the most recent edition of the International Code of Zoological Nomenclature (ICZN, 1999).

The current developments in information technology are creating a revolution in systematics and taxonomy. Now it is possible to access, link, and synthesize data for individual species on scales never before imagined. However, at the same time taxonomy is under attack. Technological arguments have been put forward that taxonomy is lingering in the past, failing to discover taxa at a rate keeping up with human-induced extinctions. Such arguments include a variety of misplaced attacks. "Taxonomy is lost in ancient Latin and Greek." What would it matter if it were in French, Spanish, Chinese, or Farsi? By using ancient forms of Latin and Greek, taxonomy avoids the vagaries of nationalism. "Taxonomy should automate and use numbers in place of names for easy use by computers." Computers serve us - not the other way around. A computer cares not whether we refer to an organism as Drosophila melanogaster or as taxon 2789.63; the machine will read both the same. More importantly, there is meaning in names. Ideally, the author of a species constructed the name to be descriptive; referring to a distinguishing feature or location of the species. Some names are even poetic; for example, the fossil butterfly Prodryas persephone, so exquisitely preserved in rock, is named for the Greek goddess Persephone, the beautiful daughter of Zeus who was abducted by Hades to be his queen of the Underworld. Computers can handle data organized by name the same as by number, but human language and cognition simply make words more recognizable and understandable. Indeed, we all have numbers assigned to us (e.g., phone, social security, passport numbers), yet we consider our traditional names to best reflect our identity and so we retain them. Names describe; they give identity.

0 0

Post a comment