Tree model

Popularized by the German linguist August Schleicher in 1853,[1][2] the tree model has always been a common method of describing genetic relationships between languages since the first attempts to do so.

Then, in a series of tracts, published in 1684, expressing skepticism concerning various beliefs, especially Biblical, Sir Thomas Browne wrote:[8] "Though the earth were widely peopled before the flood ... yet whether, after a large dispersion, and the space of sixteen hundred years, men maintained so uniform a language in all parts, ... may very well be doubted."

[11] Browne reports a number of reconstructive activities by the scholars of the times: [12] "The learned Casaubon conceiveth that a dialogue might be composed in Saxon, only of such words as are derivable from the Greek ... Verstegan made no doubt that he could contrive a letter that might be understood by the English, Dutch, and East Frislander ... And if, as the learned Buxhornius contendeth, the Scythian language as the mother tongue runs throughout the nations of Europe, and even as far as Persia, the community on many words, between so many nations, hath more reasonable traduction and were rather derivable from the common tongue diffused through them all, than from any particular nation, which hath also borrowed and holdeth but at second hand."

In that same revolutionary century in Britain James Howell published Volume II of Epistolae Ho-Elianae, quasi-fictional letters to various important persons in the realm containing valid historical information.

[citation needed] On February 2, 1786, Sir William Jones delivered his Third Anniversary Discourse to the Asiatic Society as its president on the topic of the Hindus.

In it he applied the logic of the tree model to three languages, Greek, Latin and Sanskrit, but for the first time in history on purely linguistic grounds, noting "a stronger affinity, both in the roots of the verbs and in the forms of grammar, than could possibly have been produced by accident; ...." He went on to postulate that they sprang from "some common source, which, perhaps, no longer exists."

Young begins by pointing out Adelung's indebtedness to Conrad Gesner's Mithridates, de Differentiis Linguarum of 1555 and other subsequent catalogues of languages and alphabets.

Adelung's additional classes were the Tataric (which would later be known as the disputed family Altaic), the African and the American, which depend on geography and a presumed descent from Eden.

[citation needed] Young's designation, successful in English, was only one of several candidates proposed between 1810 and 1867: indo-germanique (Conrad Malte-Brun, 1810), japetisk (Rasmus Christian Rask, 1815), Indo-Germanisch (Julius Klaproth, 1823), indisch-teutsch (F. Schmitthenner, 1826), sanskritisch (Wilhelm von Humboldt, 1827), indokeltisch (A. F. Pott, 1840), arioeuropeo (Graziadio Isaia Ascoli, 1854), Aryan (Max Müller, 1861) and aryaque (H. Chavée, 1867).

(Klaproth, for example, the author of the successful German-language candidate, Indo-Germanisch, who criticised Jones for his uncritical method, knew Chinese, Japanese, Tibetan and a number of other languages with their scripts.)

As hope of finding it gradually died they fell back on the growing concept of common Indo-European spoken by nomadic tribes on the plains of Eurasia, and although they made a good case that this language can be deduced by the methods of comparative linguistics, in fact that is not how they obtained it.

The model relies on earlier conceptions of William Jones, Franz Bopp and August Schleicher by adding the exceptionlessness of the sound laws and the regularity of the process.

Darwin criticises the synchronic method devised by Linnaeus, suggesting that it be replaced by a "natural arrangement" based on evolution.

Since the adoption of the family tree metaphor by the linguists, the concept of evolution had been proposed by Charles Darwin and was generally accepted in biology.

It became the prime goal of taxonomy to discover the lineages and alter the classification to reflect them, which it did under the overall guidance of the Nomenclature Codes, rule books kept by international organizations to authorize and publish proposals to reclassify species and other taxa.

To discover a cladistic relationship researchers relied on as large a number of morphological similarities among species as could be defined and tabulated.

Greenberg formulated large tables of characteristics of hitherto neglected languages of Africa, the Americas, Indonesia and northern Eurasia and typed them according to their similarities.

[22] The comparative method has been used by historical linguists to piece together tree models utilizing discrete lexical, morphological, and phonological data.

[citation needed] In the late 20th century, linguists began using software intended for biological classification to classify languages.

Since a variety represents an abstraction from the totality of linguistic features, there is the possibility for information loss during the translation of data (from a map of isoglosses) into a tree.

[citation needed] The limitations of the tree model, in particular its inability to handle the non-discrete distribution of shared innovations in dialect continua, have been addressed through the development of non-cladistic (non-tree-based) methodologies.

For example, according to Zuckermann (2009:63),[26] "Israeli", his term for Modern Hebrew, which he regards as a Semito-European hybrid, "demonstrates that the reality of linguistic genesis is far more complex than a simple family tree system allows.

The purpose of phylogenetic software is to generate cladograms, a special kind of tree in which the links only bifurcate; that is, at any node in the same direction only two branches are offered.

It then constructs a cladogram based on degrees of similarity; for example, hypothetical languages, a and b, which are closest only to each other, are assumed to have a common ancestor, a-b.

A new cladogram resulted from any change, which suggested that the method was not capturing the underlying evolution of languages but only reflecting the extemporaneous judgements of the researchers.

[28] Despite their care to code the best qualitative characters in sufficient numbers, the researchers could obtain no perfect phylogenies for some groups, such as Germanic and Albanian within Indo-European.

Inspecting the results, the researchers excluded the non-feasible interfaces until a list of only feasible networks remained, which could be arranged in order of compatibility score.

[citation needed] The researchers began with five candidate trees for Indo-European, lettered A-E, one generated from the phylogenetic software, two modifications of it and two suggested by Craig Melchert, a historical linguist and Indo-Europeanist.

The trees differed mainly in the placement of the most ambiguous group, the Germanic languages, and Albanian, which did not have enough distinctive characters to place it exactly.

[32] Subsequent generation of networks found that all incompatibilities could be resolved with a minimum of three contact edges except for Tree E. As it did not have a high compatibility, it was excluded.

Cladistic representation of the Mayan linguistic family , going back 4000 years. (The numbers represent proposed historical dates in the Common Era ).
Family tree of Biblical tribes
Garden of Eden, home of the Ursprache
Kashmir (red), Adelung's location of Eden
Schleicher's tree model
Classification of African language families
A phylogenetic network, one of many posited by the CPHL. The phylogenetic tree appear in black lines. The contact edges are the red lines. Here there are three, the most parsimonious number required to generate a feasible network for Indo-European.