The context for this article is the lack of evidence for the traditional view of the origin of English: that English came from Europe and replaced Welsh. But this is nonsense, because it requires a total genocide of a complete population of Welsh speaking Britons leaving almost no Welsh speakers left. Otherwise how does one explain the almost total absence of Welsh place names in Britain? That clearly did not happen, because there just is not the evidence to support it. A massacre of this scale would leave material evidence in abundance. It hasn't! So it didn't happen.That leads to one inescapable conclusion: the English always spoke a form of English even before the Romans invaded.

To summarise them. There is no evidence for the "Celts" being in Britain and Caesar explicitly says that the Celts were a tribe in France. There is no evidence for a genocide in England so it must be assumed that whatever language was there before did not change dramatically. Likewise, there is no evidence for a wholesale change of French. It's nonsense to suggest a supposed Welsh speaking France started speaking French (a language closely related to Latin) when the Non-Latin Franks arrived. Gaulish is nonsense. The few words that supposedly form this language do not fit the required Welsh speaking language. Instead this failure is hidden using the "Celtic" myth which allows those looking to translate these words turn to Irish. I show it's as easy to find the origin of these words in French and so the idea they derive from Irish does not wash. Finally in the last article I outlined how I see the languages in Britain before the Roman Invasion.

Now in this article I go back much further to the ice-age to suggest a possible means by which we arrived at the pre-Roman distribution of languages in Britain and also Europe.

And before I forget: thanks to all those who contribute to the discussion on Britarch who have prompted me to write this.

How did Welsh; Irish/Scottish Gaelic; Cornish etc reach the areas where they were spoken in post-Roman times in Britain?

This question was raised by David Petts (and thanks for prompting this). I've previously said I just don't believe in the migration myth and the story of genocidal Anglo-Saxons completely wiping out a previous Welsh-speaking British race and leaving almost no trace at all of their language in England. (The only one which is worth taking seriously is "Avon" welsh "abon" (river), "aber" (estuary) but "Avon" is so close to Germanic derived "Havern", or Harbour, that there is arguably a better origin in a Germanic language).

However, whilst it is easy to highlight the lack of material evidence for the necessary genocide, it is less easy to answer the question: "when then did the various languages arrive?".


First, I'm going to have to suggest a new terminology, because there is no evidence for any Celts in Britain and if the Britains were Germanic speaking, we can not use "Brythonic" to mean "Welsh-like". Gaelic on the other hand is readily understood. So I will often mean "Welsh-like" referring to Breton, Cornish, Welsh, Cumbrian and also possibly Pictish. (These have falsely been referred to as "P-Celtic" - which is to my mind as daft as calling them "P-Klingon".)

Welsh as a term has problems as it appears to be Anglo-Saxon for "foreigner/invader". So I would prefer something from Welsh. The welsh call themselves "Cymry and the name 'Cumbria' is derived from the same root. So, a sensible name to use would be "Cumbric" as it is readily understood and correctly pronounced by non welsh speakers.

The third group is my proposed Germanic-like language speaking group whose closest recorded language is "Anglo-Saxon" - a name coined from post-Roman elite invader group from whom we have the earliest texts, but one which then became synonymous with a European origin. So, I don't as yet have a specific name.

The origins of Cumbric (Welsh-like) and Gaelic

We have is two linguistic groups which are some considerable linguistic distance apart. Serva & Petroni in their paper "Indo-European languages tree by Levenshtein distance", provide a graph converting "Levenshtein distance" to years from which it would appear possible to derive a date for language splits (as I've added in red). This is achieved by using known historical splits in language such as the point when Iceland was settled around 1100BP.

This would appear to support a split between the Gaelic and Cumbric groups something like 2500BC. Likewise the split between between English and other Germanic groups would be 1000BC and likewise between Breton and Welsh would be the same 1000BC.


Fig 1: Lingusitic separation of European and Indian Languages
from Serva & Petroni "Indo-European languages tree by Levenshtein distance"
Dates in red were not in paper and have been added.

That makes it possible that the arrival of the Gaelic-Cumbric group where Gaelic was located in Ireland and Cumbric mainly in Western Britain is linked in some way to the Atlantic Megalithic culture. The map below shows areas where megaliths are found, but it may be misleading, because one of the most striking features of the megalithic monuments, is the fact that they simply do not appear in any greatly significant concentrations on the European mainland. Instead the great civic centres of the European megalithic culture are to be found huddled along the Atlantic seaboard, primarily in Ireland, Britain and Brittany.

Fig 2: Atlantic Megalithic Culture
Fig 3: Areas of Cumbric languages

This means there is a close relationship between megalithic culture and areas which were later occupied by Cumbric and Gaelic speaking populations.

Whether of not the arrival of this megalithic culture also means that Gaelic and Cumbric languages arrived depends on the scale between "Levenstein distance" and time, but it cannot be denied that the megaliths appear to be absent from much of the area which has a first recorded language of Germanic origin.

The Language Tree

The whole idea of "families" of languages appears to derive directly from a period when similar ideas were being used in other areas, notably the idea of a "family of animals".

Fig 4 Tree of Invertebrates showing how different groups
have split over time (Source: Wikipedia)

Originally (pre Darwin), these "families" were more a way of filing animals. So, e.g. "Reptiles" referred to any animal which was cold blooded like lizards. So, all the reptiles were filed together, all the birds, all the insects etc.

So, originally these were groups only intended as a handy way of organising animals in big collections. Then along came Darwin who around 1859 suggested that these groups were not just random likeness but had a cause and the old "they are like each other", became "they share a common ancestor".

So, the "family tree" of life, might have started as a convenient way to organise information about living things, but it turned out to be a true representation of how animals evolve because (for the most part) animals take all their genetics from their parents.

So, a few early species are the ancestors of all animals sharing group traits such at the invertebrates shown left. Moreover, because genetic mutations arise pretty much at random, the rate of change of DNA is fairly constant and so the change in animals is almost a "clock".

However, when the linguists tried to use the same "family tree" analogy for languages, this implied the same strict "languages is only inherited" which is not true and the idea of a clock based on the rate of divergence of languages does not work.

The problem is that this "family tree" concept requires that all language is inherited from a single "parent" as if each new generation received its entire vocabulary from their parents in the same way as we receive our entire DNA.

Of course that is baloney (Etymology: a variant of bologna sausage, perhaps influenced by blarney). Yes, most of our language comes from parents and in this sense is inherited, but also each generation takes language elements from its surroundings and if that surrounding includes foreign languages with new words for new items, our language is clearly obtained both by inheritance and also by "diffusion" between languages with close geographical links.

But it's also true that our language is constantly adjusting our language to match those we talk to. In this way, the very act of using a language tends to enforce a common usage, which tends to "hold together" a language and e.g. in the extreme, such as Britain since television, telephones and other media, the differences of regional accents is being reduced as we all communicate with a much more regionally diverse group which tends to enforce commonality across much larger geographical areas.

So, unlike the "tree of life", the "tree of language" shows both divergence as new elements are added to the language as e.g. new technology needs new language, but it also shows convergence as geographically close groups communicating with each other tend to pull the languages together.

So, the traditional idea that "languages diverge from one ancient ancestor" is false, because communication between groups tends to pull the languages together, so that e.g. in the extreme such as Norman French and Anglo-Saxon, the two languages amalgamate into one. That means two languages that are "close" could have

  1. Diverged - they both came from one original language and over time differences increased the linguistic distance between them,
  2. Converged - so that they were originally two very different languages which for some reason began being used together and so had enough exchange of words and other language features that eventually they grew to be alike.
Fig 5 Divergence and convergence leads to Mosaic of difference

So, there are two forces in linguistics:


Which like genetics, tends to increase the diversity of a language. The rate of change depends on the rate at which new words come into use. This will depend on the rate of change of things like technology, fashion, religion or if groups move, they may need new names for new geography and animals. If we assume technology change is constant, this would be analogous to the rate of change of DNA.


Convergence is a force entirely unlike genetics - because these changes are not inherited but are learnt from our environment.

The rate of convergence between two groups is dependent on the amount of direct and indirect communication between the groups.

As such we expert convergence to decrease with distance so that we have an equation of the form:

Total Divergence = Time x (Divergence - Convergence /Distance)

This should lead to a mosaic of languages as shown to the right with each colour representing an area where language tends to converge and outside these areas, the dominant force becomes divergence. However, even though languages will be diverging, there will be a continuous supply