Centum and satem Languages

Deborah Anderson
Department of Linguistics
University of California, Berkeley

In a lecture given in 1786, Sir William Jones, Chief Justice of India and founder of the Royal Asiatic Society, noted the strong relationship in verbal roots and the grammatical forms of Sanskrit, Greek, and Latin. This similarity, he remarked, could not have been produced by accident; these languages must have originated from a common source. He added that Gothic, Celtic, and Old Persian may have come from the same origin. Others had also noted the similarity between Sanskrit and other languages by comparing words from different languages. Though he was not the first, Jones is often credited with the birth of Indo-European linguistics by eloquently stating that a common source, later to be identified as Proto-Indo-European, was the ancestor of these related languages.

The discovery of sound laws in the 1860's helped to establish the foundation of comparative Indo-European linguistics. It is upon such regularly occurring sound laws that allowed comparisons to be made; exceptions to the laws needed to be explained. Today the study of IE linguistics draws on work done in phonetics, dialectology, typology, and other fields but the basis of comparison still rests on the set of correspondences between the languages.

An important Indo-European isogloss

By examining the words for hundred from various Indo-European languages an important pattern can be observed:

Lang. Family       Language                Word for 'hundred'

Indo-Iranian       Sanskrit            satam [acute on s and last a]
                   Avestan             satem [e is upside down]
Baltic             Lithuanian          simtas [hacek on first s,
                                              squiggly line above m]
Slavic             Old Church Slavic   suto [short mark above u]

Italic             Latin               centum
Greek              Greek             hekaton [acute on o]
Celtic             Old Irish           cet [long mark over the e]                  
                   Welsh               cant                       
Germanic           English             hund-red  (Note: original k-sound
                                         becomes a sound represented here by
                                         an h via a regular process in     
Tocharian          Tocharian           kant [umlaut over a]

In Sanskrit, Avestan, Lithuanian, and Old Church Slavic the initial consonant appears as an s- (or sh-) sound (a sibilant), whereas Greek, Latin, Old Irish, Welsh, English, and Tocharian have a k- sound (a velar or a palato-velar). This correspondence, mirrored in many other word sets, was identified as an important Indo-European isogloss (a boundary line that can be drawn based upon a particular linguistic feature): Indo-Iranian, Baltic, Slavic, Albanian, and Armenian have a sibilant for PIE *k' whereas Greek, Latin, Celtic, Germanic and Tocharian maintain the k- sound. Those languages with the s- (sh-) sound are classified satem (after the 'hundred' word in Avestan), those which have a k- sound are the centum languages (after the Latin word).

Note that Tocharian, found in far western China, is a centum language as is Hittite (found in Anatolia) so that a strict satem = east, centum = west rule-of-thumb doesn't work.

The original form of the word for 'hundred' in Proto-Indo-European was *(d)kmtom [k with an acute above it or k' can be used; dot under m; acute on o], which shows that the centum group has actually retained the original sound of the velar but the satem group has changed the sound; it moved the articulation forward in the mouth.

The satem/centum grouping holds fairly well for the outcomes of other dorsals (that is, all kinds of k-sounds) in Indo-European. The example above demonstrates the outcome for PIE *k' [k with an acute above it or k' can be used]. By looking at various correspondences, a table can be created showing the various outcomes in the different languages (adapted from Beekes 1995: 110). The reconstructed Proto-Indo-European form is on the left, the outcomes which appear in cognate words to the right. (The variant outcomes listed below depend largely upon preceding or following sounds or position in a word, particularly initial position. For details on the particular environments, compare Beekes).

Series One: Velars / Palato-velars

S A T E M                             C E N T U M                            
PIE    Skt   Av    OCS   Lith  Arm    Toch.   Hitt.  Greek   Latin   OIr   Gothic
*k'    s!    s     s     s/    s      k, s/   k      k       c       c     h, g        
*g'    j     z     z     z/    c      k, s/   k      g       g       g     k   
*g'h   h     z     z     z/    j, z   k, s/   k      kh      h, g    g     g           

A second series has been postulated, the plain velars. However, no IE language clearly retains all three series. (There is some debate about whether Albanian retains all three.) As reflected in the chart below, satem has either a velar or sibilant, centum has either a velar (or palato-velar) or labiovelar. The plain velars occur only in certain environments, i.e., only after *u and *s and before *r and *a, so they appear to be conditioned variants of the other series.

Series Two: Plain Velars

S A T E M                         C E N T U M                         
PIE  Skt  Av  OCS  Lith  Arm      Toch.  Hitt.  Greek  Latin   OIr   Gothic
*g      outcomes as below             outcomes as above 
          (labiovelars)           (velars / palato-velars)                                

A third series is well attested, the labiovelars, which combine the velar with a labial element (represented by the superscripted w). Note that in the satem languages, the labial element is lost. Once again, the satem languages differ from the reconstructed Proto-Indo-European forms in having lost the labializing element..

Series Three: Labiovelars

S A T E M                           C E N T U M                           
PIE     Skt  Av    OCS      Lith    Arm   Toch.  Hitt.  Greek    Latin    OIr       Gothic
*kw}    k,c  k,c   k,c/,c   k       k'    k,s/   ku     b,d,g    gu,g,v   b         q 
*gw}    g,j  g,j   g,z/,dz  g       k     k,s/   ku     p,t,k    qu,c     c         hw
*gw}    h    gh,h  g,j      g,z/,dz g     g,j/   k,s/   ku       ph,th,kh gu,g,v,f- g,gw,w 
[' Acute on previous letter; / is hacek on previous letter; } superscript previous letter.]

The satem/centum distinction is evident in the outcomes listed above and is an important isogloss. What does it indicate in terms of IE origins and IE distribution? As observed above, the centum languages retain the PIE articulation better than the satem group: the velars (/palato-velars) in the centum group did not become sibilants and the labial element was retained. In dialect geography, the more conservative elements are retained in the geographic periphery, away from the central area where innovation is taking place (in this case, where the satem languages are). Using the satem/centum isogloss as a guide, Indo-Iranian, Baltic, Slavic, Armenian, and Albanian serve as one central area. However, it is important to also take into consideration other isoglosses in arriving at an adequate model for the PIE situation.

© 1998 by Deborah Anderson

For further information:

Beekes, Robert S.P. Comparative Indo-European Linguistics: an introduction. Amsterdam/Philadelphia: John Benjamins, 1995. [A useful handbook now out in paperback. Centum / satem discussion is found on pages 109-113 and 129.]

Sihler, Andrew L. New Comparative Grammar of Greek and Latin. New York and Oxford: Oxford University Press, 1995. [This topic is discussed on pages 7, 151-154.]

