Wednesday, 19 October 2011

Phonemic Writing System


Phonemic Writing System

 

Writing system should be based on phoneme rather than syllable

Phoneme is the smallest unit of sound of a language. Phonemes of a language are finite in number, a reasonably small number. Writing system based on phonemes can thus be simple, compact and straightforward.  In writing systems based on syllables, numerous combinations of phonemes to form syllables make the system complex, elaborate and convoluted.   Here syllable is used in general to mean conjunct consonant usually consisting of 1 to 3 consonants followed by a vowel. In English, the writing system is based on letters but the writing system is not fully phonetic. In many Indian languages, the writing system is based on syllabic system. Since syllables are composed of phonemes, syllabic writing systems are phonetic. South Asian languages like Kannada, Sanskrit, Tamil, Telugu, Malayalam, Gujarathi, Gurmukhi, Bengali, Oriya, Sinhalese, Tibetan, Mynamar language scripts are examples of writing systems based on syllabic system and are phonetic.

Each letter should correspond to one and only one unique phoneme and vice versa 

Strict phonetic writing system is highly desirable. Such a system relieves the burden of remembering spelling of each word. When we encounter a name or a new word in a nonphonetic writing system, we need to know its spelling and pronunciation. By adopting purely phonetic writing system, problem of pronunciation dissolves automatically. For a writing system to be phonetic, it is essential that one letter should represent one phoneme only; vice versa one phoneme should correlate to one letter only. One to one correlation between phonemes and letters is the basic norm of phonetic writing system. Lack of one to one correlation between phoneme and letter results in ambiguity and is against the spirit of phonetic script. For example, in English, the letter ‘C’ is sometimes used to represent ‘k’ (plosive velar) and sometimes to represent ‘s’ (fricative alveolar). The letters ‘K’ and ‘S’ are there for this purpose. Similarly the letters ‘Q’, ‘W’, ‘X’ and ‘Y’ of English language also break the norms of phonetic writing system. Such duality or duplication in representing phoneme is to be avoided in phonetic writing systems.

Writing system should be free from consideration which are truly extraneous

It is desirable to keep certain linguistic considerations separate from writing system when such considerations are truly extraneous to a pure phonetic writing system. Let writing system be used purely for recording spoken language faithfully and correctly. Since we can understand spoken language, we can understand the same when it is recorded in a phonetic writing system without any ambiguity[s1] . Punctuation marks, which add to the clarity of writing, are welcome. 

Minimum number of letters required to represent all the phonemes of a given language fully and effectively

In phonetic writing system, the number of letters cannot be more than the number of phonemes of the language.  It may be either equal or less. It can be equal to the number of phonemes of the language when each phoneme is assigned a separate letter. In some cases where certain phonemes are grouped, one common letter can be assigned for the group and diacritics can be used to differentiate between each member of the group.  In many Indian languages, some consonants are aspirated to generate aspirated consonant sound. To indicate that a consonant is aspirated, a diacritic can be used for the aspirated vowels. Like many Indian languages, Kannada alphabet contains ten pairs of such consonants, each pair having an unaspirated consonant and an aspirated consonant – PÀ R, UÀ WÀ, ZÀ bÀ, d gÀhÄ, l oÀ, qÀ qsÀ, vÀ xÀ, zÀ zsÀ, ¥À ¥sÀ, ¨ ¨sÀ. Out of these 10 pairs of letters,  qÀ, zÀ, ¥À, § pairs have a diacritic to indicate the aspirated consonants. This diacritic could have been used for the remaining such pairs also. The use of diacritics to indicate all the aspirated consonants results in uniformity and consistency. Kannada Sahitya Parishath has made one such proposition in this direction[s2] .

As for as possible, one glyph is used for a diacritic to perform one particular function only. Functional overloading of glyph is better avoided since it results in lack of uniformity and consistency and may be little confusing. Likewise more than one glyph for a diacritic performing one function is avoided. For example, in Kannada, the diacritic for long vowels uses more than one glyph. The function of the diacritic here is to indicate that the vowel is elongated. One glyph was sufficient for consistency and uniformity.

If there is a feature in writing system used on different occasions, a diacritic can be proposed to bring in uniformity and consistency without sacrificing the phonetic characteristic of the writing system. In Kannada language, there are 5 groups of consonants called vargeeya vyanjanas. Each group is called a varga and consists of 5 consonants. The diacritic anuswaara is used to represent the fifth consonant of the corresponding varga. The use of anuswaara in this way results in simplified writing without causing any ambiguity - PÉÆAPÀÄ, ¸ÀAWÀ, vÉAUÀÄ, PÉÆAZÀ, ªÀÄAdÄ, §Al, UÀAqÀ, PÀAvÉ, »A¢, PÀA¥ÀÄ, ¤A¨É, wAªÀÄ, UÀA©üÃgÀ, CAvÀå, EA¥ÀÄ, GAlÄ NAPÁgÀ.

Flow of writing should strictly follow the flow of speaking.

The direction of writing of a writing system should be completely unidirectional. Many languages use left to right writing in their writing systems. There are some languages that have bi-directional writing system. Arabic, Urdu, Farsi and Hebrew are examples of bi-directional languages. In Urdu language, the usual direction of writing is from right to left but when writing numbers the direction of writing changes left to right.  When text from other languages, which use left to right system of writing, is embedded in Urdu, the direction of writing is from left to right. When writing is resumed further the direction of writing reverts to right to left.

Many south Asian languages use left to right system of writing. However in some languages the writing system is not strictly unidirectional. In Sanskrit, Tamil, Malayalam, Gujarati, Punjabi, Bengali, Oriya, Singhalese and some other languages, writing does not strictly follow the order of phonemes of speech. Some examples: Sanskrit - ÌlÉ, uÉÉï (ni, rvA); Bengali – ôK², øK², ôK²ç, ôK²û (ke, kai, ko, kau); Malayalam - ¤K, ¥K, ¤¤K, ¤Kx, ¥Kx, ¤Kx (ke, kE, kai, ko, kO, kau); Kannada - PÉÆëèsÉ, ¹Ûçà (kShObhe, stri[s3] ).  Here in ÌlÉ, the diacritic for the vowel is written first and then the consonant preceding the vowel. In uÉÉï, the consonant ‘r’ preceding the consonant ‘p’ is written after its following consonant and the vowel. In Bengali and Malayalam, diacritics for the first two vowels shown are written before the preceding consonant and for the remaining vowels shown above the diacritics are written partly before (pre-base glyph) and partly after (post-base glyph) the concerned consonant. The examples cited here are from Sanskrit, Bengali, Malayalam and Kannada. But similar feathers are there in many scripts that are descendents of Brahmi script – like Assamese, Gujarati, Gurmukhi, Hindi, Oriya, Punjabi, Singhalese and Tamil. Order of pronouncing the phonemes should be followed strictly in a true phonemic writing system so that reading what is written is simple and straightforward. Also programs for processing text like sorting, indexing and searching, OCR for printed text document recognition and retrieval, etc. can be coded straight forward. Such programs can be robust and less error prone.

Each letter should have equal weight in its visibility and size.

Many south Asian languages use syllabic writing systems. A consonant is joined with the following consonant/s till a following vowel occurs and the result is a conjunct consonant called as syllable. Generally, the number of consonants preceding the vowel may be 1 to 3. In Tamil language, the consonants of a syllable except the last one are shown to be devoid of vowel by a diacritic – usually a dot above the letter and the last of consonant of the syllable is rightly written along with the diacritic of the vowel following. Thus the size of the characters remains same in Tamil.

In Sanskrit and some more Indian languages, which are descendents of Brahmi script, each of the consonants of the syllable except the last one are clipped at the end and joined to the following consonant to get the conjunct consonant; this method of writing syllables makes reading less comfortable. Let us look at a few examples from Sanskrit: kÉlrÉuÉÉSÈ, AÎxiÉ, EzhÉÉãSMüqÉç, AlrɧÉ, pÉÌuÉwrÉÎliÉ. Clipping of letters this way makes reading not facile and coding of OCR character recognition and other programs is convoluted.

In Kannada and Telugu languages, the following method is adopted in writing conjunct consonants: First consonant of the syllable is written in full along with diacritic of the vowel that occurs at the end of the syllable (!) and the following consonants of the syllable are written below this using shapes meant particularly for this purpose which may or may not be similar to normal letters of consonants – necessitating a separate set of diacritics for consonants for the purpose. For example, here are a few such words: ¸ÀÆPÀë÷ä, ¨sÀPÀë÷å, ¥Áæ±À¸ÀÛ÷å, ¸ÁévÀAvÀæ÷å, wÃPÀë÷Ú, ®Që÷ä, PÉÆëèsÉ, ¹ÛçÃ, C¸ÀÛç, ±Á¹Ûç, ªÀPÀÛç, ªÉʲµÀÖ÷å, ¤µÀÌçªÀÄt. This method has some problems. Second and subsequent consonants are small in size and makes reading them less comfortable.  It is difficult to read text printed in small font size due to still smaller sizes of consonants written below others and also takes more line space. Displaying such syllables on digital display boards using pixel format takes more pixels resulting in larger size of display boards. It also makes coding of OCR text recognition programs complex and error prone.

Shapes of letters should be simple and not be intricate or clustered.

Characters used for letters of the alphabet of a language should be made up of simple strokes and should be easy to distinguish from one another.  Intricate and clustered characters are better avoided. Uniform density of strokes in any characters is desirable. Such shapes of letter result in facile reading and smaller font size can be used in print and in pixel display boards. Capital letters of English language – A B D E F G H I J K L M N O P R S T U V Z are good examples. The letters are made of simple lines and are not intricate, not clustered and quite distinct from one another; so they are easy to read, easy to code OCR text recognition programs and smaller font size text is also easy to read and need smaller pixel display boards.

If we observe the shapes of the letters of the alphabets of some Indian languages, we can see that letters of the alphabets are formed with one or two basic strokes and all letters are formed with additional strokes to make each letter distinct from all others. This feature of forming letters with some basic strokes gives the alphabet distinctive shapes making out the language easy to decipher even though the alphabet is not known fully. For example, many Sanskrit letters have one vertical stroke and one horizontal stroke at top. Similarly most of the letters of Oriya language have one comparatively bigger circular segment and additional strokes are concentrated mostly at the bottom of the letters. In Bengali, Sanskrit and Gujarati, diacritics for the vowels u, U, Ru, RU, Lu, and LU (Bengali - কু, কূ, কৃ, কৄৄ; Sanskrit MÑü, MÔü, M×ü, MÚü, MÝü, Màü; Gujarati K×ú, KÚú, KÝú, Kàú) are written in small size below the character of the main consonant. Similarly, in Kannada and Telugu languages, the diacritics Ru, RU and ai (Kannada - PÀÈ, PÀñ, PÉÊ; Telugu – OµÅ, OµÆ, ËOÇ) are written in small size below the character of the main consonant. Case of diacritics for the vowels u, U and Ru in Malayalam (Malayalam – K¡, K¢, K£) is also the same – the diacritics start below the character of the consonants concerned and have smaller circular shapes. As discussed earlier, the conjunct consonants of Telugu and Kannada also use smaller diacritics of consonants devoid of vowel in conjunct consonants below the normal letter. Font size of the letters in print or in pixel display board is to be based on the minute detail of the letter to be shown clearly; consequently bigger font size and pixel display board are required. In addition, more line space is also required to accommodate parts of letters, which are written below the normal bottom of the letters.


Characteristics of writing system are discussed. Why?

Language is living and responsive to changes. Some old words become obsolete, some old words change in pronunciation and meaning, new words are formed, words from other languages are imported from various branches of human knowledge - legal, scientific, engineering, technical and many more are imported – with or without modification. Influence of society is there on the changes in languages. Needs and aspirations of the society are met with changes in and growth of language. Writing systems have evolved over a great period of time and have been modified many times. From Brahmi script, many of our Indian scripts including Bengali, Malayalam, Kannada, Sanskrit, Tamil, Telugu, etc. have evolved to their current versions.  Indian writing systems are phonemic, but are alphasyllabaries.  Vowel occurring at the beginning of a syllable is written in its main character form. When vowel occurs after one or more consonants, the result is a conjunct consonant. A glyph is formed for such a conjunct consonant using diacritics for its consonants and for the vowel following them. This feature has been discussed in some of the preceding paragraphs. The disadvantage of representing a conjunct consonant in this way is also discussed there. Thousands of years back or even a few centuries back, poets and litterateurs were interested in excelling in proficiency and in exhibiting their profound knowledge in their literary works and were proud of their highly classic attitude towards their works. Ease of use and simplicity of their literary works was never a consideration for the poets and litterateurs. Now with changed times, there is need for language to be simple, easy and accessible to one and all. Fast and facile reading is desirable. Language should address the needs and aspiration of entire society using the language. Necessity is felt to reform Indian writing systems and make them simpler and easier to benefit of the widest range of people using the language.  For using computers in OCR text recognition, text to speech, speech to text and other text processing works, a simpler writing system is desirable. Phonemic writing system, which is strictly phonetic, is a better choice and is to be preferred. There is a need to reform existing writing systems. There are many languages in India without any writing system and there is a need to evolve easy and efficient writing systems for such languages. Identifying the characteristics of a good phonemic writing system is helpful in these tasks.

Existing writing systems - Reformation

Reformation of an existing writing system is a challenging task. Usual human tendency is to adjust to present system with compromises. Reason must prevail in reformation to overcome the shortcomings endured during usage.

In favor of reformation

Writing system should be strictly phonemic. English though widely used is not phonemic. Avoidable burden of differing spelling and pronunciation is a problem yet to be solved.

Though Indian writing systems that are decedents of Brahmi Script are phonetic, much advantage can be gained if these scripts are based on phonemes and not on syllables.

Reluctance and resistance are the stumbling blocks. Lame excuses are put forth. Sincere and rational effort to appreciate advantages and benefits of reformation of writing system is the need of the time. Efforts should be made towards finding ways and means of successfully bringing reformations in writing systems and popularizing the same.

Against reformation

Reformations, which appear to be rational and beneficial, do not get the popular support required. So widespread is the society using a language, any change to the existing system is practically unwieldy. Reformations render the whole treasure of existing books, databases, and repertoire   of human knowledge obsolete in their present written form. Lot of effort and money may have to be spent to bring them in conformity with the reformation proposed.

Reformation of existing writing systems

Lot of deliberations is necessary. Pros and cons of the proposed reformations need to be studied carefully. In some cases, compromise may become necessary in the reformations considered necessary – particularly to achieve public acceptance.

 

Languages which do not have writing systems

There are many languages that do not have any script. Evolving an ideal writing system for such a language does not pose so much a challenge as an existing writing system poses. A simple and efficient writing system can be devised with diligent effort.

 

 


 [s1]The English Spelling Reform in the Light of the Works of Richard Mulcaster and John Hart, Susana Doval Suárez, University Of Santiago De Compostela

 [s2]Kannada Nudi October 1, 1942 from Kannada Sahitya Parishattu.

 [s3]A Phonemic Code Based Scheme for Effective Processing of Indian Languages: Prof. R.K. Joshi and Keyur Shroff of National Center for Software Technology, Juhu, Mumbai, India - 400 049; Dr. S. P. Mudur, Concordia University, Canada

No comments:

Post a Comment