User Tools


Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
wiki:orthography [2019/08/21 21:00] chuckwiki:orthography [2023/03/15 04:22] (current) chuck
Line 14: Line 14:
  
 Examples are given to show how the text is normalized; counterexamples are exceptions which are not normalized. Examples are given to show how the text is normalized; counterexamples are exceptions which are not normalized.
 +
 +Normalized spellings may not represent what is generally considered to be "correct" Sanskrit; however, they do reflect orthographic practices as attested in manuscripts. 
  
 === geminated t === === geminated t ===
Line 20: Line 22:
  * replaces -tt(h)- after consonantal/vocalic r, i, and pa with -t-; replaces -tt(h)- preceding r, v, or y within a word with -t-  * replaces -tt(h)- after consonantal/vocalic r, i, and pa with -t-; replaces -tt(h)- preceding r, v, or y within a word with -t-
  
-== examples == + * __examples__ 
- + arttha => artha 
-arttha => artha + saṃskṛtta => saṃskṛta 
-saṃskṛtta => saṃskṛta + prākritta => prākrita 
-prākritta => prākrita + tattvam => tatvam 
-tattvam => tatvam + pattram => patram 
-pattram => patram + * __counterexamples__ 
- + atty annam (source: [[http://www.sanskrit-linguistics.org/dcs/index.php?contents=texte&PhraseID=461691|DCS]]) 
-== counterexamples == +
- +
-atty annam (source: [[http://www.sanskrit-linguistics.org/dcs/index.php?contents=texte&PhraseID=461691|DCS]]) +
  
 === geminated consonants after r === === geminated consonants after r ===
-<code pcre>/(?<=[rṛṙ]|[rṛ]\s)(]kgcjṭḍṇdnpbmyvl)\1/\1/</code>+<code pcre>/(?<=[rṛṙ]|[rṛ]\s)([kgcjṭḍṇdnpbmyvl])\1/\1/</code>
  
  * replaces doubled consonants (excluding t) after consonantal/vocalic r with a single consonant (Cf. [[https://www.sanskritdictionary.com/panini/8-4-46|Aṣṭādhyāyī 8.4.46]])  * replaces doubled consonants (excluding t) after consonantal/vocalic r with a single consonant (Cf. [[https://www.sanskritdictionary.com/panini/8-4-46|Aṣṭādhyāyī 8.4.46]])
  
-== examples == + * __examples__ 
- + arddha => ardha 
-arddha => ardha + dharmma => dharma 
-dharmma => dharma + pṛcchati => pṛchati
-pṛcchati => pṛchati+
  
 === geminated aspirated consonants === === geminated aspirated consonants ===
Line 48: Line 46:
  * replaces -jjh-, -tth-, -ṭṭh-, and -ddh- with -jh-, -th-, -ṭh-, and -dh- respectively (Cf. [[https://www.sanskritdictionary.com/panini/8-4-47|Aṣṭādhyāyī 8.4.47]])  * replaces -jjh-, -tth-, -ṭṭh-, and -ddh- with -jh-, -th-, -ṭh-, and -dh- respectively (Cf. [[https://www.sanskritdictionary.com/panini/8-4-47|Aṣṭādhyāyī 8.4.47]])
  
-examples+__examples__
  * attha => atha  * attha => atha
  *daddhi => dadhi  *daddhi => dadhi
Line 55: Line 53:
 <code pcre>/(?:ṃ[lśs]|nn)(?!\S)/n/</code> <code pcre>/(?:ṃ[lśs]|nn)(?!\S)/n/</code>
  
- * replaces -ṃl, -ṃś, -ṃs, and -nn with -n (Cf. [[https://www.sanskritdictionary.com/panini/8-3-7|Aṣṭādhyāyī 8.3.7]], etc.)+ * replaces final -ṃl, -ṃś, -ṃs, and -nn with -n (Cf. [[https://www.sanskritdictionary.com/panini/8-3-7|Aṣṭādhyāyī 8.3.7]], etc.)
  
-== examples == + * __examples__  
- + gacchaṃs tu => gacchan tu 
-gacchaṃs tu => gacchan tu + puruṣānn atti => puruṣān atti
-puruṣānn eva => puruṣān eva +
- +
-== counterexamples == +
- +
-aṃśa +
-annam+
  
 === internal nasal variants === === internal nasal variants ===
Line 72: Line 64:
  * replaces nasals preceding certain consonants with an anusvāra (this regular expression is the opposite of rule [[https://www.sanskritdictionary.com/panini/8-4-58|Aṣṭādhyāyī 8.4.58]], as to be more efficient)  * replaces nasals preceding certain consonants with an anusvāra (this regular expression is the opposite of rule [[https://www.sanskritdictionary.com/panini/8-4-58|Aṣṭādhyāyī 8.4.58]], as to be more efficient)
  
-== examples ==+ * __examples__ 
 + * nandita => naṃdita 
 + * yuñjati => yuṃjati 
 + 
 +==== Script/scribe specific filters ==== 
 + 
 +Some normalization filters require a tag in your TEI header, because they only apply to certain scripts or specific scribal practices. When these filters are activated, they will only apply to those transcriptions which have the corresponding tag. 
 + 
 +=== pṛṣṭhamātrā vowels === 
 +<code pcre>/ê/e/ /î/i/ /ô/o/ /û/u/</code> 
 + 
 +In transcriptions of Devanāgarī sources, pṛṣṭhamātrā vowels are transcribed as ê, aî, ô, and aû (see [[wiki:transcription|Transcription conventions]]). 
 + 
 + * These filters require ''@mainLang="sa-Deva"'' in the ''<textLang>'' tag. 
 + 
 +=== valapalagilaka === 
 +<code pcre>/ṙ/r/</code> 
 + 
 +In transcriptions of Telugu sources, the valapalagilaka reph is transcribed as ṙ (see [[wiki:transcription|Transcription conventions]]). 
 + 
 + * This filter requires ''@mainLang="sa-Telu"'' in the ''<textLang>'' tag. 
 + 
 +=== ṭh written as ṭ === 
 +<code pcre>/ṭh/ṭ/</code> 
 + 
 +In some Devanāgarī manuscripts, it is common for ṭh to be written as ṭ. 
 + 
 + * This filter requires a ''<scriptNote>'' tag with the ''@xml:id="script-ṭha-ṭa"''
 + 
 +=== b written as v === 
 +<code pcre>/b(?!h)/v/</code> 
 + 
 +In some scripts, b is not distinguished from v. 
 + 
 + * This filter requires a ''<scriptNote>'' tag with the ''@xml:id="script-ba-va"''
 + 
 +=== dbh written as bhd === 
 +<code pcre>/bh(\s?)d(?!h)/d\1bh/</code>
  
-nandita => naṃdita +In some Devanāgarī manuscripts, the conjunct dbh is written as bhd.
-yuñjati => yuṃjati+
  
 + * This filter requires a ''<scriptNote>'' tag with the ''@xml:id="script-dbha-bhda"''.