The Indian Script Code for Information Interchange (ISCII, ref.
) defines a common encoding system for ten different Indian scripts derived from Brahmi. The encodings are mostly single-byte, but feature a number of two-byte combinations:
- The nukta diacritic (़) normally adds a dot to the preceding character, but is also used as a generic code extension mechanism for rarer characters (e.g., Devanagari ॐ is encoded as ँ with nukta).
- A letter preceded by an ATR (attribute) character (0xEF) switches to a different script or encodes formatting information.
- The EXT (extension) code (0xF0) changes the meaning of the following character, for instance to encode Vedic characters as specified in Annex G (for Devanagari only).
- A virama or halant may be doubled or modified by a nukta to indicate an explicit or soft version.
IE has an implementation of ISCII Devanagari (PDF
including the two Vedic characters included in the IE implementation) under the MIME label
. The invisible consonant at 0xD9 is missing (mapped to a question mark), and formatting instructions are ignored, but script switching is supported (this applies to Punjabi and Gujarati as well).
The MIME charset label x-iscii-pa
selects IE’s implementation of ISCII Punjabi (PDF
including an undocumented nukta combination for ੜ). The new letter ਲ਼ (precomposed ਲ with nukta), not mentioned in the ISCII specification, is included. The separator । is replaced by an ASCII full stop. Undefined bytes are filled with characters from adjacent positions.
ISCII Gujarati (PDF
) is linked to the MIME label x-iscii-gu
Internet Explorer appears to support the remaining seven Brahmic scripts in ISCII as well, viz
, Oriya, Bengali, Assamese, Telugu, Kannada, Malayalam and Tamil.
Apple has defined encodings for three major Brahmic scripts. The Macintosh encodings are closely related to the corresponding ISCII encodings. Some characters are added to columns 8 and 9 (undefined in ISCII), a number of trivial nukta combinations are excluded (i.e.,
they are decoded not as precomposed characters but instead as base letters followed by nukta), and no ATR or EXT sequences are defined.
Macintosh Devanagari (PDF
) has the MIME label x-max-devanagari
and enjoys support in Safari (which appears to have a problem with virami+virami as well a number of nonsensical combinations with virami or nukta) and Firefox (which does not implement special Unicode mappings for any of the nukta combinations, but nevertheless renders them correctly). Firefox still substitutes U+FFFD for delete in this encoding.
Macintosh Gurmukhi (PDF
) is denoted by the MIME charset label x-mac-gurmukhi
. In addition to what has already been mentioned for Devanagari, Safari maps ਸ਼ to two characters (ਸ and nukta), whereas Firefox maps it to nothing.
The MIME label x-mac-gujarati
is used for Mac Gujarati (PDF
). The implementation notes for Devanagari apply here as well.