ISO-8859 encodings

ISO 8859 defines a set of European 96-character graphic character sets intended to be allocated to positions 160..255 in an 8-bit encoding vector.

The remaining positions 0..159 are usually mapped to the corresponding Unicode characters U+0..U+9F (which the tests below assume). With Firefox’s recent bug resolution, the alternative practice of mapping the control characters 127..159 to U+FFFD instead (indicated by lavender numbers in the test results below) is largely abandoned.

Many Windows encodings are roughly based on ISO-8859 encodings. Some are actually true supersets (ignoring control characters in the range U+80..U+9F), in which case most browsers tend to decode documents labelled as ISO 8859/x according to the corresponding Windows y instead.

ISO 8859/1: Latin alphabet No. 1

iso-8859-1227
1 + 27
225
3 + 27
227
1 + 27
223
27 + 5
PDF. Refs: ECMA-94 ISO-IR 100

All four browsers decode ISO 8859/1 as its superset Windows 1252.

ISO 8859/2: Latin alphabet No. 2

iso-8859-2254
1
252
3
221
1 + 33
255
PDF. Refs: ECMA-94 ISO-IR 101

ISO 8859/3: Latin alphabet No. 3

iso-8859-3254
1
245
3 + 7
221
1 + 33
255
PDF. Refs: ECMA-94 ISO-IR 109

ISO 8859/4: Latin alphabet No. 4

iso-8859-4254
1
252
3
221
1 + 33
255
PDF. Refs: ECMA-94 ISO-IR 110

ISO 8859/5: Latin/Cyrillic alphabet

iso-8859-5254
1
252
3
221
1 + 33
255
PDF. Refs: ECMA-113 ISO-IR 144

ISO 8859/6: Latin/Arabic alphabet

iso-8859-6254
1
207
3 + 45
221
1 + 33
255
asmo-708254
1
180
2 + 67 + 6
221
1 + 33
255
PDF. Refs: ECMA-114 ASMO-708 ISO-IR 127

Most browsers take asmo-708 to mean this encoding, apart from Internet Explorer, which maps it to DOS code page 708 instead.

ISO 8859/7: Latin/Greek alphabet

iso-8859-7254
1
244
3 + 5 + 3
222
1 + 32
255
PDF. Ref.: ISO-IR 227
Subset (– ‘€’ ‘₯’ ‘ͺ’): ELOT-128 ECMA-118 ISO-IR 126

Internet Explorer substitutes U+2BD ‘modifier letter reversed comma’ and U+2BC ‘modifier letter apostrophe’ for the visually similar U+2018 ‘left single quotation mark’ and U+2019 ‘right single quotation mark’. Furthermore, Internet Explorer uses a previous version of ISO 8859/7 which does not include euro, drachma and iota subscript.

ISO 8859/8: Latin/Hebrew alphabet

iso-8859-8254
1
213
3 + 3 + 36
222
1 + 32
255
PDF. Refs: SI 1311 ECMA-121 ISO-IR 198
Superset (+ ‘€’ ‘₪’ LRO RLO PDF LRE RLE): SI 1311:2002 ISO-IR 234
Subset (– LRM RLM): ISO-IR 138
Minimal subset: ISO-IR 164

Internet Explorer incorrectly substitutes U+203E ‘overline’ for the visually similar U+AF ‘macron’. Furthermore, Internet Explorer uses a previous version of ISO 8859/8 which does not include the directionality characters U+200E ‘left-to-right mark’ and U+200F ‘right-to-left mark’.

ISO 8859/9: Latin alphabet No. 5

iso-8859-9229
1 + 25
227
3 + 25
221
1 + 33
255
PDF. Refs: ECMA-128 ISO-IR 148

Some browsers decode ISO 8859/9 as its superset Windows 1254.

ISO 8859/10: Latin alphabet No. 6

iso-8859-10254
1
254
1
255
PDF. Refs: ECMA-144 ISO-IR 157

Internet Explorer does not recognise this encoding at all.

ISO 8859/11: Latin/Thai alphabet

iso-8859-11237
1 + 9 + 8
235
3 + 9 + 8
222
1 + 9 + 23
223
9 + 23
PDF. Ref.: ISO-IR 166
Subset (– NBSP): TIS-620

All four browsers decode ISO 8859/11 as its superset Windows 874.

ISO 8859/13: Latin alphabet No. 7 (Baltic Rim)

iso-8859-13254
1
252
3
221
1 + 33
255
PDF. Ref.: ISO-IR 179

ISO 8859/14: Latin alphabet No. 8 (Celtic)

iso-8859-14254
1
221
1 + 33
255
PDF. Ref.: ISO-IR 199

Internet Explorer does not recognise this encoding at all.

ISO 8859/15: Latin alphabet No. 9

iso-8859-15254
1
252
3
221
1 + 33
255
PDF. Ref.: ISO-IR 203

ISO 8859/16: Latin alphabet No. 10

iso-8859-16254
1
254
1
255
PDF. Refs: SR 14111 (SR 13411??) ISO-IR 226

Internet Explorer does not recognise this encoding at all.

Ad­ver­tise­ments

Contact

temp-onj8@coq.no