Character Set Tables

The following links are to character set tables in a uniform format, in which each character is included literally, its code shown in four ways (decimal, row/column, octal, hexadecimal), and its name given from the corresponding standard (if any), or else its Unicode name, or failing that a short-form name. Each table includes an HTML file with an announcer for its character set, so the characters should appear correctly in your Web browser if it supports HTML character-set declarations of the following form:

<META http-equiv="Content-Type" content="text/html; charset=iso-8859-1">

in which "charset" names are from the IANA / MIME registry.

If the characters do not display correctly in your browser, it means your browser does not understand the declaration, or it does not support that character set, or you don't have an appropriate font. However, you can still save the file and use it locally.

If you save a table, you can use it (you might want to keep only the part between <pre> and </pre>) to test character-set aware software. For example, if you save it on a host, then make a terminal connection (ssh, telnet, dialup, whatever) from your desktop computer to the host, you can see if your character-set definitions are working, and/or if you are using an appropriate font.

Note that nonprintable characters such as Soft Hyphen are likely to occupy no space in the display. Even though the brackets appear to be empty, there really is a character between them.

Just a few tables to begin with. If you need one that's not here, let me know and I'll add it.

Table IANA/MIME Script Remarks
ISO 8859-1 Latin Alphabet 1 iso-8859-1 Latin West Europe
ISO 8859-2 Latin Alphabet 2 iso-8859-2 Latin East Europe
ISO 8859-3 Latin Alphabet 3 iso-8859-3 Latin West Europe / Turkey
ISO 8859-4 Latin Alphabet 4 iso-8859-4 Latin North and West Europe
ISO 8859-5 Latin/Cyrillic Alphabet iso-8859-5 Cyrillic  
ISO 8859-6 Latin/Arabic Alphabet iso-8859-6 Arabic  
ISO 8859-7 Latin/Greek Alphabet iso-8859-7 Greek  
ISO 8859-8 Latin/Hebrew Alphabet iso-8859-8 Hebrew  
ISO 8859-15 Latin Alphabet 9 iso-8859-15 Latin West Europe
PC Code Page 437 ibm437 Latin West Europe
PC Code Page 850 ibm850 Latin West Europe
PC Code Page 852 ibm852 Latin East Europe
PC Code Page 856 (none) Cyrillic  
PC Code Page 861 ibm861 Latin Iceland
PC Code Page 862 ibm862 Hebrew  
PC Code Page 866 ibm866 Cyrillic  
Microsoft Windows Code Page 1250 windows-1250 Latin East Europe
Microsoft Windows Code Page 1251 windows-1251 Cyrillic  
Microsoft Windows Code Page 1252   windows-1252   Latin West Europe

You can find plain-text (not embedded in HTML) versions of these tables (and many more) in the Kermit FTP archive: ftp://kermit.columbia.edu/kermit/charsets/; transfer them in BINARY mode only. For any pair of files xxx.c and xxx.txt, the first is a C program to generate the table, the second is the table itself.


Frank da Cruz, The Kermit Project, Columbia University, March 2003