Unicode and Max

Unicode Support in Max

Unicode is a standard for encoding characters in any language. For example, the character A is represented by a code 65. But there are many more characters than just those found in the Roman alphabet. Before Unicode, encodings were invented for each language, and it was often impossible to determine the encoding used in a document. Another advantage of Unicode is that it permits text to be shared between different operating systems and applications. In the past, Windows used a different method of encoding text than the Mac, and it was often difficult to write text on one platform and read it on another.

As opposed to older encodings in which characters were represented with eight bits and could range from 0-255, Unicode permits characters to be up to 32 bits in size, offering the possibility of millions of characters. In practice, Unicode characters are generally 16 bits.

To view or edit a document in Unicode it is necessary to know the format in which the characters are stored. Max 5 documents are stored as UTF-8. This is a method of packing characters into an eight-bit sequence, which saves space and is often more convenient to implement. If you want to look at a Max 5 file in another application for some reason, you may need to tell the application that the file is encoded in UTF-8, as many applications cannot determine the text encoding of a document automatically.

Max can share Unicode text with other applications via the clipboard. Some older applications (such as Eudora on the Mac) may not understand Unicode and non-Roman characters may not be converted properly. You may notice this when cutting and pasting text into Max from other applications.

Converting the text in older Max documents to Unicode describes how text was encoded in older Max documents and how you can control converting extended characters.