Unicode is a standard for encoding characters in any language. For example,
the character A is represented by a code 65. But there are many more characters
than just those found in the Roman alphabet. Before Unicode, encodings were
invented for each language, and it was often impossible to determine the
encoding used in a document. Another advantage of Unicode is that it permits text
to be shared between different operating systems and applications. In the past,
Windows used a different method of encoding text than the Mac, and it was often
difficult to write text on one platform and read it on another.
As opposed to older encodings in which characters were represented with eight
bits and could range from 0-255, Unicode permits characters to be up to 32 bits
in size, offering the possibility of millions of characters. In practice,
Unicode characters are generally 16 bits.
To view or edit a document in Unicode it is necessary to know the format in
which the characters are stored. Max 5 documents are stored as UTF-8. This is a
method of packing characters into an eight-bit sequence, which saves space and
is often more convenient to implement. If you want to look at a Max 5 file in
another application for some reason, you may need to tell the application that
the file is encoded in UTF-8, as many applications cannot determine the text
encoding of a document automatically.
Max can share Unicode text with other applications via the clipboard. Some
older applications (such as Eudora on the Mac) may not understand Unicode and
non-Roman characters may not be converted properly. You may notice this when
cutting and pasting text into Max from other applications.
Converting the text in older Max documents to Unicode
describes how text was encoded in older Max documents and how you can control converting
extended characters.