Multilingual Messages in Eudora

Valid XHTML 1.0!

Requirements: Mac OS 10.2+, Eudora 5.1+, AppleScript 1.9.1+
Although this article uses Chinese text to illustrate the problem of non-ascii and multilingual text in Eudora, the solutions provided are not specific to CJK text.
NB. If you have Eudora’s default font set to a non-Roman font such as Taipei, Beijing, Geneva CY etc. this should be changed.
If you have tried using Eudora with MacOS X for sending messages in Chinese or in multilingual text, you will probably have discovered apparently insuperable problems of all kinds and may well have given up the struggle and switched regretfully to another mailer, with all the sacrifices that entails.  In this article I will attempt to analyse the problems and to provide a solution to all of them.  To follow this article in precise detail you need to be using an intelligent browser such as Omniweb or Safari.  Internet Explorer etc. will behave differently, though the problem is the same.
1. No font 2. Apple LiGothic 3. BiauKai 4. Osaka
眾鳥高飛盡
孤雲去獨閒
相看兩不厭
唯有敬亭山
眾鳥高飛盡
孤雲去獨閒
相看兩不厭
唯有敬亭山
眾鳥高飛盡
孤雲去獨閒
相看兩不厭
唯有敬亭山
眾鳥高飛盡
孤雲去獨閒
相看兩不厭
唯有敬亭山
The above little poem 敬亭獨坐 by 李白 will serve to illustrate the problem.  Our first exercise is to try copying the four different representations of the poem into a Eudora message window.  It will help in the understanding of the problem if you have the formatting toolbar visible in composition windows, which you can get by pasting this URI into Eudora and clicking it:
<x-eudora-setting:188=y>.
Now make new messages and paste the four versions into the messages.  The result ought to look something like this:
Pasted into Eudora
Notice that the only versions that have pasted correctly are 2 and 3, those which have a Chinese font specified.  The version with no font pastes as unrepresentable Unicode characters (“?”) and the Osaka version has two characters missing because these are not in the Japanese character set.  The poem is written in utf-8 encoding like this in every instance:

眾鳥高飛盡
孤雲去獨閒
相看兩不厭
唯有敬亭山

When we copy something to the clipboard, the clipboard takes in as much information as it is given about the item copied and holds this in a record.  If there is enough style information on the clipboard to convert the Unicode text to styled international text in legacy character sets, then it is possible to paste useful data into an application like Eudora that does not deal with true Unicode.  A character in a Unicode-intelligent application such as TextEdit, Nisus Express or Safari will be properly displayed without any font information, so you can copy this from any of them: and paste it directly into any other, but if you try pasting it into Eudora or Tex-Edit Plus or BBEdit, you will get just ?

Eudora can only display characters that exist in the legacy character sets and she does so by converting whatever valid data is received into the first suitable graphic character.  In the case of Traditional Chinese that character is likely to be found in the shift_JIS character set and the encoding of the character will most likely be shift_JIS.  Any sizeable chunk of traditional hanzi displayed in Eudora is likely to be a mixture of characters in shift_JIS and big5 encoding, and a chunk of simplified hanzi is likely to be a mixture of characters in shift_JIS and gb encoding.
This means that it is impossible to reply to a message using any single legacy character set.
Pasted into Eudora
In the picture above you see a message (apparently in Chinese) that I sent using a special table that specifies "big5" as the character set and superimposed on it you see what I, and other recipients, would receive.  If I blah the message, paste the source into Eudora or a classic text editor and change the font to Taipei, I will get utter garbage except for the two characters and .  However if I change the font to Osaka, only those two characters appear as garbage.  
From: John Delacour <jd@bd8.com>
Subject: Re: A Poem
Content-Type: text/plain; charset="big5" ; format="flowed"

<x-flowed big5>At 10:06 pm +0100 19/10/03, John Delacour wrote:
>  ≤≥íπçÇîÚ·∂
>  å«â_ã釒∂¢
>  ëää≈ô_ïsâ}
>  óBóLåhí‡éR
</x-flowed>
From: John Delacour Subject: Re: A Poem
Content-Type: text/plain; charset="big5" ; format="flowed"
At 10:06 pm +0100 19/10/03, John Delacour wrote:
>  イウ鳥高飛盡
>  孤雲去獨カ「
>  相看兩不厭
>  唯有敬亭山 
These are just a few examples of the problems we are faced with.  The only certain way of getting 100% big5 text in a Eudora message is either to type it in or paste valid text copied from a classic text application or an application the does not use Unicode.  As to replying to messages received, there is effectively no way using legacy character sets.
So what is the solution? — The solution is to use Unicode and to send messages with the charset set to “utf-8” In order to do this we need a number of special tools.  The key is the clipboard, which contains all the information needed to create a utf-8 string from the text copied.  Whereas Eudora can make no sense of Unicode (utf-16) text, the utf-8 transformation of Unicode into 8-bit characters presents no problem.
Pasted into Eudora
In the download indicated below you will find a file named utf-8 table [8808] which you must put in the folder that contains your Eudora Settings file (normally ~/Documents/Eudora Folder/).  The script “Paste clip -> utf8” belongs in Eudora’s Scripts folder.  If you are not sure where that is, select “Open Scripts Folder” from the Scripts menu.
Download Eudora utf-8 Tools
Pasted into Eudora
When you have put these files in the proper places, quit Eudora and relaunch.  You will then find the script in your Scripts menu and the transliteration table in the Message::Change::Transliteration sub-menu.  You will not normally have to set this setting manually since the script, besides converting the text of the message, also sets the encoding for sending.
Pasted into Eudora
You are now ready to begin.  Create a new message and type or paste from any source some styled Chinese, Japanese or Russian text into the message.  If you like you can mix languages.  So long as the text is readable in the Eudora message window there will be no difficulty.  Now select the body of the message and type command-c to copy it.  Next select “Paste clip->utf8” from the Scripts menu and watch what happens.  The picture below shows on the left a toolbar button linked to the script in the Scripts menu.  How to do this will be described below.  The Chinese text in the message is selected and copied to the clipboard.  When the button is pressed (or the script run from the menu) this text is converted to what looks styled garbage.  Don't panic! This is how it should be.  The text that you copied has been converted to Unicode utf-8.  You can now, if you like, remove the styling from the message by selecting all the text and typing command-option-t.  It will then look like the text on the right.
From now on you must not add to or change the text of the message (unless you understand what you are doing) but you can add or remove styles, colours, quote bars etc. if you like.  Just don't add anything or remove anything from the encoded text itself.
Three messages
When you ran the script the “transliteration table” of the message was also changed to add the proper information in the headers that will be sent.  This is most important, otherwise your recipients would get just a string of garbage.
You can now send the message to yourself and see what the result is like.

To sum up, here are the stages in the process:
  1. Copy some readable text from any source.  This might be a Classic application, a cocoa application such as TextEdit, the Eudora message itself or any source which is displaying the text in the proper characters.
  2. With your outgoing message frontmost run the script “Paste clip->utf8” from a toolbar button or from the Scripts menu to insert the contents of the clipboard encoded in utf-8 into the message.
  3. Add or remove any styling etc. as required taking care to make no modifications to the text itself.
  4. Send the message with or without styles with a Bcc. to yourself so that you can tell whether you've made any mistakes.
I have crossed out references to styles in the list above, since you are likely to get problems with styled multilingual messages unless you know exactly how Eudora behaves. I therefore do not advise you to send styled multilingual messages until Eudora is able to deal with them properly in all cases.
Further details will be added to this page in due course.  Please let me know of any problems you encounter.