Blog
Unicode Support in Scribe
Date: 19/6/2002
I've been playing around with a Unicode version of the compose control. So far it's seems to be working quite well. What I want to do is make the unicode control available as a plugin, kinda like the HTML plugin for the meantime until I'm sure it's a good replacement for the existing control. Click on more for the full rundown.
Internally the old control used 8bit chars and a codepage, the new control uses UTF-16 internally and exports to UTF-8 or UTF-16 depending on the API call you use. This [theoretically] allows for emails written in Unicode to be displayed.

I'm keen to try my hand at ISO-2022-JP support and other similar codepages. Not sure whats required, but I know unicode support is a really good start.

The problem with representing 2 separate codepages from an original email and the reply can also be solved through unicode. Which some people have been vocal about over the last few weeks. So this will be addressed soon.

Several translators have asked about a unicode format for the Scribe.lr file, and I agree, it needs to be unicode. What I think I'll do is add UTF-8 support into LgiRes as a second format for saving the translations in. The existing ".lr" format will remain the same and a new format ".lr8" will be created to contain a UTF-8 version of the resource format. The existing Scribe.lr will then be released in .lr8 format (there's a bit of work to do in converting it) after which you can edit the .lr8 file in LgiRes or in your own UTF-8 editor, sure in the knowledge that you can use whatever unicode string you like and it will show up in Scribe fine. Well eventually that is, I'm sure it'll take a while to get everything converted over ;)

This will mean some rather sweeping changes to LGI as a result, already there is a new API on the GFont object for writing and sizing UTF-16 text. A lot of functions will gain wide versions to cope with unicode. Also I'll have to write a lot of basic string functions in wide versions. I hear that a lot of the built in wide char functions (like wcslen, wcscpy etc) in windows are stubbed out incorrectly in win9x. So in the interest of public safty and cross-platformability I'm going to write my own UTF-16 versions.

 
Reply
From:
Email (optional): (Will be HTML encoded to evade harvesting)
Message:
 
Remember username and/or email in a cookie.
Notify me of new posts in this thread via email.
BBcode:
[q]text[/q]
[url=link]description[/url]
[img]url_to_image[/img]
[pre]some_code[/pre]
[b]bold_text[/b]