Basic Unicode readiness testing for your application

Unicode is a very complex standard, always evolving, but this doesn’t mean you shouldn’t do some basic testing, in order to uncover hidden bugs.

Here is a small Unicode string that could be used to test the readiness of your application to deal with Unicode strings. You can use this string to:

  • filename – save and load files using this string as port of the filename. Also you should try a long path of more than 260 characters in order to find problems regarding usage of older API under Windows.
  • text input (paste it) and see if the application will display it wrongly.
    • input cursor – first check how cursor moves under Notepad on Windows 7 and see if your application behaves the same. If you’ll see strange character movements or decompositions, you are doing something wrong.
    • selection – as above, check the notepad fist and after this check if your application does select text the same way
  • rendering, if your application is rendering text somewhere, it’s a good idea to use it to see if it does render well
    • text size, are the CJK characters too small to be recognized?
    • bad rendering, an empty rectangle may indicate a missing glyph (required font missing), this is not very dangerous – nobody has all the fonts but if you see question marks or other strange things you may have a real problem. It’s best if your application does support font-fallback, when it does display text in order to prevent the missing glyph sign.

I will post a text file encoded as UTF-8 (with BOM) that contains the test string because WordPress will cut the article where it does find the character outside Unicode BMP.

Let me know if this helped you and if you know additional tests that I could include in this basic test.

This entry was posted in Input, Output, Readiness. Bookmark the permalink.

Comments are closed.