PackageDescription: X11UnicodeClipboard


X 1 1 Unicode Clipboard

Last published: September 19, 2012 by 'stevek'

Defines 0 Classes
Extends 1 Classes


When using the X clipboard/selection, VMs up to at least 7.9 can only successfully transfer ISO-8859-1 characters. This package corrects that to be able to read any Unicode character from the X clipboard. It also partially corrects writing Unicode characters to the X clipboard by escaping them in the standard X format for unsupported characters. These escaped characters look like \u2aff, and cannot be pasted directly, but can be easily transformed into the real characters: e.g. paste in 3rd field at and press Convert here: http://unicode.online-toolz.com/tools/text-unicode-entities-convertor.php

Details:
The VM interacts with the X clipboard using XA_STRING, which is ISO-8859-1 encoded with LF for CR. When external applications are asked to put non-ISO-8859-1 characters on an XA_STRING, they represent them as \u####, where #### is the lower-case hex of the Unicode code point. As no other escaping is done, there is no way to separate a genuine '\u1234' sequence from a converted U+1234 character. We assume every \u#### is an escape, and convert it using Regex11 (could be done without, albeit more verbosely: feel free to change).