[BRLTTY] about descchar binding

Samuel Thibault samuel.thibault at ens-lyon.org
Sat Jan 31 07:49:19 EST 2009


Samuel Thibault, le Sat 31 Jan 2009 13:18:47 +0100, a écrit :
> Dave Mielke, le Sat 31 Jan 2009 03:02:24 -0500, a écrit :
> > [quoted lines by 高生旺 on 2009/01/31 at 15:52 +0800]
> > 
> > >For Chinese, may I add more information to descchar?
> > 
> > Right now we use the description associated with the character as supplied by 
> > the Unicode database. It seems to be rather generic for the ideographic 
> > characters, and, therefore, doesn't seem to be very helpful. For the benefit of 
> > other readers, the Unicode character 9000, for example, has, for its 
> > description, the rather useless phrase "CJK UNIFIED IDEDOGRAPH-9000".
> > 
> > Does each of those characters have a specific meaning, or can a character have 
> > one meaning in one language and another meaning in another language?
> > 
> > Do you have a list of what each of those characters actually means?
> 
> That list already exists: there is a more descriptive field in Unicode,
> for instance for U+9000 it says:
> 
> `Definition in English: step back, retreat, withdraw
> Mandarin Pronunciation: TUI4
> Cantonese Pronunciation: teoi3
> Japanese On Pronunciation: TAI TON
> Japanese Kun Pronunciation: SHIRIZOKU SHIRIZOKERU
> Tang Pronunciation: *tuə̀i
> Korean Pronunciation: THOY'

I haven't found how to get them from libicu yet, but from libgucharmap
these are returned by gucharmap_get_unicode_k{Definition, Cantonese,
Mandarin, Tang, Korean, JapaneseKun, JapaneseOn}.

This seems to be quite specific to CJK, I haven't found similar things
for other scripts. Maybe that could be an additional preference, to let
the user choose which language to replace CJK COMPATIBILITY IDEOGRAPH
with.

Samuel


More information about the BRLTTY mailing list