[BRLTTY] RTL languages and BRLTTY (was Re: BRLTTY in GRUB)

Dave Mielke dave at mielke.cc
Fri Mar 16 21:29:59 EDT 2012


[quoted lines by Vladimir 'φ-coder/phcoder' Serbinenko on 2012/03/16 at 22:58 +0100]

>Problem is that there are additionally several unicode bidi control
>characters which make it impossible to restore original reliably. If
>you see an alef left to beth then it might be either Hebrew word ba'
>(come) or a beginning of alphabet used in some generally
>left-to-right text with overrides.
>Another example: Let's say we have some table with title line
>2ELTIT 1ELTIT
>        2         1
>This is how a table would look like. But without knowing the element
>boundary and without control characters you would reverse the first
>line:
>TITLE1 TITLE2
>But then second line is purely numerical so it stays the way it is:
>         2         1
>Now our table is obviously screwed.
>Ideally we need some way to reliably get not-reordered version while
>allowing the visual screen to be reordered.
>Another issue with Hebrew is that braille character depends on both
>dagesh and sin dot. Both are usually omitted in normal text. There
>are 2 types of dagesh: lenae and forte. Former depends only on
>consonant in question and preceding vocal. Unfortunately vocals
>aren't written either.  But in Hebrew both dagesh forte and vocals
>depend on grammatical form and not only marginally on root. Most of
>the roots in Hebrew are three-letter ones. So a possible approach is
>to have the list of forms which infer dagesh forte or lene on one of
>its letters with root consonants replaced by wildcard which can match
>any Hebrew consonant.
>This doesn't work for loanwords though where the choice between
>putting dagesh or not is usually decided based on original writing or
>pronunciation.
>Similar applies for the two uses of yud: as a vowel or a consonant.
>Shin with a sin dot is another story, it's a complete letter which
>can't be inferred from grammar but is a possible root letter.
>Fortunately it's a rare letter. An example I can come up with offhand
>is "Israel". In Hebrew it's usually written ישראל. Full vocalisation
>would be יִשְׂרָאֵל, where you can clearly see the sin dot. So we need a
>rule for Israel. I see no exception file for Hebrew. Could someone
>render "Israel" from vocalised version I pasted and put it as a rule
>for non-vocalised one?
>In Yiddish, which is Germanic language which uses Hebrew script
>grammar doesn't help to disambiguate between those letters but
>vocalisation on ambiguous letters is more common but isn't universal.

My knowledge of the intracacies of Unicode isn't good enough to understand all 
of this yet, but, at the very least, you make a good argument for rendering in 
original order. That leaves us free to eventually get it right.

-- 
Dave Mielke           | 2213 Fox Crescent | The Bible is the very Word of God.
Phone: 1-613-726-0014 | Ottawa, Ontario   | 2011 May 21 is the End of Salvation.
EMail: dave at mielke.cc | Canada  K2A 1H7   | http://Mielke.cc/now.html
http://FamilyRadio.com/                   | http://Mielke.cc/bible/


More information about the BRLTTY mailing list