[BRLTTY] The unknown-character sign (26)

Mario Lang mlang at delysid.org
Mon Nov 24 14:27:36 EST 2008


Dave Mielke <dave at mielke.cc> writes:

>>I think in 6-dot uncontracted text-table world, the question mark

er, I ment to type "8-dot uncontracted text-table world", of course.

>>approach is OK, because in that world the user assumes every
>>char on the screen consumes one cell.  In contracted braille however,
>>the user knows what they read does not correspond to
>>whats on screen in terms of character width.  That, and the
>>fact that contracted braille overloads so many symbols allows/requires
>>us to break with the question mark idiom at least for some languages.
>
> If we assume that most contradtion tables use six dots then we should be
> okay with the default unknown character being all eight.

As explained above, I am not so much concernd with changing the current
representation in text-tables.  What I am looking for is a way for 6-dot
contracted braille tables to override the default for invalid characters.

> What do you think about presenting an unknown character as its escape 
> sequence?

I think we are looking at two different types of unknown characters, aren't we?
There are those that have no mapping in either the contraction or the
text-table, and there are those which we fail to map back to a proper unicode
codepoint because the character wasn't in the font currently being used.
To come back to your question, I think it only makes sense to
represent a unknown character as its escape sequence if we actually know
what character it was, and, while 6dot mode is active, because only
in 6dot contracted braille mode will the user assume that character width can
be unequal to cells used.  So yes, maybe it would be useful to
have contracted braille print unknown characters with an escape sequence,
especially given this comment at the end of BrailleTables/en-us-g2.ctb:
# Windows mail programs use all sorts of weird characters for apostrophes.
# I would prefer to have all characters above 126 just show up as a
# backslash and two hex digits.
Maybe extend the wish from "above 126" (which comes from pre-unicode days)
to "undefined in either the contraction or the text-table"?

However, this still leaves us with the other unknown character
case, that one where the font couldn't be used to map back to the real
character value.  In that case, I still think the contraction table
author should be able to define what should be printed, instead of relying on
what the text-table ultimately defines.

> What do you think about eliminating the text table as a fallback and
> insisting that the contraction table define all of its characters?
I am not sure this is useful, especially since a user might *want* to mix
different text/contraction tables.  Also, if a contraction table
really only uses 6dot output it does seem to make sense to be able to
fall back to the characters defined in a text-table that use dot7 or dot8.
This way, excessive duplication of character definitions can be avoided.

-- 
CYa,
  ⡍⠁⠗⠊⠕ | Debian Developer <URL:http://debian.org/>
  .''`. | Get my public key via finger mlang/key at db.debian.org
 : :' : | 1024D/7FC1A0854909BCCDBE6C102DDFFC022A6B113E44
 `. `'
   `-      <URL:http://delysid.org/>  <URL:http://www.staff.tugraz.at/mlang/>


More information about the BRLTTY mailing list