[BRLTTY] reading DPF files with BRLTTY: character conversion

Zachary Kline zkline at speedpost.net
Sun Apr 24 23:07:31 EDT 2011


Dear Jason,
pdftotext itself can do this, investigate the character set options, and if you convert to ASCII you'll find the ligatures properly removed.  At least, such has been my experience.
Hope this helps,
Zack.
On Apr 24, 2011, at 8:02 PM, Jason White wrote:

> I've noticed that when the version of pdftotext from poppler-utils is used, or
> pdftohtml is used (which also depends on Poppler), certain sequences of
> letters are often represented as single characters in the output. This is due
> to kerning and ligatures in the typesetting, which result in the use of
> special glyphs in the font.
> 
> To read such files with BRLTTY, is there a tool which can expand these
> characters into the appropriate letter sequences?
> 
> Someone mentioned to me that iconv should be able to do this, but I haven't
> been able to find the right options. The input would be a text file produced
> by pdftotext, and the output a text file with the characters correctly
> represented as letters.
> 
> Alternatively, I suppose BRLTTY could be made to do this, given suitable
> tables; but I would rather convert the text files themselves, to aid in
> searching.
> 
> _______________________________________________
> This message was sent via the BRLTTY mailing list.
> To post a message, send an e-mail to: BRLTTY at mielke.cc
> For general information, go to: http://mielke.cc/mailman/listinfo/brltty



More information about the BRLTTY mailing list