Texinfo / parse-info stuff

Subject: Texinfo / parse-info stuff
From: Rupert Swarbrick
Date: Sat, 04 May 2013 17:57:24 +0100
>> (snip: clisp failing miserably to print latin1 text for me)
>
> FWIW, this works for me on my Mac, using clisp 2.49. "zur?ck" is printed
> (correctly with two dots over the second u).

How confusing! Well, with clisp at least I can dig a bit deeper to see
what's going wrong on my system. I'll spend a bit of time on it now, I
think.

> Not sure about LANG=es.  I see
>
> los par??metros
>
> with both ccl, clisp and cmucl.  I don't know if that's correct or not.

Ah, well this looks like a bug in our transcoding. If you use Emacs, or
another text editor that gives you control over coding systems, visit
doc/info/es.utf8/maxima.info-1 and tell the editor that it should be
utf-8 text. The first line of the build_info text has this mangled data,
which looks like we didn't convert it correctly from latin1.

I *think* this bug is orthogonal to the lisp info-parsing stuff.

> But sbcl (1.0.49) prints ?? for "?".  Same results for when I use
> LANG=es_ES:UTF-8.

I think that you were probably being given the utf-8 data both times? Or
maybe we're not correctly transcoding from latin1 data in
doc/info/es/maxima.info-1 to your utf-8 terminal. Hmm.

> Also, I can't build with ecl anymore.  It complains:
>
>
> ;      - Compiling source file
> ;        "/Volumes/share2/src/sourceforge/maxima/maxima-mac/src/locale.lisp"
> ;;;
> ;;; Compiling /Volumes/share2/src/sourceforge/maxima/maxima-mac/src/locale.lisp.
> ;;; OPTIMIZE levels: Safety=2, Space=0, Speed=3, Debug=2
> ;;;
> ;;; Compiling (DEFVAR *LOCALE-DEFNS* ...).
>
> Cannot find the external symbol LATIN-1 in #<"SI" package>.
>
> Perhaps my version of ecl is too old (11.1)?

Well, I'm only on 11.1.1, it seems, so that's a bit surprising. Looking
at
http://ecls.sourceforge.net/new-manual/ch19.html#ansi.streams.formats, I
see that a unicode build of ECL is required though. Is it possible that
this is what's missing? If so, I guess we've got to work out a sensible
fallback. I can't spot a ":unicode" entry in *features* here, so this
might be slightly nontrivial...

> Overall, it seems parse-info works fine.  I didn't measure whether
> this new scheme is faster than perl.  Both are pretty fast enough on
> my machines that I don't care.
>
> I wonder if we don't also need to set the external format for the
> output stream.  I would think that we be correct most of the time if
> we could set the output stream to use utf8 always.  All of my
> terminals are set for utf8.

I agree that we need to think about it, but I would sort of expect the
lisp implementation to sort this out: My mental model is that if my lisp
reads in a character (rather than byte) stream, it converts each
character into its favoured internal coding system. Then, when it prints
the character, it can look at the locale data ($LANG etc.)  to figure
out what the terminal is expecting. Presumably there are optimisations
to avoid converting if input and output formats are equal and both are
non-equal to the internal representation, but that's an implementation
detail. But maybe my mental model is wrong. Hmm.


Rupert
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 315 bytes
Desc: not available
URL: <http://www.math.utexas.edu/pipermail/maxima/attachments/20130504/3494d2a8/attachment.pgp>;
Prev by Date: The combine method
Next by Date: Texinfo / parse-info stuff
Previous by thread: Texinfo / parse-info stuff
Next by thread: Texinfo / parse-info stuff
Index(es):
- Date
- Thread