On Tue, 7 Dec 2010, Raymond Toy wrote:
< On 12/6/10 11:28 PM, Robert Dodier wrote:
< > On Mon, Dec 6, 2010 at 5:23 PM, Raymond Toy <toy.raymond at gmail.com> wrote:
< >
< >> Can't say anything about the modified build_index since I don't seem to
< >> have that, but I do agree that read-info-text needs to be modified to
< >> open the file with element-type unsigned-byte 8 and the result of
< >> read-sequence needs to be converted from an array of octets to a string
< >> using the correct encoding. All lisps have some way of doing this, but
< >> I'm not sure maxima currently knows what the encoding of the file is.
< > I disagree that read-info-text needs to be changed; at this point
< > I don't see any evidence that it is doing something wrong.
< > Given the offset and length, it reads the junk in the file.
< > The offset is certainly wrong, and maybe the length too,
< > but that's a problem to fix in build_index.pl.
<
< The part about the offset and length is ok in read-info-text. There's a
< mismatch between what is read, what's in the string, and how it gets
< output. Perhaps maxima gets away with it because we don't (I think) set
< the external format for the output streams so the format is presumably
< latin-1. Reading the octets from the stream produces the right utf-8
< octets that will get sent out as is to the terminal, so it looks like
< everything is ok. But internally, the string is not right.
<
< > I say this after looking at the output of
< > od -Ad -w25 -c maxima.info-1
< > and looking for the offsets for "expand" and "additive" in
< > that file. The stuff in the od output at those offsets is indeed
< > what's shown by ? expand and ? additive.
< >
< > The offset for "expand" is way off the mark; it should be
< > 287010 from what I can tell, but it's shown as 33336 in
< > maxima-index.lisp. I don't think that can be attributed
< > to the difference between bytes and characters.
< Yes, this is what I see. And expandwrt has the same offset.
I have just committed the changes to build_index.pl.
This re-creates the english language lisp info hashes. When I generate
the german info files in de with utf-8 encoding, then
run the new script to generate the lisp info hashes, I
get the correct beginning for
? expand
and several others.
However, I am seeing some small bits of text at the end of the
docstring that are overflow.
This indicates to me that the offsets and lengths are now correct.
I believe that file-position is going to the correct position in
the info file, but that read-sequence is reading in length characters
rather than length bytes.
I would appreciate it if someone would test this.
Leo
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.