error trying to build de documentation

Subject: error trying to build de documentation
From: Robert Dodier
Date: Mon, 6 Dec 2010 11:43:14 -0700

When I wrote the perl script to compute the offsets,
there was some Lisp-inspired craziness about byte offsets vs character offsets.
I seem to recall that the offsets are character offsets,
but there something else which is a byte count ... maybe the
amount of stuff to read is a number of bytes, not a number of characters.

I'm pretty sure I did try reading some files with multibyte characters
(Spanish & Portuguese, I guess) so I wouldn't throw out the
existing offset/count stuff just yet. Unfortunately I won't
have time to investigate for a few days, maybe someone else
can take a look at it.

best, Robert Dodier

On 12/6/10, Raymond Toy <toy.raymond at gmail.com> wrote:
> On 12/6/10 11:37 AM, Leo Butler wrote:
>>
>> On Mon, 6 Dec 2010, Raymond Toy wrote:
>>
>> < On 12/6/10 1:10 AM, Robert Dodier wrote:
>> < > Yeah, I see the problem with the incorrect indexing too.
>> < > Could be looking in the correct file at the incorrect offset,
>> < > or the incorrect file at the correct offset, or
>> < > both the file and offset are incorrect. I didn't
>> < > look at it carefully.
>> < I don't read perl very well, but could the problem be that
>> < build-index.pl is reading the info files with a utf-8 encoding?  This is
>> < the right encoding, but won't that totally mess up the index in
>> < maxima-index.lisp?  I'm pretty sure the indices in maxima-index.lisp are
>> < octet offsets, not character offsets.
>>
>>  I was inclined to believe this, but I don't think the problem is here.
>>  I re-wrote the build_index.pl to use the right encoding (and speed it
>>  up), but this doesn't affect the problem.
>>
>>  Indeed, if you open maxima.info-1 in an emacs buffer, put point at
>>  (point-min) and (goto-char 288618), you will arrive in the middle
>>  of the `expand' documentation. So the char vs. byte counts are quite
>>  close. Accessing online help for `expand'
>>  puts you in the midst of the docstring for `example'.
> But, from looking at read-info-text in cl-info.lisp, the octet count has
> to be exact because read-info-text moves to the exact offset in the file
> and reads some number of octets.  So, close isn't enough.  From tracing
> read-info-text on "? expand", the offset is 33623, but the documentation
> for expand starts at offset 288346.
>
> (33623 was obtained from maxima-index.lisp.)
>
> So calling read-info-text with the correct offset produces the correct
> documentation (more or less).
>>  Even more peculiarly, ? expandwrt displays the same string as ? expand,
>>  but the offsets differ.
> Because maxima-index.lisp says the offsets are the same.
>>
>>  Based on all this, I tend to think the problem lies in the lisp
>>  function reading the info files.
> You are also correct about this.  read-info-text opens the file with
> some default encoding.   I'm not exactly sure what file-position does in
> various lisps for encoded files.  If file-position moves the to the
> specified octet, then that's ok.  But then we use read-sequence.
> Read-sequence doesn't support any kind of encoding, so the returned
> string will probably be messed up.
>
> I think what we need to do here is open the file as a binary file of
> octets, move to the correct offset and read in the desired number of
> octets into an array.  Then this array needs to be converted to a string
> using the correct encoding.  (Most lisps have some kind of
> octets-to-string function.)
>
> Do this make sense to you?
>
> Ray
>
> _______________________________________________
> Maxima mailing list
> Maxima at math.utexas.edu
> http://www.math.utexas.edu/mailman/listinfo/maxima
>

Prev by Date: [newbie] zero file length when plotting (gnuplot-syntax)
Next by Date: maxima won't work
Previous by thread: error trying to build de documentation
Next by thread: error trying to build de documentation
Index(es):
- Date
- Thread