more about build-index+cl-ppcre branch & encodings




On Wed, 2 Mar 2011, Robert Dodier wrote:

< I've updated my sandbox to revision 9c49048 and built Maxima.
< I'm seeing the same behavior today as I did a day or two ago;
< titles & content is displayed correctly in ISO-8859 locales,
< in UTF-8 locales, titles are correct and content is messed up.
< 
< I guess that the encoding for the content is set incorrectly.
< I don't know how the encoding for the titles could be correct
< and the content incorrect.

Because they use differenct functions to write their output.
The output to *standard-output* is being written with the
wrong encoding for you (but not me). Could you try Ray's
cmucl fix, please.

< 
< As it happens, the code for the existing describe system
< in src/cl-info.lisp doesn't bother with encodings at all;
< it falls on the Lisp implementation to figure out the encoding.
< That scheme displays titles & content correctly in ISO-8859
< and UTF-8 locales so far as I know.

It would be nice if you would test this supposition, so we
can know for certain.

< That suggests that the encoding stuff in src/build-index.lisp
< could be simplified. Just a guess.
 
 And now we go full circle back to Ray's initial idea.
 
 Leo

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.