>From mailnull Wed May 22 20:40:46 2013
Received-SPF: pass (sog-mx-4.v43.ch3.sourceforge.com: domain of math.utexas.edu designates 146.6.25.7 as permitted sender) client-ip=146.6.25.7; envelope-from=maxima-bounces at math.utexas.edu; helo=ironclad.mail.utexas.edu;
Date: Wed, 22 May 2013 20:39:26 +0000
From: Leo Butler <l_butler at users.sourceforge.net>
CC: <maxima at math.utexas.edu>
Content-Type: text/plain; charset="utf-8"
>From mailnull Wed May 22 18:34:43 2013
Received-SPF: pass (sog-mx-4.v43.ch3.sourceforge.com: domain of math.utexas.edu designates 146.6.25.7 as permitted sender) client-ip=146.6.25.7; envelope-from=maxima-bounces at math.utexas.edu; helo=ironclad.mail.utexas.edu;
From: Robert Dodier <robert.dodier at gmail.com>
Date: Wed, 22 May 2013 18:32:53 +0000
Content-Type: text/plain; charset="utf-8"
On 2013-05-22, Leo Butler <l_butler at users.sourceforge.net> wrote:
> But both ecl and gcl choke,
Well, ECL should be able to process UTF-8 characters. How did you launch
it? I'm pretty sure I've tried it with ECL by launching a UTF-8 xterm
and then executing Maxima + ECL in that and it works fine. Also
something like 'LANG=foo.UTF-8 maxima -l ecl'.
That does not work for me with ecl 11.1.1 from the debian testing
repo. The issue appears to be with this version of ecl, because, if
the encoding is set on the command line, ecl barfs.
> Maxima 5.30.0 http://maxima.sourceforge.net
> using Lisp GNU Common Lisp (GCL) GCL 2.6.7 (a.k.a. GCL)
> Distributed under the GNU Public License. See the file COPYING.
> Dedicated to the memory of William Schelter.
> The function bug_report() provides bug reporting information.
> (%i1) ?:1;
> incorrect syntax: \201 is not an infix operator
> \317\201
> ^
Well, this is understandable -- GCL doesn't see the whole UTF-8
character, instead a sequence of 2 characters \317 and \201. \317 is
nonalphabetic according to ALPHA-CHAR-P, therefore it's treated as a
separate token from the next one (\201), then the parser barfs on \201
since it's not an operator.
Ok, the error message explains as much. The point is that the GCL
reader happily interns a symbol whose symbol-name consists of 2
characters \317\201:
>(coerce (symbol-name '?) 'list)
(#\\317 #\\201)
So I could hack a "utf8-enabled" Maxima parser by redefining alphabetp
or *alphabet*, like so
(%i1) :lisp (setf *alphabet* (append '(#\\317 #\\201) *alphabet*))
(\317 \201 _ %)
(%i1) ? : 1;
(%o1) 1
I have written a hack to enable (selected) wide-characters in non-utf8
aware Lisps, as I suggested. I put it on github, I'm not sure if it
merits being put in Maxima's contrib directory.
git clone git://github.com/leo-butler/utf8-hack.git
There is a README with examples. It has worked fine for me with both
gcl and ecl. Unfortunately, the github webserver does not do justice
to the README file, it is best read off-line.
Leo