Next: String Processing, Previous: String Input and Output, Up: stringproc [Contents][Index]
Characters are strings of length 1.
Prints information about the current external format of the Lisp reader
and in case the external format encoding differs from the encoding of the
application which runs Maxima adjust_external_format
tries to adjust
the encoding or prints some help or instruction.
adjust_external_format
returns true
when the external format has
been changed and false
otherwise.
Functions like cint, unicode, octets_to_string and string_to_octets need UTF-8 as the external format of the Lisp reader to work properly over the full range of Unicode characters.
Examples (Maxima on Windows, March 2016):
Using adjust_external_format
when the default external format
is not equal to the encoding provided by the application.
1. Command line Maxima
In case a terminal session is preferred it is recommended to use Maxima compiled
with SBCL. Here Unicode support is provided by default and calls to
adjust_external_format
are unnecessary.
If Maxima is compiled with CLISP or GCL it is recommended to change
the terminal encoding from CP850 to CP1252.
adjust_external_format
prints some help.
CCL reads UTF-8 while the terminal input is CP850 by default.
CP1252 is not supported by CCL. adjust_external_format
prints instructions for changing the terminal encoding and external format
both to iso-8859-1.
2. wxMaxima
In wxMaxima SBCL reads CP1252 by default but the input from the application is UTF-8 encoded. Adjustment is needed.
Calling adjust_external_format
and restarting Maxima
permanently changes the default external format to UTF-8.
(%i1)adjust_external_format(); The line (setf sb-impl::*default-external-format* :utf-8) has been appended to the init file C:/Users/Username/.sbclrc Please restart Maxima to set the external format to UTF-8. (%i1) false
Restarting Maxima.
(%i1) adjust_external_format(); The external format is currently UTF-8 and has not been changed. (%i1) false
Returns true
if char is an alphabetic character.
To identify a non-US-ASCII character as an alphabetic character the underlying Lisp must provide full Unicode support. E.g. a German umlaut is detected as an alphabetic character with SBCL in GNU/Linux but not with GCL. (In Windows Maxima, when compiled with SBCL, must be set to UTF-8. See adjust_external_format for more.)
Example: Examination of non-US-ASCII characters.
The underlying Lisp (SBCL, GNU/Linux) is able to convert the typed character into a Lisp character and to examine.
(%i1) alphacharp("ü"); (%o1) true
In GCL this is not possible. An error break occurs.
(%i1) alphacharp("u"); (%o1) true (%i2) alphacharp("ü"); package stringproc: ü cannot be converted into a Lisp character. -- an error.
Returns true
if char is an alphabetic character or a digit
(only corresponding US-ASCII characters are regarded as digits).
Note: See remarks on alphacharp.
Returns the US-ASCII character corresponding to the integer int
which has to be less than 128
.
See unicode for converting code points larger than 127
.
Examples:
(%i1) for n from 0 thru 127 do ( ch: ascii(n), if alphacharp(ch) then sprint(ch), if n = 96 then newline() )$ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z
Returns true
if char_1 and char_2 are the same character.
Like cequal
but ignores case which is only possible for non-US-ASCII
characters when the underlying Lisp is able to recognize a character as an
alphabetic character. See remarks on alphacharp.
Returns true
if the code point of char_1 is greater than the
code point of char_2.
Like cgreaterp
but ignores case which is only possible for non-US-ASCII
characters when the underlying Lisp is able to recognize a character as an
alphabetic character. See remarks on alphacharp.
Returns true
if obj is a Maxima-character.
See introduction for example.
Returns the Unicode code point of char which must be a
Maxima character, i.e. a string of length 1
.
Examples: The hexadecimal code point of some characters (Maxima with SBCL on GNU/Linux).
(%i1) obase: 16.$ (%i2) map(cint, ["$","£","€"]); (%o2) [24, 0A3, 20AC]
Warning: It is not possible to enter characters corresponding to code points larger than 16 bit in wxMaxima with SBCL on Windows when the external format has not been set to UTF-8. See adjust_external_format.
CMUCL doesn’t process these characters as one character.
cint
then returns false
.
Converting a character to a code point via UTF-8-octets may serve as a workaround:
utf8_to_unicode(string_to_octets(character));
See utf8_to_unicode, string_to_octets.
Returns true
if the code point of char_1 is less than the
code point of char_2.
Like clessp
but ignores case which is only possible for non-US-ASCII
characters when the underlying Lisp is able to recognize a character as an
alphabetic character. See remarks on alphacharp.
Returns true
if char is a graphic character but not a space character.
A graphic character is a character one can see, plus the space character.
(constituent
is defined by Paul Graham.
See Paul Graham, ANSI Common Lisp, 1996, page 67.)
(%i1) for n from 0 thru 255 do ( tmp: ascii(n), if constituent(tmp) then sprint(tmp) )$ ! " # % ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~
Returns true
if char is a digit where only the corresponding
US-ASCII-character is regarded as a digit.
Returns true
if char is a lowercase character.
Note: See remarks on alphacharp.
The newline character (ASCII-character 10).
The space character.
The tab character.
Returns the character defined by arg which might be a Unicode code point or a name string if the underlying Lisp provides full Unicode support.
Example: Characters defined by hexadecimal code points (Maxima with SBCL on GNU/Linux).
(%i1) ibase: 16.$ (%i2) map(unicode, [24, 0A3, 20AC]); (%o2) [$, £, €]
Warning: In wxMaxima with SBCL on Windows it is not possible to convert code points larger than 16 bit to characters when the external format has not been set to UTF-8. See adjust_external_format for more information.
CMUCL doesn’t process code points larger than 16 bit.
In these cases unicode
returns false
.
Converting a code point to a character via UTF-8 octets may serve as a workaround:
octets_to_string(unicode_to_utf8(code_point));
See octets_to_string, unicode_to_utf8.
In case the underlying Lisp provides full Unicode support the character might be
specified by its name. The following is possible in ECL, CLISP and SBCL,
where in SBCL on Windows the external format has to be set to UTF-8.
unicode(name)
is supported by CMUCL too but again limited to 16 bit
characters.
The string argument to unicode
is basically the same string returned by
printf
using the "~@c" specifier.
But as shown below the prefix "#\" must be omitted.
Underlines might be replaced by spaces and uppercase letters by lowercase ones.
Example (continued): Characters defined by names (Maxima with SBCL on GNU/Linux).
(%i3) printf(false, "~@c", unicode(0DF)); (%o3) #\LATIN_SMALL_LETTER_SHARP_S (%i4) unicode("LATIN_SMALL_LETTER_SHARP_S"); (%o4) ß (%i5) unicode("Latin small letter sharp s"); (%o5) ß
Returns a list containing the UTF-8 code corresponding to the Unicode code_point.
Examples: Converting Unicode code points to UTF-8 and vice versa.
(%i1) ibase: obase: 16.$ (%i2) map(cint, ["$","£","€"]); (%o2) [24, 0A3, 20AC] (%i3) map(unicode_to_utf8, %); (%o3) [[24], [0C2, 0A3], [0E2, 82, 0AC]] (%i4) map(utf8_to_unicode, %); (%o4) [24, 0A3, 20AC]
Returns true
if char is an uppercase character.
Note: See remarks on alphacharp.
This option variable affects Maxima when the character encoding provided by the application which runs Maxima is UTF-8 but the external format of the Lisp reader is not equal to UTF-8.
On GNU/Linux this is true when Maxima is built with GCL
and on Windows in wxMaxima with GCL- and SBCL-builds.
With SBCL it is recommended to change the external format to UTF-8.
Setting us_ascii_only
is unnecessary then.
See adjust_external_format for details.
us_ascii_only
is false
by default.
Maxima itself then (i.e. in the above described situation) parses the UTF-8 encoding.
When us_ascii_only
is set to true
it is assumed that all strings
used as arguments to string processing functions do not contain Non-US-ASCII characters.
Given that promise, Maxima avoids parsing UTF-8 and strings can be processed more efficiently.
Returns a Unicode code point corresponding to the list which must contain the UTF-8 encoding of a single character.
Examples: See unicode_to_utf8.
Next: String Processing, Previous: String Input and Output, Up: stringproc [Contents][Index]