Universal read_data function



On June 8, 2011, Paul Bowyer wrote:
----------------------------------------
Evidently there is something going on with text file EOL conversion that
happens in Thunderbird. I tried opening the attached data files directly
with okteta by right-clicking on the attachment and selecting okteta,
but Thunderbird uses some internal program/function to transfer the
data. When I examined the data file in okteta I'd find the EOL chars to
be either unix LF or, in the case of the macintosh file, non-existent. I
also tried using notepad (comes with the Windows emulator, wine) and was
unable to get the EOL chars the way you described them. I finally used
kwrite as the opener after I had pre-configured it for the proper EOL
sequences, but even then I had to manually modify a couple of the files
by inserting spaces at the proper locations so I could then use okteta
to convert those inserted spaces to CRs.

After I got the EOL sequences set up according to those you described, I
ran eol_chars and here are the results:
-------------
printfile("/home/pfb/ndata9.dat")$
eol_chars("/home/pfb/ndata9.dat");

printfile("/home/pfb/line1w.txt")$
eol_chars("/home/pfb/line1w.txt");

printfile("/home/pfb/line1u.txt")$
eol_chars("/home/pfb/line1u.txt");

printfile("/home/pfb/line1m.txt")$
eol_chars("/home/pfb/line1m.txt");

2 , 4.8, -3/4, "xyz", -2.8e-9

2 , 4.8, -3/4, "xyz", -2.8e-9

2 , 4.8, -3/4, "xyz", -2.8e-9

(%o4) [13,10]
ABCD
(%o6) [13,10]
ABCD
(%o8) [10]
ABCD
(%o10) [13]
----------------
So it looks like you have a working solution to the EOL situation that
covers all of the platforms maxima runs on.

By looking at the maxima documentation, I wouldn't have thought of using
lfreeof as a way to discover character numbers in a string. So I learned
something new about maxima that may come in handy one day.
-----------------------------------------
Hi Paul,

The next part of the experiment is to take those three one
line files with known different eol chars and open each as
a stream and then use the maxima function readline.

The Maxima help manual description of readline is irritatingly
vague. :
---
  readline (stream)

Returns a string containing the characters from the current position in 
stream up to the end of the line or false if the end of the file is 
encountered.

----
(notice that 'end of line' is not defined there.)

If you look at the lisp code for readline in stringproc.lisp, it uses
the lisp function read-line, and the common lisp cookbook description
of read-line is:
----
READ-LINE will read one line from a stream (which defaults to standard 
input) the end of which is determined by either a newline character or the 
end of the file. It will return this line as a string without the trailing 
newline character. (Note that READ-LINE has a second return value which is 
true if there was no trailing newline, i.e. if the line was terminated by 
the end of the file.) READ-LINE will by default signal an error if the end 
of the file is reached. You can inhibit this by supplying NIL as the second 
argument. If you do this, READ-LINE will return NIL if it reaches the end of 
the file.
-----------------
So my literal reading of this is that the CL read-line will return all the 
chars up to
the newline char (which I interpret to mean LF, decimal 10), and this would 
mean
that Maxima's readline would return the CR (decimal 13) char as  well as the
first four chars "ABCD".
---------------------------
So basically, I don't know who or what to believe, and thus this
experiment. Is there a Lisp version dependency of  read-line which
will affect certain users of a universal read_data function??
(in which case I would have to write my own homemade read_line)
------------------------------------
Anyway, if your operating system and Lisp version cooperate as
hoped for, you should get the following results:

------------------
(%i1) display2d:false$
(%i3) load(eol_chars);
(%o3) "c:/work2/eol_chars.mac"
(%i4) printfile("line1w.txt")$
ABCD
(%i5) eol_chars ("line1w.txt");
(%o5) [13,10]
(%i6) file_length ("line1w.txt");
(%o6) 6
(%i7) ss : openr ("line1w.txt");
(%o7) ?\#\<input\ stream\ line1w\.txt\>
(%i8) al : readline (ss);
(%o8) "ABCD"
(%i9) slength(al);
(%o9) 4
(%i10) close(ss);
(%o10) true
(%i11) printfile ("line1u.txt")$
ABCD
(%i12) eol_chars ("line1u.txt");
(%o12) [10]
(%i13) file_length ("line1u.txt");
(%o13) 5
(%i14) ss : openr ("line1u.txt");
(%o14) ?\#\<input\ stream\ line1u\.txt\>
(%i15) al : readline (ss);
(%o15) "ABCD"
(%i16) slength (al);
(%o16) 4
(%i17) close(ss);
(%o17) true
(%i18) printfile ("line1m.txt")$
ABCD
(%i19) eol_chars ("line1m.txt");
(%o19) [13]
(%i20) file_length ("line1m.txt");
(%o20) 5
(%i21) ss : openr ("line1m.txt");
(%o21) ?\#\<input\ stream\ line1m\.txt\>
(%i22) al : readline (ss);
(%o22) "ABCD"
(%i23) slength (al);
(%o23) 4
(%i24) close (ss);
(%o24) true
-------------------------
Ted