Universal read_data function



On May 31, Paul Bowyer wrote:
------------------------
My reason for thinking I needed to use Windows text file standards in the 
data files was because I copy/pasted them from your email messages. If I 
were creating them from scratch on a Linux box, I'd opt for the default LF 
that is standard for Linux text files.

When I try to write utility functions, I try to make them robust so they 
don't fail when things aren't absolutely perfect. It made sense to me to 
handle the case where Windows text file standards were used since you were 
working on a Windows machine. I wasn't trying to be a nuisance by 
continually marking up your code and I hope I didn't upset you, but please 
forgive me if I did.

Paul
-------------------------------------
Hi Paul,

I never get upset, and can only be flattered by your interest in
my faltering efforts at Maxima code.

The current version of read_data (which has changed: see
below) cares not a whit about end of line chars, so that should
never be the issue here. The important thing is that the
file to be read does not contain spurious extra end of
line chars, and that is why I advise looking at the file with
a utility such as notepad2, which clearly shows up the
locations and types of end of line chars (shift+control+9)
(which is a toggle).

(By the way, when you write data to a stream opened
with openw, using printf as is the manual examples,
the end of line chars are LF (unix).)

The NEW version allows the 'data-sep-string' to be "text",
(which is a hack), in which case all lines are read
in as strings without splitting, as is appropriate for
a purely text file which contains spaces and punctuation
marks.

A related change is if the four arg version is
used, by supplying a list  of line numbers,
those lines 2 and 4 are read into separate
sublists as a whole as  one string for the
whole line, doing no splitting.

---------------------------------------
The present complete syntax and code are then:
-------------------------------------------------------------
/*********** read_data  ****************************/
  /*  if only a file name is given, then  the
    data separators can be an arbitrary mixture
    of spaces and commas, but the commas are
    converted to spaces, so strings with spaces
    will choke the code if you only provide the
    filename, or you provide (filename," ").



    syntax: read_data(filename,data-sep-string,mult,line-list)

      with ";" for example in second slot,
         and false in third slot.
      (mult is set to true by default.)

      The data separator string can be anything
      recognised by split, and the boolean parameter
      mult is used by split.


      In addition, the data-sep-string can be "text",
      in which case *all* lines of the stream are read
      in as individual strings.

      Thus the syntax read_data(filename,"text") does
      no line splitting.

     The most complicated four arg syntax has the
     form
       read_data (filename, " ", true, [2,4] )

     for example, where for split line data items,
     (ie., not lines 2 and 4) space is being used
     as the data separator, but lines 2 and 4 should
     be read into separate sublists as a whole as
     one string for the whole line, doing no splitting
     for lines 2 and 4.
           */


 /* new 5-29 */

 read_data([%v]) :=
    block ([%s,%r,%l,%filename,%dsep,%mult:true,
                 %mix:false,  %whole:[],%ln],

     %filename : part (%v,1),

     if not stringp (%filename)
       then ( disp (" file name must be a Maxima string "),
              return (false)),

    if not file_search (%filename) then
      (disp (" file not found "),return (false)),

    if length (%v) = 1 then %mix : true
       else if length(%v) = 2 then %dsep : part (%v,2)
       else if length (%v) = 3
              then (%dsep : part (%v,2), %mult : part (%v,3))
       else
      (%dsep : part (%v,2), %mult : part (%v,3),%whole : part(%v,4)),



    %s : openr (%filename),
    %r : [],
    %ln : 0,

    while (%l : readline(%s)) # false do
       ( %ln : %ln + 1,
         if %dsep = "text" then
            %r : cons (%l,%r)
         else if not lfreeof (%whole,%ln) then
            %r : cons (%l,%r)
         else if %mix then
            %r : cons (map(parse_string, split(ssubst (" ",",",%l))), %r)
         else %r : cons (map(parse_string, split(%l,%dsep,%mult)), %r)),

    close (%s),
    reverse (%r))$
------------------------------------------------

Ted