detect unix or dos line endings



On May 26 Paul Bowyer wrote:
---------------------------------------
Try:
ss : openr("/home/pfb/ndata1.dat");
ln : flength (ss);

/*Get next-to-last char and see if it's a CR*/
fposition(ss,ln-1);
cs : string(?read\-char(ss,nil));
cln : slength(cs);
cint (charat (cs,cln));

/*Get last char and see if it's a LF*/
fposition(ss,ln);
cs : string(?read\-char(ss,nil));
cln : slength(cs);
cint (charat (cs,cln));

close(ss);
-------------------
Thanks for the suggestions, Paul.
I use your idea to use the file length
function to tie down the analysis
below.

It seems to me that your code would
only work if you knew ahead of time
the file only contained one line (which
of course you might well know).
==============

On May 26 Raymond Toy wrote:
-------------------------------------------
    Edwin> (%i14) cs : string(?read\-char(ss,nil));

You want false, here, not nil.  nil is the maxima symbol nil, which
will be treated as true by READ-CHAR.  "false" will get converted to
CL:NIL, which is what you want to pass to READ-CHAR.

So

  cs: string(?read\-char(ss,false));

will return false when you reach the end of the file instead of
signaling an eof error.
----------------------------------

Thanks, Ray, for giving me the maxima to lisp translation of false:

(%i1) display2d:false$
(%i2) get_char_raw(%s) := string(?read\-char(ss,false))$
(%i3) :lisp $_
((MDEFINE) (($GET_CHAR_RAW) $%S) (($STRING) ((READ-CHAR) $SS NIL)))

It would be nice if there was a list of such translations available 
somewhere.

However, I realized that by simply analyzing the various possibilities, that
I could predict ahead of time when I might get into trouble with an
end of file, by simply comparing the file length with the length of the
first line.

I use this analysis for the code below:

==============================

(%i1) load(file_info);
(%o1) "c:/work2/file_info.mac"

(%i2) fundef (file_info);

(%o2) file_info(%filename):=block([%s,%fl,%ll,%EOL,%info,%next,%e1],
                if not file_search(%filename)
                    then (disp(" file not found "),return(false)),
                %s:openr(%filename),%fl:flength(%s),%ll:slength(readline(%s)),
                %EOL:[],%info:[%fl],fposition(%s,1+%ll),
                %EOL:cons(cint_val(%s),%EOL),
                if %fl < 5+%ll
                    then (if %fl = 2+%ll
                              then (%EOL:cons(cint_val(%s),%EOL),
                                    %EOL:reverse(%EOL)))
                    else (%next:cint_val(%s),
                          if not lfreeof([10,13],%next)
                              then (%EOL:cons(%next,%EOL),
                                    %EOL:reverse(%EOL))),
                if length(%EOL) = 1
                    then (%e1:part(%EOL,1),
                          if %e1 = 10 then %info:cons("UNIX",%info)
                              else (if %e1 = 13 then %info:cons("MAC",%info)
                                        else %info:cons(%e1,%info)))
                    else (if part(%EOL,1) = 13 and part(%EOL,2) = 10
                              then %info:cons("WINDOWS",%info)
                              else 
%info:flatten(cons(%EOL,%info))),close(%s),
                %info)

(%i3) fundef (cint_val);

(%o3) cint_val(%ss):=cint(charat(string(?read\-char(%ss)),3))

(%i4) printfile ("ndata1w.dat")$
2.3e9 "Abc"

/* here we test three one line files, having windows, unix and mac endings 
*/
(%i5) file_info ("ndata1w.dat");
(%o5) ["WINDOWS",13]
(%i6) file_info ("ndata1u.dat");
(%o6) ["UNIX",12]
(%i7) file_info ("ndata1m.dat");
(%o7) ["MAC",12]


(%i8) printfile ("ndata2w.dat")$
2 , 4.8, -3/4, "xyz", -2.8e-9

3 22.2  7/8 "abc" 4.4e10

/* here we test three two line files */

(%i9) file_info ("ndata2w.dat");
(%o9) ["WINDOWS",57]
(%i10) file_info ("ndata2u.dat");
(%o10) ["UNIX",55]
(%i11) file_info ("ndata2m.dat");
(%o11) ["MAC",55]


Ted Woollett

p.s.  for windows users, notepad2 makes it super easy to change
end of line chars.