On 2013-03-22, Rupert Swarbrick <rswarbrick at gmail.com> wrote:
> A note about byte offsets vs. char offsets: The perl script was very
> careful to compute and store byte offsets. We then used "file-position"
> on the lisp side (with a character stream) to get to the relevant point
> in the file. I haven't carefully checked, but I presume that there are
> lisps where this did the wrong thing. For example, on SBCL it seems that
> file-position counts characters rather than bytes.
Since FILE-POSITIION's result is implementation-dependent according to
CLHS, there is potentially a mismatch between the offset computed by the
Perl script and some implementation of FILE-POSITION. I don't know of
any such; in particular, Maxima + SBCL works as expected w/ UTF-8
characters.
> With the new code, we compute file-position and then use file-position
> (on the same implementation!), so we needn't care about what it
> represents, just that it's a monotonic function of, well, the file
> position(!). I think this is probably an improvement.
Agreed.
For the record I don't have any problem with replacing Perl scripts with
Lisp code, as long as the replacement is an improvement, or at least not
any worse.
best
Robert Dodier