inconsistent definition of "sort" (and what about stability?)
Subject: inconsistent definition of "sort" (and what about stability?)
From: Oliver Kullmann
Date: Sat, 12 Nov 2011 17:51:28 +0000
Hello,
while trying various Lisp's (not finished yet) a problem showed up which
we had for a long time, but could ignore, due to just using
one Lisp, namely what is the definition of "sort"?:
First it says
Sorts a list <L> according to a predicate `P' of two arguments,
such that `<P> (<L>[k], <L>[k + 1])' is `true' for any two
successive elements.
This *excludes* the use of orderlessp, which is nevertheless the
key example, since orderlessp is a *strict* order, i.e. "<",
not "<=":
orderlessp(1,1);
false
According to the sentence above one needed to use "<=", not "<", or
otherwise sort would be undefined on a list of equal elements, say
sort([1,1]) would be undefined.
In the light of the discussion below, I guess what is meant here is
P(L[k+1],L[k]) is false.
Furthermore it says
If the predicate <P> is not a total order on the elements of <L>,
then `sort' might run to completion without error, but the result
is undefined. `sort' complains if the predicate evaluates to
something other than `true' or `false'.
However "total order" is itself unspecified here, since it is not
said (and in fact, as shown above, inconsistent), whether a strict
(irreflexive) order or an "normal" (reflexive) order is meant.
Since orderlessp should be the prime example, I guess a "strict
total order" is meant. This is also algorithmically sensible,
since only when x<y shall we swap x,y, not already when x<=y.
Still a "strict total order" can mean different things.
Best, one takes the C++ standard as example, where their axiomatisation
of < as a "strict weak ordering" is
- transitive
- irreflexive
- the relation x~y given by not(x<y) and not(y<x) is an
equivalence relation.
This is most general, and avoids the use of "=".
If "=" however shall be involved, then likely you want to replace
the third condition by the stronger condition
- exactly one of x<y, x=y, y<x.
Finally, there is the problem of *stability", i.e., whether the order
of equal (resp. equivalent) elements is kept (stable) or not.
One should have two functions, like in C++, where "sort" is not
stable (thus faster!), while one has also "stable_sort".
With the given Maxima-specification, which doesn't mention
stability, one is in the worst situation:
- the implementation can be stable (slower)
- but one can't rely on it!
Thus in effect one needs to write own functions "nsort" and
"nsort_stable".
I hope my requests are clear:
1. What is the intended specification of sort? As I speculate above,
based on a strict ordering? If so, then using the "strict weak ordering"
(more general, not using "="), or the trichotomy condition?
2. It would be good if it could be specified whether "sort" shall be
stable or not. And if it is not intended as stable, then it would
be good of sort_stable would be provided.
Thanks for your attention!
Oliver