inconsistent definition of "sort" (and what about stability?)
Subject: inconsistent definition of "sort" (and what about stability?)
From: Oliver Kullmann
Date: Sun, 20 Nov 2011 01:28:11 +0000
At
http://www.lispworks.com/documentation/HyperSpec/Body/f_sort_.htm
it says
Predicate should return true if and only if the first argument is strictly less than the second (in some appropriate sense).
(It's sloppy in the sense that "appropriate sense" it not defined.)
Also
http://www.gnu.org/software/emacs/manual/html_node/cl/Sorting-Sequences.html
says
. predicate should return true (non-nil) if and only if its first argument is less than (not equal to) its second argument.
You can also look into Knuth's "The Art of Computer Programming".
Most precisely it's in the C++ standard.
Basing sorting on strict weak ordering (see http://en.wikipedia.org/wiki/Strict_weak_order)
is also done at
http://en.wikipedia.org/wiki/Sorting
Oliver
On Sun, Nov 20, 2011 at 01:18:33AM +0000, Oliver Kullmann wrote:
> On Sat, Nov 19, 2011 at 07:46:26PM -0500, Stavros Macrakis wrote:
> > On Sat, Nov 19, 2011 at 18:32, Robert Dodier <robert.dodier at gmail.com> wrote:
> >
> > Comments in line:
> > ?
> >
> > ?-- Function: sort (<L>, <P>)
> >
> > ? ? Sorts a list <L> according to a predicate `P' of two arguments,
> > ? ? such that `<P>(<L>[k + 1], <L>[k])' is `false' for any two
> > ? ? successive elements.
> >
> >
> > I'm afraid this definition is incorrect. ?After all,?L= [1,1] is sorted
> > according to "<=", but P(L[2],L[1]) is true. ?And what is the purpose of saying
> > "any two successive elements" when we already have L[k+1] and L[k]?? Also, "L"
> > is being used in the same sentence to mean two very different things: the input
> > list and the output list. ?All in all, very confusing. ?Why not something
> > simpler like:
> >
>
> No, the definition is correct: "<=" can not be used by "sort", but yields
> undefined behaviour.
>
> How to construct a good example, to demonstrate this?
> The best I'm aware of at the moment is to show that different Lisp's
> yield different results:
>
> cmp(x,y):=is(first(x) <= first(y));
> sort([[1,1],[1,2]],cmp);
>
> Ecl yields
> [[1,1],[1,2]]
> Sbcl yields
> [[1,2],[1,1]]
>
> However with the correct definition
> cmp(x,y) := is(first(x) < first(y));
> both yield
> [[1,1],[1,2]]
>
> > ? ? Sorts a list <L> using an ordering function P of two arguments, which
> > defines a "less than or equals" relation. ?More precisely, if P(a,b) and P(b,a)
> > are both true, then the order of a and b remains the same as in the input
> > ("stable sort"). ?Otherwise, a comes before b if P(a,b).
> >
>
> If both P(a,b) and P(b,a) are true, then we have undefined behaviour.
> ?
> > (This also makes the discussion of stable sort below unnecessary.)
> >
> >
> > ?The predicate may be specified as the name
> > ? ? of a function or binary infix operator, or as a `lambda'
> > ? ? expression. ?If specified as the name of an operator, the name is
> > ? ? enclosed in "double quotes".
> >
> >
> > Adding an example might help: e.g. "<".?
> >
>
> Whatever you do, one can not have "<=" and "<" at the same time.
> ?
> >
> > ? ? It is assumed the predicate <P> is a strict total order on the
> > ? ? elements of <L>.
> >
> >
> > No, it is assumed that P is total, but not strict -- i.e. it is like <=, not <.
> > ?(Otherwise we wouldn't have to specify stability.)
> >
>
> Yes, it is strict. Sorting always considers strict orders. For strict orders
> equivalence x~y is defined as not(x<y) and not(y<x). Stability concerns
> equivalent elements.
>
> >
> > ?If not, `sort' might run to completion without
> > ? ? error, but the result is undefined. ?`sort' complains if the
> > ? ? predicate evaluates to something other than `true' or `false'.
> >
> > ?
> > Instead of "complains", how about "gives an error"? ?"Complains" could mean
> > "prints a warning".
> > ?
> >
> > ? ? `sort' is a stable sort: if two elements <x> and <y> are
> > ? ? equivalent in the sense that `<P>(<x>, <y>)' and `<P>(<y>, <x>)'
> > ? ? are both `false', then the relative order of <x> and <y> in <L> is
> > ? ? preserved by `sort'.
> >
> >
> > No, this is incorrect. ?If P(x,y) and P(y,x) are both false, then P is not a
> > total order (strict or otherwise), and all bets are off.
> >
>
> This is called a "strict weak order", and is fully sufficient for all
> sorting purposes.
>
> If your statement above would be true, then sort([1,1]) would be undefined ---
> given that orderlessp is used!
>
> orderlessp(1,1);
> false?
>
> >
> > ? ? `sort (<L>)' is equivalent to `sort (<L>, orderlessp)'. ?That is,
> > ? ? the default sorting order is ascending, as determined by
> > ? ? `orderlessp'. ? ?All Maxima atoms and expressions are comparable
> > ? ? under `orderlessp'.
> >
>
> Oliver
--
Dr. Oliver Kullmann
Department of Computer Science
College of Science, Swansea University
Faraday Building, Singleton Park
Swansea SA2 8PP, UK
http://cs.swan.ac.uk/~csoliver/