inconsistent definition of "sort" (and what about stability?)



Given Oliver's and Robert's comments, how about the following language:

 -- Function: sort (<L>, <P>)
>  -- Function: sort (<L>)
>
> Sorts a list <L> using a predicate function <P> of two arguments which
> defines a strict weak order.  Thus, "<" is a valid P, but "<=" is not.  P(a,b)
> must return either true or false for any choice of a and b in <L>. If
> <P>(a,b) is true, then a will appear before b in the result.  If neither
> <P>(a,b) nor <P>(b,a) is true, then a and b are equivalent, and will appear
> in the same order as in the input (that is, 'sort' is a stable sort).
>

> If both P(a,b) and P(b,a) are true, then P is not a valid sort predicate,
> and the result is undefined, though 'sort' will generally not signal an
> error.
>
> The predicate function may be specified as the name of a function (e.g.
> 'orderlessp), a quoted binary infix operator (e.g. "<"), or as a `lambda' expression
> (e.g. lambda([a,b],a[1]<b[1]) ).
>
> The sorted list is returned as a new object; the argument <L> is not
> modified.
>

sort(<L>) is equivalent to sort(<L>,'orderlessp)

Common predicates used with sort include:

* orderlessp -- canonical ordering applicable to all Maxima objects
* ordergreaterp -- the negation of orderlessp
* ordermagnitudep -- orders by numerical magnitude when known; objects of
unknown magnitude come at the end, in orderlessp order.  Does not consult
the 'assume' database.
* "<" -- orders by numerical magnitude; does consult the 'assume' database.
 Does not allow objects of unknown magnitude.

>
> ----------------------------------
>
> On Sat, Nov 19, 2011 at 20:28, Oliver Kullmann <O.Kullmann at swansea.ac.uk>wrote:
>
>> At
>>
>> http://www.lispworks.com/documentation/HyperSpec/Body/f_sort_.htm
>>
>> it says
>>
>> Predicate should return true if and only if the first argument is
>> strictly less than the second (in some appropriate sense).
>>
>> (It's sloppy in the sense that "appropriate sense" it not defined.)
>>
>> Also
>>
>>
>> http://www.gnu.org/software/emacs/manual/html_node/cl/Sorting-Sequences.html
>>
>> says
>>
>> . predicate should return true (non-nil) if and only if its first
>> argument is less than (not equal to) its second argument.
>>
>> You can also look into Knuth's "The Art of Computer Programming".
>> Most precisely it's in the C++ standard.
>>
>> Basing sorting on strict weak ordering (see
>> http://en.wikipedia.org/wiki/Strict_weak_order)
>> is also done at
>> http://en.wikipedia.org/wiki/Sorting
>>
>> Oliver
>>
>> On Sun, Nov 20, 2011 at 01:18:33AM +0000, Oliver Kullmann wrote:
>> > On Sat, Nov 19, 2011 at 07:46:26PM -0500, Stavros Macrakis wrote:
>> > > On Sat, Nov 19, 2011 at 18:32, Robert Dodier <robert.dodier at gmail.com>
>> wrote:
>> > >
>> > > Comments in line:
>> > >
>> > >
>> > >      -- Function: sort (<L>, <P>)
>> > >
>> > >         Sorts a list <L> according to a predicate `P' of two
>> arguments,
>> > >         such that `<P>(<L>[k + 1], <L>[k])' is `false' for any two
>> > >         successive elements.
>> > >
>> > >
>> > > I'm afraid this definition is incorrect.  After all, L= [1,1] is
>> sorted
>> > > according to "<=", but P(L[2],L[1]) is true.  And what is the purpose
>> of saying
>> > > "any two successive elements" when we already have L[k+1] and L[k]?
>> Also, "L"
>> > > is being used in the same sentence to mean two very different things:
>> the input
>> > > list and the output list.  All in all, very confusing.  Why not
>> something
>> > > simpler like:
>> > >
>> >
>> > No, the definition is correct: "<=" can not be used by "sort", but
>> yields
>> > undefined behaviour.
>> >
>> > How to construct a good example, to demonstrate this?
>> > The best I'm aware of at the moment is to show that different Lisp's
>> > yield different results:
>> >
>> > cmp(x,y):=is(first(x) <= first(y));
>> > sort([[1,1],[1,2]],cmp);
>> >
>> > Ecl yields
>> > [[1,1],[1,2]]
>> > Sbcl yields
>> > [[1,2],[1,1]]
>> >
>> > However with the correct definition
>> > cmp(x,y) := is(first(x) < first(y));
>> > both yield
>> > [[1,1],[1,2]]
>> >
>> > >     Sorts a list <L> using an ordering function P of two arguments,
>> which
>> > > defines a "less than or equals" relation.  More precisely, if P(a,b)
>> and P(b,a)
>> > > are both true, then the order of a and b remains the same as in the
>> input
>> > > ("stable sort").  Otherwise, a comes before b if P(a,b).
>> > >
>> >
>> > If both P(a,b) and P(b,a) are true, then we have undefined behaviour.
>> >
>> > > (This also makes the discussion of stable sort below unnecessary.)
>> > >
>> > >
>> > >      The predicate may be specified as the name
>> > >         of a function or binary infix operator, or as a `lambda'
>> > >         expression.  If specified as the name of an operator, the
>> name is
>> > >         enclosed in "double quotes".
>> > >
>> > >
>> > > Adding an example might help: e.g. "<".
>> > >
>> >
>> > Whatever you do, one can not have "<=" and "<" at the same time.
>> >
>> > >
>> > >         It is assumed the predicate <P> is a strict total order on the
>> > >         elements of <L>.
>> > >
>> > >
>> > > No, it is assumed that P is total, but not strict -- i.e. it is like
>> <=, not <.
>> > >  (Otherwise we wouldn't have to specify stability.)
>> > >
>> >
>> > Yes, it is strict. Sorting always considers strict orders. For strict
>> orders
>> > equivalence x~y is defined as not(x<y) and not(y<x). Stability concerns
>> > equivalent elements.
>> >
>> > >
>> > >      If not, `sort' might run to completion without
>> > >         error, but the result is undefined.  `sort' complains if the
>> > >         predicate evaluates to something other than `true' or `false'.
>> > >
>> > >
>> > > Instead of "complains", how about "gives an error"?  "Complains"
>> could mean
>> > > "prints a warning".
>> > >
>> > >
>> > >         `sort' is a stable sort: if two elements <x> and <y> are
>> > >         equivalent in the sense that `<P>(<x>, <y>)' and `<P>(<y>,
>> <x>)'
>> > >         are both `false', then the relative order of <x> and <y> in
>> <L> is
>> > >         preserved by `sort'.
>> > >
>> > >
>> > > No, this is incorrect.  If P(x,y) and P(y,x) are both false, then P
>> is not a
>> > > total order (strict or otherwise), and all bets are off.
>> > >
>> >
>> > This is called a "strict weak order", and is fully sufficient for all
>> > sorting purposes.
>> >
>> > If your statement above would be true, then sort([1,1]) would be
>> undefined ---
>> > given that orderlessp is used!
>> >
>> > orderlessp(1,1);
>> >   false
>> >
>> > >
>> > >         `sort (<L>)' is equivalent to `sort (<L>, orderlessp)'.  That
>> is,
>> > >         the default sorting order is ascending, as determined by
>> > >         `orderlessp'.    All Maxima atoms and expressions are
>> comparable
>> > >         under `orderlessp'.
>> > >
>> >
>> > Oliver
>>
>> --
>> Dr. Oliver Kullmann
>> Department of Computer Science
>> College of Science, Swansea University
>> Faraday Building, Singleton Park
>> Swansea SA2 8PP, UK
>> http://cs.swan.ac.uk/~csoliver/
>>
>
>