inconsistent definition of "sort" (and what about stability?)
Subject: inconsistent definition of "sort" (and what about stability?)
From: Stavros Macrakis
Date: Sun, 20 Nov 2011 17:20:16 -0500
Given Oliver's and Robert's comments, how about the following language:
-- Function: sort (<L>, <P>)
> -- Function: sort (<L>)
>
> Sorts a list <L> using a predicate function <P> of two arguments which
> defines a strict weak order. Thus, "<" is a valid P, but "<=" is not. P(a,b)
> must return either true or false for any choice of a and b in <L>. If
> <P>(a,b) is true, then a will appear before b in the result. If neither
> <P>(a,b) nor <P>(b,a) is true, then a and b are equivalent, and will appear
> in the same order as in the input (that is, 'sort' is a stable sort).
>
> If both P(a,b) and P(b,a) are true, then P is not a valid sort predicate,
> and the result is undefined, though 'sort' will generally not signal an
> error.
>
> The predicate function may be specified as the name of a function (e.g.
> 'orderlessp), a quoted binary infix operator (e.g. "<"), or as a `lambda' expression
> (e.g. lambda([a,b],a[1]<b[1]) ).
>
> The sorted list is returned as a new object; the argument <L> is not
> modified.
>
sort(<L>) is equivalent to sort(<L>,'orderlessp)
Common predicates used with sort include:
* orderlessp -- canonical ordering applicable to all Maxima objects
* ordergreaterp -- the negation of orderlessp
* ordermagnitudep -- orders by numerical magnitude when known; objects of
unknown magnitude come at the end, in orderlessp order. Does not consult
the 'assume' database.
* "<" -- orders by numerical magnitude; does consult the 'assume' database.
Does not allow objects of unknown magnitude.
>
> ----------------------------------
>
> On Sat, Nov 19, 2011 at 20:28, Oliver Kullmann <O.Kullmann at swansea.ac.uk>wrote:
>
>> At
>>
>> http://www.lispworks.com/documentation/HyperSpec/Body/f_sort_.htm
>>
>> it says
>>
>> Predicate should return true if and only if the first argument is
>> strictly less than the second (in some appropriate sense).
>>
>> (It's sloppy in the sense that "appropriate sense" it not defined.)
>>
>> Also
>>
>>
>> http://www.gnu.org/software/emacs/manual/html_node/cl/Sorting-Sequences.html
>>
>> says
>>
>> . predicate should return true (non-nil) if and only if its first
>> argument is less than (not equal to) its second argument.
>>
>> You can also look into Knuth's "The Art of Computer Programming".
>> Most precisely it's in the C++ standard.
>>
>> Basing sorting on strict weak ordering (see
>> http://en.wikipedia.org/wiki/Strict_weak_order)
>> is also done at
>> http://en.wikipedia.org/wiki/Sorting
>>
>> Oliver
>>
>> On Sun, Nov 20, 2011 at 01:18:33AM +0000, Oliver Kullmann wrote:
>> > On Sat, Nov 19, 2011 at 07:46:26PM -0500, Stavros Macrakis wrote:
>> > > On Sat, Nov 19, 2011 at 18:32, Robert Dodier <robert.dodier at gmail.com>
>> wrote:
>> > >
>> > > Comments in line:
>> > >
>> > >
>> > > -- Function: sort (<L>, <P>)
>> > >
>> > > Sorts a list <L> according to a predicate `P' of two
>> arguments,
>> > > such that `<P>(<L>[k + 1], <L>[k])' is `false' for any two
>> > > successive elements.
>> > >
>> > >
>> > > I'm afraid this definition is incorrect. After all, L= [1,1] is
>> sorted
>> > > according to "<=", but P(L[2],L[1]) is true. And what is the purpose
>> of saying
>> > > "any two successive elements" when we already have L[k+1] and L[k]?
>> Also, "L"
>> > > is being used in the same sentence to mean two very different things:
>> the input
>> > > list and the output list. All in all, very confusing. Why not
>> something
>> > > simpler like:
>> > >
>> >
>> > No, the definition is correct: "<=" can not be used by "sort", but
>> yields
>> > undefined behaviour.
>> >
>> > How to construct a good example, to demonstrate this?
>> > The best I'm aware of at the moment is to show that different Lisp's
>> > yield different results:
>> >
>> > cmp(x,y):=is(first(x) <= first(y));
>> > sort([[1,1],[1,2]],cmp);
>> >
>> > Ecl yields
>> > [[1,1],[1,2]]
>> > Sbcl yields
>> > [[1,2],[1,1]]
>> >
>> > However with the correct definition
>> > cmp(x,y) := is(first(x) < first(y));
>> > both yield
>> > [[1,1],[1,2]]
>> >
>> > > Sorts a list <L> using an ordering function P of two arguments,
>> which
>> > > defines a "less than or equals" relation. More precisely, if P(a,b)
>> and P(b,a)
>> > > are both true, then the order of a and b remains the same as in the
>> input
>> > > ("stable sort"). Otherwise, a comes before b if P(a,b).
>> > >
>> >
>> > If both P(a,b) and P(b,a) are true, then we have undefined behaviour.
>> >
>> > > (This also makes the discussion of stable sort below unnecessary.)
>> > >
>> > >
>> > > The predicate may be specified as the name
>> > > of a function or binary infix operator, or as a `lambda'
>> > > expression. If specified as the name of an operator, the
>> name is
>> > > enclosed in "double quotes".
>> > >
>> > >
>> > > Adding an example might help: e.g. "<".
>> > >
>> >
>> > Whatever you do, one can not have "<=" and "<" at the same time.
>> >
>> > >
>> > > It is assumed the predicate <P> is a strict total order on the
>> > > elements of <L>.
>> > >
>> > >
>> > > No, it is assumed that P is total, but not strict -- i.e. it is like
>> <=, not <.
>> > > (Otherwise we wouldn't have to specify stability.)
>> > >
>> >
>> > Yes, it is strict. Sorting always considers strict orders. For strict
>> orders
>> > equivalence x~y is defined as not(x<y) and not(y<x). Stability concerns
>> > equivalent elements.
>> >
>> > >
>> > > If not, `sort' might run to completion without
>> > > error, but the result is undefined. `sort' complains if the
>> > > predicate evaluates to something other than `true' or `false'.
>> > >
>> > >
>> > > Instead of "complains", how about "gives an error"? "Complains"
>> could mean
>> > > "prints a warning".
>> > >
>> > >
>> > > `sort' is a stable sort: if two elements <x> and <y> are
>> > > equivalent in the sense that `<P>(<x>, <y>)' and `<P>(<y>,
>> <x>)'
>> > > are both `false', then the relative order of <x> and <y> in
>> <L> is
>> > > preserved by `sort'.
>> > >
>> > >
>> > > No, this is incorrect. If P(x,y) and P(y,x) are both false, then P
>> is not a
>> > > total order (strict or otherwise), and all bets are off.
>> > >
>> >
>> > This is called a "strict weak order", and is fully sufficient for all
>> > sorting purposes.
>> >
>> > If your statement above would be true, then sort([1,1]) would be
>> undefined ---
>> > given that orderlessp is used!
>> >
>> > orderlessp(1,1);
>> > false
>> >
>> > >
>> > > `sort (<L>)' is equivalent to `sort (<L>, orderlessp)'. That
>> is,
>> > > the default sorting order is ascending, as determined by
>> > > `orderlessp'. All Maxima atoms and expressions are
>> comparable
>> > > under `orderlessp'.
>> > >
>> >
>> > Oliver
>>
>> --
>> Dr. Oliver Kullmann
>> Department of Computer Science
>> College of Science, Swansea University
>> Faraday Building, Singleton Park
>> Swansea SA2 8PP, UK
>> http://cs.swan.ac.uk/~csoliver/
>>
>
>