Announcing statistical inference package 'stats'

Subject: Announcing statistical inference package 'stats'
From: Mario Rodriguez
Date: Tue, 21 Nov 2006 13:36:31 +0100

Hello Robert,

> (1) I don't think it's appropriate to modify global variables by
>     loading the stats package. This can lead to bad surprises,
>     e.g. interactions with other packages, or unexpected results.
> 
> (2) About numer in particular, numer : true defeats one of Maxima's
>     major features. We really shouldn't discourage people from
>     exploiting Maxima's capability to do exact integer and rational
>     arithmetic.
> 
>     If some functions in the stats package need to convert non-floats
>     to floats, then (LET (($NUMER T)) ..) or block([numer : true], ...)
>     is the way to go.
> 

I understand objection (1). But I think that a common user of this
package will be mostly interested in looking at floating point results;
if these are given in rational form, he's obliged to write '%,numer'
most of the time. On the other hand, nobody needs a p-value with sixteen
digits, that's why I restrict fpprintprec to 7. On the other hand, with
global variables 'numer' and 'fpprintprec' set to their default values
the displayed inference_result object is very ugly.

I propose a third alternative. Let's define two new global variables
'stats_numer' (default true) and 'stats_fpprint' (default 7), and don't
change the other two globally.

> (3) I think it is a good idea to present the results in an
>     inference_result object. I like the way the results are presented
>     in a nice format by a display function.

I like it too, but the original idea of porting this from R to Maxima is
not mine ;)

>     A possibility here is to use the existing (though not quite
>     finished) defstruct code to construct the inference_result objects.
>     Then the methods for accessing fields within a structure don't
>     need to be duplicated.

Not related with the stats package. Months ago, I have being studying
how to use 'defstruct' in the distrib package, to make it similar to the
Mathematica style of defining distributions; for example, the idea was
to write something similar to

cdf(1/2, normal_distribution(0,1));

instead of

cdf_normal(1/2,0,1);

but I wasn't sure about the benefits of this syntax, and gave up.


> (4) I think the written documentation is very good; every share package
>     should have such nice documentation. I'll make some minor revisions
>     to the texinfo file.

Please, make them.

> (5) I recommend renaming shapiro_wilk_test --> test_normality and
>     making shapiro_wilk an option (since there are other normality
>     tests)
> 
> (6) I recommend renaming dif_means_test --> means_difference_test
>     or means_diff_test
> 
> (7) I recommend renaming simple_linear_reg --> linear_regression
>     or simple_linear_regression
> 
> (8) (MAYBE) Test functions could be renamed in big-endian style,
>     to give these similar functions names which are more similar.
>     It's a minor point.
> 
>     mean_test --> test_mean
>     means_difference_test --> test_means_difference
>     variance_test --> test_variance
>     variance_ratio_test --> test_variance_ratio
>     sign_test --> test_sign
>     signed_rank_test --> test_signed_rank
>     normality_test --> test_normality


Ok, I'll put changes (5), (6), (7), and (8) in my todo list.

Thanks for your comments. I'm interested in reading these and other
opinions before writing more tests.

Mario

-- 
Mario Rodriguez Riotorto
www.biomates.net

Prev by Date: finite fields in Maxima
Next by Date: hessian
Previous by thread: Announcing statistical inference package 'stats'
Next by thread: Announcing the Imaxima-imath-0.97 compatibility with Maxima 5.11.0
Index(es):
- Date
- Thread