Table of Contents

Name

cleave - an analysis of variance (anova) program (ERP statistics)

Synopsis

cleave [ factorname1 factorname2 ... ] < input > output

Description


HOW TO USE CLEAVE
by Tim Herron             January 30, 2005 Version
-----------------------------------------------------
TABLE OF CONTENTS
INTRODUCTION
0. THE BASICS OF USING CLEAVE
        CALLING CLEAVE
        ANOVA DESIGNS
        SPECIFYING INPUT FILES
1. OPTIONAL FEATURES AND HOW TO INVOKE THEM
2. SAMPLE CLEAVE OUTPUT
3. NOTES ON USING CLEAVE’S FEATURES
        USING VARIOUS ANOVA DESIGNS AND PRODUCING F VALUES
        USING BOX-GEISSER-GREENHOUSE CORRECTION VALUES
        USING SOURCE TREATMENT MAGNITUDES
        USING POWER/SUBJECT TABLES
        USING POST-HOC AND PAIRWISE TESTS
        DETECTION OF OUTLIERS
        DETECTING AND CORRECTING ANOVA DATA SET DESIGNS
4. PROGRAM LIMITATIONS
5. REFERENCES
6. ERROR AND INFORMATIONAL MESSAGES
7. CONTACT INFORMATION
-----------------------------------------------------
INTRODUCTION
CLEAVE is a UNIX-style program which performs Analysis of
Variance (ANOVA) computations on text files containing
experimental data.  The program should work for balanced
and proportional designs with crossed data sets of
up to 15 fixed and/or random factors each having
arbitrarily many levels.  And any factor can be of the
"between" or "within" (repeated measures) variety.
CLEAVE is intended to be a very fast ANOVA program
which can handle very large data sets.  As such,
the program is rather streamlined and does not spend
effort in having a user friendly interface.
Nonetheless, we aim for elegant design and flexibility.
Creation of the CLEAVE program was inspired by the desire
to remove two particular ANOVA assumptions which often do
not hold in experiments: (a) equality of factor variances,
and (b) sphericity of factor covariances.  Much effort has
been taken to adjust for situations when these two
assumptions do not hold.
A secondary goal for the program is to provide the user
with important complementary information which augments
the basic ANOVA F test, which by itself does not give the
user much information about an experiment’s factors’
effects.  Information such as magnitude of effect, power
calculations, and pairwise significance tests help the
user understand the importance of each factor or factor
combination in determining the experimental output.
Reporting this kind of auxilliary information is quickly
becoming a requirement in many fields in order to ensure
the publication of an experiment’s results.
CLEAVE (or "cleave.exe" for Dos/Windows) uses a configuration
file "cleave.cnf" to decide which of several optional
computations to perform on the experimental data.  In addition
to computing the usual sums of squares, F and probability values
on the data, CLEAVE can perform the following useful algorithms:
(1) CLEAVE computes corrections for factor (co)variance anomolies.
(2) CLEAVE can compute treatment magnitude effects.
(3) Some post-hoc test statistics can be computed.
(4) CLEAVE computes post-hoc power values.
(5) CLEAVE can handle some designs with random factors
Finally, we include the source code for the CLEAVE program
so that the user can modify the program to include more
features (or just to squash bugs).
This document has 8 parts (not including this intro):
0) The basics on how to use CLEAVE.
1) A very brief discussion of the optional features and
   how to invoke them using the configuration file.
2) A walk through of an example output which highlights the
   optional features’ output.
3) Notes on using CLEAVE features and their use in
   analyzing statistical data.
4) Computational and statistical limitations of CLEAVE.
5) References which could help you get the most from CLEAVE.
6) Error and Informational Messages in CLEAVE
7) Contact information so that you can complain to me.
Probably the fastest way of reading this document is to read
sections 0) and then 2) to understand how to run the program
and read its output.  Then one can skim section 1) and read
section 3) to learn how and why to invoke CLEAVE’s special
features.  Finally, 4)-6) are available as reference.
0. THE BASICS OF USING CLEAVE
CALLING CLEAVE
CLEAVE takes as input text files containing experimental
data in column format and produces as output a text file
containing the results of the ANOVA analysis.  Ordinary
text editors can be used to produce the input and likewise
to edit the output.  A typical invocation of CLEAVE looks
as follows on your command line:
/home/tjherron>cleave <anova_data >anova_output
where "anova_data" is a space- or tab-delimited file
containing the experimental data (one outcome per row),
and "anova_output" is text file output.  Note that CLEAVE
uses the standard input and output sources, and so if
one wants to use files on a hard disk (or some other
media source), the user needs to specify that using the
command-line redirection characters "<" and ">".  Of
course one can also use pipes ("|") to specify input and
output files.  E.g. if one is happy to allow the output
to whiz by them on the computer screen, one can type:
/home/tjherron>cleave <anova_data
but it would be more pleasant to type
/home/tjherron>cleave <anova_data | more
in order to view the output file one screen at a time.
ANOVA DESIGNS
The following kinds of ANOVA designs can be processed
by CLEAVE’s algorithms:
1) Fixed, Random and Mixed designs
2) Highly Multi-Way ANOVA designs
3) Between, Within (Repeated Measures), and Mixed designs
4) Balanced and Proportional Designs
Any combination of these designs can be fully handled with
a few exceptions.  The following two types of designs
cannot have their F statistics computed by CLEAVE.
First, proportional designs cannot have random
factors in them (i.e. Random or Mixed designs) not
counting the subject factor (= the first column, which is
always random).  Second, unbalanced designs of any type
will not have F statistics computed by CLEAVE, except for
certain kinds of very slightly unbalanced designs.
For these two types of experimental design, the only
output that CLEAVE produces are the pairwise comparisons
that are scheduled by "cleave.cnf" because those
computations are independent of the F statistic that
CLEAVE is designed to compute.
CLEAVE can correctly detect whether a factor is a within
factor or between factor even when there are certain 
types of errors in the data set which cause a data set
to be imbalanced.  Further, if the data set is nearly
balanced or proportional (perhaps due to a typo in the
data set or a script which generates a data set file),
CLEAVE can attempt to correct the flaw as well as
inform the user which data is missing or is in excess.
Finally, and unfortunately, CLEAVE cannot currently
handle nested designs of any sort at all, and it cannot
detect properly that the design is nested.  The pairwise
comparison results that it prints out, however, may
well be accurate, but whether they are always good
is not known at present (CLEAVE attempts to guard
against those cases in which the pairwise comparison
results may be misleading - but it is not foolproof).
SPECIFYING INPUT FILES
Specifying the input file is straightforward.  The user
needs to produce a text file which contains the same
number of columns on each line, in the following format.
The final column must contain the experimental outcomes,
the first column must contain the identifiers for the
experimental subjects, and each of the middle columns
contain the levels for one unique factor being tracked
in the experiment.  Each line of the input file contains
information about one experimental outcome, e.g.
14      Left    3       75.4
records that the 14th experimental subject had outcome
of 75.4 where "Left" was the level of the first factor
and "3" was the level of the 2nd factor.  Columns can
be separated using whitespace or tabs in any combination
and the separators do not have to consistent from line
to line (or even within a line for that matter).  If
the user wishes to include whitespace in a level’s
name, then quotes (double or single) can be used to
include it into the level’s name, e.g.:
Jim      "First Group"    Sleep       -4.2
Note that in the above one can include a single quote
inside a pair of delimiting double quotes (and vice-
versa is possible).
We have included the following example text files for
illustration of what a text input file can look like:
proportional  - a 2x4x3 proportional ANOVA design with
                one varying factor
bogartz6      - a 3x3x3 ANOVA design with one repeated
                measures factor (and 2 "between" factors).
repeats       - a 4x5 repeated measures ANOVA design
                which relys on CLEAVE to create a
                REPEATS factor
CLEAVE reads in the input files and figures out the
following properties of the experimental design:
a) how many factors there are
b) the number and names of each factor’s levels
c) which factors to ignore (any factor with only 1 level)
d) whether a factor is a repeated measures factor
e) whether a factor is proportionally varying or is ’flat’
f) whether the ANOVA design is balanced or unbalanced
g) whether to introduce a new factor for duplicate
   data lines (a "REPEATS" factor)
The user needs to specify only two basic things about the
experiment.  First, the user can specify the factor names
on the command line before the input and output files
are specified, as e.g.
/home/tjherron>cleave Side Violations <anova_data >anova_output
specifies that the first factor be called "Side" and the
second factor is named "Violations".  If the user does not
specify factor names, CLEAVE assumes that the first factor
is called "A", the second, "B", etc.
Second, the user needs to specify which factors are random
factors and which are fixed factors.  The default is to assume
that all non-subject factors are fixed.  One edits the (text)
file "cleave.cnf" and puts the appropriate multi-bit index
in the following line(s) appearing in "cleave.cnf":
1                 Indicates which factors are random variables
                  = Sum_{c = random factor column number} 2^{c}
                  This should always be odd since the subjects
                  column (column 1) is always a random factor
The default above, "1", assumes that all factors are fixed
except for the initial subject factor (which is always random).
If, e.g. the user knows that the first and third factors (i.e.
the second and fourth columns in the input file) are the only
random factors out of the 4 appearing in the input file, then
s/he would place 11 (= 1*1 + 1*2 + 0*4 + 1*8 + 0*16) in place
of 1 above in "cleave.cnf".
The other parts of the file "cleave.cnf" control the options
that can be turned on in CLEAVE, and these will be described
in the following sections.
1. OPTIONAL FEATURES AND HOW TO INVOKE THEM
In addition to having the cleave program in your working path,
you will want to have the "cleave.cnf" file accessible to the
program in your current directory.  Inside the configuration
file, which you can edit with any text editor, there are
many program options.  The following sections inside the
configuration file are general program options useful for
even the most basic ANOVA:
---
General Parameter Section
10                Maximum number of ANOVA Factors
                  [Can use a maximum of 15 unless you change
                   variable MAXFACTOR in cleave.h and recompile]
1024              Maximum number of Levels for each Factor
Section concerning Significance Levels
0.10              Lowest Significance "p" Level  "Sig[0]"
0.05              Middle Significance "p" Level  "Sig[1]"
0.01              Highest Significance "p" Level "Sig[2]"
---
The first section is useful in managing the amount of memory
that CLEAVE uses (the smaller both parameters are the better).
The second section is useful for telling CLEAVE when to alert
you to statistically significant results by indicating them
with "*"s in various places.
More specialized CLEAVE options are outlined below.
First, CLEAVE can make adjustments to deal with (co)variance
variations within factor levels in repeated measures designs.
To invoke these adjustments, go into the "cleave.cnf" file
and look for the lines:
---
Section concerning Lack of Uniform Data Variance/Covariance
1                 Between-Factors Box Correction Computed?
                   Yes = 1;  No = 0
1                 G-G Computed only When Needed?  Yes = 1;  No = 0
4                 Compute Geisser Greenhouse Epsilons for
                   Interactions of at Most This Order (0 = None)
---
These allow the program to compute Geisser-Greenhouse and related
epsilon factors - these factors attempt to take into account
distortion produced by correlation amongst factor levels.  It does
this by reducing the effective degrees of freedom used by F values
that the CLEAVE program computes.  The first parameter above allows
the user to compute Geisser-Greenhouse-like epsilons for non-
repeated measures factor.  Two such epsilons are computed for
each factor, one to correct the F’s numerator degrees of freedom
and the other to correct the denominator df.
The parameter labelled "G-G Computed only When Needed" is a way
to restrict degree of freedom adjustments to the times when
the unadjusted "p" values are judged to be significant.  Note
that in the "cleave.cnf" file we have noted where the user
can select 3 levels of significance, and the above parameter
restricts Geisser-Greenhouse adjustments from being computed to
when the unadjusted "p" value is less than the largest level
of significance (= "Sig[0]").
The third parameter above allows the user to restrict adjustments
of degrees of freedom to those main and interaction terms of
a certain degree or less.  This is useful because it can take
a very long time to compute the Geisser-Greenhouse adjustments.
(See the Limitations section of this document for more information
on the speed of computing Geisser-Greenhouse epsilons).
The results of the computational adjustments appear in the
second half of the CLEAVE ouput where the "F values" are computed.
Second, CLEAVE enhances the ANOVA output by computing treatment
magnitude effects - specifically partial omega squared and
partial nu squared - within each of the source sections where
F values are computed.  In addition, if one of the above
Geisser-Greenhouse options has been selected and computed for
a given source, then the partial omega squared treament
magnitude will be recomputed taking into account the Geisser-
Greenhouse correction for nonspherical covariances.  One can
choose to compute treatment magnitudes by the changing the
"cleave.cnf" at section:
---
Section concerning Treatment Magnitudes and Power
1                 Treatment Magnitudes Computed   On = 1; Off = 0
1                 Treatment Magnitudes List       On = 1; Off = 0
2                 Compute Power/# of Subjects Table  On = 1; Off = 0
                               (to do only when F is significant = 2)
---
In addition, after the F values for each source factor (main
effects and all interactions) along with the new additions have
been printed, CLEAVE can generate a convenient listing of
each of the source factors by partial omega squared treatment
magnitude and significance level that has been exceeded.  And the
program indicates whether the Geisser-Greenhouse correction has
been used in computing both values.  See the Example section of
this document to see what a treatment magnitude list looks like.
Third, note that in the above section, one can choose to compute
a power/subjects table from the treatment magnitudes.  This table
provides a post-hoc analysis that can be used in experiment trial
runs to help determine how many subjects are likely to be needed
to run a successful experiment (one where statistical significance
is attained if there is real experimental effect).  The output
format of the table is easy to use (see the Example section),
and the table’s output is adjusted using the Geisser-Greenhouse
factor if the latter parameter is computed for the source factor
in question.
Fourth, there are a number of post-hoc tests and values that can
be computed.  The options are detailed in "cleave.cnf" as:
---
Section concerning Post-Hoc Tests
1                 Scheffe Post-Hoc Test Computed  On = 1; Off = 0
1                 Compute Pairwise Significance Comparisons for
                     Interaction Terms of at most this Order
2                 Pairwise Comparison Type: 0 base code w/ options:
                     Control is +1, All-Pairwise is +2, and to use
                     Plain Joint Factors is +4 (default value is 2)
0                 Use Bonferroni (=0) or Sidak (=1) Pairwise
                     Probabilities inside Pairwise Tests
0                 Perform Simulataneous (=0) or Sequential (=1)
                     Pairwise Significance Tests
1                 Indicate which Significance Level to use (Sig[x])
                  in determining pairwise familywise significance
1                 Indicates whether or not to use one pooled error
                  in computing all pairwise tests:  1 = Yes , 0 = No
---
One post-hoc test value that can be computed for each of the source
sections is the Scheffe Test, which controls experimentwise error
for any number of specific linear combination tests on the
source levels.  As such it provides a very high barrier to making
experimentwise errors.  As with treatment magnitudes, if one of
the Geisser-Greenhouse options has been selected by the user and
those epsilons are computed for a given source, the Scheffe value
is recomputed in light of the source factor’s covariance effect.
The other options tell CLEAVE to compute one or two of 16 different
pairwise "t" tests that can help identify a pair of
factor levels which are significantly different from one another.
First, one can tell CLEAVE to compute all pairwise differences
between a control level and all other levels, or to compute all
possible pairwise difference (or you can compute both).  Second,
one can choose to use Bonferroni or Sidak probability corrections
with which to control the familywise error.  Third, one
can use a sequential (=step-wise,"Holm") analysis to improve the
chance of finding significant pairwise level differences.
Fourth, one can choose to compute pairwise differences on the
ordinary joint factors level means or using the ANOVA model’s
interaction level means.  Finally, as with the Geisser-Greenhouse
correction, one can restrict CLEAVE to computing pairwise
differences to interaction source terms of a certain degree.
The last two parameters that the user can set to control pairwise
comparison tests are to choose which of the three significance
values to use as the familywise significance level, and to choose
whether or not to use a pooled variance error in all of the pairwise
comparisons (rather than using the variances computed directly
from each level of the source term pairs).
The next section of the configuration file deals with
random factors that might appear in the experimental design.
---
Section concerning Factor Types
1                 Indicates which factors are random variables
                  = Sum_{c = random factor column number} 2^{c}
                  This should always be odd since the subjects
                  column (column 1) is always a random factor
0                 Satterthwaite F Computation Type
                  Use approximations of effect and error = 0
                  Use approximations in denominator only = 1
0                 Compute Pairwise Post-Hoc Results for Sources
                  w/ Random Components.    0 = No, 1 = Yes
---
We already saw in the previous section how to use the first
subsection entry to specify which factors are random factors.
The second parameter which can be set tells CLEAVE exactly how
to compute the quasi-F values which sometimes must be computed
in order to compare a source term’s effect to the appropriate
error term.  The default "0" requests that CLEAVE use quasi-Fs
in both the numerator and denominator, and has the advantage
that the F value cannot be negative.  While a "1" tells CLEAVE
to only use quasi-F calculations in the denominator (error) term.
The third parameters allows the user to compute pariwise post-
hoc tables for source terms with random components - which
ordinarily the user will not want since the levels of such a
source will not be all that interesting (they were just
randomly selected after all).
---
Section concerning Repairing/Detecting Problematic Data Sets
0.01              Percentage (in %) of missing data that can be
                  interpolated to balance a design
                  (use 0.0 to turn this feature off)
0.01              Percentage (in %) of excess "entangled" data
                  that can be excised to balance a design.
                  (use 0.0 to turn this feature off)
24                Maximum Number of bins for the data histogram
                  for detecting outliers (minimum of 1)
0                 Simply average repeated data values (=1)
                  or create a new REPEATS factor      (=0)
---
This penultimate section of the "cleave.cnf" file helps deal with
data sets which are deficient in some way.  The first two
parameters help the user deal with slightly imbalanced data sets
- ones with a little bit too much data or too little data.  In
both, the user selects a percentage value which tells CLEAVE to
interpolate or delete data points provided that the number of
missing or extra, respectively, are less than the specified
percentage (compared to the amount of present data).  E.g. in the
template data file above, 0.1% are the thresholds selected, and
so if CLEAVE detects that a data file has fewer than 1 in 1000 
data points missing from being a balanced data set, CLEAVE
will attempt to interpolate the missing points by using the
cells means and standard deviations.
The third parameter above ("24") tells CLEAVE the maximum size
of the histogram which appears at the start of the output
file.  E.g. one might wish to increase this size to help 
detect the exact placement of an outlier data point.  Finally,
the fourth parameter listed above instructs CLEAVE on how to
treat duplicate data lines - those lines whichappearing in the
data set which are identical save for the outcome value in the
last column.  CLEAVE can either simply average all identical
duplicates’s values together or create a new factor to divide
the duplicate data into separate cells.
Finally, the last section of parameters helps the user control the
output by selecting factor interaction orders:
---
Section concerning Restricting data output via interaction orders
6                 Order of Interaction Levels of Basic Statistics
                  to Output (0 = output them all [default])
0                 Order of Interaction (and above) to classify
                  as part of the error term (=0 to turn off)
4                 Order of Interaction at which to compute GG epsilons
                  using an approximation routine.
---
The first parameter can be used to shorten output considerably
when dealing with highly multi-way ANOVA designs - one’s where
the joint factor means are of little interest.  The second
parameter is used to specify that the ANOVA model you desire
chops off fixed effects at a certain interaction order and
thereby pools those sums of squares into appropriate error
terms.  The final parameter is used to speed up the computation
of high order interaction GG epsilons by using an approximate
routine that is exact only for uniformly covarying levels.
In the next section we see how most of these options come
together to form the output of CLEAVE.
2. SAMPLE CLEAVE OUTPUT
The enhanced output of CLEAVE is demonstrated in the
following example from the output file using the command
"cleave Position <repeats >output" with the "cleave.cnf"
file configured as in the previous section:
*****************************************************************************
CLEAVE: Duplicate data lines...creating new REPEATS factor.
CLEAVE      (c) Timothy Herron       January 30, 2005 Version
Data Histogram
----------------------------------------------------
Bin Left Edge     # of Data Points    Bin Right Edge
----------------------------------------------------
            2     3                              3.5
          3.5     7                                5
            5     18                             6.5
          6.5     7                                8
            8     12                             9.5
          9.5     4                               11
           11     7                             12.5
         12.5     2                               14
----------------------------------------------------
Source: Grand Mean
Positio Repeats      N         Mean      Std Dev   Normed Ranges
                    60       7.2500       2.7778       **|**    
Subjects:
S1                  12       6.3333       2.8391       **|**    
S2                  12       7.1667       2.7579       **|**    
S3                  12       7.0833       2.8431        *|**    
S4                  12       7.5000       2.7136        *|**    
S5                  12       8.1667       2.8868        *|**    
Source: Position 
Positio Repeats      N         Mean      Std Dev   Normed Ranges
Center              20       7.3000       2.6378       **|**    
Left                20       6.1000       2.2688       **|**    
Right               20       8.3500       3.0310        *|**    
Source: Repeats 
Positio Repeats      N         Mean      Std Dev   Normed Ranges
        1           15       4.2667       1.0998       **|**    
        2           15       6.4000       1.2984       **|**    
        3           15       7.4000       1.3522       **|**    
        4           15       10.933       1.7099       **|**    
Source: Position Repeats 
Positio Repeats      N         Mean      Std Dev   Normed Ranges
Center  1            5       4.2000      0.83666        *|*     
Center  2            5       6.4000      0.54772        *|*     
Center  3            5       7.6000       1.1402        *|*     
Center  4            5       11.000      0.70711        *|*     
Left    1            5       3.6000       1.1402        *|*     
Left    2            5       5.2000      0.83666        *|*     
Left    3            5       6.4000      0.89443       **|*     
Left    4            5       9.2000      0.83666        *|*     
Right   1            5       5.0000       1.0000        *|*     
Right   2            5       7.6000       1.1402        *|*     
Right   3            5       8.2000       1.4832        *|*     
Right   4            5       12.600       1.3416        *|*     
    FACTOR     LEVELS       TYPE   VARIABLE  DIMENSION    BALANCE
----------------------------------------------------------------------------
  SUBJECTS          5                Random    Crossed            
  Position          3     Within      Fixed    Crossed    Uniform 
   Repeats          4     Within      Fixed    Crossed    Uniform 
    Values         60       Data 
----------------------------------------------------------------------------
                                SS      df       Eta^2 (R^2)
Total Sum Squared:          455.25      59
S/                      21.3333333       4            0.0469
SOURCE                          SS      df             MS        F    
  p
Position                      50.7       2          25.35    33.99  0.0001
***
PS/                     5.96666667       8      0.7458333
Partial Omega^2:    0.8919         Scheffe Test p=0.050:      8.92
Partial Eta^2:      0.8947         Eta^2 (R^2):       0.1114
Lower Bound Epsilon:              0.5000                            0.0043
***
Box-Geisser-Greenhouse Epsilon:   0.5376                            0.0033
***
Huynh-Feldt Epsilon:              0.5772                            0.0025
***
                                   Scheffe p=0.050 (GG):     14.42
Power(GG)=>  0.50     0.70     0.80     0.90     0.95     0.99   
Sig=0.100       2        2        2        3        3        3    subjects
Sig=0.050       3        3        3        3        3        4    subjects
Sig=0.010       4        4        4        5        5        5    subjects
Pairwise Comparisons; Familywise Error: 0.0500 ; Bonferroni Prob.: 0.0167
    0.0500  1                             1  
     -1.15  2  0.0044*                    2  
      1.10  3  0.0088*  0.0001*           3  
                   1        2        3    
               Cente     Left    Right    
SOURCE                          SS      df             MS        F    
  p
Repeats                 348.183333       3       116.0611   125.85  0.0000
***
RS/                     11.0666667      12      0.9222222
Partial Omega^2:    0.9690         Scheffe Test p=0.050:     10.47
Partial Eta^2:      0.9692         Eta^2 (R^2):       0.7648
Lower Bound Epsilon:              0.3333                            0.0004
***
Box-Geisser-Greenhouse Epsilon:   0.4687                            0.0000
***
Huynh-Feldt Epsilon:              0.6465                            0.0000
***
                                   Scheffe p=0.050 (GG):     17.24
Power(GG)=>  0.50     0.70     0.80     0.90     0.95     0.99   
Sig=0.100       2        2        2        2        2        2    subjects
Sig=0.050       2        2        2        2        2        2    subjects
Sig=0.010       2        3        3        3        3        3    subjects
Pairwise Comparisons; Familywise Error: 0.0500 ; Bonferroni Prob.: 0.0083
     -2.98  1                                      1  
    -0.850  2   6e-06*                             2  
     0.150  3   9e-08*  0.0036*                    3  
      3.68  4   2e-11*   1e-09*   2e-08*           4  
                   1        2        3        4    
                   1        2        3        4    
SOURCE                          SS      df             MS        F    
  p
PR                      5.96666667       6      0.9944444     1.98  0.1080
   
PRS/                    12.0333333      24      0.5013889
Partial Omega^2:    0.1973         Scheffe Test p=0.050:     15.05
Partial Eta^2:      0.3315         Eta^2 (R^2):       0.0131
--------------------------------------------------------------------------
         TREATMENT EFFECTS IN ORDER OF SIGNIFICANCE AND THEN SIZE     
   
                  Partial  Significance     Error      Error        Eta
  
Source            Omega^2     Levels        Level      Types      Squared

--------------------------------------------------------------------------
Repeats            0.9690     0.0100*      0.9222    Subjects     0.76482
Position           0.8919     0.0100*      0.7458    Subjects     0.11137
PR                 0.1973     1.0000       0.5014    Subjects     0.01311
         Cumulative R^2 (Eta^2) Due to All Source Terms: 0.8893
   * = Significance Levels Modified by Box-Geisser-Greenhouse Epsilons
*****************************************************************************
We will analyze the above output for the factor
labelled "Position" to see CLEAVE’s features.  Each
output file divides into three main parts: basic
information, then statistical information, and then
a summary list at the end.
CLEAVE: Duplicate data lines...creating new REPEATS factor.
Note that CLEAVE alerts us to the fact that it has
detected and generated a new factor to take care of
data rows with duplicate factor levels (including
the subject levels).
CLEAVE then prints out basic information about the
data, starting with a histogram of all of the input
data:
Data Histogram
----------------------------------------------------
Bin Left Edge     # of Data Points    Bin Right Edge
----------------------------------------------------
            2     3                              3.5
          3.5     7                                5
            5     18                             6.5
          6.5     7                                8
            8     12                             9.5
          9.5     4                               11
           11     7                             12.5
         12.5     2                               14
----------------------------------------------------
The histogram’s purpose is primarily to help the
user detect outliers and also provides some
information on the distribution of the data error.
In this case we see that most of the data points
lie in the 4 bins lying in the range from 3.5 to
9.5, and that there is a rough symmetry to the
data distribution.
The second section of the output prints out some basic
information about each of the main and interaction
effects in the ANOVA design.  In the example above,
the main effect "Position" is seen to have 3 levels,
"Center", "Left", and Right" and each levels
outcomes’s mean and standard deviation is reported:
Source: Position 
Positio Repeats      N         Mean      Std Dev   Normed Ranges
Center              20       7.3000       2.6378       **|**    
Left                20       6.1000       2.2688       **|**    
Right               20       8.3500       3.0310        *|**    
The Normed Ranges column provides information on the
normalized (in std. dev. units) maximum and minimum
data values in the Center, Left, and Right positions,
respectively.  The Ranges are intended to help the
user detect the presence of outliers, which will
likely make the graph lopsided (e.g. **|*****, which
would indicate that the maximum data value is greater
than 5 standard deviations from the mean).  In the
above Normed Ranges, no data point in any of the
3 positions is greater than 2 standard deviations
above or below the mean ("|" indicates the mean).
After basic information is printed about both main and
interaction terms, the statistical info is printed.
The first part of CLEAVE’s statistical output is the
summary information about the factors in the design:
    FACTOR     LEVELS       TYPE   VARIABLE  DIMENSION    BALANCE
----------------------------------------------------------------------------
  SUBJECTS          5                Random    Crossed            
  Position          3     Within      Fixed    Crossed    Uniform 
   Repeats          4     Within      Fixed    Crossed    Uniform 
    Values         60       Data 
----------------------------------------------------------------------------
This informs the user whether each factor is a within or
a between factor and how many levels the program detected
within each factor.  CLEAVE also tested the data and
found that that the levels are crossed.  Finally, CLEAVE
figures out whether a factor is uniformly balanced, is
nonuniformly proportional (i.e. the data is a so-called
proportional design experiment), or is totally unbalanced.
Next, CLEAVE prints out the total sums of squares
and degrees of freedom for this particular ANOVA design:
                                SS      df       Eta^2 (R^2)
Total Sum Squared:          455.25      59
S/                      21.3333333       4            0.0469
In a pure repeated measures design, as is the case here,
the total SS and df should be equal to the summed SS’s and
df’s of all of the proceeding factors combinations SS’s
and df’s, respectively (and we see this is true:
455 = 21+51+6+348+11+6+12 and 59 = 4+2+8+3+12+6+24).
CLEAVE also computes the SS and df due to the subjects,
as well as printing the Eta^2 value of the subjects
(21.3333333/455.25 = 0.0496) for comparison to later Eta^2
values.
CLEAVE then produces various statistics for main
effects and then interactions in increasing order
of interaction size (second order, then third
order, etc.).  The omnibus F test of the main effect
of factor "POSITION" is recorded in the lines:
SOURCE                          SS      df             MS        F    
  p
Position                      50.7       2          25.35    33.99  0.0001
***
PS/                     5.96666667       8      0.7458333
and are recognizable from any ANOVA program - they
indicate the sum of squares, degrees of freedom and
mean squares of the main effect "POSITION" and its
proper error term "PS/" (the slash "/" is the usual
sign which indicates nesting of between factors
inside of the subject factor - in this case there are
no nesting between factors).  We see that the above
F value is highly significant (by the "***" beside
the p value, which indicates that the p value is
less than "Sig[2]").
The next two lines:
Partial Omega^2:    0.8919         Scheffe Test p=0.050:      8.92
Partial Eta^2:      0.8947         Eta^2 (R^2):       0.1114
record some of the treatment effects and one post-hoc test
value associated with the particular source named above ("A").
Partial Omega^2 and Eta^2 values can be used to estimate the
effect that the indicated source factor combination (or main
effect) has on the experiment output value.  Partial omega
squared is the ratio of the variance due to the source level
means to the sum of the source level mean variance plus the
variance due to the error term.  Partial eta squared is a
simpler value and is just the ratio of the sum of squares due
to the source factors to the sum of sums of squares of the
source and error factors.  Finally, R^2 is the ratio of source
sum squared (SS) to the total ANOVA sums squared.
See later on in the documentation for ways to use these
three treatment magnitude values.  In the above example it
appears that our factor "A" produces treatment magnitude
effects which are quite strong: 1.0 is a maximum value for
both partial values, and the 0.1114 R^2 says that more than
11% of the total variation is due to the POSITION factor.
In addition, the Scheffe post-hoc values were computed for
source "POSITION" above.  This critical F value can be used
to conduct any number of linear post-hoc tests on the factor
levels to check which levels, if any, significantly affect
the outcome of the experiment.  The way to use Scheffe values
is to run CLEAVE again on an altered experimental data set,
a process which we explain later in this documentation.
The next three lines:
Lower Bound Epsilon:              0.5000                            0.0043
***
Box-Geisser-Greenhouse Epsilon:   0.5376                            0.0033
***
Huynh-Feldt Epsilon:              0.5772                            0.0025
***
give estimates of the various Box correction
factors for the immediately preceeding F value.  Each of the
values can be used to reduce both of the degrees of freedom
of the sums of squares in order to more accurately estimate
the probability that the null hypothesis is true.
The Lower Bound estimate is the most conservative value,
while the Geisser-Greenhouse is the maximum liklihood
estimate of the Box correction.  Finally, the Huynh-Feldt
estimate attempts to correct the bias that the Geisser-
Greenhouse epsilon has when the "true" Box correction is
near 1 [H-F should only be used when G-G > 0.8, if used
at all].
For example, if we want to be very conservative, we can use
the Lower Bound estimate in the above example to reduce the
effect ("POSITION") and the error ("PS/") degrees of freedom
to be 0.5*2 = 1 and 0.5*8 = 4, respectively.  In this
case, the computed F value of 33.99 gives a probability of
maintaining the null hypothesis of 0.0043, which we compute
at the right of the Lower Bound Epsilon line for convenience.
(use this "p" value instead of the "0.0001" that you
originally obtained in the above example).
The next source-associated line reads:
                                   Scheffe p=0.050 (GG):     14.42
CLEAVE recomputes the Scheffe post-hoc values by factoring 
in the Geisser-Greenhouse correction.  Using this value is 
preferred over using the uncorrected value computed above
because this new value takes into account the distortion
to the source distribution introduced by covariance
effects (and unequal variance effects).
Following this we find that CLEAVE prints out a power table:
Power(GG)=>  0.50     0.70     0.80     0.90     0.95     0.99   
Sig=0.100       2        2        2        3        3        3    subjects
Sig=0.050       3        3        3        3        3        4    subjects
Sig=0.010       4        4        4        5        5        5    subjects
The interpretation of this power table is simple: the top
row lists the desired power that user wishes to achieve
in a future experimental run: say 0.80 or 80%.  And the
first column lists the three possible levels of significance
that the user might use in that future run: say 0.05.  Then,
CLEAVE predicts that the user will need to run 3 subjects
in the experiment to have an 80% chance of attaining a 0.05
significance level assuming that Factor "POSITION"’s effect
in future runs of the experiment is approximately as it was
in this experimental run (where we used 5 subjects - so we
perhaps wasted some time by testing too many subjects).
Note that the "(GG)" in the above table indicates that
the table has been corrected for source covariance
anamolies by using the Box-Greenhouse-Geisser epsilon.
CLEAVE makes this correction by using the corrected
(GG) degrees of freedom above.
The next section of the output is where the post-hoc
pairwise factor level comparisons are listed:
Pairwise Comparisons; Familywise Error: 0.0500 ; Bonferroni Prob.: 0.0167
    0.0500  1                             1  
     -1.15  2  0.0023*                    2  
      1.10  3  0.0049*   4e-05*           3  
                   1        2        3    
               Cente     Left    Right    
What we see is a simple table which lists the t-test
probabilities run on all pairs of the three factor levels of
factor "POSITION".  The Bonferroni correction is used to control
the familywise error and we find that each of the factor levels
are significantly different from each other level (all comparisons
performed simultaneously).  We can see this by noting that
all normalized level means (the values 0.05, -1.15,and 1.10)
differ from one another by more than the approximate Bonferroni
distance, which is 0.824, and that the more precisely computed
p values in the table (0.0023, 0.0049, and 0.0000) indicate that
that every pair is significantly different from every other.
Lastly, CLEAVE produces a new section summarizing important
data values at the end of the program:
--------------------------------------------------------------------------
         TREATMENT EFFECTS IN ORDER OF SIGNIFICANCE AND THEN SIZE     
   
                  Partial  Significance     Error      Error        Eta
  
Source            Omega^2     Levels        Level      Types      Squared

--------------------------------------------------------------------------
Repeats            0.9690     0.0100*      0.9222    Subjects     0.76482
Position           0.8919     0.0100*      0.7458    Subjects     0.11137
PR                 0.1973     1.0000       0.5014    Subjects     0.01311
         Cumulative R^2 (Eta^2) Due to All Source Terms: 0.8893
   * = Significance Levels Modified by Box-Geisser-Greenhouse Epsilons
This list is an ordered list of ANOVA sources which tell us the
estimated treatment magnitudes (source effects) and signficance
levels acheived by each source.  The only sources listed are
those whose partial omega squared is greater than zero, and the
sources are listed in order of significance level, and then
partial omega squared order.  Note that the significance levels
are divided into 4 classes, highest significance (0.01),
medium significance (0.05), lesser significance (0.10), and not
significant (1.0).  Also, CLEAVE indicates whether or not each
significance value was computed by taking into account the
Box-Geisser-Greenhouse factor.
The Error Level column records the value of the error term’s
mean squared value that was used to compute the row’s
partial omega squared.  These values provide a good check on
whether the list of partial omega squared values can really
be used for comparisons or not by seeing if the error level
values are reasonably constant (which is an assumption
of the standard ANOVA procedure).
At the bottom of the list CLEAVE display the cumulative R^2
value as computed from all three source terms for which R^2
values were computed.  Nearly 89% of the SS values are
accounted for in these 3 in the three source terms - as
opposed to being contained in error terms (or due to the
subjects themselves).
The intent of this list is to provide the CLEAVE user a
convenient summary of treatment magnitudes so that the user
can gain some idea of which main and interaction terms
really matter to the experimental outcome.  Here we see that
the main effects are both important, while the interaction
effect is not so important - though an effect of 0.0895 is not
so trivial and might be significant with a few more subjects
in the experiment.  Finally, we again note that 89% of the
total sums squared is accounted for by the main and
interaction term, indicating that the signal to noise ratio
in the data produced in the experiment is rather high.
3. NOTES ON USING CLEAVE’S FEATURES
USING VARIOUS ANOVA DESIGNS AND PRODUCING F VALUES
For most ANOVA designs that the program CLEAVE can handle,
the basic computations are the same - compute the
sums of squares (SS) and degree of freedom values for
all relevant factor combinations by using the proportional
design equations found in many ANOVA references.  Then
mean squares (MS) and F ratios can be computed - the ratio
of the MSs for the source and its error term - and thus
"p" values computed using standard equations for the
F distribution’s cdf function.
The only relevant exception to this story is when there
are multiple random factors other than the SUBJECTS factor
in the design.  In the case where there are two or more
random factors not in a source effect under consideration,
then in order to test the hypothesis that the effect’s
variation implies no real effect, CLEAVE must use at least
4 MS values to compute a quasi-F value and subsequent p
value.  In fact, if there are n>1 random variables not
included in the source term, then CLEAVE uses the classic
Satterthwaite formulas to compute a quasi-F value which
uses 2^n MS/df pairs.  Quasi-F statistics are not true F
statistics, but when used along with the degrees of
freedom as specified by Satterthwaite, they approximate
the correct distribution by using an F distribution which
has the correct distribution’s first two moments.  The user
can tell when CLEAVE makes such a computation by seeing the
header:
SOURCE             Satterthwaite       df             MS   Quasi-F     
 p
In this case no SS values are displayed since it takes at
least 4 to compute the MS (and df) values of the numerator
and denominator.
The user has a choice of computing one of two quasi-F
values.  The first way is to choose to let MS(effect) be
the numerator of the quasi-F and the other MS’s be in the
denominator.  The other choice is to use (2^n)/2 MSs in both
the numerator and the denominator of the quasi-F, The
advantage of the latter scenario is that only addition is
used to combine the appropriate MS’s.  Most of the
theoretical work that has been done has been done on the
second, addition-only quasi-F computation, and it is the
best one to use according to some authors (it provides "p"
values that are closer to being correct).  However, some
authors recommend the first Satterthwaite method - modifying
the denominator only - because computed "p" values are
competitive with the second method and because the source
MS is not tampered with.  However, the first method can
end up with a negative denomnator - which makes little
numerical sense for a mean squared.  In that case CLEAVE
will switch to using the addition-only method.
The output of the Satterthwiate equation looks like the
following:
SOURCE             Satterthwaite       df             MS   Quasi-F     
 p
A                    Numerator:       2.5       1.215278      1.29  0.3377
   
->ABC               Denominator:      7.9      0.9444444
where factors "B" and "C" are both random factors.  Notice
that there can be fractional degrees of freedom in the
denominator (and the numerator using the second method).
Note that the first column says that the source factor is
"A" and "->ABC" indicates that the highest order interaction
term used by Satterthwaite is "ABC".
The second method is the default method for CLEAVE, but the
first method can be chosen by altering the cleave.cnf lines
0                 Satterthwaite F Computation Type
                  Use approximations of effect and error = 0
                  Use approximations in denominator only = 1
USING BOX-GEISSER-GREENHOUSE CORRECTION VALUES
The Box correction should be computed whenever there
is significant unevenness in variance or covariance of
factor levels typical of repeated measures designs.
The Box correction value is used to compute both
the Geisser-Greenhouse and Huynh-Feldt epsilons, which
are parameters for approximating a so called quasi-F
distribution using an F distribution.  In fact, uneven
(co)variance values cause the mean square statistics
of a main or interaction effect to stray from an ideal
chi-squared distribution, and the two epsilons above
are ways of approximating this deviant distribution by
adjusting the degrees of freedom of the chi-squared
distribution (they do this by trying to match the
first two moments of a true chi-squared
distribution to that of the distorted distribution).
The F distribution which results when using the
Geisser-Greenhouse epsilon (by dividing by the error
mean square adjusted again using the epsilon) is
usually a conservative approximation and thus can be
trusted for use in identifying significant factors.
A widely recommended way to use the Geisser
Greenhouse options is as follows (this is sometimes
called the Geisser-Greenhouse algorithm):
1) See if the null hypothesis is refuted using the
unadjusted F value probability (at the desired level of
significance).  If not, you are done (using any of the
epsilons only makes the null hypothesis more likely true).
But if the null hypothesis IS refuted, go to step 2.
2) Use the Lower Bound estimate to see if the null hypothesis
can be refuted.  If so, you are done, since using this
epsilon assumes the worst-case correlation.  If not, go to
step 3.
3) Use the Geisser-Greenhouse epsilon to see if the null
hypothesis can be refuted.  Either way, you are done.
4) Optionally, after step 3, if the Geisser-Greenhouse
epsilon value is greater than, say, 0.7, and you can’t quite
get step 3 to refute the null hypothesis (e.g. p = 0.057),
then use the Huynh-Feldt epsilon to see whether the null
hypothesis can be refuted with it.  If so, then be prepared
to convince your paper referee that the H-F epsilon is a
kosher method to use to wring significance out of your
borderline-significant experimental data.
However, the 4 steps just mentioned were formulated back
in the day when computation was more expensive than it is
today, and so the idea was to avoid computing the GG
epsilon values, if one could, even for a significant F
result.  Given the demonstrated reliability (though it
is a bit overly conservative) of the Geisser-Greenhouse
modification to the degrees of freedom for any F, a
reasonable approach is to simply always use the GG epsilons
- in effect you are dispensing with the ANOVA assumption
of homogeneity of (co)variances.  Ignoring the Huynh-Feldt
epsilon is reasonable, too, since though it corrects
rather large, but overly conservative, GG epsilons, the
actual effect on the resulting "p" value is not that great
(so the user loses little power by just sticking with the
Geisser-Greenhouse epsilons) and there have been studies
done where the the H-F epsilon produces a liberal p value
test.
The algorithm which computes the Geisser-Greenhouse epsilon
relies on one assumption which may be of interest to the
user: in the case where the are "between" variables not
included in the source factor, the algorithm just averages
together the covariance matrices within each separate group
(of all the auxilliary between factors) in order to compute
the Box correction factor.  If, in fact, one has reason to
believe that there may be significant differences in these
averaged covariance matrices, then the G-G epsilon might not
provide a fair adjustment to the "p" values.  However, the
lower bound epsilon is as it claims to be even in this case,
so it provides a conservative "p" value to the user.
In the case that the user has non-subject random variables
included in the experimental design, CLEAVE is able to use
Geisser-Greenhouse and Lower Bound adjustments to the
p values even in these cases.  What this requires is
computing a separate epsilon (G-G or L-B) for each of the
MS/df pair used in computing the unadjusted F value.  Thus,
we end up with a separate epsilon for the numerator and
denominator of the computed F value which is used to adjust
- separately in this case - the degrees of freedom of the
numerator and denominator of the F or quasi-F.  CLEAVE
shows the user the results in the following format:
SOURCE                          SS      df             MS        F    
  p
B                       6.88888889       2       3.444444    12.40  0.0193
** 
AB                      1.11111111       4      0.2777778
Lower Bound Epsilons:            (num: 0.500, den: 0.250)           0.1762
   
Box-Geisser-Greenhouse Epsilons: (num: 0.625, den: 0.324)           0.1344 
  
Here factor "A" is a random variable and so there are
seperate epsilons for source term "B" and error term "AB"
(numerator and denominator respectively).  In the case
where a quasi-F value is computed, the numerator
epsilon and denominator epsilon are both composites of
the epsilons used to adjust each "df" appearing in the
Satterthwaite df equation.
In addition, the user can compute Geisser-Greenhouse-like
epsilons for randomized ANOVA designs (non-repeated
measures experiments containing pure between factors).  The
theoretical underpinning is the same as it is for within
factors designs - see Box (1954) - so that epsilons are
used to correct degrees of freedom and thus to compensate
for variance heterogeneity.  And, in fact, we have done
Monte Carlo simulations which show that the corrected
degrees of freedom, under various heterogeneity scenarios,
does well in making corrected "p" values reflect their
stated significance levels with balanced ANOVA designs,
which are CLEAVE’s specialty.  See the file "randbox.txt"
for a brief description of the simulations and the results.
If the user chooses to compute Box epsilons for pure
between factors main effect and interaction terms,
CLEAVE will produce the following kind of result (the
following example is reproduced in the file bogartz6.out
and concerns the between main effect labeled "B").:
SOURCE                          SS      df             MS        F    
  p
B                       75.2839506       2       37.64198    17.03  0.0001
***
S/BC                    39.7777778      18       2.209877
Lower Bound Epsilons:            (num: 0.500, den: 0.111)           0.0540
*  
Box-Geisser-Greenhouse Epsilons: (num: 0.988, den: 0.423)           0.0016 ***
As with repeated measures designs and Geisser-Greenhouse
epsilons, CLEAVE produces multiple epsilons, one a "worst
case" pair of epsilons, and another a best estimate (Box)
pair of epsilons.  But as in the random factors case,
for between factors the Box epsilons are different for
the numerator degree of freedom and the denominator
degree of freedom.  So, for example, the Lower Bound
Epsilon "p" value of 0.0540 was computed by using the F
value of 17.03 along with the adjusted degrees of freedom
2*0.500 (numerator) and 18*0.111 (denominator).  But the
above factor "B" had little likely heterogeneity of
variance as indicated by the Box epsilon of 0.988 for
the numerator and subsequent adjusted "p" value of 0.0016.
One can use the between factor’s Box epsilons in the same
way that one uses Geisser Greenhouse corrections in the
repeated factors case: as part of the Geisser-Greenhouse
algorithm or always using them.  The latter policy is made
more attractive as only variance vectors, not covariances
matrices, need to be computed for pure between factor
terms.
Similar independent epsilons are computed when the error
term is constituted, in part, of high-order interaction
terms (e.g. when cells only contain 1 value and the
highest order interaction term is the error).  This can
happen because epsilons are never computed for such high-
order interactions used in the error term even when the
epsilons are computed for the effect terms.
One last comment.  In our lab, we have noticed that when
processing highly multi-way data using CLEAVE, that many,
high-order interaction factors show significant GG correction
epsilons (i.e. they are closer to the lower bound value than
to 1 [1=a true "F" dist]).  To us, this implies that when
analyzing highly factorial experiments, we would be remiss
to not use the Box-Geisser-Greenhouse option when looking for
significant experimental factors.  This is especially true
when one realizes that with multi-way and/or large data sets,
it is hard to get a feel for whether ANOVA’s standard "equal-
variance" or "covariance sphericity" assumptions hold for all
factor interaction terms.  If you always use the Box options
as standard policy, then you have to worry less about those
two assumptions (don’t worry - you’ve still got normality and
other linearity assumptions to keep you on your toes...).
USING SOURCE TREATMENT MAGNITUDES
When perusing the usual results of an multi-way ANOVA, an
important question to ask is whether the significant (or
nearly significant) results one finds using an omnibus F
test have large or small effects on the outcome of the
experiment.  This is not an obvious question to answer
from the F or p values alone because with enough subjects
in the experiment, even a trifling effect can be made to
pass even a severe significance test.
CLEAVE include three parameters which can help
answer this question.  Partial omega squared is the ratio
of the variance estimate due to the effect divided by
that same variance plus the error variance.  And partial
eta squared is the ratio of effect sums squared to the
same sum squared plus the error term’s sum squared.
Partial omega squared provides a direct look at the source
term’s effect within the linear ANOVA model equation.
Partial eta^2 is an analogue to the regression coefficient
which estimates how important a coefficient is to the
multilinear regression equation.  Third, Eta squared,
which is often called R squared, is the ratio of source
SS (sums squared) to total SS and shows the user what
fraction of total variation is due to the source term
being considered.  The two "partial" values are only
ratios of the source (effect) term to the effect plus
(local) noise, whereas the R squared value is a ratio
of the source magnitude to the total sum squared
magnitude.
All three treatment magnitude values are valuable
to a researcher because they strip away the degrees of
freedom which are intrinsic in "F" and "p" values computed
for each source, and the prescence of those df’s makes
comparing source terms’s "F" or "p" values a *very* bad
idea.
In the linear model generally assumed when performing an
ANOVA, partial omega^2 is a relative estimate of the variance
(sum squared of effect) of each of the factors or factor
interactions.  Thus, the larger this value is, the less likely
the null hypothesis is true.  Further, it is sometime kosher
to compare treatment magnitude terms of different interaction
terms one to another.  If one source term’s partial omega^2
is an order of magnitude larger than that of another source
term, then it is likely that the first source term has a
larger effect on the experimental value than does the latter.
Similar remarks are applicable for partial eta squared.
In contrast to the two "partial" magnitude values.  R^2 tells
the user in a simple way how the amount of variation due to
the effect compares with the amount of variation due to
other source effects.  And at the end of the treatment
magnitude list, the user can see how much of the total sum
squared variance is due to all of the source terms whose
F values were computed - the balance is sum squared variance
due to error terms which are computed earlier but are not
always appearing in the column of error terms.
It is not as straightforward to use these values as it is to
use the "F" and "p" values computed for each source term.
However, it appears to be the case that in some fields, it
is known that a researcher should look for treatment
magnitude effects to be a certain size in order for those
factors to be claimed to be significant to the outcome of
the experiment.  For example, Keppel states that in
psychological experiments, decades of experience shows that
partial omega^2 values can be loosely categorized into
small, medium and large effects by looking for values in
the vicinity of 0.01, 0.06, and 0.11, respectively.
The size of partial omega squared (and partial eta^2)
depends upon the ratio of "signal to noise" that you see in
the experiments that you are running.  If there is a lot
of random fluctuation in your experiments compared to those
in another field, then your treatment effect values will, in
general, be uniformly smaller.  That is why CLEAVE makes no
attempt to classify the absolute size of treatment magnitudes
and instead just treats them as comparative values in its
ordering of source effects at the end of the output file.
There is a reason for caution even with this approach,
however.  Using partial omega^2 values comparatively requires
the standard ANOVA assumption that error level stays "constant"
across subjects and factors (= the epsilon that appears in the
usual linear ANOVA model).  That is why it pays to look at the
"Error Level" column in the ordering of treatment magnitudes.
One should look to see if these values are distrubuted in a
vaguely normal manner.  If not, then the ordering of partial
omega squared values might be of little use, and, in fact,
the omnibus ANOVA results should be taken with a grain of
salt since there might be some subject-factor interaction
lurking in the error terms.
However, there is one way in which the absolute values of
the treatment magnitudes can make a big difference.
Because CLEAVE’s "power/number of subjects" table is
computed directly from the value of partial omega squared,
the smaller this value (regardless of the field of study
you are in) the more subjects you are going to have to run
through an experiment to find significance - assuming you
are confident enough that the effect you are looking for
is really there.
Note that because of the way in which partial omega squared
is computed, it is possible that the partial omega squared
estimate can be negative despite the fact that a variance
can never be negative.  However, statisticians recommend that
when partial omega squared values are used comparatively,
that negative values be reported lest differences in
reported value (between different factor combinations) be
biased.  CLEAVE doe snot, however, list negative magnitudes
in the treatment list table.
Finally, we note that in the case where a random factor
not in the effect term appears in the error term instead
of the usual subject-inclusive error term, then partial
omega squared uses the subject-inclusive error term in
its computation instead of the previous error term in
order to help make the partial omega squared terms more
meaningful.  The column "Error Types" records which error
term is being used: "Subjects" indicates the when the
subject-inclusive error term is being used (which is the
right one to use if there are only fixed factors outside
of the effect term), while "Effects" indicates that the
partial omega squared uses the effects proper error 
term’s mean squared (the one including random factors)
if the subject-inclusive error type is not available
for some reason.
USING POWER/SUBJECT TABLES
Given the intuitive display format of the power/subject
table - that being the number of subjects that it would
take to have a specified chance of attaining the specified
significance level - the power table is easy to use.
We wished to make it easy to use so that users will opt
to keep this feature turned on as often as possible.
This is particulary important to experimenters for at
least two reasons.
First, reducing the number of subjects saves money, so
trial runs can be used effectively to gauge how many
subjects will be needed to demonstrate any effect
or to figure out that the experimental setup just
doesn’t have the requisite power.
Second, many funding organizations (e.g. US National
Institute of Health and the US Veterans Administration)
have ethics regulations which urge researchers to
estimate the experimental power of their setup in
order to help reduce the number of subjects that are
needed in the experiment.  This helps reduce the
number of subjects exposed to the experimental
treatment’s side effects, if any.
Finally, one very interesting use of the power/subjects
table as presented in CLEAVE is as a measure of
strength of effect of the source term.  If the user fixes
a significance level (e.g. 0.05) and a reliability level
(say 90%) and reports the number of subjects needed
to reach that level with that reliability: this is an
intuitive and quite visceral measure of effect strength.
No interpretation is needed compared to partial omega
squared or R^2 or other measures (like Cohen’s f or d’)
because experimenters in a field generally understand
approximately how many subjects are need to attain
significance in experiments with strong or weak effects.
Note that when there are non-subject random factors in
a design, whenever such random factors appear in the
error term (the denominator) of the computed F, a
subject table cannot be computed for the source term
because the number of subjects does not come into play
directly in computing the F value or testing it for
significance.  CLEAVE skips the computation in that
case.
USING POST-HOC AND PAIRWISE TESTS
Scheffe Post Hoc Test
Using the Scheffe post-hoc value is not too hard but takes
a little effort on the part of the person using CLEAVE.
The main idea is that the user reruns CLEAVE on an altered
experimental data set in order to test the analogous source
F value with the Scheffe value produced with the orignial
data set.  This is most useful when the user wants to test
3-or-more-level contrasts of a given factor.
The Scheffe post-hoc value is one of the most conservative
post-hoc tests that is available when one wants to follow
up an omnibus ANOVA with more specific interaction or
main effect hypothesis test.  It is conservative because it
allows one to perform any number of linear hypotheses tests
on the source factors while containing familywise error
under the specified level (i.e. the chance of making just
one error or more on all of the tests is kept below 0.05).
An efficient way to use the tests as follows.  First, run
the omnibus test on your data and observe the Schefe post-hoc
value for each source term you wish to apply post-hoc test
to.  Then, edit the input to the CLEAVE program in order to
convert the data to perform the test you wish to see, likely
one factor at a time.
For example, suppose that factor "A" has 4 levels:
CONTROL, DRUG1, DRUG2, and DRUG3.  If you wish to test the
hypothesis that the average of the drugs combined is
significant against the control, then use a text editor and
change all of the DRUG1, DRUG2, and DRUG3 labels appearing
in the data set into the "DRUG" label (a search and replace
should make this easy), and then rerun CLEAVE and see if the
computed F value for the specified source is exceeds the
Scheffe value from the omnibus test.
As another example, if you wish to test the hypothesis that
DRUG1’s effect is significantly different from DRUG3’s effect,
then use the text editor on the original data set and remove
all of the lines of data containing the labels DRUG2 and
CONTROL (can be done reasonably fast with a editor macro).
Then rerun CLEAVE and see if the computed F for the source
exceeds that of the omnibus Scheffe value.  However, in this
case of a pairwise comparison, it is easier and more accurate
to use the pairwise comparison tables as described later in
this subsection.
The main problem with the Scheffe post-hoc critical F value
is that it is a low power test - meaning that there is a good
chance that you may not get significant results (via the Scheffe
test) even though some other post-hoc test would tell you
that the comparison are looking at is significant.
However, a good thing about the Scheffe test is that its values
are easy to compute and it is easy to adjust when there are
significant covariance effects appearing in the source factors.
In that case you should use the Geisser-Greenhouse corrected
Scheffe critical F value for your comparisons.  In this case
the user should use the "(GG)" corrected Scheffe value when
rerunning the ANOVA with altered factor levels.  See the next
section for limitations on using Scheffe’s F test value.
Paiwise Comparison Tests
Using the pairwise comparison tests are more straightforward.
The pairwise tests use a generalized t test to see whether
or not the hypothesis that two level outcome averages are
likely to be the same.  This is performed for every
combination of levels.  The t test is generalized to take
into account differences in sample size and level variances
- we use a version of Welch’s V as recommended by many
statisticians.  And the test use various fairly conservative
algorithms to control familywise error.
In all cases what CLEAVE outputs are tables which list
a "p" value each pair of source term levels under the
hypothesis that they produce the same outcome.  In the
extensive example analyzed in a previous section we
saw that a triangular table is produced if the user wishes to
inspect all pairwise comparisons (the default case).  If the
user chooses to inspect all comparisons to one control, then
the table that is prouced is a square with the digonal missing
such as the one below (this is for the factor "Repeats" in
the "repeats" file where the "control comparison" option
chosen in cleave.cnf):
Control Comparisons; Familywise Error: 0.0500 ; Bonferroni Prob.: 0.0167
     -2.98  1           0.0001*   0e+00*   0e+00*  1  
    -0.850  2  0.0001*           0.0146*   0e+00*  2  
     0.150  3   0e+00*  0.0146*            0e+00*  3  
      3.68  4   0e+00*   0e+00*   0e+00*           4  
                   1        2        3        4    
                   1        2        3        4    
Here, if one wants to use, say, level 2 as the control, then
the second row (or column because of diagonal symmetry) should
be consulted to see which other level’s effects are likely to be
different from level 2’s effect.  Other rows can be ignored.
One should use the "cleave.cnf" file to select a control table
to be produced (likewise in choosing other options listed below)
because using the all-pairwise table for the purposes of making
comparisons reduces the power of the user’s pairwise tests.
Also, if we are using pooled error in the above comparisons,
we can use the Sidak Distance, 0.971 above to provide
significance intervals to each factor level.  E.g. factor level
3 above has its mean in the interval [0.150-0.971,0.150+0.971]
while keeping the familywise error below 0.05.
We now talk about the other pairwise comparison options below.
The difference between choosing the Bonferroni correction or the
Sidak correction is one of assumptions: the Sidak correction
assumes just that your factor levels are distributed according
to a multinormal distribution, while the Bonferroni makes
not even that assumption - it just uses basic laws of
probability to make its "familywise" correction.  In practice,
however, both corrections give nearly the same results, but the
Sidak correction is slightly higher in value - so it provides
a more powerful test while being conservative with respect to
any significant factor level covariation.
Likewise, whether you use a simultaneous test or a sequential
(step-wise) test depends upon whether you are comfortable
with the quasi-Bayesian assumption that the sequential ("Holm")
test makes - that every time you find a pair which passes the
corrected probability test, then you can run the next test
with a corrected probability assuming one fewer pair to test
(you are updating on the information that the previous pair has
significantly different outcomes).  Thus, the sequential tests
are slightly easier tests to pass and are similar to tests
like Newman-Keuls and Duncan (the latter have more assumptions
attached to them, however, which is why we do not include them).
Another major option that you can change is to choose whether or
not to use a pooled error when computing the pairwise comparisons.
The default is to not use a pooled error, and the reason is that
if one does use a pooled error variance, the user runs the risk
of having the computed p values not being a reliable indicator
of significance in one of two senses.  If there are significant
differences in the the variances of two levels being compared,
then using a pooled error variance can cause the familywise error
specified to be exceeded - or it can cause the pairwise tests
to lose power.  However, if a user knows that the variances
of the different levels are quite similar (or discovers this
by looking at the first part of CLEAVE’s output), then using
a pooled error can increase the pairwise comparison’s power,
and so it would be recommended to change the default value.
The final major option one can choose in doing post-hoc
pairwise comparisons is whether or not to compare ANOVA-model
interaction levels, which is what was done in bulk by the ANOVA
F statistic, or to simply compare plain joint factor levels
via paired comparisons.  The former is the default case, but
that latter could be used in the case that joint factor levels
are more interesting than are the ANOVA model’s interactions
where intersecting lower-order averages being subtracted out
for the sake of having disjoint sums squared.  Plain joint
factor levels can be chosen to be analyzed by adding a 4 to
the value in cleave.cnf which chooses whether to look at
all-pairwise comparisons or control comparisons.
In the case that ANOVA model interaction levels (the default)
and pooled errors (also the default) are selected, CLEAVE
will attempt to boost significance by using logical relations
amongst the different pairs.  These logical relations occur in
interaction terms where at least one of the term’s factors is
binary.  E.g. if we choose to process interactions terms on
the included "proportional" data file using the two assumptions
listedd above, then the pairwise comparison’s table for an
all-pairwise comparison for the first two factor’s interaction
looks as follows:
Pairwise Comparisons; Familywise Error: 0.0500 ; Bonferroni Prob.: 0.0031
     0.491  1                                                          
  
     0.366  2  0.9364                                                   
 
     -1.22  3  0.2515   0.2283                                            
      1.12  4  0.7010   0.6098   0.0936                                   

    -0.491  5  0.5673   0.5844   0.6250   0.3244                           
    -0.366  6  0.5844   0.6015   0.5161   0.3141   0.9364                  
      1.22  7  0.6250   0.5161   0.0466   0.9415   0.2515   0.2283         
     -1.12  8  0.3244   0.3141   0.9415   0.1476   0.7010   0.6098   0.0936
                   1        2        3        4        5        6     
  7        
                   1        1        1        1        2        2     
  2 
                   1        2        3        4        1        2     
  3 
Note that even though there are 28 possible pairwise comparisons
in the table, the familywise Bonferroni probability is 0.0031,
which is 0.05/16 and not 0.05/28 which is what it would be had
CLEAVE not used the logical relations inherint in ANOVA model
interaction terms having one binary factor (the first in the
above example).
DETECTION OF OUTLIERS
There are two parts of CLEAVE’s output which assist the user
in detecting outlying data values.  Detection of rogue values
is important as their presence is one of the major ways in
which the results of an ANOVA can be misleading.  Given the
fact that ANOVA and pairwise comparisons rely on sums of
squares, extreme values can easily reduce their effectiveness
of the resulting statistics.
The inital Data Histogram allows the user to see whether the
data is roughly generated by a linear model with Gaussian-like
noise added to it.  The two most troubling things to see in
the histogram would be, first, to have the bulk of the data
compressed in relatively few bins with a few outlying values
scattered at the edges, and second, to see a very asymmetric
or non-unimodal distribution of the experimental data - one
which is not at all consistent with a linear model.
The Normed Ranges that appear with the basic factor
information also can be useful in tracking down multiple
outlying data values.  Each normed range line diagram
displays the maximum and minimum values appearing in each
line’s factor-level combination data by displaying the
range as multiples of the combination’s data standard
deviation.  Thus, one can use the line diagrams to detect
single data values which are abberrant relative to the
rest of the values.  This works espeically well when one
inspects the Normed Ranges of the cells of the ANOVA
design - when all of the factors in a dataset (sans the
subjects) take on levels and are not averaged over.  One
can then see whether or not the cell data appears to have
extreme values in accord with Gaussian shaped noise added
to a constant data value.
Note that when one is viewing the Normed Ranges of main
effects or lower-order interactions, it is often the case
that one will see maximum or minimum values of high multiples
of the standard deviation even though there are relatively
few data values taking on the particular line’s factor’s
levels.  This can be consistent with the data values being
prouced by a linear model with Gaussian noise when the
other factors being averaged over have significant effects
via the linear model.  E.g. if the user sees the Normed
Range diagram "****|***" on a line where only 35 data points
are in the range of a main effect’s line, the user should
not think that the result is surprising by thinking that
one should rarely see a data point 4 standard deviations
below the mean (and 3 above) beacuse the variation is not
due to Gaussian noise alone but due to linear factor effects.
DETECTING AND CORRECTING ANOVA DATA SET DESIGNS
The detection of the particular design of an ANOVA data set
is intended to help the program automatically perform the
appropriate statistical procedures on the data values.  The
most important decision the program makes is in classifying
factors as "within" (repeated measures) or "between" factors,
where the latter factors have one level assigned to each
subject  (and the assignment is ordinarily done via
randomization).  The detection routine used can pick out
"within" and "between" factors even when there are some
errors in the data set, such as there being too few or too
many data values, or minor errors in the factor level
labels.
Second, the detection routine performs a through check on
the proportionality of the ANOVA data set and to tell the
user whether or not a factor is balanced or not and whether
a balanced factor has varying numbers of subjects assigned
to each level or is evenly balanced.
Finally, CLEAVE can "correct’ data sets which are nearly
balanced, but can only do so for data imblanced in certain
limited ways.  One way in which a defective data set can
be corrected is if some subject performs the experiment
repeatedly with most but not all of the possible
combinations of levels of all of the within factors.
CLEAVE can interpolate data values based upon the present
data cells means and standard deviations in order to
force the design into balance (at least the within
factors will be balanced).  
The other main correction that CLEAVE can make is to find
subjects which do an experiment multiple times with
different levels of a between factor.  In this case CLEAVE
can (randomly) delete some of the data points to make sure
that each subject is associated with only one level of
every between factor.
In both cases of data set modification, the intent is to
have CLEAVE modify only very small departures from a
proportional design, which is why the configurations file
"cleave.cnf" allows the user to place a cap on both the
percentage of missing data that needs to be added or
excess data that needs to be deleted (defaults are set
at 0.1% for both).  The algorithm used is justified
by numerical considerations - that guessing reasonable
values for the absent data will not change the computed
statistics highest (3 or more) significant digits
(similarly for deleted data).
More importantly, if data set modification is attempted
by CLEAVE, in the basic data output one will find the
following messages:
--------------------------------------------------------
CLEAVE: Excise some data to create a proportional design.
Deleted Cell Data
SUBJECT A       B       C       Deleted Data Value
5       1       3       3                     3
CLEAVE: Inserting missing data to create a proportional design.
Inserted Cell Data
SUBJECT A       B       C       Inserted Data Value
8       1       3       3            0.88776571
CLEAVE: Start over with, perhaps, a modified dataset.
--------------------------------------------------------
which informs the user which data points were deleted and
which data points were interpolated into the data set.
This can be important in the case that the user has
fixable defects in the data set file because it pinpoints
precisely where the problem is.
4. PROGRAM LIMITATIONS
1) CLEAVE is constanly being tested in vivo and in vitro.
It has passed all of the tests thus far thrown at it (see
"cleave.tst" for details), but further testing is being done
on real trial data and ongoing corrections are likely to
be needed.
2) Using Box corrections with higher interaction term degrees
might cause the program to work for a long time to compute the
G-G values.  When using CLEAVE on large, multi-way data sets,
it is advisable to test how long the program will take using
a degree value like 2 in order to estimate how long it might
take to compute higher order interaction Box coefficients.
Computing the Box corections only when needed will also
speed up execution a certain extent without costing any
interesting results [the good news is that the more
significant your experimental results are, the longer this
option takes].
Also, a computer with memory limitations might not be able
to hold the covariance matrices which are computed in the
course of computing the G-G epsilons.
E.g. with a 7-way ANOVA run on fragmentary fMRI data common
in our lab, (9 x 3 x 2 x 2 x 15 x 2 x 7 x 5) where the 9 is
the random factor and all factors are crossed (within factors),
it takes the following amount of time to run CLEAVE on a 900
MHz Athlon computer (512 Megs RAM) running Windows 2000:
Computed
Only When
Needed?         Time            Interaction Degree
-------         -------------   ------------------
No              9     seconds           0
No              10    seconds           1
No              20    seconds           2
No              100   seconds           3
No              1500  seconds           4
No              35000 seconds           5
Yes             10    seconds           1
Yes             15    second            2
Yes             50    seconds           3
Yes             800   seconds           4
Yes             10300 seconds           5
Unfortunately, it does appear that the amount
of time it takes to compute interactions of 1 order higher 
increases exponentially (this makes sense if one looks at the
number of computations that the program does to compute the Box
factors).
This problem can be lessened by using the cleave.cnf option
that CLEAVE compute approximate GG epsilon values for
interaction terms at or above a specified order.  The
approximation epsilon is identical epsilon under certain
simplifying assumptions, speeds up the computation greatly,
and appears to be a fairly good approximation to the true
GG epsilon in testing thus far.
3) The input file to CLEAVE is written to a temporary file
so that it can be reread multiple times during processing.
Occasionally, if CLEAVE is interrupted, this file can be
left on the disk - it has the name "CLEAxxxx.tmp" - and
can be deleted without harm when you find it in the current
directory or the tmp directory.
4) Our GG adjustment of the Scheffe critical value,
unfortunately, is not the recommended one [see Miller, e.g.]
- we simply adjust the Scheffe F value using the same
moment-matching degree of freedom adjustment that the
omnibus F test relied upon to estimate significance.
However, this is a sensible approximation which allows the
user to run CLEAVE again as recommended while providing a
margin of safety which the lack of equal covariance values
demands.  The standard adjustment of the Scheffe value is
one which requires knowing what linear combinations of levels
are to be tested - something which would be hard for the user
to incorporate into successive runs of CLEAVE (this is a
post-hoc test, after all).  Additionally, the G-G correction
used to adjust the Scheffe F test value is an average value
which may not reflect the variance/covariance inhomogeneities
of the specific post-hoc factor level comparisons the user is
interested in.  However, one can check to see if the latter
are the same as the former by running the post-hoc test
with the Geisser-Greenhouse option turned on and looking to
see whether the post-hoc epsilon is reasonably close to the
original (omnibus) F test’s computed G-G epsilon.
5) We only include the rather conservative Bonferroni and
Sidak pairwise tests because those two tests use few
assumptions about the data when controlling experimentwise
error, and also because the two best other tests - Tukey
for all-pairise tests and Dunnett for control-pairwise
tests - are expensive to compute if we want to correct
(and we do) for (co)variance heterogeneity.
6) The input file parser has the following quirks to it.
First, one could arrange the data in "quote-delimited"
format without the use of any whitespaces,
e.g.  ’Sub1’"medial""posterior"’high’"4567.89"
is a perfectly valid input line.  However, one wants to
avoid having input files with an empty set of quotes:
"" or ’’ - either of these will mess up the parsing
(a level name must contain 1 character).  Finally, one can
record the abscence of an experimental outcome by placing
the string "NA" in the final column - this will undoubtedly
cause the input file to be an unbalanced design, but a
record will be made of how many such experimental outcomes
are missing from the input file.
7) The Bonferroni-Sidak Post-hoc tables are not very easy
to use read when there are more than 13 or so factor
levels.  It helps to have a text editor in which one can
turn off "wrapping" and easily scroll left and right.
8) In the computation of partial omega squared in the
case where ther are repeated measures factors, we assume
that the ANOVA model is a so-called "additive model".
Essentially, this assumes that there is no substantial
interaction between the subjects and the repeated factors
(each subject was subjected to all of the levels of said
factors).  This is a very important assumption when one
is computing omega squared, but as we are computing only
partial omega squared - so that we do not need to know the
total experimental variance, but only a factor’s error
variance - it is less of an issue.  However, it remains
to be computed how partial omega squared is affected in
clearly non-additive models (which can be detected by
the "Tukey test" - see Vaughn and Corballis).
9) The data insertion/deletion feature cannot in
general correct a nearly proportional design when
the design has two or more varying between factors
and the lack of proportionality is due to there
being a lack or excess of interaction instances of
those varying between factors.
5. REFERENCES
Box, G.E.P., "Some Theorems on Quadratic Forms Applied in
the Study of ANOVA Problems, I. Effect of Inequality of
Variance in the One-Way Classification", Ann. of Math.
Stat., v. 25, 1954, p. 290.
Games, P.A., H.J. Kesselman and J.C. Rogan, "Simultaneous
Pairwise Multiple Comparison Procedures for Means when
Sample Sizes are Unequal", Psychological Bulletin, v. 90,
#3, pp. 594-598, 1981.
Geisser S. and Greenhouse, S., "An Extension of Box’s Results
on the Use of F Distributions in Multivariate Analysis",
Annals of Mathematical Statistics, v. 29, pp. 885-891, 1958.
Keppel, G., "Design and Analysis: A Researcher’s Handbook",
3rd Ed., Prentice Hall, 1991.
Kirk, Roger E., "Experimental Design: Procedures for the
Behavioral Sciences", 3rd Ed., Brooks/Cole, 1995.
Kirk, Roger E., "Practical Significance: A Concept
Whose Time Has Come", Educational and Psychological
Measurement, v. 56, #5, pp746-759, Oct. 1996.
Miller, Rupert, "Simultaneous Statistical Inference",
2nd Ed., Springer, 1981.
Pendelton, O.J., "Inflential Observations in the
Analysis of Variance", Communications in Statistics -
Theory and Methods, v. 14, #3, pp.551-565, 1985.
Sahai, Haredo and Mohammed I. Ageel, "The Analysis of
Variance: Fixed, Random and Mixed Models", Birkhauser,
Boston, 2000.
Shaffer, Juliet Popper, "Modified Sequentially Rejective
Multiple Test Procedures", JASA, v. 81, #395, Sept. 1986.
Vaughn, Graham and Michael Corballis, "Beyond Tests of
Significance: Estimating Strength of Effect in Selected
ANOVA Designs", Psychological Bulletin, v. 22, #3, 1969.
For more references used in the design of this program
see the cleave.hst file.
6. ERROR AND INFORMATIONAL MESSAGES
The following is an alphabetized list of the messages you
might see when you run CLEAVE, along with some suggestions
on how to make those messages disappear the next time you
try to run CLEAVE on offending data set.
---
"CLEAVE: ___ is not a data value"
CLEAVE has encountered a non-numeric string in the final
column that is always dedicated to holding the outcome values.
Check the data file to see if there are extra columns or
stray characters in the final column.
---
"CLEAVE: Cannot exceed ___ many levels/factor!"
This message can be gotten rid of by going into cleave.cnf
and increasing the maximum number of levels/factor so that 
your data set fits into the allocated memory space.
---
"CLEAVE: Cannot find a temporary file name!"
"CLEAVE: Cannot open a temporary write file!"
"CLEAVE: Cannot reopen the temporary data storage file!"
CLEAVE is having problems locating space for or opening
a disk file which is to hold the data file so that CLEAVE
can reread it again.  This file is either in the current
directory or is in the /tmp/ directory (*NIX systems), and
it is important to be able to reread the file as that is
how CLEAVE is able to perform the ANOVA without having to
load the entire data set into memory.
---
"CLEAVE: Data rows must have from one to ___ factors"
This probably means that you need to go into cleave.cnf
and increase the number of factors that it can hold, but it
may indicate that you data set is rather trivial - it has
only one cell, and CLEAVE doesn’t have anything to do then.
---
"CLEAVE: Duplicate data lines...creating new REPEATS factor."
"CLEAVE: Duplicate data lines...averaging into existing factors."
CLEAVE is informing you that it found rows with indentical
factor levels (including the subject name) and is saying that
it is going to either average them into the factors listed in the
data file or create a new (repeated measure = within)
factor which separates either of these. 
---
"CLEAVE: Excise some data to create a proportional design"
"CLEAVE: Inserting missing data to create a proportional 
design"
These messages can appear when CLEAVE tries to alter the
input data set to make it a balanced (proportional)
ANOVA design.  They appear in the output directly above
the indicators of the data points inserted or deleted.
---
"CLEAVE: Input file has varying columnar structure"
CLEAVE has detected that the number of columns in your data
file doesn’t have a uniform number of columns - you have
to fix that in the data set.
---
"CLEAVE: Interrupted by user (Ctrl-C)... temporary file 
removed"
what you see when you hit the Ctrl-C key combination while
CLEAVE is running.  The temporary file may or may not be
removed (as indicated) depending upon whether it still
exits or not.
---
"CLEAVE: Invalid a or b or NUMERICS_ITMAX too small in betacf"
"CLEAVE: Invalid betai value"
"CLEAVE: Invalid degree of freedom in cdfF denominator."
"CLEAVE: Invalid degree of freedom in cdfF numerator."
"CLEAVE: Invalid degree of freedom in cdfNCF denominator."
"CLEAVE: Invalid degree of freedom in cdfNCF numerator."
"CLEAVE: Invalid gammaln value"
These errors are internal, mathematical function errors which
you should never see unless there is some kind of bug in
CLEAVE.  CLEAVE keeps on going past these problems, but
the values printed in the section immediately following 
the warnings should be ignored as they are likely bogus.
---
"CLEAVE: Invalidly small DF: ___ in source: ___"
"CLEAVE: Large F value due to zero errorms value"
"CLEAVE: Negative DF: ___ in source denominator: ___"
"CLEAVE: Negative DF: ___ in source numerator: ___"
"CLEAVE: Negative MS: ___ in source denominator: ___"
"CLEAVE: Negative MS: ___ in source numerator: ___"
"CLEAVE: Negative SS: ___ in source: ___"
"CLEAVE: Negative SS: ___ in source pure error term: ___"
CLEAVE is having problems calculating sums of squares,
or mean squares, or degrees of freedom or an appropriate
F value.  Seeing the "Large F value" warning might be a
sign that something is wrong with your data file - there 
is just so little noise in the data set that something is
probably fishy.  The other warning/errors should rarely
be seen unless there is a bug in CLEAVE or your data
set is badly (and obviously) messed up.
---
"CLEAVE: No free memory for A Data Means Insertion Array"
These memory allocations are used when CLEAVE tries to 
interpolate data into a data set when a data set is not
balanced due to a lack of data.
---
"CLEAVE: No free memory for Aux Means"
"CLEAVE: No free memory for Aux Counts"
"CLEAVE: No free memory for Aux (Co)Variance"
"CLEAVE: No free memory for Aux Interaction Covariance 
Matrix Or t Pairs"
"CLEAVE: No free memory for Aux t Pairs"
"CLEAVE: No free memory for Aux Variance"
These memory allocation errors say that you are
running out of memory for performing Geisser-Greenhouse
computations and/or pairwise post-hoc comparisons.
Either you need to turn off those options or use
the other suggestions appearing in the other 
memory warnings in this section.
---
"CLEAVE: No free memory for Box Approximations"
"CLEAVE: No free memory for Box Trace"
These memory allocation errors say that you are
running out of memory for performing Geisser-Greenhouse
computations. Either you need to turn off this option 
or use the other suggestions appearing in the other 
memory warnings in this section.
---
"CLEAVE: No free memory for Bracket Computations"
"CLEAVE: No free memory for Duplication Array"
"CLEAVE: No free memory for File Column Level Name Parsing"
"CLEAVE: No free memory for Level Names"
"CLEAVE: No free memory for Level Tracking Index"
"CLEAVE: No free memory for Multidimensional Array"
"CLEAVE: No free memory for Number of Levels per Factor"
"CLEAVE: No free memory for Proportionality Single Factor Counts"
"CLEAVE: No free memory for Repeats Instance Factor"
"CLEAVE: No free memory for Temporary Duplication Array"
"CLEAVE: No free memory for Text Column Mask"
Seeing any of the above memory allocation errors means
that you will not be able to perform basic ANOVA tasks
without getting or freeing up more memory somehow.
Make sure that you see if cleave.cnf contains the
lowest number of factors and levels/factor that your
data set requires.
---
"CLEAVE: No free memory for Treatment Effects"
"CLEAVE: No free memory for Treatment Errors"
"CLEAVE: No free memory for Treatment List Flags"
These memory allocation errors say that you are
running out of memory for building a treatment magnitude
list. Either you need to turn off this option or use the 
other suggestions appearing in the other memory warnings 
in this section.
---
"CLEAVE: No room for new factor - using only last data 
instances!"
CLEAVE is informing you that it found rows with indentical
factor levels (including the subject name) and is saying that
it cannot create a new factor to handle it because you are
already at the limit in the number of factors you have.
Thus, it will only pay attention to the last instance
of each duplicated factor.
---
"CLEAVE: Not enough subjects in data (column 1)!"
CLEAVE only detects that there is one subject, which
makes it kind of hard to do good analysis of variance.
Check to see if column 1 contains all of your subject
labels.
---
"CLEAVE: Said lack of separation explains the ANOVA design 
imbalance"
"CLEAVE: Seems unable to balance the repeated measures 
design"
"CLEAVE: Some subjects fail to keep between factors 
separated"
"CLEAVE: Some within factors are missing some subject 
instances"
"CLEAVE: Subjects missing in design: cannot fix data 
imbalance"
These warnings can appear when the data set that CLEAVE 
reads in is not a proportional or balanced design.  The 
problems that can be detected include a lack of
separation amongst "between" factors (which suggests 
that proper randomization has not been done?) as well as
too few instantiations of within factors, indicating
that some subjects did not do the entire experiment.
Also some of the warnings say whether the design can
be balanced or not.
---
"CLEAVE: Switching to default Satterthwaite approximation 
due to negative MS"
This appears when a negative MS denominator value is
computed when using the (non-default) Satterthwaite method
which involves subtraction.  The default method is then
automatically invoked to compute a quasi-F value.
---
"CLEAVE: Too little input data"
CLEAVE detects that you have zero multi-level,
non-subject factors.  Check your data set.
---
"CLEAVE: Zero or negative MSerror => Will not use pooled 
variance."
CLEAVE gives this warning in two cases, usually.  Either your
data set has (suspiciously) zero noise, or in the case when
there are multiple random variables, and you are using the
version of Satterthwaite’s approximation which only approximates
the denominator.  Either you should switch to using the
(default) Satterthwiate approximation which approximates both
the numerator and denominator of the F computation, or you 
should switch to not using the pooled variance when computing
pairwise post-hoc tests.  
---
"CLEAVE: Cannot open ’cleave.cnf’!  Using default values..."
This is when the configuration file cleave.cnf cannot be
found in the current (working) directory.  CLEAVE still works
in this case, but it uses default values.
---
"Completely imbalanced design => F values will not be computed"
CLEAVE cannot handle unbalanced designs when computing F, so
all you can expect to get is the pairwise comparisons.
---
"Presence of within factors => Pairwise comparisons might be 
misleading"
"Within factor interation terms will use pooled error for 
pairwise comparisons"
CLEAVE automatically tries to use pooled error in any pairwise
comparisons it does whenever there are within variables in the
ANOVA design.  If it cannot compute pooled errors, then it
will still do the pairwise comparisons, but it informs the
user of the potential problem (which is an excess of degrees
of freedom in the pairwise comparison error terms).
---
"Random, mixed between-within design => F values will not be 
computed"
When there is a design with random elements and both between 
and within (repeated measures) factors, CLEAVE can only compute
the pairwise comparisons.
---
"Random unbalanced proportional design => F values will not be 
computed"
When there is an unbalanced proportional design with some non-
subject random elements in it, CLEAVE cannot compute the F
values, but only the pairwise comparison.
---
"Some Geisser-Greenhouse sub-epsilons not computed: interaction 
levels too high"
This is presented when there are random variables and in 
attempting to compute the GG adjusted p value CLEAVE has
to compute a GG value for an interaction term whose order
exceeds that specified in cleave.cnf.  It is advised to
change the maximal order in cleave.cnf to at least the order 
of the term appearing at the head of the line just below the 
original computed F value for this source term.
---
7. CONTACT INFORMATION
Please send any bugs or suggestions to me at the email
and/or realmail addresses below.
Timothy Herron
Staff Statistical Curmudgeon
Human Cognitive Neurophysiology Laboratory
Department of Neurology
UC Davis and VANCHCS,
150 Muir Road
Martinez, CA USA 94553
tjherron@ebire.org
1-925-372-2000-x4119