Home > matlabmk > mxwt_perm2.m

mxwt_perm2

PURPOSE ^

function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxwt_perm2(dataA,dataB,n_perm,alph,tail,verblevel,seed_state)

SYNOPSIS ^

function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxwt_perm2(dataA,dataB,n_perm,alph,tail,verblevel,seed_state)

DESCRIPTION ^

function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxwt_perm2(dataA,dataB,n_perm,alph,tail,verblevel,seed_state)

 mxwt_perm2-Two sample permutation test based on Welch's t-statistic and null
 hypothesis of a mean difference of zero.  For one-tailed tests, the null 
 distribution is derived from the most extreme Welch's t-score of all 
 permutations.  For two-tailed tests, the null distribution is derived from 
 the maximum absolute Welch's t-score of all permutations.  This function 
 can handle multiple electrodes and time points.  Based on simulated ERP 
 data sets, Welch's t appears to be less sensitive to differences in 
 variance between groups than the standard independent samples t-statistic 
 (esp. when group sizes are unequal).  However, it is probably less powerful
 than the standard t-statistic.  Thus you should probably use this function
 if the numbers of participants in each group are not equal and you suspect 
 that one group is more variable (from trial to trail or participant to 
 participant) than the other. Note, "Welch's t" is also referred to as the 
 "Aspin-Welch t".  See Murphy (1967) or Zar (1999) for the formula.

 Inputs:
  dataA  - 3D matrix of ERPs from Group A (Channel x Time x Participant)
  dataB  - 3D matrix of ERPs from Group B (Channel x Time x Participant)
  n_perm - number of permutations {default=1000}.  Manly (1997) suggests
           using at least 1000 permutations for an alpha level of 0.05 and 
           at least 5000 permutations for an alpha level of 0.01.
  alph   - test alpha level {default=.05}
  tail   - [1 | 0 | -1] If tail=1, the alternative hypothesis is that the
           mean of data is greater than 0 (upper tailed test).  If tail=0,
           the alternative hypothesis is that the mean of data is different
           than 0 (two tailed test).  If tail=-1, the alternative hypothesis
           is that the mean of the data is less than 0 (lower tailed test).
           {default: 0}

 Optional Inputs:
  verblevel   - An integer specifiying the amount of information you want
                this function to provide about what it is doing during runtime.
                 Options are:
                    0 - quiet, only show errors, warnings, and EEGLAB reports
                    1 - stuff anyone should probably know
                    2 - stuff you should know the first time you start working
                        with a data set {default value}
                    3 - stuff that might help you debug (show all
                        reports)
  seed_state  - The initial state of the random number generating stream
                (see MATLAB documentation for "randstream"). If you pass
                a value from a previous run of this function, it should
                reproduce the exact same values.

 Outputs:
  pval       - p-value at each time point and electrode (corrected for
               multiple comparisons via permutation test)
  t_orig     - Welch's t-score at each time point and electrode
  tmx_ptile  - critical t-scores for given alpha level
  seed_state - The initial state of the random number generating stream
               (see MATLAB documentation for "randstream") used to 
               generate the permutations. You can use this to reproduce
               the output of this function.
  est_alpha  - The estimated family-wise alpha level of the test.  With 
               permutation tests, a finite number of p-values are possible.
               This function tries to use an alpha level that is as close 
               as possible to the desired alpha level.  However, if the 
               sample size is small, a very limited number of p-values are 
               possible and the desired family-wise alpha level may be 
               impossible to acheive.

 Notes:
 -The null hypothesis of this test is that the data in the two groups are
 "exchangeable" (i.e., the data from a participant in Group A was just as
 likely to have come from Group B).  If the distributions of the two
 populations you're comparing differ in *any* way (i.e., not just in
 central tendency), the null hypothesis will be violated.  For example, if
 the ERPs of the two groups differ in their variability, but not their
 average value (e.g., one set of ERPs is from a group of children and the
 other adults), you'll be more likely than alpha to reject the null
 hypothesis.  Using Welch's t ameliorates the test's sensitivity to
 between-group differences in variance, but it does not completely remove
 it.

 -Unlike a parametric test (e.g., an ANOVA), a discrete set of p-values
 are possible (at most the number of possible permutations).  Since the
 number of possible permutations grows rapdily with the number of
 participants, this is only an issue for small sample sizes (e.g., 3
 participants in each group).  When you have such a small sample size, the
 limited number of p-values may make the test overly conservative (e.g., 
 you might be forced to use an alpha level of .0286 since it is the biggest
 possible alpha level less than .05).

 Author:
 David Groppe
 July, 2009
 Kutaslab, San Diego

 References:

 Murphy, B. P. (1967). Some two-sample tests when the variances are unequal:
 A simulation study. Biometrika, 54(3), 679-683. 

 Zar, J. H. (1999). Biostatistical analysis (fourth ed.). Upper Saddle 
 River, New Jersey: Prentice Hall.

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 %function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxwt_perm2(dataA,dataB,n_perm,alph,tail,verblevel,seed_state)
0002 %
0003 % mxwt_perm2-Two sample permutation test based on Welch's t-statistic and null
0004 % hypothesis of a mean difference of zero.  For one-tailed tests, the null
0005 % distribution is derived from the most extreme Welch's t-score of all
0006 % permutations.  For two-tailed tests, the null distribution is derived from
0007 % the maximum absolute Welch's t-score of all permutations.  This function
0008 % can handle multiple electrodes and time points.  Based on simulated ERP
0009 % data sets, Welch's t appears to be less sensitive to differences in
0010 % variance between groups than the standard independent samples t-statistic
0011 % (esp. when group sizes are unequal).  However, it is probably less powerful
0012 % than the standard t-statistic.  Thus you should probably use this function
0013 % if the numbers of participants in each group are not equal and you suspect
0014 % that one group is more variable (from trial to trail or participant to
0015 % participant) than the other. Note, "Welch's t" is also referred to as the
0016 % "Aspin-Welch t".  See Murphy (1967) or Zar (1999) for the formula.
0017 %
0018 % Inputs:
0019 %  dataA  - 3D matrix of ERPs from Group A (Channel x Time x Participant)
0020 %  dataB  - 3D matrix of ERPs from Group B (Channel x Time x Participant)
0021 %  n_perm - number of permutations {default=1000}.  Manly (1997) suggests
0022 %           using at least 1000 permutations for an alpha level of 0.05 and
0023 %           at least 5000 permutations for an alpha level of 0.01.
0024 %  alph   - test alpha level {default=.05}
0025 %  tail   - [1 | 0 | -1] If tail=1, the alternative hypothesis is that the
0026 %           mean of data is greater than 0 (upper tailed test).  If tail=0,
0027 %           the alternative hypothesis is that the mean of data is different
0028 %           than 0 (two tailed test).  If tail=-1, the alternative hypothesis
0029 %           is that the mean of the data is less than 0 (lower tailed test).
0030 %           {default: 0}
0031 %
0032 % Optional Inputs:
0033 %  verblevel   - An integer specifiying the amount of information you want
0034 %                this function to provide about what it is doing during runtime.
0035 %                 Options are:
0036 %                    0 - quiet, only show errors, warnings, and EEGLAB reports
0037 %                    1 - stuff anyone should probably know
0038 %                    2 - stuff you should know the first time you start working
0039 %                        with a data set {default value}
0040 %                    3 - stuff that might help you debug (show all
0041 %                        reports)
0042 %  seed_state  - The initial state of the random number generating stream
0043 %                (see MATLAB documentation for "randstream"). If you pass
0044 %                a value from a previous run of this function, it should
0045 %                reproduce the exact same values.
0046 %
0047 % Outputs:
0048 %  pval       - p-value at each time point and electrode (corrected for
0049 %               multiple comparisons via permutation test)
0050 %  t_orig     - Welch's t-score at each time point and electrode
0051 %  tmx_ptile  - critical t-scores for given alpha level
0052 %  seed_state - The initial state of the random number generating stream
0053 %               (see MATLAB documentation for "randstream") used to
0054 %               generate the permutations. You can use this to reproduce
0055 %               the output of this function.
0056 %  est_alpha  - The estimated family-wise alpha level of the test.  With
0057 %               permutation tests, a finite number of p-values are possible.
0058 %               This function tries to use an alpha level that is as close
0059 %               as possible to the desired alpha level.  However, if the
0060 %               sample size is small, a very limited number of p-values are
0061 %               possible and the desired family-wise alpha level may be
0062 %               impossible to acheive.
0063 %
0064 % Notes:
0065 % -The null hypothesis of this test is that the data in the two groups are
0066 % "exchangeable" (i.e., the data from a participant in Group A was just as
0067 % likely to have come from Group B).  If the distributions of the two
0068 % populations you're comparing differ in *any* way (i.e., not just in
0069 % central tendency), the null hypothesis will be violated.  For example, if
0070 % the ERPs of the two groups differ in their variability, but not their
0071 % average value (e.g., one set of ERPs is from a group of children and the
0072 % other adults), you'll be more likely than alpha to reject the null
0073 % hypothesis.  Using Welch's t ameliorates the test's sensitivity to
0074 % between-group differences in variance, but it does not completely remove
0075 % it.
0076 %
0077 % -Unlike a parametric test (e.g., an ANOVA), a discrete set of p-values
0078 % are possible (at most the number of possible permutations).  Since the
0079 % number of possible permutations grows rapdily with the number of
0080 % participants, this is only an issue for small sample sizes (e.g., 3
0081 % participants in each group).  When you have such a small sample size, the
0082 % limited number of p-values may make the test overly conservative (e.g.,
0083 % you might be forced to use an alpha level of .0286 since it is the biggest
0084 % possible alpha level less than .05).
0085 %
0086 % Author:
0087 % David Groppe
0088 % July, 2009
0089 % Kutaslab, San Diego
0090 %
0091 % References:
0092 %
0093 % Murphy, B. P. (1967). Some two-sample tests when the variances are unequal:
0094 % A simulation study. Biometrika, 54(3), 679-683.
0095 %
0096 % Zar, J. H. (1999). Biostatistical analysis (fourth ed.). Upper Saddle
0097 % River, New Jersey: Prentice Hall.
0098 
0099 %%%%%%%%%%%%%%%% REVISION LOG %%%%%%%%%%%%%%%%%
0100 %
0101 
0102 
0103 %%%%%%%%%%%%%%%% FUTURE IMPROVEMENTS %%%%%%%%%%%%%%%%%
0104 %
0105 % Test for equality of trials or variance?
0106 
0107 function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxwt_perm2(dataA,dataB,n_perm,alph,tail,verblevel,seed_state)
0108 
0109 if nargin<2,
0110     error('You need to provide data for two groups of participants.');
0111 end
0112 
0113 if nargin<3,
0114     n_perm=2000;
0115 end
0116 
0117 if nargin<4,
0118     alph=.05;
0119 elseif (alph>=1) || (alph<=0)
0120     error('Argument ''alph'' needs to be between 0 and 1.');
0121 end
0122 
0123 if alph<=.01 && n_perm<5000,
0124     watchit(sprintf('You are probably using too few permutations for an alpha level of %f. Type ">>help mxwt_perm2" for more info.',alph));
0125 elseif alph<=.05 && n_perm<1000,
0126     watchit(sprintf('You are probably using too few permutations for an alpha level of %f. Type ">>help mxwt_perm2" for more info.',alph));
0127 end
0128 
0129 if nargin<5,
0130     tail=0;
0131 elseif (tail~=0) && (tail~=1) && (tail~=-1),
0132     error('Argument ''tail'' needs to be 0,1, or -1.');
0133 end
0134 
0135 if nargin<6,
0136     verblevel=2;
0137 end
0138 
0139 defaultStream=RandStream.getDefaultStream; %random # generator state
0140 if (nargin<7) || isempty(seed_state),
0141     %Store state of random number generator
0142     seed_state=defaultStream.State;
0143 else
0144     defaultStream.State=seed_state; %reset random number generator to saved state
0145 end
0146 
0147 
0148 if length(size(dataA))~=3 || length(size(dataB))~=3 
0149     error('dataA and dataB need to be three dimensional (chan x time x participant)')
0150 end
0151 [n_chan, n_tpt, n_subsA]=size(dataA);
0152 [n_chan2, n_tpt2, n_subsB]=size(dataB);
0153 
0154 if verblevel,
0155     warning('off','all'); %for large # of subjects, nchoosek warns that its result is approximate
0156     n_psbl_prms=nchoosek(n_subsA+n_subsB,n_subsA);
0157     if n_psbl_prms<100,
0158         watchit(sprintf(['Due to the very limited number of participants in each group,' ...
0159             ' the total number of possible permutations is small.\nThus only a limited number of p-values (at most %d) are possible and the test might be overly conservative.'], ...
0160             n_psbl_prms));
0161     end
0162     warning('on','all');
0163 end
0164 
0165 if n_chan~=n_chan2,
0166     error('The number of channels in Group 1 (dataA) and Group 2 (dataB) need to be equal.');
0167 elseif n_tpt~=n_tpt2,
0168     error('The number of time points in Group 1 (dataA) and Group 2 (dataB) need to be equal.');
0169 end
0170 %combine data
0171 total_subs=n_subsA+n_subsB;
0172 data=dataA; 
0173 data(:,:,(n_subsA+1):total_subs)=dataB;
0174 
0175 if verblevel~=0,
0176     fprintf('mxt_perm2: Number of channels: %d\n',n_chan);
0177     fprintf('mxt_perm2: Number of time points: %d\n',n_tpt);
0178     fprintf('mxt_perm2: Total # of comparisons: %d\n',n_chan*n_tpt);
0179     fprintf('mxt_perm2: Number of participants in Group A: %d\n',n_subsA);
0180     fprintf('mxt_perm2: Number of participants in Group B: %d\n',n_subsB);
0181     %fprintf('t-score degrees of freedom: %d\n',total_subs-2); %edit for Welch's t?
0182 end
0183 VerbReport(sprintf('Executing permutation test with %d permutations...',n_perm),2,verblevel);
0184 if (verblevel>=2),
0185     fprintf('Permutations completed: ');
0186 end
0187 
0188 mx_t=zeros(1,n_perm);
0189 for perm=1:n_perm
0190     if ~rem(perm,100)
0191         if (verblevel>=2),
0192             if ~rem(perm-100,1000)
0193                 fprintf('%d',perm);
0194             else
0195                 fprintf(', %d',perm);
0196             end
0197             if ~rem(perm,1000)
0198                 fprintf('\n');
0199             end
0200         end
0201     end
0202     %randomly assign participants to conditions
0203     r=randperm(total_subs);
0204     grp1=r(1:n_subsA);
0205     grp2=r((n_subsA+1):total_subs);
0206     %compute most extreme Welch's t-score
0207     mx_t(perm)=wtmax2(data,grp1,grp2,n_subsA,n_subsB);
0208 end
0209     
0210 %End of permutations, print carriage return if it hasn't already been done
0211 %(i.e., perm is NOT a multiple of 1000)
0212 if (verblevel>=2) && rem(perm,1000)
0213     fprintf('\n');
0214 end
0215 
0216 %Compute critical Welch's t(s)
0217 if tail==0,
0218     %two-tailed, test statistic is biggest absolute value of all t's
0219     mx_t=abs(mx_t);
0220     tmx_ptile(2)=prctile(mx_t,100-100*alph);
0221     tmx_ptile(1)=-tmx_ptile(2);
0222     est_alpha=mean(mx_t>=tmx_ptile(2));
0223 elseif tail==1,
0224     %upper tailed
0225     tmx_ptile=prctile(mx_t,100-100*alph);
0226     est_alpha=mean(mx_t>=tmx_ptile);
0227 else
0228     %tail=-1, lower tailed
0229     tmx_ptile=prctile(mx_t,alph*100);
0230     est_alpha=mean(mx_t<=tmx_ptile);
0231 end
0232 if verblevel~=0,
0233     fprintf('Desired family-wise alpha level: %f\n',alph);
0234     fprintf('Estimated actual family-wise alpha level: %f\n',est_alpha);
0235 end
0236 
0237 %Compute t-scores of actual observations
0238 [dummy t_orig]=wtmax2(data,1:n_subsA,(n_subsA+1):total_subs,n_subsA,n_subsB);
0239 pval=zeros(n_chan,n_tpt);
0240 %compute p-values
0241 for t=1:n_tpt,
0242     for c=1:n_chan,
0243         if tail==0,
0244             pval(c,t)=mean(mx_t>=abs(t_orig(c,t))); %note mx_t are now all positive due to abs command above
0245         elseif tail==1,
0246             pval(c,t)=mean(mx_t>=t_orig(c,t));
0247         elseif tail==-1,
0248             pval(c,t)=mean(mx_t<=t_orig(c,t));
0249         end
0250     end
0251 end
0252 
0253 
0254 function [mx_t, all_t]=wtmax2(dat,grpA,grpB,n_subsA,n_subsB)
0255 % might make this faster by moving it into the code
0256 
0257 xA=dat(:,:,grpA);
0258 xB=dat(:,:,grpB);
0259 
0260 smA=sum(xA,3);
0261 mnA=smA/n_subsA;
0262 varA=(sum(xA.^2,3)-smA.^2/n_subsA)/(n_subsA-1);
0263 
0264 smB=sum(xB,3);
0265 mnB=smB/n_subsB;
0266 varB=(sum(xB.^2,3)-smB.^2/n_subsB)/(n_subsB-1);
0267 
0268 all_t=(mnA-mnB)./sqrt( (varA/n_subsA) + (varB/n_subsB));
0269 [mx1 id_row]=max(abs(all_t));
0270 [mx2 id_col]=max(mx1);
0271 mx_t=all_t(id_row(id_col),id_col);
0272

Generated on Tue 10-May-2016 16:37:45 by m2html © 2005