Home > matlabmk > mxt_perm2old_matlab.m

mxt_perm2old_matlab

PURPOSE ^

function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxt_perm2(dataA,dataB,n_perm,alph,tail,verblevel,seed_state,freq_domain)

SYNOPSIS ^

function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxt_perm2(dataA,dataB,n_perm,alph,tail,verblevel,seed_state,freq_domain)

DESCRIPTION ^

function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxt_perm2(dataA,dataB,n_perm,alph,tail,verblevel,seed_state,freq_domain)

 mxt_perm2-Two sample permutation test based on a t-statistic and null
 hypothesis of a mean difference of zero.  For one-tailed tests, the null distribution
 is derived from the most extreme t-score of all permutations.  For two-tailed
 tests, the null distribution is derived from the maximum absolute t-score of
 all permutations.  Can handle multiple electrodes and time
 points. According to Hemmelammnn et al. (2004), the t_max permutation
 test has relatively good power for multivariate data whose dimensions are
 highly correlated (like EEG/ERP data).  See Maris (2004) for an
 explanation of how permutation tests like this control for multiple
 comparisons.  Note, Maris uses Hotelling's multivariate T^2 statistic in
 his paper (instead of the t_max used here) and he advocated performing
 the tests over multiple time points (e.g., a 10 ms time window).
 However, I found that the T^2 statistic did not have much power when the
 sample size was near the number of electrodes and the results of the test
 could vary greatly depending on the size of the time window.

 Required Inputs:
  dataA  - 3D matrix of ERPs from Group A (Channel x Time x Participant)
  dataB  - 3D matrix of ERPs from Group B (Channel x Time x Participant)

 Optional Inputs:
  n_perm - number of permutations {default=1000}.  Manly (1997) suggests
           using at least 1000 permutations for an alpha level of 0.05 and 
           at least 5000 permutations for an alpha level of 0.01.
  alph   - test alpha level {default=.05}
  tail   - [1 | 0 | -1] If tail=1, the alternative hypothesis is that the
           means of Group A are greater than those of Group B (upper tailed test).  
           If tail=0, the alternative hypothesis is that the means of the 
           two groups are different (two tailed test).  If tail=-1, the 
           alternative hypothesis is that the means of Group A are less 
           than those of Group B (lower tailed test). {default: 0}
  verblevel   - An integer specifiying the amount of information you want
                this function to provide about what it is doing during runtime.
                 Options are:
                    0 - quiet, only show errors, warnings, and EEGLAB reports
                    1 - stuff anyone should probably know
                    2 - stuff you should know the first time you start working
                        with a data set {default value}
                    3 - stuff that might help you debug (show all
                        reports)
  seed_state  - The initial state of the random number generating stream
                (see MATLAB documentation for "randstream"). If you pass
                a value from a previous run of this function, it should
                reproduce the exact same values.
  freq_domain - If 0, command line report will be given in temporal units
                (e.g. time points).  Otherwise, the report will be given 
                in frequency domain units (e.g., frequencies). {default:
                0}

 Outputs:
  pval       - p-value at each time point and electrode (corrected for
                multiple comparisons via permutation test)
  t_orig     - t-score at each time point and electrode
  tmx_ptile  - critical t-scores for given alpha level
  seed_state - The initial state of the random number generating stream
               (see MATLAB documentation for "randstream") used to 
               generate the permutations. You can use this to reproduce
               the output of this function.
  est_alpha  - The estimated family-wise alpha level of the test.  With 
               permutation tests, a finite number of p-values are possible.
               This function tries to use an alpha level that is as close 
               as possible to the desired alpha level.  However, if the 
               sample size is small, a very limited number of p-values are 
               possible and the desired family-wise alpha level may be 
               impossible to acheive.

 Notes:
 -The null hypothesis of this test is that the data in the two groups are
 "exchangeable" (i.e., the data from a participant in Group A was just as
 likely to have come from Group B).  If the distributions of the two
 populations you're comparing differ in *any* way (i.e., not just in
 central tendency), the null hypothesis will be violated.  For example, if
 the ERPs of the two groups differ in their variability, but not their
 average value (e.g., one set of ERPs is from a group of children and the
 other adults), you'll be more likely than alpha to reject the null
 hypothesis.

 -Unlike a parametric test (e.g., an ANOVA), a discrete set of p-values
 are possible (at most the number of possible permutations).  Since the
 number of possible permutations grows rapdily with the number of
 participants, this is only an issue for small sample sizes (e.g., 3
 participants in each group).  When you have such a small sample size, the
 limited number of p-values may make the test overly conservative (e.g., 
 you might be forced to use an alpha level of .0286 since it is the biggest
 possible alpha level less than .05).

 Author:
 David Groppe
 March, 2009
 Kutaslab, San Diego

 References:

 Hemmelmann, et al. (2004) Multivariate tests for the evaluation of
 high-dimensional EEG data. Journal of Neuroscience Methods.

 Maris, E. (2004) Randomization tests for ERP topographies and whole 
 spatiotemporal data matrices. Psychophysiology

 Manly, B.F.J. (1997) Randomization, bootstrap, and Monte Carlo methods in
 biology. 2nd ed. Chapmn and Hall, London.

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 %function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxt_perm2(dataA,dataB,n_perm,alph,tail,verblevel,seed_state,freq_domain)
0002 %
0003 % mxt_perm2-Two sample permutation test based on a t-statistic and null
0004 % hypothesis of a mean difference of zero.  For one-tailed tests, the null distribution
0005 % is derived from the most extreme t-score of all permutations.  For two-tailed
0006 % tests, the null distribution is derived from the maximum absolute t-score of
0007 % all permutations.  Can handle multiple electrodes and time
0008 % points. According to Hemmelammnn et al. (2004), the t_max permutation
0009 % test has relatively good power for multivariate data whose dimensions are
0010 % highly correlated (like EEG/ERP data).  See Maris (2004) for an
0011 % explanation of how permutation tests like this control for multiple
0012 % comparisons.  Note, Maris uses Hotelling's multivariate T^2 statistic in
0013 % his paper (instead of the t_max used here) and he advocated performing
0014 % the tests over multiple time points (e.g., a 10 ms time window).
0015 % However, I found that the T^2 statistic did not have much power when the
0016 % sample size was near the number of electrodes and the results of the test
0017 % could vary greatly depending on the size of the time window.
0018 %
0019 % Required Inputs:
0020 %  dataA  - 3D matrix of ERPs from Group A (Channel x Time x Participant)
0021 %  dataB  - 3D matrix of ERPs from Group B (Channel x Time x Participant)
0022 %
0023 % Optional Inputs:
0024 %  n_perm - number of permutations {default=1000}.  Manly (1997) suggests
0025 %           using at least 1000 permutations for an alpha level of 0.05 and
0026 %           at least 5000 permutations for an alpha level of 0.01.
0027 %  alph   - test alpha level {default=.05}
0028 %  tail   - [1 | 0 | -1] If tail=1, the alternative hypothesis is that the
0029 %           means of Group A are greater than those of Group B (upper tailed test).
0030 %           If tail=0, the alternative hypothesis is that the means of the
0031 %           two groups are different (two tailed test).  If tail=-1, the
0032 %           alternative hypothesis is that the means of Group A are less
0033 %           than those of Group B (lower tailed test). {default: 0}
0034 %  verblevel   - An integer specifiying the amount of information you want
0035 %                this function to provide about what it is doing during runtime.
0036 %                 Options are:
0037 %                    0 - quiet, only show errors, warnings, and EEGLAB reports
0038 %                    1 - stuff anyone should probably know
0039 %                    2 - stuff you should know the first time you start working
0040 %                        with a data set {default value}
0041 %                    3 - stuff that might help you debug (show all
0042 %                        reports)
0043 %  seed_state  - The initial state of the random number generating stream
0044 %                (see MATLAB documentation for "randstream"). If you pass
0045 %                a value from a previous run of this function, it should
0046 %                reproduce the exact same values.
0047 %  freq_domain - If 0, command line report will be given in temporal units
0048 %                (e.g. time points).  Otherwise, the report will be given
0049 %                in frequency domain units (e.g., frequencies). {default:
0050 %                0}
0051 %
0052 % Outputs:
0053 %  pval       - p-value at each time point and electrode (corrected for
0054 %                multiple comparisons via permutation test)
0055 %  t_orig     - t-score at each time point and electrode
0056 %  tmx_ptile  - critical t-scores for given alpha level
0057 %  seed_state - The initial state of the random number generating stream
0058 %               (see MATLAB documentation for "randstream") used to
0059 %               generate the permutations. You can use this to reproduce
0060 %               the output of this function.
0061 %  est_alpha  - The estimated family-wise alpha level of the test.  With
0062 %               permutation tests, a finite number of p-values are possible.
0063 %               This function tries to use an alpha level that is as close
0064 %               as possible to the desired alpha level.  However, if the
0065 %               sample size is small, a very limited number of p-values are
0066 %               possible and the desired family-wise alpha level may be
0067 %               impossible to acheive.
0068 %
0069 % Notes:
0070 % -The null hypothesis of this test is that the data in the two groups are
0071 % "exchangeable" (i.e., the data from a participant in Group A was just as
0072 % likely to have come from Group B).  If the distributions of the two
0073 % populations you're comparing differ in *any* way (i.e., not just in
0074 % central tendency), the null hypothesis will be violated.  For example, if
0075 % the ERPs of the two groups differ in their variability, but not their
0076 % average value (e.g., one set of ERPs is from a group of children and the
0077 % other adults), you'll be more likely than alpha to reject the null
0078 % hypothesis.
0079 %
0080 % -Unlike a parametric test (e.g., an ANOVA), a discrete set of p-values
0081 % are possible (at most the number of possible permutations).  Since the
0082 % number of possible permutations grows rapdily with the number of
0083 % participants, this is only an issue for small sample sizes (e.g., 3
0084 % participants in each group).  When you have such a small sample size, the
0085 % limited number of p-values may make the test overly conservative (e.g.,
0086 % you might be forced to use an alpha level of .0286 since it is the biggest
0087 % possible alpha level less than .05).
0088 %
0089 % Author:
0090 % David Groppe
0091 % March, 2009
0092 % Kutaslab, San Diego
0093 %
0094 % References:
0095 %
0096 % Hemmelmann, et al. (2004) Multivariate tests for the evaluation of
0097 % high-dimensional EEG data. Journal of Neuroscience Methods.
0098 %
0099 % Maris, E. (2004) Randomization tests for ERP topographies and whole
0100 % spatiotemporal data matrices. Psychophysiology
0101 %
0102 % Manly, B.F.J. (1997) Randomization, bootstrap, and Monte Carlo methods in
0103 % biology. 2nd ed. Chapmn and Hall, London.
0104 
0105 %%%%%%%%%%%%%%%% REVISION LOG %%%%%%%%%%%%%%%%%
0106 %
0107 % 3/26/2010-fixed critical t-scores (tmx_ptiles) for two-tailed test.  I was
0108 % allowing them to be asymmetrical but this differed from how I was
0109 % computing p-values.  It turns out the way I was computing p-values (from
0110 % Hemmelmann) was effectively using the absolute value of all the t-scores.
0111 % Thus the critical t scores should be symmetrical.
0112 %
0113 % 4/1/2010-Estimated alpha level added
0114 %
0115 % 12/14/2010-freq_domain optional input added
0116 
0117 %This code has an appropriate false positive rate when run on simulated
0118 %data at one time point:
0119 %
0120 % tested with script: sim_mxt_perm2.m
0121 % n_sim=4000, n_perm=3000, data=2x3, 16 participants per group, Gaussian, alpha=.05
0122 % Two-tailed test: Empirical alpha rate: 0.0495 (SE=.0034)
0123 % Upper tailed test: Empirical alpha rate: 0.0503 (SE=0.0035)
0124 % Lower tailed test: Empirical alpha rate: 0.0500 (SE=.0034)
0125 %
0126 % n_sim=4000, n_perm=3000, data=2x3, 16 participants Group A, 20 Group B, Gaussian, alpha=.05
0127 % Lower tailed test: Empirical alpha rate: 0.0502 (SE=.0035)
0128 % Upper tailed test: Empirical alpha rate: 0.0483 (SE=.0034)
0129 
0130 %%%%%%%%%%%%%%%% FUTURE IMPROVEMENTS %%%%%%%%%%%%%%%%%
0131 %
0132 % -Test for equality of trials or variance?
0133 % -Add alternate test statistics?
0134 
0135 function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxt_perm2(dataA,dataB,n_perm,alph,tail,verblevel,seed_state,freq_domain)
0136 
0137 if nargin<2,
0138     error('You need to provide data for two groups of participants.');
0139 end
0140 
0141 if nargin<3,
0142     n_perm=2000;
0143 end
0144 
0145 if nargin<4,
0146     alph=.05;
0147 elseif (alph>=1) || (alph<=0)
0148     error('Argument ''alph'' needs to be between 0 and 1.');
0149 end
0150 
0151 if alph<=.01 && n_perm<5000,
0152     watchit(sprintf('You are probably using too few permutations for an alpha level of %f. Type ">>help mxt_perm2" for more info.',alph));
0153 elseif alph<=.05 && n_perm<1000,
0154     watchit(sprintf('You are probably using too few permutations for an alpha level of %f. Type ">>help mxt_perm2" for more info.',alph));
0155 end
0156 
0157 if nargin<5,
0158     tail=0;
0159 elseif (tail~=0) && (tail~=1) && (tail~=-1),
0160     error('Argument ''tail'' needs to be 0,1, or -1.');
0161 end
0162 
0163 if nargin<6,
0164     verblevel=2;
0165 end
0166 
0167 if 0,
0168     defaultStream=RandStream.getDefaultStream; %random # generator state
0169     if (nargin<7) || isempty(seed_state),
0170         %Store state of random number generator
0171         seed_state=defaultStream.State;
0172     else
0173         defaultStream.State=seed_state; %reset random number generator to saved state
0174     end
0175 else
0176     fprintf('Not seeding random number generator.\n');
0177     seed_state=NaN;
0178 end
0179 
0180 if (nargin<8),
0181    freq_domain=0; 
0182 end
0183 
0184 if length(size(dataA))~=3 || length(size(dataB))~=3 
0185     error('dataA and dataB need to be three dimensional (chan x time x participant)')
0186 end
0187 [n_chan, n_tpt, n_subsA]=size(dataA);
0188 [n_chan2, n_tpt2, n_subsB]=size(dataB);
0189 
0190 if verblevel,
0191     warning('off','all'); %for large # of subjects, nchoosek warns that its result is approximate
0192     n_psbl_prms=nchoosek(n_subsA+n_subsB,n_subsA);
0193     if n_psbl_prms<100,
0194         watchit(sprintf(['Due to the very limited number of participants in each group,' ...
0195             ' the total number of possible permutations is small.\nThus only a limited number of p-values (at most %d) are possible and the test might be overly conservative.'], ...
0196             n_psbl_prms));
0197     end
0198     warning('on','all');
0199 end
0200 
0201 if n_chan~=n_chan2,
0202     error('The number of channels in Group 1 (dataA) and Group 2 (dataB) need to be equal.');
0203 elseif n_tpt~=n_tpt2,
0204     error('The number of time points in Group 1 (dataA) and Group 2 (dataB) need to be equal.');
0205 end
0206 %combine data
0207 total_subs=n_subsA+n_subsB;
0208 data=dataA; 
0209 data(:,:,(n_subsA+1):total_subs)=dataB;
0210 
0211 if verblevel~=0,
0212     fprintf('mxt_perm2: Number of channels: %d\n',n_chan);
0213     if freq_domain,
0214         fprintf('mxt_perm2: Number of frequencies: %d\n',n_tpt);
0215     else
0216         fprintf('mxt_perm2: Number of time points: %d\n',n_tpt);
0217     end
0218     fprintf('mxt_perm2: Total # of comparisons: %d\n',n_chan*n_tpt);
0219     fprintf('mxt_perm2: Number of participants in Group A: %d\n',n_subsA);
0220     fprintf('mxt_perm2: Number of participants in Group B: %d\n',n_subsB);
0221     fprintf('t-score degrees of freedom: %d\n',total_subs-2); %edit for alt t-test??
0222 end
0223 VerbReport(sprintf('Executing permutation test with %d permutations...',n_perm),2,verblevel);
0224 if (verblevel>=2),
0225     fprintf('Permutations completed: ');
0226 end
0227 
0228 % Factors that are used to compute t-scores.  Saves time to compute them
0229 % now rather than to compute them anew for each permutation.
0230 df=n_subsA+n_subsB-2;
0231 mult_fact=(n_subsA+n_subsB)/(n_subsA*n_subsB);
0232 
0233 mx_t=zeros(1,n_perm);
0234 for perm=1:n_perm
0235     if ~rem(perm,100)
0236         if (verblevel>=2),
0237             if ~rem(perm-100,1000)
0238                 fprintf('%d',perm);
0239             else
0240                 fprintf(', %d',perm);
0241             end
0242             if ~rem(perm,1000)
0243                 fprintf('\n');
0244             end
0245         end
0246     end
0247     %randomly assign participants to conditions
0248     r=randperm(total_subs);
0249     grp1=r(1:n_subsA);
0250     grp2=r((n_subsA+1):total_subs);
0251     %compute most extreme t-score
0252     mx_t(perm)=tmax2(data,grp1,grp2,n_subsA,n_subsB,df,mult_fact);
0253 end
0254     
0255 %End of permutations, print carriage return if it hasn't already been done
0256 %(i.e., perm is NOT a multiple of 1000)
0257 if (verblevel>=2) && rem(perm,1000)
0258     fprintf('\n');
0259 end
0260 
0261 %Compute critical t's
0262 if tail==0,
0263     %two-tailed, test statistic is biggest absolute value of all t's
0264     mx_t=abs(mx_t);
0265     tmx_ptile(2)=prctile(mx_t,100-100*alph);
0266     tmx_ptile(1)=-tmx_ptile(2);
0267     est_alpha=mean(mx_t>=tmx_ptile(2));
0268 elseif tail==1,
0269     %upper tailed
0270     tmx_ptile=prctile(mx_t,100-100*alph);
0271     est_alpha=mean(mx_t>=tmx_ptile);
0272 else
0273     %tail=-1, lower tailed
0274     tmx_ptile=prctile(mx_t,alph*100);
0275     est_alpha=mean(mx_t<=tmx_ptile);
0276 end
0277 if verblevel~=0,
0278     fprintf('Desired family-wise alpha level: %f\n',alph);
0279     fprintf('Estimated actual family-wise alpha level: %f\n',est_alpha);
0280 end
0281 
0282 %Compute t-scores of actual observations
0283 [dummy t_orig]=tmax2(data,1:n_subsA,(n_subsA+1):total_subs,n_subsA,n_subsB,df,mult_fact);
0284 pval=zeros(n_chan,n_tpt);
0285 %compute p-values
0286 for t=1:n_tpt,
0287     for c=1:n_chan,
0288         if tail==0,
0289             pval(c,t)=mean(mx_t>=abs(t_orig(c,t))); %note mx_t are now all positive due to abs command above
0290         elseif tail==1,
0291             pval(c,t)=mean(mx_t>=t_orig(c,t));
0292         elseif tail==-1,
0293             pval(c,t)=mean(mx_t<=t_orig(c,t));
0294         end
0295     end
0296 end
0297 
0298 
0299 function [mx_t, all_t]=tmax2(dat,grp1,grp2,n_subsA,n_subsB,df,mult_fact)
0300 % might make this faster by moving it into the code
0301 
0302 x1=dat(:,:,grp1);
0303 x2=dat(:,:,grp2);
0304 
0305 sm1=sum(x1,3);
0306 mn1=sm1/n_subsA;
0307 ss1=sum(x1.^2,3)-(sm1.^2)/n_subsA;
0308 
0309 sm2=sum(x2,3);
0310 mn2=sm2/n_subsB;
0311 ss2=sum(x2.^2,3)-(sm2.^2)/n_subsB;
0312 
0313 pooled_var=(ss1+ss2)/df;
0314 stder=sqrt(pooled_var*mult_fact);
0315 
0316 all_t=(mn1-mn2)./stder;
0317 [mx1 id_row]=max(abs(all_t));
0318 [mx2 id_col]=max(mx1);
0319 mx_t=all_t(id_row(id_col),id_col);
0320 
0321

Generated on Tue 10-May-2016 16:37:45 by m2html © 2005