function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxt_perm1(data,n_perm,alph,tail,verblevel,seed_state,freq_domain) mxt_perm1-One sample permutation test based on a t-statistic and null hypothesis of a mean of zero. Can handle multiple electrodes and time points/frequencies. According to Hemmelammnn et al. (2004), the t_max permutation test has relatively good power for multivariate data whose dimensions are highly correlated (like EEG/ERP data). See Maris (2004) for an explanation of how permutation tests like this control for multiple comparisons. Note, Maris uses Hotelling's multivariate T^2 statistic in his paper (instead of the t_max used here) and he advocated performing the tests over multiple time points (e.g., a 10 ms time window). However, our lab found that the T^2 statistic did not have much power when the sample size was near the number of electrodes and the results of the test could vary greatly depending on the size of the time window. Inputs: data - 3D matrix of data (Channel x Time x Participant) Optional Inputs: n_perm - number of permutations {default=2000}. Manly (1997) suggests using at least 1000 permutations for an alpha level of 0.05 and at least 5000 permutations for an alpha level of 0.01. alph - test alpha level {default=.05} tail - [1 | 0 | -1] If tail=1, the alternative hypothesis is that the mean of the data is greater than 0 (upper tailed test). If tail=0, the alternative hypothesis is that the mean of the data is different than 0 (two tailed test). If tail=-1, the alternative hypothesis is that the mean of the data is less than 0 (lower tailed test). {default: 0} verblevel - An integer specifying the amount of information you want this function to provide about what it is doing during runtime. Options are: 0 - quiet, only show errors, warnings, and EEGLAB reports 1 - stuff anyone should probably know 2 - stuff you should know the first time you start working with a data set {default value} 3 - stuff that might help you debug (show all reports) seed_state - The initial state of the random number generating stream (see MATLAB documentation for "randstream"). If you pass a value from a previous run of this function, it should reproduce the exact same values. freq_domain - If 0, command line report will be given in temporal units (e.g. time points). Otherwise, the report will be given in frequency domain units (e.g., frequencies). {default: 0} Outputs: pval - p-value at each time point and electrode (corrected for multiple comparisons via permutation test) t_orig - t-score at each time point and electrode tmx_ptile - critical t-scores for given alpha level seed_state - The initial state of the random number generating stream (see MATLAB documentation for "randstream") used to generate the permutations. You can use this to reproduce the output of this function. est_alpha - The estimated family-wise alpha level of the test. With permutation tests, a finite number of p-values are possible. This function tries to use an alpha level that is as close as possible to the desired alpha level. However, if the sample size is small, a very limited number of p-values are possible and the desired family-wise alpha level may be impossible to approximately achieve. Notes: -Unlike a parametric test (e.g., an ANOVA), a discrete set of p-values are possible (at most the number of possible permutations). Since the number of possible permutations grows rapidly with the number of participants, this is only an issue for small sample sizes (e.g., 6 participants). When you have such a small sample size, the limited number of p-values may make the test overly conservative (e.g., you might be forced to use an alpha level of .0286 since it is the biggest possible alpha level less than .05). Author: David Groppe June, 2009 Kutaslab, San Diego References: Hemmelmann, et al. (2004) Multivariate tests for the evaluation of high-dimensional EEG data. Journal of Neuroscience Methods. Maris, E. (2004) Randomization tests for ERP topographies and whole spatiotemporal data matrices. Psychophysiology Manly, B.F.J. (1997) Randomization, bootstrap, and Monte Carlo methods in biology. 2nd ed. Chapmn and Hall, London.
0001 %function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxt_perm1(data,n_perm,alph,tail,verblevel,seed_state,freq_domain) 0002 % 0003 % mxt_perm1-One sample permutation test based on a t-statistic and null 0004 % hypothesis of a mean of zero. Can handle multiple electrodes and time 0005 % points/frequencies. According to Hemmelammnn et al. (2004), the t_max permutation 0006 % test has relatively good power for multivariate data whose dimensions are 0007 % highly correlated (like EEG/ERP data). See Maris (2004) for an 0008 % explanation of how permutation tests like this control for multiple 0009 % comparisons. Note, Maris uses Hotelling's multivariate T^2 statistic in 0010 % his paper (instead of the t_max used here) and he advocated performing 0011 % the tests over multiple time points (e.g., a 10 ms time window). 0012 % However, our lab found that the T^2 statistic did not have much power when the 0013 % sample size was near the number of electrodes and the results of the test 0014 % could vary greatly depending on the size of the time window. 0015 % 0016 % Inputs: 0017 % data - 3D matrix of data (Channel x Time x Participant) 0018 % 0019 % Optional Inputs: 0020 % n_perm - number of permutations {default=2000}. Manly (1997) suggests 0021 % using at least 1000 permutations for an alpha level of 0.05 and 0022 % at least 5000 permutations for an alpha level of 0.01. 0023 % alph - test alpha level {default=.05} 0024 % tail - [1 | 0 | -1] If tail=1, the alternative hypothesis is that the 0025 % mean of the data is greater than 0 (upper tailed test). If tail=0, 0026 % the alternative hypothesis is that the mean of the data is different 0027 % than 0 (two tailed test). If tail=-1, the alternative hypothesis 0028 % is that the mean of the data is less than 0 (lower tailed test). 0029 % {default: 0} 0030 % verblevel - An integer specifying the amount of information you want 0031 % this function to provide about what it is doing during runtime. 0032 % Options are: 0033 % 0 - quiet, only show errors, warnings, and EEGLAB reports 0034 % 1 - stuff anyone should probably know 0035 % 2 - stuff you should know the first time you start working 0036 % with a data set {default value} 0037 % 3 - stuff that might help you debug (show all 0038 % reports) 0039 % seed_state - The initial state of the random number generating stream 0040 % (see MATLAB documentation for "randstream"). If you pass 0041 % a value from a previous run of this function, it should 0042 % reproduce the exact same values. 0043 % freq_domain - If 0, command line report will be given in temporal units 0044 % (e.g. time points). Otherwise, the report will be given 0045 % in frequency domain units (e.g., frequencies). {default: 0046 % 0} 0047 % 0048 % 0049 % Outputs: 0050 % pval - p-value at each time point and electrode (corrected for 0051 % multiple comparisons via permutation test) 0052 % t_orig - t-score at each time point and electrode 0053 % tmx_ptile - critical t-scores for given alpha level 0054 % seed_state - The initial state of the random number generating stream 0055 % (see MATLAB documentation for "randstream") used to 0056 % generate the permutations. You can use this to reproduce 0057 % the output of this function. 0058 % est_alpha - The estimated family-wise alpha level of the test. With 0059 % permutation tests, a finite number of p-values are possible. 0060 % This function tries to use an alpha level that is as close 0061 % as possible to the desired alpha level. However, if the 0062 % sample size is small, a very limited number of p-values are 0063 % possible and the desired family-wise alpha level may be 0064 % impossible to approximately achieve. 0065 % 0066 % 0067 % Notes: 0068 % -Unlike a parametric test (e.g., an ANOVA), a discrete set of p-values 0069 % are possible (at most the number of possible permutations). Since the 0070 % number of possible permutations grows rapidly with the number of 0071 % participants, this is only an issue for small sample sizes (e.g., 6 0072 % participants). When you have such a small sample size, the 0073 % limited number of p-values may make the test overly conservative (e.g., 0074 % you might be forced to use an alpha level of .0286 since it is the biggest 0075 % possible alpha level less than .05). 0076 % 0077 % Author: 0078 % David Groppe 0079 % June, 2009 0080 % Kutaslab, San Diego 0081 % 0082 % References: 0083 % 0084 % Hemmelmann, et al. (2004) Multivariate tests for the evaluation of 0085 % high-dimensional EEG data. Journal of Neuroscience Methods. 0086 % 0087 % Maris, E. (2004) Randomization tests for ERP topographies and whole 0088 % spatiotemporal data matrices. Psychophysiology 0089 % 0090 % Manly, B.F.J. (1997) Randomization, bootstrap, and Monte Carlo methods in 0091 % biology. 2nd ed. Chapmn and Hall, London. 0092 0093 %%%%%%%%%%%%%%%% REVISION LOG %%%%%%%%%%%%%%%%% 0094 % 0095 % 3/8/2010-rand_state optional input and output added 0096 % 0097 % 3/18/2010-modified to be much faster (time for-loop eliminated, some numbers needed to calculate 0098 % t are computed once for all t-tests; "computational formula" used instead of 0099 % conceptual formula). New version gives exact same results as old version 0100 % 0101 % 4/1/2010-Estimated alpha level added 0102 % 0103 % 12/10/2010-Differences in the steps used for computing t-scores between 0104 % permutations and observed data could lead to rounding errors that would 0105 % produce inaccurate p-values for very small sample sizes (e.g., 3 0106 % participants). That's been fixed. 0107 % 0108 % 12/14/2010-freq_domain optional input added 0109 0110 %This code has an appropriate false positive rate when run on simulated 0111 %data at one time point (use sim_test1.m): 0112 % 0113 % n_sim=4000, n_perm=2000, data=2x1x16 Gaussian, alpha=.05 0114 % Two-tailed test: Empirical alpha rate: 0.047500 0115 % Upper tailed test: Empirical alpha rate: 0.048500 0116 % Lower tailed test: Empirical alpha rate: 0.045500 0117 0118 function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxt_perm1(data,n_perm,alph,tail,verblevel,seed_state,freq_domain) 0119 0120 if nargin<1, 0121 error('You need to provide data.'); 0122 end 0123 0124 if nargin<2, 0125 n_perm=2000; 0126 end 0127 0128 if nargin<3, 0129 alph=.05; 0130 elseif (alph>=1) || (alph<=0) 0131 error('Argument ''alph'' needs to be between 0 and 1.'); 0132 end 0133 0134 if alph<=.01 && n_perm<5000, 0135 watchit(sprintf('You are probably using too few permutations for an alpha level of %f. Type ">>help mxt_perm1" for more info.',alph)); 0136 elseif alph<=.05 && n_perm<1000, 0137 watchit(sprintf('You are probably using too few permutations for an alpha level of %f. Type ">>help mxt_perm1" for more info.',alph)); 0138 end 0139 0140 if nargin<4, 0141 tail=0; 0142 elseif (tail~=0) && (tail~=1) && (tail~=-1), 0143 error('Argument ''tail'' needs to be 0,1, or -1.'); 0144 end 0145 0146 if nargin<5, 0147 verblevel=2; 0148 end 0149 0150 if 0 % ?? for old versions of Matlab 0151 defaultStream=RandStream.getDefaultStream; %random # generator state 0152 if (nargin<6) || isempty(seed_state), 0153 %Store state of random number generator 0154 seed_state=defaultStream.State; 0155 else 0156 defaultStream.State=seed_state; %reset random number generator to saved state 0157 end 0158 else 0159 fprintf('Not seeding random number generator.\n'); 0160 seed_state=NaN; 0161 end 0162 0163 if (nargin<7), 0164 freq_domain=0; 0165 end 0166 0167 s=size(data); 0168 n_chan=s(1); 0169 n_pts=s(2); 0170 n_subs=s(3); 0171 if n_subs<2, 0172 error('You need data from at least two observations (e.g., participants) to perform a hypothesis test.') 0173 end 0174 0175 if n_subs<7, 0176 n_psbl_prms=2^n_subs; 0177 watchit(sprintf(['Due to the very limited number of participants,' ... 0178 ' the total number of possible permutations is small.\nThus only a limited number of p-values (at most %d) are possible and the test might be overly conservative.'], ... 0179 n_psbl_prms)); 0180 end 0181 0182 if verblevel~=0, 0183 fprintf('mxt_perm1: Number of channels: %d\n',n_chan); 0184 if freq_domain, 0185 fprintf('mxt_perm1: Number of frequencies: %d\n',n_pts); 0186 else 0187 fprintf('mxt_perm1: Number of time points: %d\n',n_pts); 0188 end 0189 fprintf('mxt_perm1: Total # of comparisons: %d\n',n_pts*n_chan); 0190 fprintf('mxt_perm1: Number of participants: %d\n',n_subs); 0191 fprintf('t-score degrees of freedom: %d\n',n_subs-1); 0192 end 0193 0194 0195 VerbReport(sprintf('Executing permutation test with %d permutations...',n_perm),2,verblevel); 0196 if (verblevel>=2), 0197 fprintf('Permutations completed: '); 0198 end 0199 0200 0201 %Constant factor for computing t, speeds up computing t to precalculate 0202 %now 0203 sqrt_nXnM1=sqrt(n_subs*(n_subs-1)); 0204 0205 mxt=zeros(1,n_perm*2); 0206 sn=zeros(1,1,n_subs); 0207 for perm=1:n_perm 0208 if ~rem(perm,100) 0209 if (verblevel>=2), 0210 if ~rem(perm-100,1000) 0211 fprintf('%d',perm); 0212 else 0213 fprintf(', %d',perm); 0214 end 0215 if ~rem(perm,1000) 0216 fprintf('\n'); 0217 end 0218 end 0219 end 0220 %randomly set sign of each participant's data 0221 sn(1,1,1:n_subs)=(rand(1,n_subs)>.5)*2-1; 0222 sn_mtrx=repmat(sn,[n_chan n_pts 1]); 0223 0224 d_perm=data.*sn_mtrx; 0225 0226 %computes t-score of permuted data across all channels and time 0227 %points or frequencies 0228 sm=sum(d_perm,3); 0229 mn=sm/n_subs; 0230 sm_sqrs=sum(d_perm.^2,3)-(sm.^2)/n_subs; 0231 stder=sqrt(sm_sqrs)/sqrt_nXnM1; 0232 t=mn./stder; 0233 0234 %get most extreme t-score (sign isn't immportant since we asumme 0235 %symmetric distribution of null hypothesis for one sample test) 0236 mxt(perm)=max(max(abs(t))); 0237 end 0238 mxt(n_perm+1:2*n_perm)=-mxt(1:n_perm); %add the negative of all values since we assumme 0239 %null hypothesis distribution is symmetric 0240 0241 %End permutations completed line 0242 if (verblevel>=2) && rem(perm,1000) 0243 fprintf('\n'); 0244 end 0245 0246 if tail==0, 0247 %two-tailed 0248 tmx_ptile(1)=prctile(mxt,100*alph/2); 0249 tmx_ptile(2)=-tmx_ptile(1); 0250 est_alpha=mean(mxt>=tmx_ptile(2))*2; 0251 elseif tail==1, 0252 %upper tailed 0253 tmx_ptile=prctile(mxt,100-100*alph); 0254 est_alpha=mean(mxt>=tmx_ptile); 0255 else 0256 %tail=-1, lower tailed 0257 tmx_ptile=prctile(mxt,alph*100); 0258 est_alpha=mean(mxt<=tmx_ptile); 0259 end 0260 if verblevel~=0, 0261 fprintf('Desired family-wise alpha level: %f\n',alph); 0262 fprintf('Estimated actual family-wise alpha level: %f\n',est_alpha); 0263 end 0264 0265 %computes t-scores of observations at all channels and time 0266 %points/frequencies 0267 sm=sum(data,3); 0268 mn=sm/n_subs; 0269 sm_sqrs=sum(data.^2,3)-(sm.^2)/n_subs; 0270 stder=sqrt(sm_sqrs)/sqrt_nXnM1; 0271 t_orig=mn./stder; 0272 0273 %compute p-values 0274 pval=zeros(n_chan,n_pts); 0275 for t=1:n_pts, 0276 for c=1:n_chan, 0277 if tail==0, 0278 pval(c,t)=mean(mxt>=abs(t_orig(c,t)))*2; 0279 elseif tail==1, 0280 pval(c,t)=mean(mxt>=t_orig(c,t)); 0281 elseif tail==-1, 0282 pval(c,t)=mean(mxt<=t_orig(c,t)); 0283 end 0284 end 0285 end 0286