function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxt_perm1(data,n_perm,alph,tail,verblevel,seed_state,freq_domain) mxt_perm1-One sample permutation test based on a t-statistic and null hypothesis of a mean of zero. Can handle multiple electrodes and time points/frequencies. According to Hemmelammnn et al. (2004), the t_max permutation test has relatively good power for multivariate data whose dimensions are highly correlated (like EEG/ERP data). See Maris (2004) for an explanation of how permutation tests like this control for multiple comparisons. Note, Maris uses Hotelling's multivariate T^2 statistic in his paper (instead of the t_max used here) and he advocated performing the tests over multiple time points (e.g., a 10 ms time window). However, our lab found that the T^2 statistic did not have much power when the sample size was near the number of electrodes and the results of the test could vary greatly depending on the size of the time window. Inputs: data - 3D matrix of data (Channel x Time x Participant) Optional Inputs: n_perm - number of permutations {default=2000}. Manly (1997) suggests using at least 1000 permutations for an alpha level of 0.05 and at least 5000 permutations for an alpha level of 0.01. alph - test alpha level {default=.05} tail - [1 | 0 | -1] If tail=1, the alternative hypothesis is that the mean of the data is greater than 0 (upper tailed test). If tail=0, the alternative hypothesis is that the mean of the data is different than 0 (two tailed test). If tail=-1, the alternative hypothesis is that the mean of the data is less than 0 (lower tailed test). {default: 0} verblevel - An integer specifying the amount of information you want this function to provide about what it is doing during runtime. Options are: 0 - quiet, only show errors, warnings, and EEGLAB reports 1 - stuff anyone should probably know 2 - stuff you should know the first time you start working with a data set {default value} 3 - stuff that might help you debug (show all reports) seed_state - The initial state of the random number generating stream (see MATLAB documentation for "randstream"). If you pass a value from a previous run of this function, it should reproduce the exact same values. freq_domain - If 0, command line report will be given in temporal units (e.g. time points). Otherwise, the report will be given in frequency domain units (e.g., frequencies). {default: 0} Outputs: pval - p-value at each time point and electrode (corrected for multiple comparisons via permutation test) t_orig - t-score at each time point and electrode tmx_ptile - critical t-scores for given alpha level seed_state - The initial state of the random number generating stream (see MATLAB documentation for "randstream") used to generate the permutations. You can use this to reproduce the output of this function. est_alpha - The estimated family-wise alpha level of the test. With permutation tests, a finite number of p-values are possible. This function tries to use an alpha level that is as close as possible to the desired alpha level. However, if the sample size is small, a very limited number of p-values are possible and the desired family-wise alpha level may be impossible to approximately achieve. Notes: -Unlike a parametric test (e.g., an ANOVA), a discrete set of p-values are possible (at most the number of possible permutations). Since the number of possible permutations grows rapidly with the number of participants, this is only an issue for small sample sizes (e.g., 6 participants). When you have such a small sample size, the limited number of p-values may make the test overly conservative (e.g., you might be forced to use an alpha level of .0286 since it is the biggest possible alpha level less than .05). Author: David Groppe June, 2009 Kutaslab, San Diego References: Hemmelmann, et al. (2004) Multivariate tests for the evaluation of high-dimensional EEG data. Journal of Neuroscience Methods. Maris, E. (2004) Randomization tests for ERP topographies and whole spatiotemporal data matrices. Psychophysiology Manly, B.F.J. (1997) Randomization, bootstrap, and Monte Carlo methods in biology. 2nd ed. Chapmn and Hall, London.
0001 %function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxt_perm1(data,n_perm,alph,tail,verblevel,seed_state,freq_domain) 0002 % 0003 % mxt_perm1-One sample permutation test based on a t-statistic and null 0004 % hypothesis of a mean of zero. Can handle multiple electrodes and time 0005 % points/frequencies. According to Hemmelammnn et al. (2004), the t_max permutation 0006 % test has relatively good power for multivariate data whose dimensions are 0007 % highly correlated (like EEG/ERP data). See Maris (2004) for an 0008 % explanation of how permutation tests like this control for multiple 0009 % comparisons. Note, Maris uses Hotelling's multivariate T^2 statistic in 0010 % his paper (instead of the t_max used here) and he advocated performing 0011 % the tests over multiple time points (e.g., a 10 ms time window). 0012 % However, our lab found that the T^2 statistic did not have much power when the 0013 % sample size was near the number of electrodes and the results of the test 0014 % could vary greatly depending on the size of the time window. 0015 % 0016 % Inputs: 0017 % data - 3D matrix of data (Channel x Time x Participant) 0018 % 0019 % Optional Inputs: 0020 % n_perm - number of permutations {default=2000}. Manly (1997) suggests 0021 % using at least 1000 permutations for an alpha level of 0.05 and 0022 % at least 5000 permutations for an alpha level of 0.01. 0023 % alph - test alpha level {default=.05} 0024 % tail - [1 | 0 | -1] If tail=1, the alternative hypothesis is that the 0025 % mean of the data is greater than 0 (upper tailed test). If tail=0, 0026 % the alternative hypothesis is that the mean of the data is different 0027 % than 0 (two tailed test). If tail=-1, the alternative hypothesis 0028 % is that the mean of the data is less than 0 (lower tailed test). 0029 % {default: 0} 0030 % verblevel - An integer specifying the amount of information you want 0031 % this function to provide about what it is doing during runtime. 0032 % Options are: 0033 % 0 - quiet, only show errors, warnings, and EEGLAB reports 0034 % 1 - stuff anyone should probably know 0035 % 2 - stuff you should know the first time you start working 0036 % with a data set {default value} 0037 % 3 - stuff that might help you debug (show all 0038 % reports) 0039 % seed_state - The initial state of the random number generating stream 0040 % (see MATLAB documentation for "randstream"). If you pass 0041 % a value from a previous run of this function, it should 0042 % reproduce the exact same values. 0043 % freq_domain - If 0, command line report will be given in temporal units 0044 % (e.g. time points). Otherwise, the report will be given 0045 % in frequency domain units (e.g., frequencies). {default: 0046 % 0} 0047 % 0048 % 0049 % Outputs: 0050 % pval - p-value at each time point and electrode (corrected for 0051 % multiple comparisons via permutation test) 0052 % t_orig - t-score at each time point and electrode 0053 % tmx_ptile - critical t-scores for given alpha level 0054 % seed_state - The initial state of the random number generating stream 0055 % (see MATLAB documentation for "randstream") used to 0056 % generate the permutations. You can use this to reproduce 0057 % the output of this function. 0058 % est_alpha - The estimated family-wise alpha level of the test. With 0059 % permutation tests, a finite number of p-values are possible. 0060 % This function tries to use an alpha level that is as close 0061 % as possible to the desired alpha level. However, if the 0062 % sample size is small, a very limited number of p-values are 0063 % possible and the desired family-wise alpha level may be 0064 % impossible to approximately achieve. 0065 % 0066 % 0067 % Notes: 0068 % -Unlike a parametric test (e.g., an ANOVA), a discrete set of p-values 0069 % are possible (at most the number of possible permutations). Since the 0070 % number of possible permutations grows rapidly with the number of 0071 % participants, this is only an issue for small sample sizes (e.g., 6 0072 % participants). When you have such a small sample size, the 0073 % limited number of p-values may make the test overly conservative (e.g., 0074 % you might be forced to use an alpha level of .0286 since it is the biggest 0075 % possible alpha level less than .05). 0076 % 0077 % Author: 0078 % David Groppe 0079 % June, 2009 0080 % Kutaslab, San Diego 0081 % 0082 % References: 0083 % 0084 % Hemmelmann, et al. (2004) Multivariate tests for the evaluation of 0085 % high-dimensional EEG data. Journal of Neuroscience Methods. 0086 % 0087 % Maris, E. (2004) Randomization tests for ERP topographies and whole 0088 % spatiotemporal data matrices. Psychophysiology 0089 % 0090 % Manly, B.F.J. (1997) Randomization, bootstrap, and Monte Carlo methods in 0091 % biology. 2nd ed. Chapmn and Hall, London. 0092 0093 %%%%%%%%%%%%%%%% REVISION LOG %%%%%%%%%%%%%%%%% 0094 % 0095 % 3/8/2010-rand_state optional input and output added 0096 % 0097 % 3/18/2010-modified to be much faster (time for-loop eliminated, some numbers needed to calculate 0098 % t are computed once for all t-tests; "computational formula" used instead of 0099 % conceptual formula). New version gives exact same results as old version 0100 % 0101 % 4/1/2010-Estimated alpha level added 0102 % 0103 % 12/10/2010-Differences in the steps used for computing t-scores between 0104 % permutations and observed data could lead to rounding errors that would 0105 % produce inaccurate p-values for very small sample sizes (e.g., 3 0106 % participants). That's been fixed. 0107 % 0108 % 12/14/2010-freq_domain optional input added 0109 0110 %This code has an appropriate false positive rate when run on simulated 0111 %data at one time point (use sim_test1.m): 0112 % 0113 % n_sim=4000, n_perm=2000, data=2x1x16 Gaussian, alpha=.05 0114 % Two-tailed test: Empirical alpha rate: 0.047500 0115 % Upper tailed test: Empirical alpha rate: 0.048500 0116 % Lower tailed test: Empirical alpha rate: 0.045500 0117 0118 function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxt_perm1(data,n_perm,alph,tail,verblevel,seed_state,freq_domain) 0119 0120 if nargin<1, 0121 error('You need to provide data.'); 0122 end 0123 0124 if nargin<2, 0125 n_perm=2000; 0126 end 0127 0128 if nargin<3, 0129 alph=.05; 0130 elseif (alph>=1) || (alph<=0) 0131 error('Argument ''alph'' needs to be between 0 and 1.'); 0132 end 0133 0134 if alph<=.01 && n_perm<5000, 0135 watchit(sprintf('You are probably using too few permutations for an alpha level of %f. Type ">>help mxt_perm1" for more info.',alph)); 0136 elseif alph<=.05 && n_perm<1000, 0137 watchit(sprintf('You are probably using too few permutations for an alpha level of %f. Type ">>help mxt_perm1" for more info.',alph)); 0138 end 0139 0140 if nargin<4, 0141 tail=0; 0142 elseif (tail~=0) && (tail~=1) && (tail~=-1), 0143 error('Argument ''tail'' needs to be 0,1, or -1.'); 0144 end 0145 0146 if nargin<5, 0147 verblevel=2; 0148 end 0149 0150 defaultStream=RandStream.getGlobalStream; %random # generator state 0151 if (nargin<6) || isempty(seed_state), 0152 %Store state of random number generator 0153 seed_state=defaultStream.State; 0154 else 0155 defaultStream.State=seed_state; %reset random number generator to saved state 0156 end 0157 0158 if (nargin<7), 0159 freq_domain=0; 0160 end 0161 0162 s=size(data); 0163 n_chan=s(1); 0164 n_pts=s(2); 0165 n_subs=s(3); 0166 if n_subs<2, 0167 error('You need data from at least two observations (e.g., participants) to perform a hypothesis test.') 0168 end 0169 0170 if n_subs<7, 0171 n_psbl_prms=2^n_subs; 0172 watchit(sprintf(['Due to the very limited number of participants,' ... 0173 ' the total number of possible permutations is small.\nThus only a limited number of p-values (at most %d) are possible and the test might be overly conservative.'], ... 0174 n_psbl_prms)); 0175 end 0176 0177 if verblevel~=0, 0178 fprintf('mxt_perm1: Number of channels: %d\n',n_chan); 0179 if freq_domain, 0180 fprintf('mxt_perm1: Number of frequencies: %d\n',n_pts); 0181 else 0182 fprintf('mxt_perm1: Number of time points: %d\n',n_pts); 0183 end 0184 fprintf('mxt_perm1: Total # of comparisons: %d\n',n_pts*n_chan); 0185 fprintf('mxt_perm1: Number of participants: %d\n',n_subs); 0186 fprintf('t-score degrees of freedom: %d\n',n_subs-1); 0187 end 0188 0189 0190 VerbReport(sprintf('Executing permutation test with %d permutations...',n_perm),2,verblevel); 0191 if (verblevel>=2), 0192 fprintf('Permutations completed: '); 0193 end 0194 0195 0196 %Constant factor for computing t, speeds up computing t to precalculate 0197 %now 0198 sqrt_nXnM1=sqrt(n_subs*(n_subs-1)); 0199 0200 mxt=zeros(1,n_perm*2); 0201 sn=zeros(1,1,n_subs); 0202 for perm=1:n_perm 0203 if ~rem(perm,100) 0204 if (verblevel>=2), 0205 if ~rem(perm-100,1000) 0206 fprintf('%d',perm); 0207 else 0208 fprintf(', %d',perm); 0209 end 0210 if ~rem(perm,1000) 0211 fprintf('\n'); 0212 end 0213 end 0214 end 0215 %randomly set sign of each participant's data 0216 sn(1,1,1:n_subs)=(rand(1,n_subs)>.5)*2-1; 0217 sn_mtrx=repmat(sn,[n_chan n_pts 1]); 0218 0219 d_perm=data.*sn_mtrx; 0220 0221 %computes t-score of permuted data across all channels and time 0222 %points or frequencies 0223 sm=sum(d_perm,3); 0224 mn=sm/n_subs; 0225 sm_sqrs=sum(d_perm.^2,3)-(sm.^2)/n_subs; 0226 stder=sqrt(sm_sqrs)/sqrt_nXnM1; 0227 t=mn./stder; 0228 0229 %get most extreme t-score (sign isn't immportant since we asumme 0230 %symmetric distribution of null hypothesis for one sample test) 0231 mxt(perm)=max(max(abs(t))); 0232 end 0233 mxt(n_perm+1:2*n_perm)=-mxt(1:n_perm); %add the negative of all values since we assumme 0234 %null hypothesis distribution is symmetric 0235 0236 %End permutations completed line 0237 if (verblevel>=2) && rem(perm,1000) 0238 fprintf('\n'); 0239 end 0240 0241 if tail==0, 0242 %two-tailed 0243 tmx_ptile(1)=prctile(mxt,100*alph/2); 0244 tmx_ptile(2)=-tmx_ptile(1); 0245 est_alpha=mean(mxt>=tmx_ptile(2))*2; 0246 elseif tail==1, 0247 %upper tailed 0248 tmx_ptile=prctile(mxt,100-100*alph); 0249 est_alpha=mean(mxt>=tmx_ptile); 0250 else 0251 %tail=-1, lower tailed 0252 tmx_ptile=prctile(mxt,alph*100); 0253 est_alpha=mean(mxt<=tmx_ptile); 0254 end 0255 if verblevel~=0, 0256 fprintf('Desired family-wise alpha level: %f\n',alph); 0257 fprintf('Estimated actual family-wise alpha level: %f\n',est_alpha); 0258 end 0259 0260 %computes t-scores of observations at all channels and time 0261 %points/frequencies 0262 sm=sum(data,3); 0263 mn=sm/n_subs; 0264 sm_sqrs=sum(data.^2,3)-(sm.^2)/n_subs; 0265 stder=sqrt(sm_sqrs)/sqrt_nXnM1; 0266 t_orig=mn./stder; 0267 0268 %compute p-values 0269 pval=zeros(n_chan,n_pts); 0270 for t=1:n_pts, 0271 for c=1:n_chan, 0272 if tail==0, 0273 pval(c,t)=mean(mxt>=abs(t_orig(c,t)))*2; 0274 elseif tail==1, 0275 pval(c,t)=mean(mxt>=t_orig(c,t)); 0276 elseif tail==-1, 0277 pval(c,t)=mean(mxt<=t_orig(c,t)); 0278 end 0279 end 0280 end 0281