function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxt_perm1(data,n_perm,alph,tail,verblevel,seed_state,freq_domain) mxt_perm1-One sample permutation test based on a t-statistic and null hypothesis of a mean of zero. Can handle multiple electrodes and time points/frequencies. According to Hemmelammnn et al. (2004), the t_max permutation test has relatively good power for multivariate data whose dimensions are highly correlated (like EEG/ERP data). See Maris (2004) for an explanation of how permutation tests like this control for multiple comparisons. Note, Maris uses Hotelling's multivariate T^2 statistic in his paper (instead of the t_max used here) and he advocated performing the tests over multiple time points (e.g., a 10 ms time window). However, our lab found that the T^2 statistic did not have much power when the sample size was near the number of electrodes and the results of the test could vary greatly depending on the size of the time window. Inputs: data - 3D matrix of data (Channel x Time x Participant) Optional Inputs: n_perm - number of permutations {default=2000}. Manly (1997) suggests using at least 1000 permutations for an alpha level of 0.05 and at least 5000 permutations for an alpha level of 0.01. alph - test alpha level {default=.05} tail - [1 | 0 | -1] If tail=1, the alternative hypothesis is that the mean of the data is greater than 0 (upper tailed test). If tail=0, the alternative hypothesis is that the mean of the data is different than 0 (two tailed test). If tail=-1, the alternative hypothesis is that the mean of the data is less than 0 (lower tailed test). {default: 0} verblevel - An integer specifying the amount of information you want this function to provide about what it is doing during runtime. Options are: 0 - quiet, only show errors, warnings, and EEGLAB reports 1 - stuff anyone should probably know 2 - stuff you should know the first time you start working with a data set {default value} 3 - stuff that might help you debug (show all reports) seed_state - The initial state of the random number generating stream (see MATLAB documentation for "randstream"). If you pass a value from a previous run of this function, it should reproduce the exact same values. freq_domain - If 0, command line report will be given in temporal units (e.g. time points). Otherwise, the report will be given in frequency domain units (e.g., frequencies). {default: 0} Outputs: pval - p-value at each time point and electrode (corrected for multiple comparisons via permutation test) t_orig - t-score at each time point and electrode tmx_ptile - critical t-scores for given alpha level seed_state - The initial state of the random number generating stream (see MATLAB documentation for "randstream") used to generate the permutations. You can use this to reproduce the output of this function. est_alpha - The estimated family-wise alpha level of the test. With permutation tests, a finite number of p-values are possible. This function tries to use an alpha level that is as close as possible to the desired alpha level. However, if the sample size is small, a very limited number of p-values are possible and the desired family-wise alpha level may be impossible to approximately achieve. Notes: -Unlike a parametric test (e.g., an ANOVA), a discrete set of p-values are possible (at most the number of possible permutations). Since the number of possible permutations grows rapidly with the number of participants, this is only an issue for small sample sizes (e.g., 6 participants). When you have such a small sample size, the limited number of p-values may make the test overly conservative (e.g., you might be forced to use an alpha level of .0286 since it is the biggest possible alpha level less than .05). Author: David Groppe June, 2009 Kutaslab, San Diego References: Hemmelmann, et al. (2004) Multivariate tests for the evaluation of high-dimensional EEG data. Journal of Neuroscience Methods. Maris, E. (2004) Randomization tests for ERP topographies and whole spatiotemporal data matrices. Psychophysiology Manly, B.F.J. (1997) Randomization, bootstrap, and Monte Carlo methods in biology. 2nd ed. Chapmn and Hall, London.
0001 %function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxt_perm1(data,n_perm,alph,tail,verblevel,seed_state,freq_domain) 0002 % 0003 % mxt_perm1-One sample permutation test based on a t-statistic and null 0004 % hypothesis of a mean of zero. Can handle multiple electrodes and time 0005 % points/frequencies. According to Hemmelammnn et al. (2004), the t_max permutation 0006 % test has relatively good power for multivariate data whose dimensions are 0007 % highly correlated (like EEG/ERP data). See Maris (2004) for an 0008 % explanation of how permutation tests like this control for multiple 0009 % comparisons. Note, Maris uses Hotelling's multivariate T^2 statistic in 0010 % his paper (instead of the t_max used here) and he advocated performing 0011 % the tests over multiple time points (e.g., a 10 ms time window). 0012 % However, our lab found that the T^2 statistic did not have much power when the 0013 % sample size was near the number of electrodes and the results of the test 0014 % could vary greatly depending on the size of the time window. 0015 % 0016 % Inputs: 0017 % data - 3D matrix of data (Channel x Time x Participant) 0018 % 0019 % Optional Inputs: 0020 % n_perm - number of permutations {default=2000}. Manly (1997) suggests 0021 % using at least 1000 permutations for an alpha level of 0.05 and 0022 % at least 5000 permutations for an alpha level of 0.01. 0023 % alph - test alpha level {default=.05} 0024 % tail - [1 | 0 | -1] If tail=1, the alternative hypothesis is that the 0025 % mean of the data is greater than 0 (upper tailed test). If tail=0, 0026 % the alternative hypothesis is that the mean of the data is different 0027 % than 0 (two tailed test). If tail=-1, the alternative hypothesis 0028 % is that the mean of the data is less than 0 (lower tailed test). 0029 % {default: 0} 0030 % verblevel - An integer specifying the amount of information you want 0031 % this function to provide about what it is doing during runtime. 0032 % Options are: 0033 % 0 - quiet, only show errors, warnings, and EEGLAB reports 0034 % 1 - stuff anyone should probably know 0035 % 2 - stuff you should know the first time you start working 0036 % with a data set {default value} 0037 % 3 - stuff that might help you debug (show all 0038 % reports) 0039 % seed_state - The initial state of the random number generating stream 0040 % (see MATLAB documentation for "randstream"). If you pass 0041 % a value from a previous run of this function, it should 0042 % reproduce the exact same values. 0043 % freq_domain - If 0, command line report will be given in temporal units 0044 % (e.g. time points). Otherwise, the report will be given 0045 % in frequency domain units (e.g., frequencies). {default: 0046 % 0} 0047 % 0048 % 0049 % Outputs: 0050 % pval - p-value at each time point and electrode (corrected for 0051 % multiple comparisons via permutation test) 0052 % t_orig - t-score at each time point and electrode 0053 % tmx_ptile - critical t-scores for given alpha level 0054 % seed_state - The initial state of the random number generating stream 0055 % (see MATLAB documentation for "randstream") used to 0056 % generate the permutations. You can use this to reproduce 0057 % the output of this function. 0058 % est_alpha - The estimated family-wise alpha level of the test. With 0059 % permutation tests, a finite number of p-values are possible. 0060 % This function tries to use an alpha level that is as close 0061 % as possible to the desired alpha level. However, if the 0062 % sample size is small, a very limited number of p-values are 0063 % possible and the desired family-wise alpha level may be 0064 % impossible to approximately achieve. 0065 % 0066 % 0067 % Notes: 0068 % -Unlike a parametric test (e.g., an ANOVA), a discrete set of p-values 0069 % are possible (at most the number of possible permutations). Since the 0070 % number of possible permutations grows rapidly with the number of 0071 % participants, this is only an issue for small sample sizes (e.g., 6 0072 % participants). When you have such a small sample size, the 0073 % limited number of p-values may make the test overly conservative (e.g., 0074 % you might be forced to use an alpha level of .0286 since it is the biggest 0075 % possible alpha level less than .05). 0076 % 0077 % Author: 0078 % David Groppe 0079 % June, 2009 0080 % Kutaslab, San Diego 0081 % 0082 % References: 0083 % 0084 % Hemmelmann, et al. (2004) Multivariate tests for the evaluation of 0085 % high-dimensional EEG data. Journal of Neuroscience Methods. 0086 % 0087 % Maris, E. (2004) Randomization tests for ERP topographies and whole 0088 % spatiotemporal data matrices. Psychophysiology 0089 % 0090 % Manly, B.F.J. (1997) Randomization, bootstrap, and Monte Carlo methods in 0091 % biology. 2nd ed. Chapmn and Hall, London. 0092 0093 %%%%%%%%%%%%%%%% REVISION LOG %%%%%%%%%%%%%%%%% 0094 % 0095 % 3/8/2010-rand_state optional input and output added 0096 % 0097 % 3/18/2010-modified to be much faster (time for-loop eliminated, some numbers needed to calculate 0098 % t are computed once for all t-tests; "computational formula" used instead of 0099 % conceptual formula). New version gives exact same results as old version 0100 % 0101 % 4/1/2010-Estimated alpha level added 0102 % 0103 % 12/10/2010-Differences in the steps used for computing t-scores between 0104 % permutations and observed data could lead to rounding errors that would 0105 % produce inaccurate p-values for very small sample sizes (e.g., 3 0106 % participants). That's been fixed. 0107 % 0108 % 12/14/2010-freq_domain optional input added 0109 % 0110 % 3/17/2013-Randomization is now compatible with Matlab v13. Thanks to 0111 % Aaron Newman for the fix. 0112 0113 %This code has an appropriate false positive rate when run on simulated 0114 %data at one time point (use sim_test1.m): 0115 % 0116 % n_sim=4000, n_perm=2000, data=2x1x16 Gaussian, alpha=.05 0117 % Two-tailed test: Empirical alpha rate: 0.047500 0118 % Upper tailed test: Empirical alpha rate: 0.048500 0119 % Lower tailed test: Empirical alpha rate: 0.045500 0120 0121 function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxt_perm1(data,n_perm,alph,tail,verblevel,seed_state,freq_domain) 0122 0123 if nargin<1, 0124 error('You need to provide data.'); 0125 end 0126 0127 if nargin<2, 0128 n_perm=2000; 0129 end 0130 0131 if nargin<3, 0132 alph=.05; 0133 elseif (alph>=1) || (alph<=0) 0134 error('Argument ''alph'' needs to be between 0 and 1.'); 0135 end 0136 0137 if alph<=.01 && n_perm<5000, 0138 watchit(sprintf('You are probably using too few permutations for an alpha level of %f. Type ">>help mxt_perm1" for more info.',alph)); 0139 elseif alph<=.05 && n_perm<1000, 0140 watchit(sprintf('You are probably using too few permutations for an alpha level of %f. Type ">>help mxt_perm1" for more info.',alph)); 0141 end 0142 0143 if nargin<4, 0144 tail=0; 0145 elseif (tail~=0) && (tail~=1) && (tail~=-1), 0146 error('Argument ''tail'' needs to be 0,1, or -1.'); 0147 end 0148 0149 if nargin<5, 0150 verblevel=2; 0151 end 0152 0153 %Get random # generator state 0154 if verLessThan('matlab','8.1') 0155 defaultStream=RandStream.getDefaultStream; 0156 else 0157 defaultStream=RandStream.getGlobalStream; 0158 end 0159 if (nargin<6) || isempty(seed_state), 0160 %Store state of random number generator 0161 seed_state=defaultStream.State; 0162 else 0163 defaultStream.State=seed_state; %reset random number generator to saved state 0164 end 0165 0166 if (nargin<7), 0167 freq_domain=0; 0168 end 0169 0170 s=size(data); 0171 n_chan=s(1); 0172 n_pts=s(2); 0173 n_subs=s(3); 0174 if n_subs<2, 0175 error('You need data from at least two observations (e.g., participants) to perform a hypothesis test.') 0176 end 0177 0178 if n_subs<7, 0179 n_psbl_prms=2^n_subs; 0180 watchit(sprintf(['Due to the very limited number of participants,' ... 0181 ' the total number of possible permutations is small.\nThus only a limited number of p-values (at most %d) are possible and the test might be overly conservative.'], ... 0182 n_psbl_prms)); 0183 end 0184 0185 if verblevel~=0, 0186 fprintf('mxt_perm1: Number of channels: %d\n',n_chan); 0187 if freq_domain, 0188 fprintf('mxt_perm1: Number of frequencies: %d\n',n_pts); 0189 else 0190 fprintf('mxt_perm1: Number of time points: %d\n',n_pts); 0191 end 0192 fprintf('mxt_perm1: Total # of comparisons: %d\n',n_pts*n_chan); 0193 fprintf('mxt_perm1: Number of participants: %d\n',n_subs); 0194 fprintf('t-score degrees of freedom: %d\n',n_subs-1); 0195 end 0196 0197 0198 VerbReport(sprintf('Executing permutation test with %d permutations...',n_perm),2,verblevel); 0199 if (verblevel>=2), 0200 fprintf('Permutations completed: '); 0201 end 0202 0203 0204 %Constant factor for computing t, speeds up computing t to precalculate 0205 %now 0206 sqrt_nXnM1=sqrt(n_subs*(n_subs-1)); 0207 0208 mxt=zeros(1,n_perm*2); 0209 sn=zeros(1,1,n_subs); 0210 for perm=1:n_perm 0211 if ~rem(perm,100) 0212 if (verblevel>=2), 0213 if ~rem(perm-100,1000) 0214 fprintf('%d',perm); 0215 else 0216 fprintf(', %d',perm); 0217 end 0218 if ~rem(perm,1000) 0219 fprintf('\n'); 0220 end 0221 end 0222 end 0223 %randomly set sign of each participant's data 0224 sn(1,1,1:n_subs)=(rand(1,n_subs)>.5)*2-1; % You need the first two 1's in sn to make the results 3D 0225 %sn=(rand(1,n_subs)>.5)*2-1; 0226 sn_mtrx=repmat(sn,[n_chan n_pts 1]); 0227 0228 d_perm=data.*sn_mtrx; 0229 0230 %computes t-score of permuted data across all channels and time 0231 %points or frequencies 0232 sm=sum(d_perm,3); 0233 mn=sm/n_subs; 0234 sm_sqrs=sum(d_perm.^2,3)-(sm.^2)/n_subs; 0235 stder=sqrt(sm_sqrs)/sqrt_nXnM1; 0236 t=mn./stder; 0237 0238 %get most extreme t-score (sign isn't immportant since we asumme 0239 %symmetric distribution of null hypothesis for one sample test) 0240 mxt(perm)=max(max(abs(t))); 0241 end 0242 mxt(n_perm+1:2*n_perm)=-mxt(1:n_perm); %add the negative of all values since we assumme 0243 %null hypothesis distribution is symmetric 0244 0245 %End permutations completed line 0246 if (verblevel>=2) && rem(perm,1000) 0247 fprintf('\n'); 0248 end 0249 0250 if tail==0, 0251 %two-tailed 0252 tmx_ptile(1)=prctile(mxt,100*alph/2); 0253 tmx_ptile(2)=-tmx_ptile(1); 0254 est_alpha=mean(mxt>=tmx_ptile(2))*2; 0255 elseif tail==1, 0256 %upper tailed 0257 tmx_ptile=prctile(mxt,100-100*alph); 0258 est_alpha=mean(mxt>=tmx_ptile); 0259 else 0260 %tail=-1, lower tailed 0261 tmx_ptile=prctile(mxt,alph*100); 0262 est_alpha=mean(mxt<=tmx_ptile); 0263 end 0264 if verblevel~=0, 0265 fprintf('Desired family-wise alpha level: %f\n',alph); 0266 fprintf('Estimated actual family-wise alpha level: %f\n',est_alpha); 0267 end 0268 0269 %computes t-scores of observations at all channels and time 0270 %points/frequencies 0271 sm=sum(data,3); 0272 mn=sm/n_subs; 0273 sm_sqrs=sum(data.^2,3)-(sm.^2)/n_subs; 0274 stder=sqrt(sm_sqrs)/sqrt_nXnM1; 0275 t_orig=mn./stder; 0276 0277 %compute p-values 0278 pval=zeros(n_chan,n_pts); 0279 for t=1:n_pts, 0280 for c=1:n_chan, 0281 if tail==0, 0282 pval(c,t)=mean(mxt>=abs(t_orig(c,t)))*2; 0283 elseif tail==1, 0284 pval(c,t)=mean(mxt>=t_orig(c,t)); 0285 elseif tail==-1, 0286 pval(c,t)=mean(mxt<=t_orig(c,t)); 0287 end 0288 end 0289 end 0290