function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxtdif_perm2(dataA,dataB,n_perm,alph,tail,verblevel,seed_state) mxtdif_perm2-Two sample permutation test based on normalizing the mean of each group to a t-score taking the difference between the two. The null hypothesis is a mean difference of zero. For one-tailed tests, the null distribution is derived from the most extreme t-score difference of all permutations. For two-tailed tests, the null distribution is derived from the maximum absolute t-score difference of all permutations. This function can handle multiple electrodes and time points. Based on simulated ERP data sets, the t-score difference appears to be less sensitive to differences in variance between groups than the standard independent samples t-statistic or Welsh's t (esp. when group sizes are unequal). However, it is probably less powerful than these other two statistics. Thus you should probably use this function if the numbers of participants in each group are not equal and you suspect that one group is much more variable (from trial to trail or participant to participant) than the other. Note, normalzing a group's data to a t-score is obtained by simply dividing the mean of the data by its standard deviation. This is often referred to as a "z-score" but technically it is a "t-statistic" since the standard deviation of the group is estimated from the data (i.e., it is not known a priori). Inputs: dataA - 3D matrix of ERPs from Group A (Channel x Time x Participant) dataB - 3D matrix of ERPs from Group B (Channel x Time x Participant) n_perm - number of permutations {default=1000}. Manly (1997) suggests using at least 1000 permutations for an alpha level of 0.05 and at least 5000 permutations for an alpha level of 0.01. alph - test alpha level {default=.05} tail - [1 | 0 | -1] If tail=1, the alternative hypothesis is that the mean of data is greater than 0 (upper tailed test). If tail=0, the alternative hypothesis is that the mean of data is different than 0 (two tailed test). If tail=-1, the alternative hypothesis is that the mean of the data is less than 0 (lower tailed test). {default: 0} Optional Inputs: verblevel - An integer specifiying the amount of information you want this function to provide about what it is doing during runtime. Options are: 0 - quiet, only show errors, warnings, and EEGLAB reports 1 - stuff anyone should probably know 2 - stuff you should know the first time you start working with a data set {default value} 3 - stuff that might help you debug (show all reports) seed_state - The initial state of the random number generating stream (see MATLAB documentation for "randstream"). If you pass a value from a previous run of this function, it should reproduce the exact same values. Outputs: pval - p-value at each time point and electrode (corrected for multiple comparisons via permutation test) t_orig - t-score at each time point and electrode tmx_ptile - critical t-scores for given alpha level seed_state - The initial state of the random number generating stream (see MATLAB documentation for "randstream") used to generate the permutations. You can use this to reproduce the output of this function. est_alpha - The estimated family-wise alpha level of the test. With permutation tests, a finite number of p-values are possible. This function tries to use an alpha level that is as close as possible to the desired alpha level. However, if the sample size is small, a very limited number of p-values are possible and the desired family-wise alpha level may be impossible to acheive. Notes: -The null hypothesis of this test is that the data in the two groups are "exchangeable" (i.e., the data from a participant in Group A was just as likely to have come from Group B). If the distributions of the two populations you're comparing differ in *any* way (i.e., not just in central tendency), the null hypothesis will be violated. For example, if the ERPs of the two groups differ in their variability, but not their average value (e.g., one set of ERPs is from a group of fidgety, thin skulled children and the other highly cooperative adults), you'll be more likely than alpha to reject the null hypothesis. Uses differences in t-scores ameliorates the test's sensitivity to between-group differences in variance, but it does not completely remove it. -Unlike a parametric test (e.g., an ANOVA), a discrete set of p-values are possible (at most the number of possible permutations). Since the number of possible permutations grows rapdily with the number of participants, this is only an issue for small sample sizes (e.g., 3 participants in each group). When you have such a small sample size, the limited number of p-values may make the test overly conservative (e.g., you might be forced to use an alpha level of .0286 since it is the biggest possible alpha level less than .05). Author: David Groppe July, 2009 Kutaslab, San Diego References: Manly, B.F.J. (1997) Randomization, bootstrap, and Monte Carlo methods in biology. 2nd ed. Chapmn and Hall, London.
0001 %function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxtdif_perm2(dataA,dataB,n_perm,alph,tail,verblevel,seed_state) 0002 % 0003 % mxtdif_perm2-Two sample permutation test based on normalizing the mean of 0004 % each group to a t-score taking the difference between the two. The null 0005 % hypothesis is a mean difference of zero. For one-tailed tests, the null 0006 % distribution is derived from the most extreme t-score difference of all 0007 % permutations. For two-tailed tests, the null distribution is derived from 0008 % the maximum absolute t-score difference of all permutations. This function 0009 % can handle multiple electrodes and time points. Based on simulated ERP 0010 % data sets, the t-score difference appears to be less sensitive to differences 0011 % in variance between groups than the standard independent samples t-statistic 0012 % or Welsh's t (esp. when group sizes are unequal). However, it is probably 0013 % less powerful than these other two statistics. Thus you should probably 0014 % use this function if the numbers of participants in each group are not equal 0015 % and you suspect that one group is much more variable (from trial to trail 0016 % or participant to participant) than the other. Note, normalzing a group's 0017 % data to a t-score is obtained by simply dividing the mean of the data by 0018 % its standard deviation. This is often referred to as a "z-score" but 0019 % technically it is a "t-statistic" since the standard deviation of the 0020 % group is estimated from the data (i.e., it is not known a priori). 0021 % 0022 % Inputs: 0023 % dataA - 3D matrix of ERPs from Group A (Channel x Time x Participant) 0024 % dataB - 3D matrix of ERPs from Group B (Channel x Time x Participant) 0025 % n_perm - number of permutations {default=1000}. Manly (1997) suggests 0026 % using at least 1000 permutations for an alpha level of 0.05 and 0027 % at least 5000 permutations for an alpha level of 0.01. 0028 % alph - test alpha level {default=.05} 0029 % tail - [1 | 0 | -1] If tail=1, the alternative hypothesis is that the 0030 % mean of data is greater than 0 (upper tailed test). If tail=0, 0031 % the alternative hypothesis is that the mean of data is different 0032 % than 0 (two tailed test). If tail=-1, the alternative hypothesis 0033 % is that the mean of the data is less than 0 (lower tailed test). 0034 % {default: 0} 0035 % 0036 % Optional Inputs: 0037 % verblevel - An integer specifiying the amount of information you want 0038 % this function to provide about what it is doing during runtime. 0039 % Options are: 0040 % 0 - quiet, only show errors, warnings, and EEGLAB reports 0041 % 1 - stuff anyone should probably know 0042 % 2 - stuff you should know the first time you start working 0043 % with a data set {default value} 0044 % 3 - stuff that might help you debug (show all 0045 % reports) 0046 % seed_state - The initial state of the random number generating stream 0047 % (see MATLAB documentation for "randstream"). If you pass 0048 % a value from a previous run of this function, it should 0049 % reproduce the exact same values. 0050 % 0051 % Outputs: 0052 % pval - p-value at each time point and electrode (corrected for 0053 % multiple comparisons via permutation test) 0054 % t_orig - t-score at each time point and electrode 0055 % tmx_ptile - critical t-scores for given alpha level 0056 % seed_state - The initial state of the random number generating stream 0057 % (see MATLAB documentation for "randstream") used to 0058 % generate the permutations. You can use this to reproduce 0059 % the output of this function. 0060 % est_alpha - The estimated family-wise alpha level of the test. With 0061 % permutation tests, a finite number of p-values are possible. 0062 % This function tries to use an alpha level that is as close 0063 % as possible to the desired alpha level. However, if the 0064 % sample size is small, a very limited number of p-values are 0065 % possible and the desired family-wise alpha level may be 0066 % impossible to acheive. 0067 % 0068 % Notes: 0069 % -The null hypothesis of this test is that the data in the two groups are 0070 % "exchangeable" (i.e., the data from a participant in Group A was just as 0071 % likely to have come from Group B). If the distributions of the two 0072 % populations you're comparing differ in *any* way (i.e., not just in 0073 % central tendency), the null hypothesis will be violated. For example, if 0074 % the ERPs of the two groups differ in their variability, but not their 0075 % average value (e.g., one set of ERPs is from a group of fidgety, thin 0076 % skulled children and the other highly cooperative adults), you'll be more 0077 % likely than alpha to reject the null hypothesis. Uses differences in 0078 % t-scores ameliorates the test's sensitivity to between-group differences 0079 % in variance, but it does not completely remove it. 0080 % 0081 % -Unlike a parametric test (e.g., an ANOVA), a discrete set of p-values 0082 % are possible (at most the number of possible permutations). Since the 0083 % number of possible permutations grows rapdily with the number of 0084 % participants, this is only an issue for small sample sizes (e.g., 3 0085 % participants in each group). When you have such a small sample size, the 0086 % limited number of p-values may make the test overly conservative (e.g., 0087 % you might be forced to use an alpha level of .0286 since it is the 0088 % biggest possible alpha level less than .05). 0089 % 0090 % Author: 0091 % David Groppe 0092 % July, 2009 0093 % Kutaslab, San Diego 0094 % 0095 % References: 0096 % 0097 % Manly, B.F.J. (1997) Randomization, bootstrap, and Monte Carlo methods in 0098 % biology. 2nd ed. Chapmn and Hall, London. 0099 0100 %%%%%%%%%%%%%%%% REVISION LOG %%%%%%%%%%%%%%%%% 0101 % 0102 % ?/?/?- 0103 0104 %%%%%%%%%%%%%%%% FUTURE IMPROVEMENTS %%%%%%%%%%%%%%%%% 0105 % 0106 % Test for equality of trials or variance? 0107 0108 function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxtdif_perm2(dataA,dataB,n_perm,alph,tail,verblevel,seed_state) 0109 0110 if nargin<2, 0111 error('You need to provide data for two groups of participants.'); 0112 end 0113 0114 if nargin<3, 0115 n_perm=2000; 0116 end 0117 0118 if nargin<4, 0119 alph=.05; 0120 elseif (alph>=1) || (alph<=0) 0121 error('Argument ''alph'' needs to be between 0 and 1.'); 0122 end 0123 0124 if alph<=.01 && n_perm<5000, 0125 watchit(sprintf('You are probably using too few permutations for an alpha level of %f. Type ">>help mxtdif_perm2" for more info.',alph)); 0126 elseif alph<=.05 && n_perm<1000, 0127 watchit(sprintf('You are probably using too few permutations for an alpha level of %f. Type ">>help mxtdif_perm2" for more info.',alph)); 0128 end 0129 0130 if nargin<5, 0131 tail=0; 0132 elseif (tail~=0) && (tail~=1) && (tail~=-1), 0133 error('Argument ''tail'' needs to be 0,1, or -1.'); 0134 end 0135 0136 if nargin<6, 0137 verblevel=2; 0138 end 0139 0140 defaultStream=RandStream.getDefaultStream; %random # generator state 0141 if (nargin<7) || isempty(seed_state), 0142 %Store state of random number generator 0143 seed_state=defaultStream.State; 0144 else 0145 defaultStream.State=seed_state; %reset random number generator to saved state 0146 end 0147 0148 0149 if length(size(dataA))~=3 || length(size(dataB))~=3 0150 error('dataA and dataB need to be three dimensional (chan x time x participant)') 0151 end 0152 [n_chan, n_tpt, n_subsA]=size(dataA); 0153 [n_chan2, n_tpt2, n_subsB]=size(dataB); 0154 0155 if verblevel, 0156 warning('off','all'); %for large # of subjects, nchoosek warns that its result is approximate 0157 n_psbl_prms=nchoosek(n_subsA+n_subsB,n_subsA); 0158 if n_psbl_prms<100, 0159 watchit(sprintf(['Due to the very limited number of participants in each group,' ... 0160 ' the total number of possible permutations is small.\nThus only a limited number of p-values (at most %d) are possible and the test might be overly conservative.'], ... 0161 n_psbl_prms)); 0162 end 0163 warning('on','all'); 0164 end 0165 0166 if n_chan~=n_chan2, 0167 error('The number of channels in Group 1 (dataA) and Group 2 (dataB) need to be equal.'); 0168 elseif n_tpt~=n_tpt2, 0169 error('The number of time points in Group 1 (dataA) and Group 2 (dataB) need to be equal.'); 0170 end 0171 0172 %combine data 0173 total_subs=n_subsA+n_subsB; 0174 data=dataA; 0175 data(:,:,(n_subsA+1):total_subs)=dataB; 0176 0177 if verblevel~=0, 0178 fprintf('mxt_perm2: Number of channels: %d\n',n_chan); 0179 fprintf('mxt_perm2: Number of time points: %d\n',n_tpt); 0180 fprintf('mxt_perm2: Total # of comparisons: %d\n',n_chan*n_tpt); 0181 fprintf('mxt_perm2: Number of participants in Group A: %d\n',n_subsA); 0182 fprintf('mxt_perm2: Number of participants in Group B: %d\n',n_subsB); 0183 %fprintf('t-score degrees of freedom: %d\n',total_subs-2); %edit for alt t-test? 0184 end 0185 VerbReport(sprintf('Executing permutation test with %d permutations...',n_perm),2,verblevel); 0186 if (verblevel>=2), 0187 fprintf('Permutations completed: '); 0188 end 0189 0190 0191 mx_t=zeros(1,n_perm); 0192 for perm=1:n_perm 0193 if ~rem(perm,100) 0194 if (verblevel>=2), 0195 if ~rem(perm-100,1000) 0196 fprintf('%d',perm); 0197 else 0198 fprintf(', %d',perm); 0199 end 0200 if ~rem(perm,1000) 0201 fprintf('\n'); 0202 end 0203 end 0204 end 0205 %randomly assign participants to conditions 0206 r=randperm(total_subs); 0207 grp1=r(1:n_subsA); 0208 grp2=r((n_subsA+1):total_subs); 0209 %compute most extreme t-score difference 0210 mx_t(perm)=tMtmax2(data,grp1,grp2,n_subsA,n_subsB); 0211 end 0212 0213 %End of permutations, print carriage return if it hasn't already been done 0214 %(i.e., perm is NOT a multiple of 1000) 0215 if (verblevel>=2) && rem(perm,1000) 0216 fprintf('\n'); 0217 end 0218 0219 %Compute critical t's 0220 if tail==0, 0221 %two-tailed, test statistic is biggest absolute value of all t differences's 0222 mx_t=abs(mx_t); 0223 tmx_ptile(2)=prctile(mx_t,100-100*alph); 0224 tmx_ptile(1)=-tmx_ptile(2); 0225 est_alpha=mean(mx_t>=tmx_ptile(2)); 0226 elseif tail==1, 0227 %upper tailed 0228 tmx_ptile=prctile(mx_t,100-100*alph); 0229 est_alpha=mean(mx_t>=tmx_ptile); 0230 else 0231 %tail=-1, lower tailed 0232 tmx_ptile=prctile(mx_t,alph*100); 0233 est_alpha=mean(mx_t<=tmx_ptile); 0234 end 0235 if verblevel~=0, 0236 fprintf('Desired family-wise alpha level: %f\n',alph); 0237 fprintf('Estimated actual family-wise alpha level: %f\n',est_alpha); 0238 end 0239 0240 %Compute t-score differences of actual observations 0241 [dummy t_orig]=tMtmax2(data,1:n_subsA,(n_subsA+1):total_subs,n_subsA,n_subsB); 0242 pval=zeros(n_chan,n_tpt); 0243 %compute p-values 0244 for t=1:n_tpt, 0245 for c=1:n_chan, 0246 if tail==0, 0247 pval(c,t)=mean(mx_t>=abs(t_orig(c,t))); %note mx_t are now all positive due to abs command above 0248 elseif tail==1, 0249 pval(c,t)=mean(mx_t>=t_orig(c,t)); 0250 elseif tail==-1, 0251 pval(c,t)=mean(mx_t<=t_orig(c,t)); 0252 end 0253 end 0254 end 0255 0256 0257 function [mx_t, all_t]=tMtmax2(dat,grp1,grp2,n_subsA,n_subsB) 0258 % converts mean of both groups into t-scores and simply takes the 0259 % difference 0260 % 0261 % might make this faster by moving it into the code 0262 % 0263 % Could speed this up by pre-computing sqrt(n_subs) and sqrt(n_subsA-1) 0264 0265 x1=dat(:,:,grp1); 0266 x2=dat(:,:,grp2); 0267 0268 sm1=sum(x1,3); 0269 mn1=sm1/n_subsA; 0270 ss1=sum(x1.^2,3)-(sm1.^2)/n_subsA; 0271 stder1=sqrt(ss1/(n_subsA-1))/sqrt(n_subsA); 0272 0273 sm2=sum(x2,3); 0274 mn2=sm2/n_subsB; 0275 ss2=sum(x2.^2,3)-(sm2.^2)/n_subsB; 0276 stder2=sqrt(ss2/(n_subsB-1))/sqrt(n_subsB); 0277 0278 all_t=(mn1./stder1)-(mn2./stder2); 0279 [mx1 id_row]=max(abs(all_t)); 0280 [mx2 id_col]=max(mx1); 0281 mx_t=all_t(id_row(id_col),id_col); 0282 0283 0284