Home > matlabmk > mxtdif_perm2.m

mxtdif_perm2

PURPOSE ^

function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxtdif_perm2(dataA,dataB,n_perm,alph,tail,verblevel,seed_state)

SYNOPSIS ^

function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxtdif_perm2(dataA,dataB,n_perm,alph,tail,verblevel,seed_state)

DESCRIPTION ^

function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxtdif_perm2(dataA,dataB,n_perm,alph,tail,verblevel,seed_state)

 mxtdif_perm2-Two sample permutation test based on normalizing the mean of
 each group to a t-score taking the difference between the two.  The null
 hypothesis is a mean difference of zero.  For one-tailed tests, the null 
 distribution is derived from the most extreme t-score difference of all 
 permutations.  For two-tailed tests, the null distribution is derived from 
 the maximum absolute t-score difference of all permutations.  This function 
 can handle multiple electrodes and time points.  Based on simulated ERP 
 data sets, the t-score difference appears to be less sensitive to differences 
 in variance between groups than the standard independent samples t-statistic 
 or Welsh's t (esp. when group sizes are unequal).  However, it is probably 
 less powerful than these other two statistics.  Thus you should probably 
 use this function if the numbers of participants in each group are not equal 
 and you suspect that one group is much more variable (from trial to trail 
 or participant to participant) than the other. Note, normalzing a group's
 data to a t-score is obtained by simply dividing the mean of the data by
 its standard deviation.  This is often referred to as a "z-score" but
 technically it is a "t-statistic" since the standard deviation of the
 group is estimated from the data (i.e., it is not known a priori).

 Inputs:
  dataA  - 3D matrix of ERPs from Group A (Channel x Time x Participant)
  dataB  - 3D matrix of ERPs from Group B (Channel x Time x Participant)
  n_perm - number of permutations {default=1000}.  Manly (1997) suggests
           using at least 1000 permutations for an alpha level of 0.05 and 
           at least 5000 permutations for an alpha level of 0.01.
  alph   - test alpha level {default=.05}
  tail   - [1 | 0 | -1] If tail=1, the alternative hypothesis is that the
           mean of data is greater than 0 (upper tailed test).  If tail=0,
           the alternative hypothesis is that the mean of data is different
           than 0 (two tailed test).  If tail=-1, the alternative hypothesis
           is that the mean of the data is less than 0 (lower tailed test).
           {default: 0}

 Optional Inputs:
  verblevel   - An integer specifiying the amount of information you want
                this function to provide about what it is doing during runtime.
                 Options are:
                    0 - quiet, only show errors, warnings, and EEGLAB reports
                    1 - stuff anyone should probably know
                    2 - stuff you should know the first time you start working
                        with a data set {default value}
                    3 - stuff that might help you debug (show all
                        reports)
  seed_state  - The initial state of the random number generating stream
                (see MATLAB documentation for "randstream"). If you pass
                a value from a previous run of this function, it should
                reproduce the exact same values.

 Outputs:
  pval       - p-value at each time point and electrode (corrected for
                multiple comparisons via permutation test)
  t_orig     - t-score at each time point and electrode
  tmx_ptile  - critical t-scores for given alpha level
  seed_state - The initial state of the random number generating stream
               (see MATLAB documentation for "randstream") used to 
               generate the permutations. You can use this to reproduce
               the output of this function.
  est_alpha  - The estimated family-wise alpha level of the test.  With 
               permutation tests, a finite number of p-values are possible.
               This function tries to use an alpha level that is as close 
               as possible to the desired alpha level.  However, if the 
               sample size is small, a very limited number of p-values are 
               possible and the desired family-wise alpha level may be 
               impossible to acheive.

 Notes:
 -The null hypothesis of this test is that the data in the two groups are
 "exchangeable" (i.e., the data from a participant in Group A was just as
 likely to have come from Group B).  If the distributions of the two
 populations you're comparing differ in *any* way (i.e., not just in
 central tendency), the null hypothesis will be violated.  For example, if
 the ERPs of the two groups differ in their variability, but not their
 average value (e.g., one set of ERPs is from a group of fidgety, thin 
 skulled children and the other highly cooperative adults), you'll be more 
 likely than alpha to reject the null hypothesis.  Uses differences in 
 t-scores ameliorates the test's sensitivity to between-group differences 
 in variance, but it does not completely remove it.

 -Unlike a parametric test (e.g., an ANOVA), a discrete set of p-values
 are possible (at most the number of possible permutations).  Since the
 number of possible permutations grows rapdily with the number of
 participants, this is only an issue for small sample sizes (e.g., 3
 participants in each group).  When you have such a small sample size, the
 limited number of p-values may make the test overly conservative (e.g., 
 you might be forced to use an alpha level of .0286 since it is the
 biggest possible alpha level less than .05).

 Author:
 David Groppe
 July, 2009
 Kutaslab, San Diego

 References:

 Manly, B.F.J. (1997) Randomization, bootstrap, and Monte Carlo methods in
 biology. 2nd ed. Chapmn and Hall, London.

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 %function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxtdif_perm2(dataA,dataB,n_perm,alph,tail,verblevel,seed_state)
0002 %
0003 % mxtdif_perm2-Two sample permutation test based on normalizing the mean of
0004 % each group to a t-score taking the difference between the two.  The null
0005 % hypothesis is a mean difference of zero.  For one-tailed tests, the null
0006 % distribution is derived from the most extreme t-score difference of all
0007 % permutations.  For two-tailed tests, the null distribution is derived from
0008 % the maximum absolute t-score difference of all permutations.  This function
0009 % can handle multiple electrodes and time points.  Based on simulated ERP
0010 % data sets, the t-score difference appears to be less sensitive to differences
0011 % in variance between groups than the standard independent samples t-statistic
0012 % or Welsh's t (esp. when group sizes are unequal).  However, it is probably
0013 % less powerful than these other two statistics.  Thus you should probably
0014 % use this function if the numbers of participants in each group are not equal
0015 % and you suspect that one group is much more variable (from trial to trail
0016 % or participant to participant) than the other. Note, normalzing a group's
0017 % data to a t-score is obtained by simply dividing the mean of the data by
0018 % its standard deviation.  This is often referred to as a "z-score" but
0019 % technically it is a "t-statistic" since the standard deviation of the
0020 % group is estimated from the data (i.e., it is not known a priori).
0021 %
0022 % Inputs:
0023 %  dataA  - 3D matrix of ERPs from Group A (Channel x Time x Participant)
0024 %  dataB  - 3D matrix of ERPs from Group B (Channel x Time x Participant)
0025 %  n_perm - number of permutations {default=1000}.  Manly (1997) suggests
0026 %           using at least 1000 permutations for an alpha level of 0.05 and
0027 %           at least 5000 permutations for an alpha level of 0.01.
0028 %  alph   - test alpha level {default=.05}
0029 %  tail   - [1 | 0 | -1] If tail=1, the alternative hypothesis is that the
0030 %           mean of data is greater than 0 (upper tailed test).  If tail=0,
0031 %           the alternative hypothesis is that the mean of data is different
0032 %           than 0 (two tailed test).  If tail=-1, the alternative hypothesis
0033 %           is that the mean of the data is less than 0 (lower tailed test).
0034 %           {default: 0}
0035 %
0036 % Optional Inputs:
0037 %  verblevel   - An integer specifiying the amount of information you want
0038 %                this function to provide about what it is doing during runtime.
0039 %                 Options are:
0040 %                    0 - quiet, only show errors, warnings, and EEGLAB reports
0041 %                    1 - stuff anyone should probably know
0042 %                    2 - stuff you should know the first time you start working
0043 %                        with a data set {default value}
0044 %                    3 - stuff that might help you debug (show all
0045 %                        reports)
0046 %  seed_state  - The initial state of the random number generating stream
0047 %                (see MATLAB documentation for "randstream"). If you pass
0048 %                a value from a previous run of this function, it should
0049 %                reproduce the exact same values.
0050 %
0051 % Outputs:
0052 %  pval       - p-value at each time point and electrode (corrected for
0053 %                multiple comparisons via permutation test)
0054 %  t_orig     - t-score at each time point and electrode
0055 %  tmx_ptile  - critical t-scores for given alpha level
0056 %  seed_state - The initial state of the random number generating stream
0057 %               (see MATLAB documentation for "randstream") used to
0058 %               generate the permutations. You can use this to reproduce
0059 %               the output of this function.
0060 %  est_alpha  - The estimated family-wise alpha level of the test.  With
0061 %               permutation tests, a finite number of p-values are possible.
0062 %               This function tries to use an alpha level that is as close
0063 %               as possible to the desired alpha level.  However, if the
0064 %               sample size is small, a very limited number of p-values are
0065 %               possible and the desired family-wise alpha level may be
0066 %               impossible to acheive.
0067 %
0068 % Notes:
0069 % -The null hypothesis of this test is that the data in the two groups are
0070 % "exchangeable" (i.e., the data from a participant in Group A was just as
0071 % likely to have come from Group B).  If the distributions of the two
0072 % populations you're comparing differ in *any* way (i.e., not just in
0073 % central tendency), the null hypothesis will be violated.  For example, if
0074 % the ERPs of the two groups differ in their variability, but not their
0075 % average value (e.g., one set of ERPs is from a group of fidgety, thin
0076 % skulled children and the other highly cooperative adults), you'll be more
0077 % likely than alpha to reject the null hypothesis.  Uses differences in
0078 % t-scores ameliorates the test's sensitivity to between-group differences
0079 % in variance, but it does not completely remove it.
0080 %
0081 % -Unlike a parametric test (e.g., an ANOVA), a discrete set of p-values
0082 % are possible (at most the number of possible permutations).  Since the
0083 % number of possible permutations grows rapdily with the number of
0084 % participants, this is only an issue for small sample sizes (e.g., 3
0085 % participants in each group).  When you have such a small sample size, the
0086 % limited number of p-values may make the test overly conservative (e.g.,
0087 % you might be forced to use an alpha level of .0286 since it is the
0088 % biggest possible alpha level less than .05).
0089 %
0090 % Author:
0091 % David Groppe
0092 % July, 2009
0093 % Kutaslab, San Diego
0094 %
0095 % References:
0096 %
0097 % Manly, B.F.J. (1997) Randomization, bootstrap, and Monte Carlo methods in
0098 % biology. 2nd ed. Chapmn and Hall, London.
0099 
0100 %%%%%%%%%%%%%%%% REVISION LOG %%%%%%%%%%%%%%%%%
0101 %
0102 % ?/?/?-
0103 
0104 %%%%%%%%%%%%%%%% FUTURE IMPROVEMENTS %%%%%%%%%%%%%%%%%
0105 %
0106 % Test for equality of trials or variance?
0107 
0108 function [pval, t_orig, tmx_ptile, seed_state, est_alpha]=mxtdif_perm2(dataA,dataB,n_perm,alph,tail,verblevel,seed_state)
0109 
0110 if nargin<2,
0111     error('You need to provide data for two groups of participants.');
0112 end
0113 
0114 if nargin<3,
0115     n_perm=2000;
0116 end
0117 
0118 if nargin<4,
0119     alph=.05;
0120 elseif (alph>=1) || (alph<=0)
0121     error('Argument ''alph'' needs to be between 0 and 1.');
0122 end
0123 
0124 if alph<=.01 && n_perm<5000,
0125     watchit(sprintf('You are probably using too few permutations for an alpha level of %f. Type ">>help mxtdif_perm2" for more info.',alph));
0126 elseif alph<=.05 && n_perm<1000,
0127     watchit(sprintf('You are probably using too few permutations for an alpha level of %f. Type ">>help mxtdif_perm2" for more info.',alph));
0128 end
0129 
0130 if nargin<5,
0131     tail=0;
0132 elseif (tail~=0) && (tail~=1) && (tail~=-1),
0133     error('Argument ''tail'' needs to be 0,1, or -1.');
0134 end
0135 
0136 if nargin<6,
0137     verblevel=2;
0138 end
0139 
0140 defaultStream=RandStream.getDefaultStream; %random # generator state
0141 if (nargin<7) || isempty(seed_state),
0142     %Store state of random number generator
0143     seed_state=defaultStream.State;
0144 else
0145     defaultStream.State=seed_state; %reset random number generator to saved state
0146 end
0147 
0148 
0149 if length(size(dataA))~=3 || length(size(dataB))~=3 
0150     error('dataA and dataB need to be three dimensional (chan x time x participant)')
0151 end
0152 [n_chan, n_tpt, n_subsA]=size(dataA);
0153 [n_chan2, n_tpt2, n_subsB]=size(dataB);
0154 
0155 if verblevel,
0156     warning('off','all'); %for large # of subjects, nchoosek warns that its result is approximate
0157     n_psbl_prms=nchoosek(n_subsA+n_subsB,n_subsA);
0158     if n_psbl_prms<100,
0159         watchit(sprintf(['Due to the very limited number of participants in each group,' ...
0160             ' the total number of possible permutations is small.\nThus only a limited number of p-values (at most %d) are possible and the test might be overly conservative.'], ...
0161             n_psbl_prms));
0162     end
0163     warning('on','all');
0164 end
0165 
0166 if n_chan~=n_chan2,
0167     error('The number of channels in Group 1 (dataA) and Group 2 (dataB) need to be equal.');
0168 elseif n_tpt~=n_tpt2,
0169     error('The number of time points in Group 1 (dataA) and Group 2 (dataB) need to be equal.');
0170 end
0171 
0172 %combine data
0173 total_subs=n_subsA+n_subsB;
0174 data=dataA; 
0175 data(:,:,(n_subsA+1):total_subs)=dataB;
0176 
0177 if verblevel~=0,
0178     fprintf('mxt_perm2: Number of channels: %d\n',n_chan);
0179     fprintf('mxt_perm2: Number of time points: %d\n',n_tpt);
0180     fprintf('mxt_perm2: Total # of comparisons: %d\n',n_chan*n_tpt);
0181     fprintf('mxt_perm2: Number of participants in Group A: %d\n',n_subsA);
0182     fprintf('mxt_perm2: Number of participants in Group B: %d\n',n_subsB);
0183     %fprintf('t-score degrees of freedom: %d\n',total_subs-2); %edit for alt t-test?
0184 end
0185 VerbReport(sprintf('Executing permutation test with %d permutations...',n_perm),2,verblevel);
0186 if (verblevel>=2),
0187     fprintf('Permutations completed: ');
0188 end
0189 
0190 
0191 mx_t=zeros(1,n_perm);
0192 for perm=1:n_perm
0193     if ~rem(perm,100)
0194         if (verblevel>=2),
0195             if ~rem(perm-100,1000)
0196                 fprintf('%d',perm);
0197             else
0198                 fprintf(', %d',perm);
0199             end
0200             if ~rem(perm,1000)
0201                 fprintf('\n');
0202             end
0203         end
0204     end
0205     %randomly assign participants to conditions
0206     r=randperm(total_subs);
0207     grp1=r(1:n_subsA);
0208     grp2=r((n_subsA+1):total_subs);
0209     %compute most extreme t-score difference
0210     mx_t(perm)=tMtmax2(data,grp1,grp2,n_subsA,n_subsB);
0211 end
0212     
0213 %End of permutations, print carriage return if it hasn't already been done
0214 %(i.e., perm is NOT a multiple of 1000)
0215 if (verblevel>=2) && rem(perm,1000)
0216     fprintf('\n');
0217 end
0218 
0219 %Compute critical t's
0220 if tail==0,
0221     %two-tailed, test statistic is biggest absolute value of all t differences's
0222     mx_t=abs(mx_t);
0223     tmx_ptile(2)=prctile(mx_t,100-100*alph);
0224     tmx_ptile(1)=-tmx_ptile(2);
0225     est_alpha=mean(mx_t>=tmx_ptile(2));
0226 elseif tail==1,
0227     %upper tailed
0228     tmx_ptile=prctile(mx_t,100-100*alph);
0229     est_alpha=mean(mx_t>=tmx_ptile);
0230 else
0231     %tail=-1, lower tailed
0232     tmx_ptile=prctile(mx_t,alph*100);
0233     est_alpha=mean(mx_t<=tmx_ptile);
0234 end
0235 if verblevel~=0,
0236     fprintf('Desired family-wise alpha level: %f\n',alph);
0237     fprintf('Estimated actual family-wise alpha level: %f\n',est_alpha);
0238 end
0239 
0240 %Compute t-score differences of actual observations
0241 [dummy t_orig]=tMtmax2(data,1:n_subsA,(n_subsA+1):total_subs,n_subsA,n_subsB);
0242 pval=zeros(n_chan,n_tpt);
0243 %compute p-values
0244 for t=1:n_tpt,
0245     for c=1:n_chan,
0246         if tail==0,
0247             pval(c,t)=mean(mx_t>=abs(t_orig(c,t))); %note mx_t are now all positive due to abs command above
0248         elseif tail==1,
0249             pval(c,t)=mean(mx_t>=t_orig(c,t));
0250         elseif tail==-1,
0251             pval(c,t)=mean(mx_t<=t_orig(c,t));
0252         end
0253     end
0254 end
0255 
0256 
0257 function [mx_t, all_t]=tMtmax2(dat,grp1,grp2,n_subsA,n_subsB)
0258 % converts mean of both groups into t-scores and simply takes the
0259 % difference
0260 %
0261 % might make this faster by moving it into the code
0262 %
0263 % Could speed this up by pre-computing sqrt(n_subs) and sqrt(n_subsA-1)
0264 
0265 x1=dat(:,:,grp1);
0266 x2=dat(:,:,grp2);
0267 
0268 sm1=sum(x1,3);
0269 mn1=sm1/n_subsA;
0270 ss1=sum(x1.^2,3)-(sm1.^2)/n_subsA;
0271 stder1=sqrt(ss1/(n_subsA-1))/sqrt(n_subsA);
0272 
0273 sm2=sum(x2,3);
0274 mn2=sm2/n_subsB;
0275 ss2=sum(x2.^2,3)-(sm2.^2)/n_subsB;
0276 stder2=sqrt(ss2/(n_subsB-1))/sqrt(n_subsB);
0277 
0278 all_t=(mn1./stder1)-(mn2./stder2);
0279 [mx1 id_row]=max(abs(all_t));
0280 [mx2 id_col]=max(mx1);
0281 mx_t=all_t(id_row(id_col),id_col);
0282 
0283 
0284

Generated on Tue 10-May-2016 16:37:45 by m2html © 2005