Home > matlabmk > fdr_bky.m

fdr_bky

PURPOSE ^

fdr_bky() - Executes the "two-stage" Benjamini, Krieger, & Yekutieli (2006)

SYNOPSIS ^

function [h crit_p]=fdr_bky(p_values,q,report)

DESCRIPTION ^

 fdr_bky() - Executes the "two-stage" Benjamini, Krieger, & Yekutieli (2006)
             procedure for controlling the false discovery rate (FDR) of a
             family of hypothesis tests. FDR is the expected proportion of
             rejected hypotheses that are mistakenly rejected (i.e., the null
             hypothesis is actually true for those tests). FDR is a
             somewhat less conservative/more powerful method for correcting
             for multiple comparisons than procedures like Bonferroni
             correction that provide strong control of the family-wise
             error rate (i.e., the probability that one or more null
             hypotheses are mistakenly rejected).
               The procedure implemented by this function is more powerful
             than the original Benjamini & Hochberg (1995) procedure when 
             a considerable percentage of the hypotheses in the family are
             false. To the best of my knowledge, this procedure is only
             guaranteed to control FDR if the tests are independent.
             However, simulations suggest that it can control FDR even
             when the tests are positively correlated (Benjamini et al., 
             2006).

 Usage:
  >> [h, crit_p]=fdr_bky(pvals,q,report);

 Required Input:
   pvals - A vector or matrix (two dimensions or more) containing the
           p-value of each individual test in a family of tests.

 Optional Inputs:
   q       - The desired false discovery rate. {default: 0.05}
   report  - ['yes' or 'no'] If 'yes', a brief summary of FDR results are
             output to the MATLAB command line {default: 'no'}


 Outputs:
   h       - A binary vector or matrix of the same size as the input "pvals."
             If the ith element of h is 1, then the test that produced the 
             ith p-value in pvals is significant (i.e., the null hypothesis
             of the test is rejected).
   crit_p  - All p-values less than or equal to crit_p are significant
             (i.e., their null hypotheses are rejected).  If no p-values are
             significant, crit_p=0.


 References:
   Benjamini, Y., Krieger, A.M., & Yekutieli, D. (2006) Adaptive linear 
     step-up procedures that control the false discovery rate. Biometrika.
     93(3), 491-507.

   Benjamini, Y. & Hochberg, Y. (1995) Controlling the false discovery
     rate: A practical and powerful approach to multiple testing. Journal
     of the Royal Statistical Society, Series B (Methodological). 57(1),
     289-300.

 Example:
   [dummy p_null]=ttest(randn(12,15)); %15 tests where the null hypothesis
                                       %is true
   [dummy p_effect]=ttest(randn(12,5)+1); %5 tests where the null
                                          %hypothesis is false
   [h crit_p]=fdr_bky([p_null p_effect],.05,'yes');


 For a review on false discovery rate control and other contemporary
 techniques for correcting for multiple comparisons see:

   Groppe, D.M., Urbach, T.P., & Kutas, M. (2011) Mass univariate analysis 
 of event-related brain potentials/fields I: A critical tutorial review. 
 Psychophysiology, 48(12) pp. 1711-1725, DOI: 10.1111/j.1469-8986.2011.01273.x 
 http://www.cogsci.ucsd.edu/~dgroppe/PUBLICATIONS/mass_uni_preprint1.pdf

 Author:
 David M. Groppe
 Kutaslab
 Dept. of Cognitive Science
 University of California, San Diego
 March 25, 2010

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 % fdr_bky() - Executes the "two-stage" Benjamini, Krieger, & Yekutieli (2006)
0002 %             procedure for controlling the false discovery rate (FDR) of a
0003 %             family of hypothesis tests. FDR is the expected proportion of
0004 %             rejected hypotheses that are mistakenly rejected (i.e., the null
0005 %             hypothesis is actually true for those tests). FDR is a
0006 %             somewhat less conservative/more powerful method for correcting
0007 %             for multiple comparisons than procedures like Bonferroni
0008 %             correction that provide strong control of the family-wise
0009 %             error rate (i.e., the probability that one or more null
0010 %             hypotheses are mistakenly rejected).
0011 %               The procedure implemented by this function is more powerful
0012 %             than the original Benjamini & Hochberg (1995) procedure when
0013 %             a considerable percentage of the hypotheses in the family are
0014 %             false. To the best of my knowledge, this procedure is only
0015 %             guaranteed to control FDR if the tests are independent.
0016 %             However, simulations suggest that it can control FDR even
0017 %             when the tests are positively correlated (Benjamini et al.,
0018 %             2006).
0019 %
0020 % Usage:
0021 %  >> [h, crit_p]=fdr_bky(pvals,q,report);
0022 %
0023 % Required Input:
0024 %   pvals - A vector or matrix (two dimensions or more) containing the
0025 %           p-value of each individual test in a family of tests.
0026 %
0027 % Optional Inputs:
0028 %   q       - The desired false discovery rate. {default: 0.05}
0029 %   report  - ['yes' or 'no'] If 'yes', a brief summary of FDR results are
0030 %             output to the MATLAB command line {default: 'no'}
0031 %
0032 %
0033 % Outputs:
0034 %   h       - A binary vector or matrix of the same size as the input "pvals."
0035 %             If the ith element of h is 1, then the test that produced the
0036 %             ith p-value in pvals is significant (i.e., the null hypothesis
0037 %             of the test is rejected).
0038 %   crit_p  - All p-values less than or equal to crit_p are significant
0039 %             (i.e., their null hypotheses are rejected).  If no p-values are
0040 %             significant, crit_p=0.
0041 %
0042 %
0043 % References:
0044 %   Benjamini, Y., Krieger, A.M., & Yekutieli, D. (2006) Adaptive linear
0045 %     step-up procedures that control the false discovery rate. Biometrika.
0046 %     93(3), 491-507.
0047 %
0048 %   Benjamini, Y. & Hochberg, Y. (1995) Controlling the false discovery
0049 %     rate: A practical and powerful approach to multiple testing. Journal
0050 %     of the Royal Statistical Society, Series B (Methodological). 57(1),
0051 %     289-300.
0052 %
0053 % Example:
0054 %   [dummy p_null]=ttest(randn(12,15)); %15 tests where the null hypothesis
0055 %                                       %is true
0056 %   [dummy p_effect]=ttest(randn(12,5)+1); %5 tests where the null
0057 %                                          %hypothesis is false
0058 %   [h crit_p]=fdr_bky([p_null p_effect],.05,'yes');
0059 %
0060 %
0061 % For a review on false discovery rate control and other contemporary
0062 % techniques for correcting for multiple comparisons see:
0063 %
0064 %   Groppe, D.M., Urbach, T.P., & Kutas, M. (2011) Mass univariate analysis
0065 % of event-related brain potentials/fields I: A critical tutorial review.
0066 % Psychophysiology, 48(12) pp. 1711-1725, DOI: 10.1111/j.1469-8986.2011.01273.x
0067 % http://www.cogsci.ucsd.edu/~dgroppe/PUBLICATIONS/mass_uni_preprint1.pdf
0068 %
0069 % Author:
0070 % David M. Groppe
0071 % Kutaslab
0072 % Dept. of Cognitive Science
0073 % University of California, San Diego
0074 % March 25, 2010
0075 
0076 function [h crit_p]=fdr_bky(p_values,q,report)
0077 
0078 if nargin<1,
0079     error('You need to provide a vector or matrix of p-values.');
0080 else
0081     if ~isempty(find(p_values<0,1)),
0082         error('Some p-values are less than 0.');
0083     elseif ~isempty(find(p_values>1,1)),
0084         error('Some p-values are greater than 1.');
0085     end
0086 end
0087 
0088 if nargin<2,
0089     q=.05;
0090 end
0091 
0092 if nargin<3,
0093     report='no';
0094 end
0095 
0096 s=size(p_values);
0097 if (length(s)>2) || s(1)>1,
0098     p_sorted=sort(reshape(p_values,1,prod(s)));
0099 else
0100     %p-values are already a row vector
0101     p_sorted=sort(p_values);
0102 end
0103 m=length(p_sorted); %number of tests
0104 
0105 %STEP 1: Run classic Benjamini-Hochberg linear step up FDR procedure (BH) with
0106 %slightly more conservative q value
0107 q_prime=q/(1+q);
0108 [hh crit_p]=fdr_bh(p_sorted,q_prime,'pdep');
0109 r1=sum(hh);
0110 
0111 if r1==0,
0112     %NO hypotheses rejected, stop here
0113     crit_p=0;
0114     h=p_values*0;
0115 elseif r1==m,
0116     %ALL hypotheses rejected, stop here
0117     crit_p=p_sorted(end); %critical p-value is biggest p-value
0118     h=p_values<=crit_p;
0119 else
0120     %Continue on.....
0121     %STEP 2: r1=Estimated # of false hypotheses
0122     m0hat=m-r1; % m0hat=estimated # of true hypotheses
0123     %repeat BH with new q
0124     q_star=q_prime*m/m0hat;
0125     [hh crit_p]=fdr_bh(p_sorted,q_star,'pdep');
0126     h=p_values<=crit_p;
0127 end
0128 
0129 if strcmpi(report,'yes'),
0130     n_sig=sum(hh);
0131     if n_sig==1,
0132         fprintf('Out of %d tests, %d is significant using a false discovery rate of %f.\n',m,n_sig,q);
0133     else
0134         fprintf('Out of %d tests, %d are significant using a false discovery rate of %f.\n',m,n_sig,q);
0135     end
0136     fprintf('FDR two-stage procedure used is guaranteed valid for independent tests.\n');
0137 end
0138 
0139 
0140 function [h crit_p]=fdr_bh(pvals,q,method)
0141 % fdr_bh() - Executes the Benjamini & Hochberg (1995) procedure for
0142 %            controlling the false discovery rate (FDR) of a family of
0143 %            hypothesis tests. FDR is the expected proportion of rejected
0144 %            hypotheses that are mistakenly rejected (i.e., the null
0145 %            hypothesis is actually true for those tests). FDR is a
0146 %            somewhat less conservative/more powerful method for correcting
0147 %            for multiple comparisons than procedures like Bonferroni
0148 %            correction that provide strong control of the family-wise
0149 %            error rate (i.e., the probability that one or more null
0150 %            hypotheses are mistakenly rejected).
0151 %
0152 % Usage:
0153 %  >> [h, crit_p]=fdr_bh(pvals,q,method);
0154 %
0155 % Required Input:
0156 %   pvals - A vector or matrix (two dimensions or more) containing the
0157 %           p-value of each individual test in a family of tests.
0158 %   q     - The desired false discovery rate.
0159 %
0160 % Optional Inputs:
0161 %   method  - ['pdep' or 'dep'] If 'pdep,' the original Bejnamini & Hochberg
0162 %             FDR procedure is used, which is guaranteed to be accurate if
0163 %             the individual tests are independent or positively dependent
0164 %             (e.g., positively correlated).  If 'dep,' the FDR procedure
0165 %             described in Benjamini & Yekutieli (2001) that is guaranteed
0166 %             to be accurate for any test dependency structure (e.g.,
0167 %             positively and/or negatively correlated tests) is used. 'dep'
0168 %             is always appropriate to use but is less powerful than 'pdep.'
0169 %             {default: 'pdep'}
0170 %
0171 % Outputs:
0172 %   h       - A binary vector or matrix of the same size as the input "pvals."
0173 %             If the ith element of h is 1, then the test that produced the
0174 %             ith p-value in pvals is significant (i.e., the null hypothesis
0175 %             of the test is rejected).
0176 %   crit_p  - All p-values less than or equal to crit_p are significant
0177 %             (i.e., their null hypotheses are rejected).  If no p-values are
0178 %             significant, crit_p=0.
0179 %
0180 %
0181 % References:
0182 %   Benjamini, Y. & Hochberg, Y. (1995) Controlling the false discovery
0183 %     rate: A practical and powerful approach to multiple testing. Journal
0184 %     of the Royal Statistical Society, Series B (Methodological). 57(1),
0185 %     289-300.
0186 %
0187 %   Benjamini, Y. & Yekutieli, D. (2001) The control of the false discovery
0188 %     rate in multiple testing under dependency. The Annals of Statistics.
0189 %     29(4), 1165-1188.
0190 %
0191 % Example:
0192 %   [dummy p_null]=ttest(randn(12,15)); %15 tests where the null hypothesis
0193 %                                       %is true
0194 %   [dummy p_effect]=ttest(randn(12,5)+1); %5 tests where the null
0195 %                                          %hypothesis is false
0196 %   [h crit_p]=fdr_bh([p_null p_effect],.05,'pdep','yes');
0197 %
0198 %
0199 % Author:
0200 % David M. Groppe
0201 % Kutaslab
0202 % Dept. of Cognitive Science
0203 % University of California, San Diego
0204 % March 24, 2010
0205 
0206 
0207 if nargin<3,
0208     method='pdep';
0209 end
0210 
0211 %Note: pvals is already sorted, and a row vector
0212 m=length(pvals); %number of tests
0213 
0214 if strcmpi(method,'pdep'),
0215     %BH procedure for independence or positive dependence
0216     thresh=[1:m]*q/m;
0217 elseif strcmpi(method,'dep')
0218     %BH procedure for any dependency structure
0219     denom=m*sum(1./[1:m]);
0220     thresh=[1:m]*q/denom;
0221 else
0222     error('Argument ''method'' needs to be ''pdep'' or ''dep''.');
0223 end
0224 
0225 rej=pvals<=thresh;
0226 max_id=find(rej,1,'last'); %find greatest significant pvalue
0227 if isempty(max_id),
0228     crit_p=0;
0229     h=pvals*0;
0230 else
0231     crit_p=pvals(max_id);
0232     h=pvals<=crit_p;
0233 end
0234 
0235 
0236 
0237

Generated on Tue 10-May-2016 16:37:45 by m2html © 2005