fdr_bky() - Executes the "two-stage" Benjamini, Krieger, & Yekutieli (2006) procedure for controlling the false discovery rate (FDR) of a family of hypothesis tests. FDR is the expected proportion of rejected hypotheses that are mistakenly rejected (i.e., the null hypothesis is actually true for those tests). FDR is a somewhat less conservative/more powerful method for correcting for multiple comparisons than procedures like Bonferroni correction that provide strong control of the family-wise error rate (i.e., the probability that one or more null hypotheses are mistakenly rejected). The procedure implemented by this function is more powerful than the original Benjamini & Hochberg (1995) procedure when a considerable percentage of the hypotheses in the family are false. To the best of my knowledge, this procedure is only guaranteed to control FDR if the tests are independent. However, simulations suggest that it can control FDR even when the tests are positively correlated (Benjamini et al., 2006). Usage: >> [h, crit_p]=fdr_bky(pvals,q,report); Required Input: pvals - A vector or matrix (two dimensions or more) containing the p-value of each individual test in a family of tests. Optional Inputs: q - The desired false discovery rate. {default: 0.05} report - ['yes' or 'no'] If 'yes', a brief summary of FDR results are output to the MATLAB command line {default: 'no'} Outputs: h - A binary vector or matrix of the same size as the input "pvals." If the ith element of h is 1, then the test that produced the ith p-value in pvals is significant (i.e., the null hypothesis of the test is rejected). crit_p - All p-values less than or equal to crit_p are significant (i.e., their null hypotheses are rejected). If no p-values are significant, crit_p=0. References: Benjamini, Y., Krieger, A.M., & Yekutieli, D. (2006) Adaptive linear step-up procedures that control the false discovery rate. Biometrika. 93(3), 491-507. Benjamini, Y. & Hochberg, Y. (1995) Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B (Methodological). 57(1), 289-300. Example: [dummy p_null]=ttest(randn(12,15)); %15 tests where the null hypothesis %is true [dummy p_effect]=ttest(randn(12,5)+1); %5 tests where the null %hypothesis is false [h crit_p]=fdr_bky([p_null p_effect],.05,'yes'); For a review on false discovery rate control and other contemporary techniques for correcting for multiple comparisons see: Groppe, D.M., Urbach, T.P., & Kutas, M. (2011) Mass univariate analysis of event-related brain potentials/fields I: A critical tutorial review. Psychophysiology, 48(12) pp. 1711-1725, DOI: 10.1111/j.1469-8986.2011.01273.x http://www.cogsci.ucsd.edu/~dgroppe/PUBLICATIONS/mass_uni_preprint1.pdf Author: David M. Groppe Kutaslab Dept. of Cognitive Science University of California, San Diego March 25, 2010
0001 % fdr_bky() - Executes the "two-stage" Benjamini, Krieger, & Yekutieli (2006) 0002 % procedure for controlling the false discovery rate (FDR) of a 0003 % family of hypothesis tests. FDR is the expected proportion of 0004 % rejected hypotheses that are mistakenly rejected (i.e., the null 0005 % hypothesis is actually true for those tests). FDR is a 0006 % somewhat less conservative/more powerful method for correcting 0007 % for multiple comparisons than procedures like Bonferroni 0008 % correction that provide strong control of the family-wise 0009 % error rate (i.e., the probability that one or more null 0010 % hypotheses are mistakenly rejected). 0011 % The procedure implemented by this function is more powerful 0012 % than the original Benjamini & Hochberg (1995) procedure when 0013 % a considerable percentage of the hypotheses in the family are 0014 % false. To the best of my knowledge, this procedure is only 0015 % guaranteed to control FDR if the tests are independent. 0016 % However, simulations suggest that it can control FDR even 0017 % when the tests are positively correlated (Benjamini et al., 0018 % 2006). 0019 % 0020 % Usage: 0021 % >> [h, crit_p]=fdr_bky(pvals,q,report); 0022 % 0023 % Required Input: 0024 % pvals - A vector or matrix (two dimensions or more) containing the 0025 % p-value of each individual test in a family of tests. 0026 % 0027 % Optional Inputs: 0028 % q - The desired false discovery rate. {default: 0.05} 0029 % report - ['yes' or 'no'] If 'yes', a brief summary of FDR results are 0030 % output to the MATLAB command line {default: 'no'} 0031 % 0032 % 0033 % Outputs: 0034 % h - A binary vector or matrix of the same size as the input "pvals." 0035 % If the ith element of h is 1, then the test that produced the 0036 % ith p-value in pvals is significant (i.e., the null hypothesis 0037 % of the test is rejected). 0038 % crit_p - All p-values less than or equal to crit_p are significant 0039 % (i.e., their null hypotheses are rejected). If no p-values are 0040 % significant, crit_p=0. 0041 % 0042 % 0043 % References: 0044 % Benjamini, Y., Krieger, A.M., & Yekutieli, D. (2006) Adaptive linear 0045 % step-up procedures that control the false discovery rate. Biometrika. 0046 % 93(3), 491-507. 0047 % 0048 % Benjamini, Y. & Hochberg, Y. (1995) Controlling the false discovery 0049 % rate: A practical and powerful approach to multiple testing. Journal 0050 % of the Royal Statistical Society, Series B (Methodological). 57(1), 0051 % 289-300. 0052 % 0053 % Example: 0054 % [dummy p_null]=ttest(randn(12,15)); %15 tests where the null hypothesis 0055 % %is true 0056 % [dummy p_effect]=ttest(randn(12,5)+1); %5 tests where the null 0057 % %hypothesis is false 0058 % [h crit_p]=fdr_bky([p_null p_effect],.05,'yes'); 0059 % 0060 % 0061 % For a review on false discovery rate control and other contemporary 0062 % techniques for correcting for multiple comparisons see: 0063 % 0064 % Groppe, D.M., Urbach, T.P., & Kutas, M. (2011) Mass univariate analysis 0065 % of event-related brain potentials/fields I: A critical tutorial review. 0066 % Psychophysiology, 48(12) pp. 1711-1725, DOI: 10.1111/j.1469-8986.2011.01273.x 0067 % http://www.cogsci.ucsd.edu/~dgroppe/PUBLICATIONS/mass_uni_preprint1.pdf 0068 % 0069 % Author: 0070 % David M. Groppe 0071 % Kutaslab 0072 % Dept. of Cognitive Science 0073 % University of California, San Diego 0074 % March 25, 2010 0075 0076 function [h crit_p]=fdr_bky(p_values,q,report) 0077 0078 if nargin<1, 0079 error('You need to provide a vector or matrix of p-values.'); 0080 else 0081 if ~isempty(find(p_values<0,1)), 0082 error('Some p-values are less than 0.'); 0083 elseif ~isempty(find(p_values>1,1)), 0084 error('Some p-values are greater than 1.'); 0085 end 0086 end 0087 0088 if nargin<2, 0089 q=.05; 0090 end 0091 0092 if nargin<3, 0093 report='no'; 0094 end 0095 0096 s=size(p_values); 0097 if (length(s)>2) || s(1)>1, 0098 p_sorted=sort(reshape(p_values,1,prod(s))); 0099 else 0100 %p-values are already a row vector 0101 p_sorted=sort(p_values); 0102 end 0103 m=length(p_sorted); %number of tests 0104 0105 %STEP 1: Run classic Benjamini-Hochberg linear step up FDR procedure (BH) with 0106 %slightly more conservative q value 0107 q_prime=q/(1+q); 0108 [hh crit_p]=fdr_bh(p_sorted,q_prime,'pdep'); 0109 r1=sum(hh); 0110 0111 if r1==0, 0112 %NO hypotheses rejected, stop here 0113 crit_p=0; 0114 h=p_values*0; 0115 elseif r1==m, 0116 %ALL hypotheses rejected, stop here 0117 crit_p=p_sorted(end); %critical p-value is biggest p-value 0118 h=p_values<=crit_p; 0119 else 0120 %Continue on..... 0121 %STEP 2: r1=Estimated # of false hypotheses 0122 m0hat=m-r1; % m0hat=estimated # of true hypotheses 0123 %repeat BH with new q 0124 q_star=q_prime*m/m0hat; 0125 [hh crit_p]=fdr_bh(p_sorted,q_star,'pdep'); 0126 h=p_values<=crit_p; 0127 end 0128 0129 if strcmpi(report,'yes'), 0130 n_sig=sum(hh); 0131 if n_sig==1, 0132 fprintf('Out of %d tests, %d is significant using a false discovery rate of %f.\n',m,n_sig,q); 0133 else 0134 fprintf('Out of %d tests, %d are significant using a false discovery rate of %f.\n',m,n_sig,q); 0135 end 0136 fprintf('FDR two-stage procedure used is guaranteed valid for independent tests.\n'); 0137 end 0138 0139 0140 function [h crit_p]=fdr_bh(pvals,q,method) 0141 % fdr_bh() - Executes the Benjamini & Hochberg (1995) procedure for 0142 % controlling the false discovery rate (FDR) of a family of 0143 % hypothesis tests. FDR is the expected proportion of rejected 0144 % hypotheses that are mistakenly rejected (i.e., the null 0145 % hypothesis is actually true for those tests). FDR is a 0146 % somewhat less conservative/more powerful method for correcting 0147 % for multiple comparisons than procedures like Bonferroni 0148 % correction that provide strong control of the family-wise 0149 % error rate (i.e., the probability that one or more null 0150 % hypotheses are mistakenly rejected). 0151 % 0152 % Usage: 0153 % >> [h, crit_p]=fdr_bh(pvals,q,method); 0154 % 0155 % Required Input: 0156 % pvals - A vector or matrix (two dimensions or more) containing the 0157 % p-value of each individual test in a family of tests. 0158 % q - The desired false discovery rate. 0159 % 0160 % Optional Inputs: 0161 % method - ['pdep' or 'dep'] If 'pdep,' the original Bejnamini & Hochberg 0162 % FDR procedure is used, which is guaranteed to be accurate if 0163 % the individual tests are independent or positively dependent 0164 % (e.g., positively correlated). If 'dep,' the FDR procedure 0165 % described in Benjamini & Yekutieli (2001) that is guaranteed 0166 % to be accurate for any test dependency structure (e.g., 0167 % positively and/or negatively correlated tests) is used. 'dep' 0168 % is always appropriate to use but is less powerful than 'pdep.' 0169 % {default: 'pdep'} 0170 % 0171 % Outputs: 0172 % h - A binary vector or matrix of the same size as the input "pvals." 0173 % If the ith element of h is 1, then the test that produced the 0174 % ith p-value in pvals is significant (i.e., the null hypothesis 0175 % of the test is rejected). 0176 % crit_p - All p-values less than or equal to crit_p are significant 0177 % (i.e., their null hypotheses are rejected). If no p-values are 0178 % significant, crit_p=0. 0179 % 0180 % 0181 % References: 0182 % Benjamini, Y. & Hochberg, Y. (1995) Controlling the false discovery 0183 % rate: A practical and powerful approach to multiple testing. Journal 0184 % of the Royal Statistical Society, Series B (Methodological). 57(1), 0185 % 289-300. 0186 % 0187 % Benjamini, Y. & Yekutieli, D. (2001) The control of the false discovery 0188 % rate in multiple testing under dependency. The Annals of Statistics. 0189 % 29(4), 1165-1188. 0190 % 0191 % Example: 0192 % [dummy p_null]=ttest(randn(12,15)); %15 tests where the null hypothesis 0193 % %is true 0194 % [dummy p_effect]=ttest(randn(12,5)+1); %5 tests where the null 0195 % %hypothesis is false 0196 % [h crit_p]=fdr_bh([p_null p_effect],.05,'pdep','yes'); 0197 % 0198 % 0199 % Author: 0200 % David M. Groppe 0201 % Kutaslab 0202 % Dept. of Cognitive Science 0203 % University of California, San Diego 0204 % March 24, 2010 0205 0206 0207 if nargin<3, 0208 method='pdep'; 0209 end 0210 0211 %Note: pvals is already sorted, and a row vector 0212 m=length(pvals); %number of tests 0213 0214 if strcmpi(method,'pdep'), 0215 %BH procedure for independence or positive dependence 0216 thresh=[1:m]*q/m; 0217 elseif strcmpi(method,'dep') 0218 %BH procedure for any dependency structure 0219 denom=m*sum(1./[1:m]); 0220 thresh=[1:m]*q/denom; 0221 else 0222 error('Argument ''method'' needs to be ''pdep'' or ''dep''.'); 0223 end 0224 0225 rej=pvals<=thresh; 0226 max_id=find(rej,1,'last'); %find greatest significant pvalue 0227 if isempty(max_id), 0228 crit_p=0; 0229 h=pvals*0; 0230 else 0231 crit_p=pvals(max_id); 0232 h=pvals<=crit_p; 0233 end 0234 0235 0236 0237