fdr_bh() - Executes the Benjamini & Hochberg (1995) and the Benjamini & Yekutieli (2001) procedure for controlling the false discovery rate (FDR) of a family of hypothesis tests. FDR is the expected proportion of rejected hypotheses that are mistakenly rejected (i.e., the null hypothesis is actually true for those tests). FDR is a somewhat less conservative/more powerful method for correcting for multiple comparisons than procedures like Bonferroni correction that provide strong control of the family-wise error rate (i.e., the probability that one or more null hypotheses are mistakenly rejected). Usage: >> [h, crit_p, adj_p]=fdr_bh(pvals,q,method,report); Required Input: pvals - A vector or matrix (two dimensions or more) containing the p-value of each individual test in a family of tests. Optional Inputs: q - The desired false discovery rate. {default: 0.05} method - ['pdep' or 'dep'] If 'pdep,' the original Bejnamini & Hochberg FDR procedure is used, which is guaranteed to be accurate if the individual tests are independent or positively dependent (e.g., Gaussian variables that are positively correlated or independent). If 'dep,' the FDR procedure described in Benjamini & Yekutieli (2001) that is guaranteed to be accurate for any test dependency structure (e.g., Gaussian variables with any covariance matrix) is used. 'dep' is always appropriate to use but is less powerful than 'pdep.' {default: 'pdep'} report - ['yes' or 'no'] If 'yes', a brief summary of FDR results are output to the MATLAB command line {default: 'no'} Outputs: h - A binary vector or matrix of the same size as the input "pvals." If the ith element of h is 1, then the test that produced the ith p-value in pvals is significant (i.e., the null hypothesis of the test is rejected). crit_p - All uncorrected p-values less than or equal to crit_p are significant (i.e., their null hypotheses are rejected). If no p-values are significant, crit_p=0. adj_p - All adjusted p-values less than or equal to q are significant (i.e., their null hypotheses are rejected). Note, adjusted p-values can be greater than 1. References: Benjamini, Y. & Hochberg, Y. (1995) Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B (Methodological). 57(1), 289-300. Benjamini, Y. & Yekutieli, D. (2001) The control of the false discovery rate in multiple testing under dependency. The Annals of Statistics. 29(4), 1165-1188. Example: [dummy p_null]=ttest(randn(12,15)); %15 tests where the null hypothesis %is true [dummy p_effect]=ttest(randn(12,5)+1); %5 tests where the null %hypothesis is false [h crit_p adj_p]=fdr_bh([p_null p_effect],.05,'pdep','yes'); For a review of false discovery rate control and other contemporary techniques for correcting for multiple comparisons see: Groppe, D.M., Urbach, T.P., & Kutas, M. (2011) Mass univariate analysis of event-related brain potentials/fields I: A critical tutorial review. Psychophysiology, 48(12) pp. 1711-1725, DOI: 10.1111/j.1469-8986.2011.01273.x http://www.cogsci.ucsd.edu/~dgroppe/PUBLICATIONS/mass_uni_preprint1.pdf Author: David M. Groppe Kutaslab Dept. of Cognitive Science University of California, San Diego March 24, 2010
0001 % fdr_bh() - Executes the Benjamini & Hochberg (1995) and the Benjamini & 0002 % Yekutieli (2001) procedure for controlling the false discovery 0003 % rate (FDR) of a family of hypothesis tests. FDR is the expected 0004 % proportion of rejected hypotheses that are mistakenly rejected 0005 % (i.e., the null hypothesis is actually true for those tests). 0006 % FDR is a somewhat less conservative/more powerful method for 0007 % correcting for multiple comparisons than procedures like Bonferroni 0008 % correction that provide strong control of the family-wise 0009 % error rate (i.e., the probability that one or more null 0010 % hypotheses are mistakenly rejected). 0011 % 0012 % Usage: 0013 % >> [h, crit_p, adj_p]=fdr_bh(pvals,q,method,report); 0014 % 0015 % Required Input: 0016 % pvals - A vector or matrix (two dimensions or more) containing the 0017 % p-value of each individual test in a family of tests. 0018 % 0019 % Optional Inputs: 0020 % q - The desired false discovery rate. {default: 0.05} 0021 % method - ['pdep' or 'dep'] If 'pdep,' the original Bejnamini & Hochberg 0022 % FDR procedure is used, which is guaranteed to be accurate if 0023 % the individual tests are independent or positively dependent 0024 % (e.g., Gaussian variables that are positively correlated or 0025 % independent). If 'dep,' the FDR procedure 0026 % described in Benjamini & Yekutieli (2001) that is guaranteed 0027 % to be accurate for any test dependency structure (e.g., 0028 % Gaussian variables with any covariance matrix) is used. 'dep' 0029 % is always appropriate to use but is less powerful than 'pdep.' 0030 % {default: 'pdep'} 0031 % report - ['yes' or 'no'] If 'yes', a brief summary of FDR results are 0032 % output to the MATLAB command line {default: 'no'} 0033 % 0034 % 0035 % Outputs: 0036 % h - A binary vector or matrix of the same size as the input "pvals." 0037 % If the ith element of h is 1, then the test that produced the 0038 % ith p-value in pvals is significant (i.e., the null hypothesis 0039 % of the test is rejected). 0040 % crit_p - All uncorrected p-values less than or equal to crit_p are 0041 % significant (i.e., their null hypotheses are rejected). If 0042 % no p-values are significant, crit_p=0. 0043 % adj_p - All adjusted p-values less than or equal to q are significant 0044 % (i.e., their null hypotheses are rejected). Note, adjusted 0045 % p-values can be greater than 1. 0046 % 0047 % 0048 % References: 0049 % Benjamini, Y. & Hochberg, Y. (1995) Controlling the false discovery 0050 % rate: A practical and powerful approach to multiple testing. Journal 0051 % of the Royal Statistical Society, Series B (Methodological). 57(1), 0052 % 289-300. 0053 % 0054 % Benjamini, Y. & Yekutieli, D. (2001) The control of the false discovery 0055 % rate in multiple testing under dependency. The Annals of Statistics. 0056 % 29(4), 1165-1188. 0057 % 0058 % Example: 0059 % [dummy p_null]=ttest(randn(12,15)); %15 tests where the null hypothesis 0060 % %is true 0061 % [dummy p_effect]=ttest(randn(12,5)+1); %5 tests where the null 0062 % %hypothesis is false 0063 % [h crit_p adj_p]=fdr_bh([p_null p_effect],.05,'pdep','yes'); 0064 % 0065 % 0066 % For a review of false discovery rate control and other contemporary 0067 % techniques for correcting for multiple comparisons see: 0068 % 0069 % Groppe, D.M., Urbach, T.P., & Kutas, M. (2011) Mass univariate analysis 0070 % of event-related brain potentials/fields I: A critical tutorial review. 0071 % Psychophysiology, 48(12) pp. 1711-1725, DOI: 10.1111/j.1469-8986.2011.01273.x 0072 % http://www.cogsci.ucsd.edu/~dgroppe/PUBLICATIONS/mass_uni_preprint1.pdf 0073 % 0074 % 0075 % Author: 0076 % David M. Groppe 0077 % Kutaslab 0078 % Dept. of Cognitive Science 0079 % University of California, San Diego 0080 % March 24, 2010 0081 0082 %%%%%%%%%%%%%%%% REVISION LOG %%%%%%%%%%%%%%%%% 0083 % 0084 % 5/7/2010-Added FDR adjusted p-values 0085 % 5/14/2013- D.H.J. Poot, Erasmus MC, improved run-time complexity 0086 0087 function [h crit_p adj_p]=fdr_bh(pvals,q,method,report) 0088 0089 if nargin<1, 0090 error('You need to provide a vector or matrix of p-values.'); 0091 else 0092 if ~isempty(find(pvals<0,1)), 0093 error('Some p-values are less than 0.'); 0094 elseif ~isempty(find(pvals>1,1)), 0095 error('Some p-values are greater than 1.'); 0096 end 0097 end 0098 0099 if nargin<2, 0100 q=.05; 0101 end 0102 0103 if nargin<3, 0104 method='pdep'; 0105 end 0106 0107 if nargin<4, 0108 report='no'; 0109 end 0110 0111 s=size(pvals); 0112 if (length(s)>2) || s(1)>1, 0113 [p_sorted, sort_ids]=sort(reshape(pvals,1,prod(s))); 0114 else 0115 %p-values are already a row vector 0116 [p_sorted, sort_ids]=sort(pvals); 0117 end 0118 [dummy, unsort_ids]=sort(sort_ids); %indexes to return p_sorted to pvals order 0119 m=length(p_sorted); %number of tests 0120 0121 if strcmpi(method,'pdep'), 0122 %BH procedure for independence or positive dependence 0123 thresh=(1:m)*q/m; 0124 wtd_p=m*p_sorted./(1:m); 0125 0126 elseif strcmpi(method,'dep') 0127 %BH procedure for any dependency structure 0128 denom=m*sum(1./(1:m)); 0129 thresh=(1:m)*q/denom; 0130 wtd_p=denom*p_sorted./[1:m]; 0131 %Note, it can produce adjusted p-values greater than 1! 0132 %compute adjusted p-values 0133 else 0134 error('Argument ''method'' needs to be ''pdep'' or ''dep''.'); 0135 end 0136 0137 if nargout>2, 0138 %compute adjusted p-values 0139 adj_p=zeros(1,m)*NaN; 0140 [wtd_p_sorted, wtd_p_sindex] = sort( wtd_p ); 0141 nextfill = 1; 0142 for k = 1 : m 0143 if wtd_p_sindex(k)>=nextfill 0144 adj_p(nextfill:wtd_p_sindex(k)) = wtd_p_sorted(k); 0145 nextfill = wtd_p_sindex(k)+1; 0146 if nextfill>m 0147 break; 0148 end; 0149 end; 0150 end; 0151 adj_p=reshape(adj_p(unsort_ids),s); 0152 end 0153 0154 rej=p_sorted<=thresh; 0155 max_id=find(rej,1,'last'); %find greatest significant pvalue 0156 if isempty(max_id), 0157 crit_p=0; 0158 h=pvals*0; 0159 else 0160 crit_p=p_sorted(max_id); 0161 h=pvals<=crit_p; 0162 end 0163 0164 if strcmpi(report,'yes'), 0165 n_sig=sum(p_sorted<=crit_p); 0166 if n_sig==1, 0167 fprintf('Out of %d tests, %d is significant using a false discovery rate of %f.\n',m,n_sig,q); 0168 else 0169 fprintf('Out of %d tests, %d are significant using a false discovery rate of %f.\n',m,n_sig,q); 0170 end 0171 if strcmpi(method,'pdep'), 0172 fprintf('FDR procedure used is guaranteed valid for independent or positively dependent tests.\n'); 0173 else 0174 fprintf('FDR procedure used is guaranteed valid for independent or dependent tests.\n'); 0175 end 0176 end 0177 0178 0179 0180