Home > matlabmk > fdr_bh.m

fdr_bh

PURPOSE ^

fdr_bh() - Executes the Benjamini & Hochberg (1995) and the Benjamini &

SYNOPSIS ^

function [h crit_p adj_p]=fdr_bh(pvals,q,method,report)

DESCRIPTION ^

 fdr_bh() - Executes the Benjamini & Hochberg (1995) and the Benjamini &
            Yekutieli (2001) procedure for controlling the false discovery 
            rate (FDR) of a family of hypothesis tests. FDR is the expected
            proportion of rejected hypotheses that are mistakenly rejected 
            (i.e., the null hypothesis is actually true for those tests). 
            FDR is a somewhat less conservative/more powerful method for 
            correcting for multiple comparisons than procedures like Bonferroni
            correction that provide strong control of the family-wise
            error rate (i.e., the probability that one or more null
            hypotheses are mistakenly rejected).

 Usage:
  >> [h, crit_p, adj_p]=fdr_bh(pvals,q,method,report);

 Required Input:
   pvals - A vector or matrix (two dimensions or more) containing the
           p-value of each individual test in a family of tests.

 Optional Inputs:
   q       - The desired false discovery rate. {default: 0.05}
   method  - ['pdep' or 'dep'] If 'pdep,' the original Bejnamini & Hochberg
             FDR procedure is used, which is guaranteed to be accurate if
             the individual tests are independent or positively dependent
             (e.g., Gaussian variables that are positively correlated or
             independent).  If 'dep,' the FDR procedure
             described in Benjamini & Yekutieli (2001) that is guaranteed
             to be accurate for any test dependency structure (e.g.,
             Gaussian variables with any covariance matrix) is used. 'dep'
             is always appropriate to use but is less powerful than 'pdep.'
             {default: 'pdep'}
   report  - ['yes' or 'no'] If 'yes', a brief summary of FDR results are
             output to the MATLAB command line {default: 'no'}


 Outputs:
   h       - A binary vector or matrix of the same size as the input "pvals."
             If the ith element of h is 1, then the test that produced the 
             ith p-value in pvals is significant (i.e., the null hypothesis
             of the test is rejected).
   crit_p  - All uncorrected p-values less than or equal to crit_p are 
             significant (i.e., their null hypotheses are rejected).  If 
             no p-values are significant, crit_p=0.
   adj_p   - All adjusted p-values less than or equal to q are significant
             (i.e., their null hypotheses are rejected). Note, adjusted 
             p-values can be greater than 1.


 References:
   Benjamini, Y. & Hochberg, Y. (1995) Controlling the false discovery
     rate: A practical and powerful approach to multiple testing. Journal
     of the Royal Statistical Society, Series B (Methodological). 57(1),
     289-300.

   Benjamini, Y. & Yekutieli, D. (2001) The control of the false discovery
     rate in multiple testing under dependency. The Annals of Statistics.
     29(4), 1165-1188.

 Example:
   [dummy p_null]=ttest(randn(12,15)); %15 tests where the null hypothesis
                                       %is true
   [dummy p_effect]=ttest(randn(12,5)+1); %5 tests where the null
                                          %hypothesis is false
   [h crit_p adj_p]=fdr_bh([p_null p_effect],.05,'pdep','yes');


 For a review of false discovery rate control and other contemporary
 techniques for correcting for multiple comparisons see:

   Groppe, D.M., Urbach, T.P., & Kutas, M. (2011) Mass univariate analysis 
 of event-related brain potentials/fields I: A critical tutorial review. 
 Psychophysiology, 48(12) pp. 1711-1725, DOI: 10.1111/j.1469-8986.2011.01273.x 
 http://www.cogsci.ucsd.edu/~dgroppe/PUBLICATIONS/mass_uni_preprint1.pdf


 Author:
 David M. Groppe
 Kutaslab
 Dept. of Cognitive Science
 University of California, San Diego
 March 24, 2010

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

0001 % fdr_bh() - Executes the Benjamini & Hochberg (1995) and the Benjamini &
0002 %            Yekutieli (2001) procedure for controlling the false discovery
0003 %            rate (FDR) of a family of hypothesis tests. FDR is the expected
0004 %            proportion of rejected hypotheses that are mistakenly rejected
0005 %            (i.e., the null hypothesis is actually true for those tests).
0006 %            FDR is a somewhat less conservative/more powerful method for
0007 %            correcting for multiple comparisons than procedures like Bonferroni
0008 %            correction that provide strong control of the family-wise
0009 %            error rate (i.e., the probability that one or more null
0010 %            hypotheses are mistakenly rejected).
0011 %
0012 % Usage:
0013 %  >> [h, crit_p, adj_p]=fdr_bh(pvals,q,method,report);
0014 %
0015 % Required Input:
0016 %   pvals - A vector or matrix (two dimensions or more) containing the
0017 %           p-value of each individual test in a family of tests.
0018 %
0019 % Optional Inputs:
0020 %   q       - The desired false discovery rate. {default: 0.05}
0021 %   method  - ['pdep' or 'dep'] If 'pdep,' the original Bejnamini & Hochberg
0022 %             FDR procedure is used, which is guaranteed to be accurate if
0023 %             the individual tests are independent or positively dependent
0024 %             (e.g., Gaussian variables that are positively correlated or
0025 %             independent).  If 'dep,' the FDR procedure
0026 %             described in Benjamini & Yekutieli (2001) that is guaranteed
0027 %             to be accurate for any test dependency structure (e.g.,
0028 %             Gaussian variables with any covariance matrix) is used. 'dep'
0029 %             is always appropriate to use but is less powerful than 'pdep.'
0030 %             {default: 'pdep'}
0031 %   report  - ['yes' or 'no'] If 'yes', a brief summary of FDR results are
0032 %             output to the MATLAB command line {default: 'no'}
0033 %
0034 %
0035 % Outputs:
0036 %   h       - A binary vector or matrix of the same size as the input "pvals."
0037 %             If the ith element of h is 1, then the test that produced the
0038 %             ith p-value in pvals is significant (i.e., the null hypothesis
0039 %             of the test is rejected).
0040 %   crit_p  - All uncorrected p-values less than or equal to crit_p are
0041 %             significant (i.e., their null hypotheses are rejected).  If
0042 %             no p-values are significant, crit_p=0.
0043 %   adj_p   - All adjusted p-values less than or equal to q are significant
0044 %             (i.e., their null hypotheses are rejected). Note, adjusted
0045 %             p-values can be greater than 1.
0046 %
0047 %
0048 % References:
0049 %   Benjamini, Y. & Hochberg, Y. (1995) Controlling the false discovery
0050 %     rate: A practical and powerful approach to multiple testing. Journal
0051 %     of the Royal Statistical Society, Series B (Methodological). 57(1),
0052 %     289-300.
0053 %
0054 %   Benjamini, Y. & Yekutieli, D. (2001) The control of the false discovery
0055 %     rate in multiple testing under dependency. The Annals of Statistics.
0056 %     29(4), 1165-1188.
0057 %
0058 % Example:
0059 %   [dummy p_null]=ttest(randn(12,15)); %15 tests where the null hypothesis
0060 %                                       %is true
0061 %   [dummy p_effect]=ttest(randn(12,5)+1); %5 tests where the null
0062 %                                          %hypothesis is false
0063 %   [h crit_p adj_p]=fdr_bh([p_null p_effect],.05,'pdep','yes');
0064 %
0065 %
0066 % For a review of false discovery rate control and other contemporary
0067 % techniques for correcting for multiple comparisons see:
0068 %
0069 %   Groppe, D.M., Urbach, T.P., & Kutas, M. (2011) Mass univariate analysis
0070 % of event-related brain potentials/fields I: A critical tutorial review.
0071 % Psychophysiology, 48(12) pp. 1711-1725, DOI: 10.1111/j.1469-8986.2011.01273.x
0072 % http://www.cogsci.ucsd.edu/~dgroppe/PUBLICATIONS/mass_uni_preprint1.pdf
0073 %
0074 %
0075 % Author:
0076 % David M. Groppe
0077 % Kutaslab
0078 % Dept. of Cognitive Science
0079 % University of California, San Diego
0080 % March 24, 2010
0081 
0082 %%%%%%%%%%%%%%%% REVISION LOG %%%%%%%%%%%%%%%%%
0083 %
0084 % 5/7/2010-Added FDR adjusted p-values
0085 % 5/14/2013- D.H.J. Poot, Erasmus MC, improved run-time complexity
0086 
0087 function [h crit_p adj_p]=fdr_bh(pvals,q,method,report)
0088 
0089 if nargin<1,
0090     error('You need to provide a vector or matrix of p-values.');
0091 else
0092     if ~isempty(find(pvals<0,1)),
0093         error('Some p-values are less than 0.');
0094     elseif ~isempty(find(pvals>1,1)),
0095         error('Some p-values are greater than 1.');
0096     end
0097 end
0098 
0099 if nargin<2,
0100     q=.05;
0101 end
0102 
0103 if nargin<3,
0104     method='pdep';
0105 end
0106 
0107 if nargin<4,
0108     report='no';
0109 end
0110 
0111 s=size(pvals);
0112 if (length(s)>2) || s(1)>1,
0113     [p_sorted, sort_ids]=sort(reshape(pvals,1,prod(s)));
0114 else
0115     %p-values are already a row vector
0116     [p_sorted, sort_ids]=sort(pvals);
0117 end
0118 [dummy, unsort_ids]=sort(sort_ids); %indexes to return p_sorted to pvals order
0119 m=length(p_sorted); %number of tests
0120 
0121 if strcmpi(method,'pdep'),
0122     %BH procedure for independence or positive dependence
0123     thresh=(1:m)*q/m;
0124     wtd_p=m*p_sorted./(1:m);
0125     
0126 elseif strcmpi(method,'dep')
0127     %BH procedure for any dependency structure
0128     denom=m*sum(1./(1:m));
0129     thresh=(1:m)*q/denom;
0130     wtd_p=denom*p_sorted./[1:m];
0131     %Note, it can produce adjusted p-values greater than 1!
0132     %compute adjusted p-values
0133 else
0134     error('Argument ''method'' needs to be ''pdep'' or ''dep''.');
0135 end
0136 
0137 if nargout>2,
0138     %compute adjusted p-values
0139     adj_p=zeros(1,m)*NaN;
0140     [wtd_p_sorted, wtd_p_sindex] = sort( wtd_p );
0141     nextfill = 1;
0142     for k = 1 : m
0143         if wtd_p_sindex(k)>=nextfill
0144             adj_p(nextfill:wtd_p_sindex(k)) = wtd_p_sorted(k);
0145             nextfill = wtd_p_sindex(k)+1;
0146             if nextfill>m
0147                 break;
0148             end;
0149         end;
0150     end;
0151     adj_p=reshape(adj_p(unsort_ids),s);
0152 end
0153 
0154 rej=p_sorted<=thresh;
0155 max_id=find(rej,1,'last'); %find greatest significant pvalue
0156 if isempty(max_id),
0157     crit_p=0;
0158     h=pvals*0;
0159 else
0160     crit_p=p_sorted(max_id);
0161     h=pvals<=crit_p;
0162 end
0163 
0164 if strcmpi(report,'yes'),
0165     n_sig=sum(p_sorted<=crit_p);
0166     if n_sig==1,
0167         fprintf('Out of %d tests, %d is significant using a false discovery rate of %f.\n',m,n_sig,q);
0168     else
0169         fprintf('Out of %d tests, %d are significant using a false discovery rate of %f.\n',m,n_sig,q);
0170     end
0171     if strcmpi(method,'pdep'),
0172         fprintf('FDR procedure used is guaranteed valid for independent or positively dependent tests.\n');
0173     else
0174         fprintf('FDR procedure used is guaranteed valid for independent or dependent tests.\n');
0175     end
0176 end
0177 
0178 
0179 
0180

Generated on Tue 10-May-2016 16:37:45 by m2html © 2005