Home > matlabmk > items_regress.m

items_regress

PURPOSE ^

items_regress() - Performs multiple regression analysis on ERPs to each

SYNOPSIS ^

function regress_results=items_regress(gui_infiles_or_tmplt,time_wind,varargin)

DESCRIPTION ^

 items_regress() - Performs multiple regression analysis on ERPs to each 
                experimental item. For each unique experimental item, the 
                function collects the trials involving that item from
                every participant in the study and forms an ERP from that 
                item.  A standard multiple regression analysis is then
                performed with the mean amplitude of the item ERPs in a
                particular time window at one or more electrodes as the
                response variable. When multiple electrodes are used, the 
                function corrects for multiple comparisons by using the 
                Bonferroni-Holm procedure (Holm, 1979).  Each electrode is
                considered a distinct test (i.e., the total number of
                electrodes equals the number of tests in the family of
                tests). Note, this approach to data analysis treats your
                experimental participants as fixed variables and your
                items of random variables.  Thus you are testing for
                effects that are likely to generalize across items on
                these exact same participants (i.e., the results might not
                generalize beyond these participants).
                
 Usage:
  >> regress_results=items_regress(gui_infiles_or_tmplt,time_wind,varargin);
 

 Required Inputs:
  gui_infiles_or_tmplt - ['gui', a cell array of strings, or a string template]
                         If 'gui', a GUI is created that allows you to select
                         which set files to import (this is probably the
                         easiest way to import files).  If a cell array of
                         of strings, each element of the cell array should
                         contain the filename of an EEGLAB set file (e.g.,
                         {'visodbl01.set','visodbl02.set'}.  Otherwise, this
                         input should be a filename template (i.e., a string with
                         # where the subject number should be--'visodbl#.set').
                         If you provide a template, you must use the option
                         'sub_ids' (see below) to specify the ID numbers of
                         each subject. Include the files' path unless the files
                         are in the current working directory.
  time_wind            - [pair of integers] The time window (in milliseconds)
                         across which to average ERP amplitude for the response
                         variable (e.g., [300 500]).

 Optional Inputs:
  predictors    - [cell array of strings] A cell array that specifies the
                  non-intercept predictors you wish to include in the
                  regression analysis (an intercept predictor is automatically
                  added).  The predictors must be fields of the EEG.epoch
                  field of the EEG struct variable stored in each
                  set file. If not specified, the list of possible
                  predictors will be presented on the command line and the
                  user will be asked to select from that list. {default:
                  not specified}
  sub_ids       - [integer vector] A set of integers specifying the
                  subject ID numbers to include in the grand average.
                  Only necessary if a filename template is given as
                  the input to gui_infiles_or_tmplt. Leading 0's are not
                  necessary for ID numbers less than 10 (e.g., If 7 is a
                  sub_id and visodbl#.set the template, the function will
                  look for visodbl07.set AND visodbl7.set).
  alpha         - [value between 0 and 1] Alpha level with adjustment for 
                  multiple comparisons. If you're using the Bonferroni-Holm
                  procedure to correct for multiple comparisons, alpha is
                  the upper bound on the family-wise error rate. If you're
                  using FDR control, then alpha is the upper bound on the
                  false discovery rate. {default: 0.05}
  use_bins      - [integer vector] The bin number or numbers to use in the
                  analysis (i.e., all trials that fall in those bins will
                  be included in the analysis). {default: all bins}
  bsln_wind     - [pair of integers] The time window (in units of
                  milliseconds) that will be used to baseline each trial
                  of EEG.  The mean voltage in the time window will be
                  removed from the entire EEG epoch. If the data have
                  already been baselined and ICA artifact correction was NOT
                  applied, you won't need to re-baseline the data. {default:
                  no additional baselining will be done}
  plot_topos    - [0 or 1] If 1, topographies of regression
                  coefficients and their corresponding t-scores will be
                  produced (if more than one channel was used in the analysis).
                  {default: 1}
  tail          - [1 | 0 | -1] If tail=1, the alternative hypothesis is
                  that regression coeffcients are greater than 0 (upper
                  tailed test).  If tail=0, the alternative hypothesis is
                  that the regression coefficients are different than
                  0 (two tailed test). If tail=-1, the alternative
                  hypothesis is that the regression coefficients
                  are less than 0 (lower tailed test). {default: 0}
  exclude_chans - A cell array of channel labels to exclude from the
                  analysis (e.g., {'A2','lle','rhe'}).  This option
                  sacrifices spatial resolution to increase test power by
                  reducing the number of comparisons. You cannot
                  use both this option and 'include_chans' (below).{default:
                  not used, all channels included in test}
  include_chans - A cell array of channel labels to use in the analysis
                  (e.g., {'A2','lle','rhe'}).  All other channels will
                  be ignored. This option sacrifices spatial resolution to
                  increase test power by reducing the number of comparisons.
                  You cannot use both this option and 'exclude_chans'
                  (above). {default: not used, all channels included in
                  test}
  correction    - The method used to correct for multiple comparisons
                  (i.e., tests at multiple channels).  There are two
                  options:
                   'Bonf-Holm' -Bonferroni-Holm (1979) procedure to strongly
                                control the family-wise error rate.
                                {default}
                   'FDR BH'    -Benjamini & Hochberg (1995) false
                                discovery rate control procedure.
                                Guaranteed to control FDR for independent
                                or positive regression dependent data.
                   'FDR BY'    -Benjamini & Yekutieli (2001) false
                                discovery rate control procedure.
                                Always guaranteed to control FDR.
                  If the regression analysis is being performed on a
                  single channel, no correction for multiple comparisons
                  is applied.
  output_file   - A string indicating the name of a tab delimited text
                  file to produce containing the results of the analysis.
  freq_filt     - [low_boundary high_boundary] a two element vector indicating the frequency
                  cut-offs for a 3rd order Butterworth filter that will be applied to each
                  trial of data.  If low_boundary=0, the filter is a low pass filter.  If
                  high_boundary=(sampling rate)/2, then the filter is a high pass filter.  If both
                  boundaries are between 0 and (sampling rate)/2, then the filter is a band-pass filter.
                  If both boundaries are between 0 and -(sampling rate)/2, then the filter is a band-
                  stop filter (with boundaries equal to the absolute values of the low_boundary and
                  high_boundary).  Note, using this option requires the signal processing toolbox
                  function butter.m and filtfilt.m.  If 'freq_filt' is not
                  specified, no fitering is done. {default: no filtering}
   n_pad        - [integer] number of time points to pad each epoch of
                  data before filtering.  For example, if n_pad is 512 and
                  you have 256 time points of data per epoch, each epoch of
                  data will be grown to 1280 time points (i.e., 512*2+256) by
                  adding 512 padding points before and after the 256 time points
                  of data.  The values of the padding points form a line from the
                  value of the first and last data point (i.e., if you wrapped
                  the padded waveform around a circle the pre-padding and
                  post-padding points would form a line).  This method of
                  padding is the way the Kutaslab function filter.c works.
                  If n_pad is not specified, butterfilt.m will follow the Kutaslab
                  convention of n_pad=2*(the number of time points per
                  epoch). If you wish to avoid zero padding, set n_pad to
                  0.  Note, 'n_pad' has no effect if the optional argument
                  'freq_filt' is not used.
  use_bins_and_evcode - ['yes' or 'no'] If 'yes', item ERPs are formed
                        according to the event code of the time locking 
                        event AND the bin that event fell into.  
                        Specifically, the item code for that class of 
                        event becomes that item's event code plus the bin 
                        # divided by 10^the order of magnitude of the 
                        biggest bin #.  For example, say there are three 
                        bins and some class of event (say targets in an 
                        oddball task) have an event code of 100.  When 
                        that event code occurs in Bin 2 (say all the targets
                        that preceded by another target fall in Bin 2), its 
                        item identifier will be 100.2. When that event code 
                        occurs in Bin 1 (say all the targets preceded by
                        a standard fall in Bin 1), its item identifier will 
                        be 100.1. If 'no', only event codes are used to 
                        make item ERPs. {default: 'no'}

 Output:
   Here's a brief explanation as to what each field of regress_results
   contains:
   -item_coefs: The regression coefficient at each channel for each 
    predictor (the matrix is channel x predictor)
   -item_coefs_stder: The standard error of the regression coefficient 
    across subjects for each channel and predictor
   -R_sqr: The R^2 of the regression model for each channel and predictor. 
    This is a measure of how much of the variance is explained by the 
    predictor
   -R_sqr_adj: The adjusted R^2 of the regression model for each channel 
    and predictor. This is a measure of how much of the variance is 
    explained by the predictor adjusted by the number of predictors.  This 
    is more informative statistic to report if you use more than one 
    predictor.
   -overall_F: The F score of the full regression model (i.e., including 
    all predictors) at each channel
   -overall_F_pvalue: The F score of the full regression model (i.e., 
    including all predictors) at each channel. Note, this is NOT corrected 
    for multiple comparisons. 
   -t_coefs: t-scores of the coefficients (i.e., coefficients divided by 
    the standard error)
   -t_df: degrees of freedom for the t-scores in t_coefs
   -tail: The tail of the hypothesis tests (1=upper tailed, 0=two tailed, 
    -1=lower tailed)
   -hi_ci: Upper bound of 95% confidence interval for the estimate of the 
    regression coefficient for each channel and predictor.  Confidence 
    intervals assume a t-distribution and are NOT corrected for multiple 
    comparisons.
   -low_ci: Lower bound of 95% confidence interval for the estimate of the
    mean regression coefficient across subjects for each channel and 
    predictor.
   -d_coefs: Cohen's d of the mean coefficients (i.e., the mean of the 
    coefficients divided by the standard deviation).  Cohen's d is a 
    standard measure of effect size.
   -F_coefs: F-scores of the coefficients. This is included to facilitate 
    computing minF in the future and might not be accurate. For the time 
    being you can ignore it.
   -p_coefs_uncor: Uncorrected p-values for each coefficient
   -p_coefs_cor: p-values after correction for multiple comparisons
   -crctn_method: The method used to correct for multiple comparisons (if 
    more than one channel analyzed)
   -residuals: Residuals of full regression model for each channel and 
    item (matrix is channel by item)
   -R_collinear: Estimate of the degree of collinearity for each predictor
   -h_coefs_cor: binary matrix (channel x predictor) indicating whether or 
    not the null hypothesis of a mean coefficient of 0 could be rejected. 
    1=the null hypothesis is rejected. 0=you failed to reject the null 
    hypothesis.
   -predictor_names: The names of the predictor variables
   -family_alpha: Specified family-wise alpha level
   -chanlocs: Channel labels and location information for the EEG channels 
    included in the analysis
   -bins: The bins included in the analysis
   -time_wind: The lower and upper bound of time window used for analysis
   -participant_filenames: Names of the EEGLAB set files used in the
    analysis.
   -participants_per_item: The number of participants who contributed 
    trials for each item
   -bins_used_to_define_items: Whether or not bins were used in addition
    to stimpres event codes to define items


 Reference:
 Holm, S. (1979) A simple sequentially rejective multiple test procedure.
  Scandinavian Journal of Statistics. 6, 65-70.

 Benjamini, Y. & Hochberg, Y. (1995) Controlling the false discovery
  rate: A practical and powerful approach to multiple testing. Journal
  of the Royal Statistical Society, Series B (Methodological). 57(1),
  289-300.
  
 Benjamini, Y. & Yekutieli, D. (2001) The control of the false discovery
  rate in multiple testing under dependency. The Annals of Statistics.
  29(4), 1165-1188.


 Notes:
 -To plot topographies of coefficients from the output of this function at
 a later time, use the function sig_topo_coefs.m.

 -If ICA has been applied to the data and some ICs have been labeled as
 artifacts, this function will remove them and re-baseline the data before
 performing the regression analysis. The function should report on the
 command line how many ICs have been removed.

 -If you get the message "Warning: Matrix is singular to working
 precision." something is wrong with your data or predictors.  One or both
 of them is exhibiting perfect collinearity and the regression
 coefficients are undetermined.

 -If you get a warning that some trials could not be used because one of
 predictors have NaN values for that trial, that means that when you
 loaded information about that predictor into the set file a predictor 
 value for that trial was not found.  "NaN" simply stands for "not a
 number" and indicates that that trial can not be used in the regression
 analysis.

 -A trial that has an empty value for a predictor (e.g.,
 EEG.epoch(1).tone_duration=[]) will be assigned a value of NaN for that 
 predictor 

 -This function will probably NOT work yet with REACTION TIME as a predictor.

 Author:
 David Groppe
 Kutaslab, 2/2011

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

0001 % items_regress() - Performs multiple regression analysis on ERPs to each
0002 %                experimental item. For each unique experimental item, the
0003 %                function collects the trials involving that item from
0004 %                every participant in the study and forms an ERP from that
0005 %                item.  A standard multiple regression analysis is then
0006 %                performed with the mean amplitude of the item ERPs in a
0007 %                particular time window at one or more electrodes as the
0008 %                response variable. When multiple electrodes are used, the
0009 %                function corrects for multiple comparisons by using the
0010 %                Bonferroni-Holm procedure (Holm, 1979).  Each electrode is
0011 %                considered a distinct test (i.e., the total number of
0012 %                electrodes equals the number of tests in the family of
0013 %                tests). Note, this approach to data analysis treats your
0014 %                experimental participants as fixed variables and your
0015 %                items of random variables.  Thus you are testing for
0016 %                effects that are likely to generalize across items on
0017 %                these exact same participants (i.e., the results might not
0018 %                generalize beyond these participants).
0019 %
0020 % Usage:
0021 %  >> regress_results=items_regress(gui_infiles_or_tmplt,time_wind,varargin);
0022 %
0023 %
0024 % Required Inputs:
0025 %  gui_infiles_or_tmplt - ['gui', a cell array of strings, or a string template]
0026 %                         If 'gui', a GUI is created that allows you to select
0027 %                         which set files to import (this is probably the
0028 %                         easiest way to import files).  If a cell array of
0029 %                         of strings, each element of the cell array should
0030 %                         contain the filename of an EEGLAB set file (e.g.,
0031 %                         {'visodbl01.set','visodbl02.set'}.  Otherwise, this
0032 %                         input should be a filename template (i.e., a string with
0033 %                         # where the subject number should be--'visodbl#.set').
0034 %                         If you provide a template, you must use the option
0035 %                         'sub_ids' (see below) to specify the ID numbers of
0036 %                         each subject. Include the files' path unless the files
0037 %                         are in the current working directory.
0038 %  time_wind            - [pair of integers] The time window (in milliseconds)
0039 %                         across which to average ERP amplitude for the response
0040 %                         variable (e.g., [300 500]).
0041 %
0042 % Optional Inputs:
0043 %  predictors    - [cell array of strings] A cell array that specifies the
0044 %                  non-intercept predictors you wish to include in the
0045 %                  regression analysis (an intercept predictor is automatically
0046 %                  added).  The predictors must be fields of the EEG.epoch
0047 %                  field of the EEG struct variable stored in each
0048 %                  set file. If not specified, the list of possible
0049 %                  predictors will be presented on the command line and the
0050 %                  user will be asked to select from that list. {default:
0051 %                  not specified}
0052 %  sub_ids       - [integer vector] A set of integers specifying the
0053 %                  subject ID numbers to include in the grand average.
0054 %                  Only necessary if a filename template is given as
0055 %                  the input to gui_infiles_or_tmplt. Leading 0's are not
0056 %                  necessary for ID numbers less than 10 (e.g., If 7 is a
0057 %                  sub_id and visodbl#.set the template, the function will
0058 %                  look for visodbl07.set AND visodbl7.set).
0059 %  alpha         - [value between 0 and 1] Alpha level with adjustment for
0060 %                  multiple comparisons. If you're using the Bonferroni-Holm
0061 %                  procedure to correct for multiple comparisons, alpha is
0062 %                  the upper bound on the family-wise error rate. If you're
0063 %                  using FDR control, then alpha is the upper bound on the
0064 %                  false discovery rate. {default: 0.05}
0065 %  use_bins      - [integer vector] The bin number or numbers to use in the
0066 %                  analysis (i.e., all trials that fall in those bins will
0067 %                  be included in the analysis). {default: all bins}
0068 %  bsln_wind     - [pair of integers] The time window (in units of
0069 %                  milliseconds) that will be used to baseline each trial
0070 %                  of EEG.  The mean voltage in the time window will be
0071 %                  removed from the entire EEG epoch. If the data have
0072 %                  already been baselined and ICA artifact correction was NOT
0073 %                  applied, you won't need to re-baseline the data. {default:
0074 %                  no additional baselining will be done}
0075 %  plot_topos    - [0 or 1] If 1, topographies of regression
0076 %                  coefficients and their corresponding t-scores will be
0077 %                  produced (if more than one channel was used in the analysis).
0078 %                  {default: 1}
0079 %  tail          - [1 | 0 | -1] If tail=1, the alternative hypothesis is
0080 %                  that regression coeffcients are greater than 0 (upper
0081 %                  tailed test).  If tail=0, the alternative hypothesis is
0082 %                  that the regression coefficients are different than
0083 %                  0 (two tailed test). If tail=-1, the alternative
0084 %                  hypothesis is that the regression coefficients
0085 %                  are less than 0 (lower tailed test). {default: 0}
0086 %  exclude_chans - A cell array of channel labels to exclude from the
0087 %                  analysis (e.g., {'A2','lle','rhe'}).  This option
0088 %                  sacrifices spatial resolution to increase test power by
0089 %                  reducing the number of comparisons. You cannot
0090 %                  use both this option and 'include_chans' (below).{default:
0091 %                  not used, all channels included in test}
0092 %  include_chans - A cell array of channel labels to use in the analysis
0093 %                  (e.g., {'A2','lle','rhe'}).  All other channels will
0094 %                  be ignored. This option sacrifices spatial resolution to
0095 %                  increase test power by reducing the number of comparisons.
0096 %                  You cannot use both this option and 'exclude_chans'
0097 %                  (above). {default: not used, all channels included in
0098 %                  test}
0099 %  correction    - The method used to correct for multiple comparisons
0100 %                  (i.e., tests at multiple channels).  There are two
0101 %                  options:
0102 %                   'Bonf-Holm' -Bonferroni-Holm (1979) procedure to strongly
0103 %                                control the family-wise error rate.
0104 %                                {default}
0105 %                   'FDR BH'    -Benjamini & Hochberg (1995) false
0106 %                                discovery rate control procedure.
0107 %                                Guaranteed to control FDR for independent
0108 %                                or positive regression dependent data.
0109 %                   'FDR BY'    -Benjamini & Yekutieli (2001) false
0110 %                                discovery rate control procedure.
0111 %                                Always guaranteed to control FDR.
0112 %                  If the regression analysis is being performed on a
0113 %                  single channel, no correction for multiple comparisons
0114 %                  is applied.
0115 %  output_file   - A string indicating the name of a tab delimited text
0116 %                  file to produce containing the results of the analysis.
0117 %  freq_filt     - [low_boundary high_boundary] a two element vector indicating the frequency
0118 %                  cut-offs for a 3rd order Butterworth filter that will be applied to each
0119 %                  trial of data.  If low_boundary=0, the filter is a low pass filter.  If
0120 %                  high_boundary=(sampling rate)/2, then the filter is a high pass filter.  If both
0121 %                  boundaries are between 0 and (sampling rate)/2, then the filter is a band-pass filter.
0122 %                  If both boundaries are between 0 and -(sampling rate)/2, then the filter is a band-
0123 %                  stop filter (with boundaries equal to the absolute values of the low_boundary and
0124 %                  high_boundary).  Note, using this option requires the signal processing toolbox
0125 %                  function butter.m and filtfilt.m.  If 'freq_filt' is not
0126 %                  specified, no fitering is done. {default: no filtering}
0127 %   n_pad        - [integer] number of time points to pad each epoch of
0128 %                  data before filtering.  For example, if n_pad is 512 and
0129 %                  you have 256 time points of data per epoch, each epoch of
0130 %                  data will be grown to 1280 time points (i.e., 512*2+256) by
0131 %                  adding 512 padding points before and after the 256 time points
0132 %                  of data.  The values of the padding points form a line from the
0133 %                  value of the first and last data point (i.e., if you wrapped
0134 %                  the padded waveform around a circle the pre-padding and
0135 %                  post-padding points would form a line).  This method of
0136 %                  padding is the way the Kutaslab function filter.c works.
0137 %                  If n_pad is not specified, butterfilt.m will follow the Kutaslab
0138 %                  convention of n_pad=2*(the number of time points per
0139 %                  epoch). If you wish to avoid zero padding, set n_pad to
0140 %                  0.  Note, 'n_pad' has no effect if the optional argument
0141 %                  'freq_filt' is not used.
0142 %  use_bins_and_evcode - ['yes' or 'no'] If 'yes', item ERPs are formed
0143 %                        according to the event code of the time locking
0144 %                        event AND the bin that event fell into.
0145 %                        Specifically, the item code for that class of
0146 %                        event becomes that item's event code plus the bin
0147 %                        # divided by 10^the order of magnitude of the
0148 %                        biggest bin #.  For example, say there are three
0149 %                        bins and some class of event (say targets in an
0150 %                        oddball task) have an event code of 100.  When
0151 %                        that event code occurs in Bin 2 (say all the targets
0152 %                        that preceded by another target fall in Bin 2), its
0153 %                        item identifier will be 100.2. When that event code
0154 %                        occurs in Bin 1 (say all the targets preceded by
0155 %                        a standard fall in Bin 1), its item identifier will
0156 %                        be 100.1. If 'no', only event codes are used to
0157 %                        make item ERPs. {default: 'no'}
0158 %
0159 % Output:
0160 %   Here's a brief explanation as to what each field of regress_results
0161 %   contains:
0162 %   -item_coefs: The regression coefficient at each channel for each
0163 %    predictor (the matrix is channel x predictor)
0164 %   -item_coefs_stder: The standard error of the regression coefficient
0165 %    across subjects for each channel and predictor
0166 %   -R_sqr: The R^2 of the regression model for each channel and predictor.
0167 %    This is a measure of how much of the variance is explained by the
0168 %    predictor
0169 %   -R_sqr_adj: The adjusted R^2 of the regression model for each channel
0170 %    and predictor. This is a measure of how much of the variance is
0171 %    explained by the predictor adjusted by the number of predictors.  This
0172 %    is more informative statistic to report if you use more than one
0173 %    predictor.
0174 %   -overall_F: The F score of the full regression model (i.e., including
0175 %    all predictors) at each channel
0176 %   -overall_F_pvalue: The F score of the full regression model (i.e.,
0177 %    including all predictors) at each channel. Note, this is NOT corrected
0178 %    for multiple comparisons.
0179 %   -t_coefs: t-scores of the coefficients (i.e., coefficients divided by
0180 %    the standard error)
0181 %   -t_df: degrees of freedom for the t-scores in t_coefs
0182 %   -tail: The tail of the hypothesis tests (1=upper tailed, 0=two tailed,
0183 %    -1=lower tailed)
0184 %   -hi_ci: Upper bound of 95% confidence interval for the estimate of the
0185 %    regression coefficient for each channel and predictor.  Confidence
0186 %    intervals assume a t-distribution and are NOT corrected for multiple
0187 %    comparisons.
0188 %   -low_ci: Lower bound of 95% confidence interval for the estimate of the
0189 %    mean regression coefficient across subjects for each channel and
0190 %    predictor.
0191 %   -d_coefs: Cohen's d of the mean coefficients (i.e., the mean of the
0192 %    coefficients divided by the standard deviation).  Cohen's d is a
0193 %    standard measure of effect size.
0194 %   -F_coefs: F-scores of the coefficients. This is included to facilitate
0195 %    computing minF in the future and might not be accurate. For the time
0196 %    being you can ignore it.
0197 %   -p_coefs_uncor: Uncorrected p-values for each coefficient
0198 %   -p_coefs_cor: p-values after correction for multiple comparisons
0199 %   -crctn_method: The method used to correct for multiple comparisons (if
0200 %    more than one channel analyzed)
0201 %   -residuals: Residuals of full regression model for each channel and
0202 %    item (matrix is channel by item)
0203 %   -R_collinear: Estimate of the degree of collinearity for each predictor
0204 %   -h_coefs_cor: binary matrix (channel x predictor) indicating whether or
0205 %    not the null hypothesis of a mean coefficient of 0 could be rejected.
0206 %    1=the null hypothesis is rejected. 0=you failed to reject the null
0207 %    hypothesis.
0208 %   -predictor_names: The names of the predictor variables
0209 %   -family_alpha: Specified family-wise alpha level
0210 %   -chanlocs: Channel labels and location information for the EEG channels
0211 %    included in the analysis
0212 %   -bins: The bins included in the analysis
0213 %   -time_wind: The lower and upper bound of time window used for analysis
0214 %   -participant_filenames: Names of the EEGLAB set files used in the
0215 %    analysis.
0216 %   -participants_per_item: The number of participants who contributed
0217 %    trials for each item
0218 %   -bins_used_to_define_items: Whether or not bins were used in addition
0219 %    to stimpres event codes to define items
0220 %
0221 %
0222 % Reference:
0223 % Holm, S. (1979) A simple sequentially rejective multiple test procedure.
0224 %  Scandinavian Journal of Statistics. 6, 65-70.
0225 %
0226 % Benjamini, Y. & Hochberg, Y. (1995) Controlling the false discovery
0227 %  rate: A practical and powerful approach to multiple testing. Journal
0228 %  of the Royal Statistical Society, Series B (Methodological). 57(1),
0229 %  289-300.
0230 %
0231 % Benjamini, Y. & Yekutieli, D. (2001) The control of the false discovery
0232 %  rate in multiple testing under dependency. The Annals of Statistics.
0233 %  29(4), 1165-1188.
0234 %
0235 %
0236 % Notes:
0237 % -To plot topographies of coefficients from the output of this function at
0238 % a later time, use the function sig_topo_coefs.m.
0239 %
0240 % -If ICA has been applied to the data and some ICs have been labeled as
0241 % artifacts, this function will remove them and re-baseline the data before
0242 % performing the regression analysis. The function should report on the
0243 % command line how many ICs have been removed.
0244 %
0245 % -If you get the message "Warning: Matrix is singular to working
0246 % precision." something is wrong with your data or predictors.  One or both
0247 % of them is exhibiting perfect collinearity and the regression
0248 % coefficients are undetermined.
0249 %
0250 % -If you get a warning that some trials could not be used because one of
0251 % predictors have NaN values for that trial, that means that when you
0252 % loaded information about that predictor into the set file a predictor
0253 % value for that trial was not found.  "NaN" simply stands for "not a
0254 % number" and indicates that that trial can not be used in the regression
0255 % analysis.
0256 %
0257 % -A trial that has an empty value for a predictor (e.g.,
0258 % EEG.epoch(1).tone_duration=[]) will be assigned a value of NaN for that
0259 % predictor
0260 %
0261 % -This function will probably NOT work yet with REACTION TIME as a predictor.
0262 %
0263 % Author:
0264 % David Groppe
0265 % Kutaslab, 2/2011
0266 %
0267 
0268 %%%%%%%%%%%%%%%% Future Work %%%%%%%%%%%%%%%%%
0269 % -diagnostics plots for each sub?
0270 % -add verblevel as input and make more use of it?
0271 % -could speed up function by baselining just used trials
0272 % -option: exclude garv flagged trials?
0273 % -reaction time as predictor should be checked
0274 % -Implement FDR control as an alternative to Bonf-Holm?
0275 % -Use this function as part of another function to compute minF? NOTE, if
0276 % you do this, you need to check the accuracy of regress_results.F_coefs
0277 
0278 %%%%%%%%%%%%%%%% REVISION LOG %%%%%%%%%%%%%%%%%
0279 % ?? update comments!!
0280 
0281 function regress_results=items_regress(gui_infiles_or_tmplt,time_wind,varargin)
0282 % Some optional input arguments have no effect.
0283 % They are just included so that items_regress.m and rm_regress.m accept the same optional input arguments
0284 % This allows minF_regress.m to call them with the same varagin.
0285 p=inputParser;
0286 p.addRequired('gui_infiles_or_tmplt',@(x) ischar(x) || iscell(x));
0287 p.addRequired('time_wind',@(x) isnumeric(x) && (length(x)==2) && isvector(x));
0288 p.addParamValue('predictors',[],@(x) ischar(x) || iscell(x));
0289 p.addParamValue('alpha',.05,@(x) isnumeric(x) && (length(x)==1) && isvector(x) && (x>0) && (x<1));
0290 p.addParamValue('sub_ids',[],@isnumeric);
0291 p.addParamValue('use_bins',[],@(x) isnumeric(x) && isvector(x));
0292 p.addParamValue('bsln_wind',[],@(x) isnumeric(x) && length(x)==2);
0293 p.addParamValue('freq_filt',[],@(x) isnumeric(x) && length(x)==2);
0294 p.addParamValue('n_pad',[],@(x) isnumeric(x) && length(x)==1);
0295 p.addParamValue('plot_topos',1,@isnumeric);
0296 p.addParamValue('exclude_chans',[],@(x) ischar(x) || iscell(x));
0297 p.addParamValue('include_chans',[],@(x) ischar(x) || iscell(x));
0298 p.addParamValue('output_file',[],@ischar);
0299 p.addParamValue('tail',0,@(x) isscalar(x) && ((x==0) || (x==-1) || (x==1)));
0300 p.addParamValue('use_bins_and_evcode','no',@(x) ischar(x) && (strcmpi(x,'yes')) || (strcmpi(x,'no'))); %  trial_item_code=trial_evcodes(item_ct)+trial_bin(item_ct)/bin_order_of_mag;
0301 p.addParamValue('minF_call',0,@isnumeric); %if non-zero, this function has been called by minF_regress.m.  This modifies some error reports.
0302 p.addParamValue('n_perm',5000,@(x) isscalar(x) && (x>0)); %has no effect
0303 p.addParamValue('correction','Bonf-Holm',@(x) ischar(x) && (strcmpi(x,'Bonf-Holm') || strcmpi(x,'FDR BH') || strcmpi(x,'FDR BY'))); % 'perm' option has no effect
0304 p.addParamValue('seed_state',[],@isnumeric); %has no effect
0305 
0306 p.parse(gui_infiles_or_tmplt,time_wind,varargin{:});
0307 
0308 if strcmpi(p.Results.use_bins_and_evcode,'yes'),
0309     use_bins_and_evcode=1;
0310 else
0311     use_bins_and_evcode=0;
0312 end
0313 
0314 if ~p.Results.minF_call,
0315    if ~isempty(p.Results.seed_state)
0316       watchit('Input argument ''seed_state'' has no effect on items_regress.m.  It is just here to make it compatible with minF_regress.m.');
0317    end
0318 end
0319 
0320 if strcmpi(p.Results.correction,'perm'),
0321     correction='Bonf-Holm';
0322     if ~isempty(p.Results.seed_state)
0323         watchit('''perm'' is an invalid multiple comparison correction for items_regress.m.  Using ''Bonf-Holm'' instead.');
0324     end
0325 else
0326     correction=p.Results.correction;
0327 end
0328 
0329 %Make sure exclude & include options were not both used.
0330 if ~isempty(p.Results.include_chans) && ~isempty(p.Results.exclude_chans)
0331     error('You cannot use BOTH ''include_chans'' and ''exclude_chans'' options.');
0332 end
0333 if ischar(p.Results.exclude_chans),
0334     exclude_chans{1}=p.Results.exclude_chans;
0335 elseif isempty(p.Results.exclude_chans)
0336     exclude_chans=[];
0337 else
0338     exclude_chans=p.Results.exclude_chans;
0339 end
0340 if ischar(p.Results.include_chans),
0341     include_chans{1}=p.Results.include_chans;
0342 elseif isempty(p.Results.include_chans)
0343     include_chans=[];
0344 else
0345     include_chans=p.Results.include_chans;
0346 end
0347 flt=p.Results.freq_filt;
0348 
0349 VERBLEVEL=2; % ?? This is currently barely used.
0350 global EEG; %necessary for remove_artifact_ics.m
0351 
0352 %% Get *.set fileames
0353 infiles=get_set_infiles(gui_infiles_or_tmplt,p.Results.sub_ids);
0354 
0355 %% Import Data
0356 n_sub=length(infiles);
0357 
0358 n_regress_trial=zeros(1,n_sub);
0359 for sub=1:n_sub,
0360     fprintf('Loading data for Participant %s\n',infiles{sub});
0361     EEG=pop_loadset(infiles{sub});
0362     %load(infiles{sub},'-MAT');
0363     
0364     % Get time points
0365     if time_wind(1)<EEG.times(1),
0366         error('Analysis time window (%d to %d ms) begins before earliest time point in file %s (%d ms).', ...
0367             time_wind(1),time_wind(2),infiles{sub},EEG.times(1));
0368     else
0369         start_tpt=find_tpt(time_wind(1),EEG.times);
0370     end
0371     if time_wind(2)>EEG.times(end),
0372         error('Analysis time window (%d to %d ms) extends beyond latest time point in file %s (%d ms).', ...
0373             time_wind(1),time_wind(2),infiles{sub},EEG.times(end));
0374     else
0375         end_tpt=find_tpt(time_wind(2),EEG.times);
0376     end
0377     time_ids=start_tpt:end_tpt;
0378     
0379     %Check for artifact ICs and remove them if there
0380     ICs_removed=remove_artifact_ics(p.Results.bsln_wind,VERBLEVEL);
0381     
0382     %Filter the data if requested
0383     if ~isempty(flt)
0384         %first error check
0385         if ~isempty(flt),
0386             if max(flt)>(EEG.srate/2),
0387                 error('''filt'' parameters need to be less than or equal to (sampling rate)/2 (i.e., %f).',EEG.srate/2);
0388             elseif (flt(2)==(EEG.srate/2)) && (flt(1)==0),
0389                 error('If second element of ''filt'' parameter is EEG.srate/2, then the first element must be greater than 0.');
0390             elseif abs(flt(2))<=abs(flt(1)),
0391                 error('Second element of ''filt'' parameters must be greater than first in absolute value.');
0392             elseif (flt(1)<0) || (flt(2)<0),
0393                 if (flt(1)>=0) || (flt(2)>=0),
0394                     error('BOTH parameters of ''filt'' need to be greater than or equal to zero OR need to be negative.');
0395                 end
0396                 if min(flt)<=(-EEG.srate/2),
0397                     error('''filt'' parameters need to be greater than sampling rate/2 (i.e., -%f) when creating a stop band.',EEG.srate/2);
0398                 end
0399             end
0400             
0401         end
0402         EEG=butterfilt(EEG,flt,[],p.Results.n_pad);
0403     end
0404     
0405     %Baseline EEG (unless already baselined by remove_art_ics
0406     if ~isempty(p.Results.bsln_wind) && isempty(ICs_removed),
0407         EEG=baseline_EEG(EEG,p.Results.bsln_wind,VERBLEVEL);   
0408     end
0409     
0410     n_pos_chan=EEG.nbchan; % # of possible channels
0411     chan_ids=zeros(1,n_pos_chan);
0412     
0413     if sub==1,
0414         %% Error check bins
0415         n_bins=length(EEG.bindesc);
0416         if ~isempty(p.Results.use_bins),
0417             mx_bin=max(p.Results.use_bins);
0418             if mx_bin>n_bins,
0419                 error('You requested using bins as high as %d, however file %s only has %d bins.', ...
0420                     mx_bin,infiles{sub},n_bins);
0421             end
0422         else
0423             mx_bin=n_bins;
0424         end        
0425         if use_bins_and_evcode,
0426             bin_order_of_mag=orderofmag(mx_bin)*10;
0427         end
0428         
0429         %% Identify predictors
0430         if isempty(p.Results.predictors),
0431             %Print list of possible predictors on command line
0432             epoch_flds=fieldnames(EEG.epoch);
0433             n_epoch_flds=length(epoch_flds);
0434             poss_pred=cell(1,n_epoch_flds);
0435             poss_pred_fullname=cell(1,n_epoch_flds);
0436             
0437             fprintf('\n**** Choose Predictor(s) for Regression Analysis ****\n');
0438             ct=0;
0439             for fld_loop=1:n_epoch_flds,
0440                 if ~strcmpi('event',epoch_flds{fld_loop}) && ~strcmpi('eventtype',epoch_flds{fld_loop}),
0441                     ct=ct+1;
0442                     poss_pred_fullname{ct}=epoch_flds{fld_loop};
0443                     if strcmpi('event',epoch_flds{fld_loop}(1:5)),
0444                         poss_pred{ct}=epoch_flds{fld_loop}(6:end);
0445                     else
0446                         poss_pred{ct}=epoch_flds{fld_loop};
0447                     end
0448                     fprintf('%d) %s\n',ct,poss_pred{ct});
0449                 end
0450             end
0451             %Is reaction time a field?
0452             EEG_flds=fieldnames(EEG);
0453             if ismember('rtmsec',EEG_flds) && ismember('rtbins',EEG_flds),
0454                 ct=ct+1;
0455                 poss_pred{ct}='Reaction Time';
0456                 poss_pred_fullname{ct}='Reaction Time';
0457                 fprintf('%d) %s\n',ct,poss_pred{ct});
0458             end
0459             
0460             pred_ids=[];
0461             while isempty(pred_ids),
0462                 pred_ids=input('Enter the number(s) of desired predictor(s): ','s');
0463                 fprintf('\n');
0464                 pred_ids=str2num(pred_ids);
0465                 if max(pred_ids)>ct,
0466                     fprintf('ERROR: There are no predictor indices greater than %d.\n',ct);
0467                     pred_ids=[];
0468                 elseif min(pred_ids)<1,
0469                     fprintf('ERROR: All predictor indices must be greater than 0.\n');
0470                     pred_ids=[];
0471                 end
0472             end
0473             predictors=poss_pred_fullname(unique(pred_ids));
0474             predictor_names_neat=poss_pred(unique(pred_ids));
0475             if ismember('Reaction Time',predictors),
0476                 rt_selected=1;
0477             else
0478                 rt_selected=0;
0479             end
0480         else
0481             %get possible predictors
0482             epoch_flds=fieldnames(EEG.epoch);
0483             n_epoch_flds=length(epoch_flds);
0484             poss_pred=cell(1,n_epoch_flds);
0485             poss_pred_fullname=cell(1,n_epoch_flds);
0486             ct=0;
0487             for fld_loop=1:n_epoch_flds,
0488                 if ~strcmpi('event',epoch_flds{fld_loop}) && ~strcmpi('eventtype',epoch_flds{fld_loop}),
0489                     ct=ct+1;
0490                     poss_pred_fullname{ct}=epoch_flds{fld_loop};
0491                     if strcmpi('event',epoch_flds{fld_loop}(1:5)),
0492                         poss_pred{ct}=epoch_flds{fld_loop}(6:end);
0493                     else
0494                         poss_pred{ct}=epoch_flds{fld_loop};
0495                     end
0496                 end
0497             end
0498             %Is reaction time a field?
0499             EEG_flds=fieldnames(EEG);
0500             if ismember('rtmsec',EEG_flds) && ismember('rtbins',EEG_flds),
0501                 ct=ct+1;
0502                 poss_pred{ct}='Reaction Time';
0503                 poss_pred_fullname{ct}='Reaction Time';
0504             end
0505             
0506             %figure out which of the possible predictors have been
0507             %requested
0508             if ischar(p.Results.predictors)
0509                 requested_pred{1}=p.Results.predictors;
0510             else
0511                 requested_pred=p.Results.predictors;
0512             end
0513             pred_ids=zeros(1,length(requested_pred));
0514             ct=0;
0515             n_poss=length(poss_pred);
0516             for dloop=1:length(requested_pred),
0517                 found=0;
0518                 for eloop=1:n_poss,
0519                     if strcmpi(requested_pred{dloop},poss_pred{eloop}) || ...
0520                             strcmpi(requested_pred{dloop},poss_pred_fullname{eloop})
0521                         found=1;
0522                         ct=ct+1;
0523                         pred_ids(ct)=eloop;
0524                         break;
0525                     end
0526                 end
0527                 if ~found,
0528                     fprintf('*** ERROR!! ***\n');
0529                     fprintf('Requested predictor %s was not found in this EEG variable.\n', ...
0530                         requested_pred{dloop});
0531                     fprintf('Possible predictors are:\n');
0532                     for floop=1:n_poss,
0533                         if ~isempty(poss_pred{floop}),
0534                             fprintf('%d) %s\n',floop,poss_pred{floop});
0535                         end
0536                     end
0537                     error('Requested predictor not found.');
0538                 end
0539             end
0540             predictors=poss_pred_fullname(unique(pred_ids));
0541             predictor_names_neat=poss_pred(unique(pred_ids));
0542             if ismember('Reaction Time',predictors),
0543                 rt_selected=1;
0544             else
0545                 rt_selected=0;
0546             end
0547         end
0548         n_pred=length(predictors);
0549         
0550         %% Figure out which channels to ignore if any
0551         % exclude and include chan options
0552         use_chan_labels=cell(1,n_pos_chan); %preallocate mem
0553         ct=0;
0554         if ~isempty(exclude_chans),
0555             for x=1:n_pos_chan,
0556                 if ~ismember(EEG.chanlocs(x).labels,exclude_chans),
0557                     ct=ct+1;
0558                     chan_ids(ct)=x;
0559                     use_chan_labels{ct}=EEG.chanlocs(x).labels;
0560                 end
0561             end
0562         elseif ~isempty(include_chans),
0563             for x=1:n_pos_chan,
0564                 if ismember(EEG.chanlocs(x).labels,include_chans),
0565                     ct=ct+1;
0566                     chan_ids(ct)=x;
0567                     use_chan_labels{ct}=EEG.chanlocs(x).labels;
0568                 end
0569             end
0570         else
0571             for x=1:n_pos_chan,
0572                 use_chan_labels{x}=EEG.chanlocs(x).labels;
0573             end
0574             chan_ids=1:n_pos_chan;
0575             ct=n_pos_chan;
0576         end
0577         chan_ids=chan_ids(1:ct);
0578         use_chan_labels(ct+1:end)=[];
0579         n_chan=length(chan_ids);
0580         chanlocs=EEG.chanlocs(chan_ids);
0581         
0582         %We could try to preallocate memory for these variables to speed
0583         %things up a bit, even though we don't know for sure how many
0584         %unique items there are. ??
0585         item_erps=[];
0586         item_erps_evcodes=[];
0587         item_erps_ct=[];
0588         item_pred_vals=[];
0589     else
0590         %Not the first subject, grab the same channels in the same order and throw error if missing used channel(s)
0591         ct=0;
0592         for a=1:n_chan,
0593             found=0;
0594             for b=1:n_pos_chan,
0595                 %a=channel index for analysis
0596                 %b=channel index for this particular subject
0597                 if strcmpi(EEG.chanlocs(b).labels,use_chan_labels{a}),
0598                     found=1;
0599                     ct=ct+1;
0600                     chan_ids(ct)=b;
0601                     break;
0602                 end
0603             end
0604             if ~found,
0605                 error('Missing channel %s for participant %s.',use_chan_labels{a}, ...
0606                     infiles{sub});
0607             end
0608         end
0609         chan_ids=chan_ids(1:ct);
0610     end
0611     
0612     poss_prdct=fieldnames(EEG.epoch);
0613     if rt_selected,
0614         EEG_flds=fieldnames(EEG);
0615         if ismember('rtmsec',EEG_flds) && ismember('rtbins',EEG_flds),
0616             poss_prdct{1+length(poss_prdct)}='Reaction Time';
0617         end
0618     end
0619     missing_prdct=setdiff(predictors,poss_prdct);
0620     if ~isempty(missing_prdct),
0621         fprintf('ERROR: File %s is missing the following predictor(s): \n', ...
0622             infiles{sub});
0623         for a=1:length(missing_prdct),
0624             fprintf('%s \n',missing_prdct{a});
0625         end
0626         error('Regression aborted.');
0627     end
0628     
0629     %% Get predictor and response values
0630     n_trial=length(EEG.epoch);
0631     use_trial=zeros(1,n_trial);
0632     trial_evcodes=zeros(1,n_trial);
0633     trial_bin=zeros(1,n_trial);
0634     pred_val=zeros(n_pred,n_trial);
0635     EEG_val=zeros(n_chan,n_trial);
0636     for trial=1:n_trial,
0637         n_type=length(EEG.epoch(trial).eventtype);
0638         
0639         %Figure out which bins the trial belongs to
0640         all_trial_bins=[];
0641         for typ=1:n_type,
0642             if length(EEG.epoch(trial).eventtype{typ})>3,
0643                 if strcmpi(EEG.epoch(trial).eventtype{typ}(1:3),'bin')
0644                     all_trial_bins=[all_trial_bins str2num(EEG.epoch(trial).eventtype{typ}(4:end))];
0645                 end
0646             end
0647         end
0648         all_trial_bins=unique(all_trial_bins);
0649         
0650         % Check to see if current trial falls into desired bin(s)
0651         if isempty(p.Results.use_bins)
0652             use_trial(trial)=1;
0653             bin_intersect=all_trial_bins;
0654         else
0655             bin_intersect=intersect(p.Results.use_bins,all_trial_bins);
0656             if ~isempty(bin_intersect)
0657                 use_trial(trial)=1;
0658             end
0659         end
0660 
0661         %Record predictor and response var values if trial is being used
0662         if use_trial(trial),
0663             %make sure trial doesn't fall into multiple bins if bins are being
0664             %used to define items
0665             if use_bins_and_evcode,
0666                 if length(bin_intersect)>1,
0667                     msg=sprintf(['Epoch %d belongs to more than one bin and you are using bins to help define items.\n', ...
0668                         'Thus, I do not know which item this epoch belongs to.  Set ''bins_and_evcodes_define_items'' to ''no'' or use fewer bins via ''use_bins'' option.\n'], ...
0669                         trial);
0670                     error(msg);
0671                 else
0672                     trial_bin(trial)=bin_intersect;
0673                 end
0674             end
0675                         
0676             trial_evcodes(trial)=EEG.epoch(trial).eventevcode;
0677             for pred=1:n_pred,
0678                 if strcmpi(predictors{pred},'Reaction Time'),
0679                     n_rt_bin=length(EEG.rtbins{trial});
0680                     bin_rts=zeros(1,n_rt_bin);
0681                     bin_ct=0;
0682                     for b=1:n_rt_bin,
0683                         if ismember(EEG.rtbins{trial}(b),p.Results.use_bins),
0684                             bin_ct=bin_ct+1;
0685                             bin_rts(bin_ct)=EEG.rtmsec{trial}(b);
0686                         end
0687                     end
0688                     bin_rts=bin_rts(1:bin_ct);
0689                     uni_rt=unique(bin_rts);
0690                     if length(uni_rt)>1
0691                         error('Trial %d has more than one reaction time for the bins you''ve selected.',trial);
0692                     elseif isempty(uni_rt),
0693                         watchit(sprintf('Epoch %d has an empty value for reaction time.  Please use NaN for missing predictor values.\n', ...
0694                             trial));
0695                         pred_val(pred,trial)=NaN;
0696                     else
0697                         pred_val(pred,trial)=uni_rt;
0698                     end
0699                 else
0700                     cmnd=['emptyval=isempty(EEG.epoch(trial).' predictors{pred} ');'];
0701                     eval(cmnd);
0702                     if emptyval,
0703                         watchit(sprintf('Epoch %d has an empty value for predictor %s.  Please use NaN for missing predictor values.\n', ...
0704                             trial,predictor_names_neat{pred}));
0705                         cmnd=['pred_val(pred,trial)=NaN;']; %?? Do RT too
0706                         eval(cmnd);
0707                     else
0708                         cmnd=['pred_val(pred,trial)=EEG.epoch(trial).' predictors{pred} ';'];
0709                         eval(cmnd);
0710                     end
0711                 end
0712             end
0713             for c=1:n_chan,
0714                 EEG_val(c,trial)=squeeze(mean(EEG.data(chan_ids(c),time_ids,trial),2));
0715             end
0716         end
0717     end
0718  
0719     
0720     %get rid of unused trials
0721     trial_ids=find(use_trial==1);
0722     if n_chan>1,
0723         EEG_val=EEG_val(:,trial_ids);
0724     else
0725         EEG_val=EEG_val(trial_ids);
0726     end
0727     if n_pred>1,
0728         pred_val=pred_val(:,trial_ids);
0729     else
0730         pred_val=pred_val(trial_ids);
0731     end
0732     trial_evcodes=trial_evcodes(trial_ids);
0733     trial_bin=trial_bin(trial_ids);
0734     
0735     %get rid of NaN trials (if any)
0736     if n_pred>1,
0737         has_NaN=sum(isnan(pred_val));
0738         non_NaN_ids=find(has_NaN==0);
0739         NaN_ids=find(has_NaN);
0740         pred_val=pred_val(:,non_NaN_ids);
0741     else
0742         non_NaN_ids=find(~isnan(pred_val));
0743         NaN_ids=find(isnan(pred_val));
0744         pred_val=pred_val(non_NaN_ids);
0745     end
0746     if n_chan>1,
0747         EEG_val=EEG_val(:,non_NaN_ids);
0748     else
0749         EEG_val=EEG_val(non_NaN_ids);
0750     end
0751     trial_evcodes=trial_evcodes(non_NaN_ids);
0752     trial_bin=trial_bin(non_NaN_ids);
0753     
0754     n_nonNaN_trials=length(non_NaN_ids);
0755     n_selected_trials=length(trial_ids);
0756     if n_selected_trials~=n_nonNaN_trials,
0757         watchit(sprintf('%d of the selected trials have NaN predictor values for at least one predictor variable.', ...
0758             n_selected_trials-n_nonNaN_trials));
0759         NaN_trial_ids=trial_ids(NaN_ids);
0760         if VERBLEVEL>2,
0761             disp(['The EEG.epoch indices of those trials are:' num2str(NaN_trial_ids)]);
0762         end
0763     end
0764     n_regress_trial(sub)=n_nonNaN_trials;
0765     fprintf('Number of usable trials: %d\n',n_nonNaN_trials);
0766     %Check if sufficient # of usable trials
0767     if ~n_nonNaN_trials,
0768         if isempty(p.Results.use_bins),
0769             error('Participant %s has no usable trials.  Regression aborted.', ...
0770                 infiles{sub});
0771         else
0772             error('Participant %s has no trials for Bin(s) %s.  Regression aborted.', ...
0773                 infiles{sub},num2str(p.Results.use_bins));
0774         end
0775     end
0776     
0777     %Report min and max values for each predictor
0778     for pred_loop=1:n_pred,
0779         fprintf('%s (min/max): %f/%f\n',predictor_names_neat{pred_loop}, ...
0780             min(pred_val(pred_loop,:)),max(pred_val(pred_loop,:)));
0781     end
0782     
0783     %% Add trials to item running sums
0784     for item_ct=1:n_nonNaN_trials,
0785         if use_bins_and_evcode,
0786             trial_item_code=trial_evcodes(item_ct)+trial_bin(item_ct)/bin_order_of_mag;
0787         else
0788             trial_item_code=trial_evcodes(item_ct);
0789         end
0790         
0791         item_erps_id=find(item_erps_evcodes==trial_item_code);
0792         if isempty(item_erps_id),
0793             %item hasn't been encountered in previous set files
0794             item_erps_id=length(item_erps_ct)+1;
0795             item_erps_ct(item_erps_id)=1;
0796             item_erps_evcodes(item_erps_id)=trial_item_code;
0797             if n_chan>1,
0798                 item_erps(:,item_erps_id)=EEG_val(:,item_ct);
0799             else
0800                 item_erps(item_erps_id)=EEG_val(item_ct);
0801             end
0802             if n_pred>1,
0803                 item_pred_vals(:,item_erps_id)=pred_val(:,item_ct);
0804             else
0805                 item_pred_vals(item_erps_id)=pred_val(item_ct);
0806             end
0807         else
0808             item_erps_ct(item_erps_id)=item_erps_ct(item_erps_id)+1;
0809             if n_chan>1,
0810                 item_erps(:,item_erps_id)=item_erps(:,item_erps_id)+EEG_val(:,item_ct);
0811             else
0812                 item_erps(item_erps_id)=item_erps(item_erps_id)+EEG_val(item_ct);
0813             end
0814             %check to make sure current set file's predictor variable
0815             %values are the same as previous set files
0816             if n_pred>1,
0817                 if ~isequal(pred_val(:,item_ct),item_pred_vals(:,item_erps_id))
0818                     error('Predictor variable values for item %d in set file %s do not equal that of previous set files.', ...
0819                         trial_evcodes(item_ct),infiles{sub});
0820                 end
0821             else
0822                 if ~isequal(pred_val(item_ct),item_pred_vals(item_erps_id))
0823                     error('Predictor variable values for item %d in set file %s do not equal that of previous set files.', ...
0824                         trial_evcodes(item_ct),infiles{sub});
0825                 end
0826             end
0827         end
0828     end
0829 end
0830 
0831 
0832 %% Turn running sums into means
0833 n_items=length(item_erps_ct);
0834 item_erps=item_erps./repmat(item_erps_ct,n_chan,1);
0835 fprintf('\n\n*** Finished Constructing Item Averages ***\n\n');
0836 
0837 %check to make sure there are a sufficient number of item averages to do the regression
0838 if n_items<=n_pred,
0839     error('There are an insufficient number of unique items (i.e., %d) for the number of predictors in this analysis (i.e., %d plus an intercept term).', ...
0840         n_items,n_pred);
0841 end
0842 
0843 
0844 %% Perform multiple regression at each channel
0845 regress_results.item_coefs=zeros(n_chan,n_pred+1);
0846 regress_results.item_coefs_stder=zeros(n_chan,n_pred+1);
0847 regress_results.R_sqr=zeros(n_chan,n_pred+1);
0848 regress_results.R_sqr_adj=zeros(n_chan,n_pred+1);
0849 regress_results.overall_F=zeros(n_chan,1);
0850 regress_results.overall_F_pvalue=zeros(n_chan,1);
0851 regress_results.t_coefs=zeros(n_chan,n_pred+1);
0852 regress_results.t_df=n_items-(n_pred+1); 
0853 regress_results.tail=p.Results.tail; 
0854 regress_results.hi_ci=zeros(n_chan,n_pred+1);
0855 regress_results.low_ci=zeros(n_chan,n_pred+1);
0856 regress_results.d_coefs=zeros(n_chan,n_pred+1);
0857 regress_results.F_coefs=zeros(n_chan,n_pred+1); %useful for combining by-item and by-subjects analyses
0858 regress_results.p_coefs_uncor=zeros(n_chan,n_pred+1);
0859 regress_results.p_coefs_cor=zeros(n_chan,n_pred+1);
0860 regress_results.crctn_method=[]; %filled in later
0861 regress_results.residuals=zeros(n_chan,n_items);
0862 for c=1:n_chan,
0863     [coeff, S_err, XTXI, R_sq, R_sq_adj, F_val, Coef_stats, y_hat, residuals]= ...
0864         mregress(item_erps(c,:)',item_pred_vals',1); %final argument "1" means that an intercept term is included in the model
0865     %Note that the first element of coeff and associated variables
0866     %correpsponds to the intercept term
0867     regress_results.item_coefs(c,1:n_pred+1)=coeff;
0868     regress_results.item_coefs_stder(c,1:n_pred+1)=Coef_stats(:,2);
0869     regress_results.R_sqr(c,1:n_pred+1)=R_sq;
0870     regress_results.R_sqr_adj(c,1:n_pred+1)=R_sq_adj;
0871     regress_results.t_coefs(c,1:n_pred+1)=Coef_stats(:,3);
0872     regress_results.overall_F(c)=F_val(1);
0873     regress_results.overall_F_pvalue(c)=F_val(2);
0874     
0875     % Compute coefficient p-values (no correction for multiple comparisons, yet)
0876     if ~p.Results.tail
0877         %two-tailed test
0878         regress_results.p_coefs_uncor(c,1:n_pred+1)=Coef_stats(:,4);
0879     elseif p.Results.tail==1,
0880         %upper-tailed test
0881         for pred=1:n_pred+1,
0882             regress_results.p_coefs_uncor(c,pred)=(1-tcdf(regress_results.t_coefs(c,pred),regress_results.t_df));
0883         end
0884     else
0885         %lower-tailed test
0886         for pred=1:n_pred+1,
0887             regress_results.p_coefs_uncor(c,pred)=tcdf(regress_results.t_coefs(c,pred),regress_results.t_df);
0888         end
0889     end
0890     
0891     regress_results.F_coefs(c,1:n_pred+1)=finv(1-Coef_stats(:,4),1,regress_results.t_df); %check this against stat book ??, note you will have to fix this if tail is made an option
0892     regress_results.residuals(c,:)=residuals;
0893 end
0894 if sum(sum(isnan(regress_results.item_coefs))),
0895     error('The data exhibit perfect collinearity and the regression coefficients cannot be determined.');
0896 end
0897 regress_results.d_coefs=regress_results.item_coefs./regress_results.item_coefs_stder; %compute Cohen's d
0898 crit_t=tinv(.975,regress_results.t_df); %for 95% confidence intervals
0899 regress_results.hi_ci=regress_results.item_coefs+crit_t*regress_results.item_coefs_stder;
0900 regress_results.low_ci=regress_results.item_coefs-crit_t*regress_results.item_coefs_stder;
0901 
0902 %% Quantify degree of collinearity with other predictors (if more than one
0903 %  predictor)
0904 if n_pred>1,
0905     regress_results.R_collinear=zeros(1,n_pred+1);
0906     regress_results.R_collinear(1)=NaN; %intercept term
0907     for a=1:n_pred,
0908         [coeffCOL, S_errCOL, XTXIcol, R_sqCOL]=mregress(item_pred_vals(a,:)', ...
0909             item_pred_vals(setdiff(1:n_pred,a),:)',1);
0910         regress_results.R_collinear(a+1)=R_sqCOL;
0911     end
0912 else
0913     regress_results.R_collinear(1:2)=NaN; %intercept term and one predictor
0914 end
0915 
0916 
0917 %% Correct for multiple comparisons (if more than one channel)
0918 if n_chan==1,
0919     regress_results.crctn_method='Not applicable';
0920 else
0921     regress_results.crctn_method=correction;
0922 end
0923 for a=1:(n_pred+1),
0924     if strcmpi(correction,'Bonf-Holm'),
0925         regress_results.p_coefs_cor(:,a)=bonf_holm(regress_results.p_coefs_uncor(:,a));
0926     elseif strcmpi(correction,'FDR BH'),
0927         [dummy1 dummy2 regress_results.p_coefs_cor(:,a)]=fdr_bh(regress_results.p_coefs_uncor(:,a),p.Results.alpha,'pdep','yes');
0928     elseif strcmpi(correction,'FDR BY'),
0929         [dummy1 dummy2 regress_results.p_coefs_cor(:,a)]=fdr_bh(regress_results.p_coefs_uncor(:,a),p.Results.alpha,'dep','yes');
0930     else
0931         %no correction
0932         regress_results.p_coefs_cor(:,a)=regress_results.p_coefs_uncor(:,a);
0933     end
0934 end
0935 regress_results.h_coefs_cor=regress_results.p_coefs_cor<=p.Results.alpha;
0936 
0937 regress_results.predictor_names=cell(1,n_pred+1);
0938 regress_results.predictor_names{1}='Intercept';
0939 %regress_results.predictor_names(2:n_pred+1)=predictors; <=These are
0940 %predictor names with event in front of it
0941 regress_results.predictor_names(2:n_pred+1)=predictor_names_neat;
0942 regress_results.family_alpha=p.Results.alpha;
0943 regress_results.chanlocs=chanlocs;
0944 if isempty(p.Results.use_bins),
0945     regress_results.bins='All';
0946 else
0947     regress_results.bins=p.Results.use_bins;
0948 end
0949 regress_results.time_wind=time_wind;
0950 regress_results.participant_filenames=infiles;
0951 regress_results.participants_per_item=item_erps_ct;
0952 regress_results.bins_used_to_define_items=p.Results.use_bins_and_evcode;
0953 
0954 
0955 %% Report Results to Command Line
0956 fprintf('**ANALYSIS PARAMETERS**\n');
0957 fprintf('Total # of unique items: %d\n',n_items);
0958 fprintf('Min/Max # of participants contributing to each item: %d/%d\n', ...
0959     min(item_erps_ct),max(item_erps_ct));
0960 fprintf('Mean (SD) # of participants contributing to each item: %.1f (%.1f)\n', ...
0961     mean(item_erps_ct),std(item_erps_ct));
0962 fprintf('Total number of participants: %d\n',n_sub);
0963 fprintf('Regression analysis t-score degrees of freedom: %d\n', ...
0964     regress_results.t_df);
0965 if p.Results.tail>0,
0966     fprintf('Tail of tests: Upper\n');
0967 elseif p.Results.tail<0,
0968     fprintf('Tail of tests: Lower\n');
0969 else
0970     fprintf('Tail of tests: Two-Tailed\n');
0971 end
0972 if n_chan==1,
0973     fprintf('Multiple comparison correction method: Not Applicable\n');
0974 else
0975     fprintf('Multiple comparison correction method: ');
0976     if strcmpi(correction,'Bonf-Holm'),
0977         fprintf('Bonferroni-Holm procedure\n');
0978     elseif strcmpi(correction,'FDR BH'),
0979         fprintf('Benjamini & Hochberg FDR control procedure\n');
0980     elseif strcmpi(correction,'FDR BY'),
0981         fprintf('Benjamini & Yekutieli FDR control procedure\n');
0982     else
0983         error('Multiple comparison correction method: UNKOWN!!!\n');
0984     end
0985 end
0986 fprintf('Specified Family-Wise Alpha Level: %.3f\n',p.Results.alpha); 
0987 fprintf('Number of channels analyzed: %d\n',n_chan);
0988 fprintf('Mean EEG amplitude measured from %d to %d ms.\n',time_wind(1), ...
0989     time_wind(2));
0990 
0991 
0992 %% predictor results
0993 fprintf('**RESULTS (Summarized across all channels)**\n');
0994 if n_pred>1,
0995     fprintf('Predictor\t#_of_Sig_Channels\tCollinearity\tp-values\n');
0996     for a=1:n_pred+1,
0997         mn_p=min(regress_results.p_coefs_cor(:,a));
0998         if mn_p>=p.Results.alpha,
0999             %no significant effects for this predictor
1000             if a==1,
1001                 %intercept term
1002                 fprintf('%s\t\t%d\t\t\tNA\t\t\tp>=%f\n',regress_results.predictor_names{a}, ...
1003                     sum(regress_results.h_coefs_cor(:,a)),mn_p); %intercept term has no collinearity measure
1004             else
1005                 fprintf('%s\t\t%d\t\t\t%.2f\t\tp>=%f\n',regress_results.predictor_names{a}, ...
1006                     sum(regress_results.h_coefs_cor(:,a)),regress_results.R_collinear(a), ...
1007                     mn_p);
1008             end
1009         else
1010             %at least one significant effect for this predictor
1011             sig_p_id=find(regress_results.p_coefs_cor(:,a)<p.Results.alpha);
1012             mx_p=max(regress_results.p_coefs_cor(sig_p_id,a));
1013             if a==1,
1014                 %intercept term
1015                 fprintf('%s\t\t%d\t\t\tNA\t\t%f>=sig p>=%f\n',regress_results.predictor_names{a}, ...
1016                     sum(regress_results.h_coefs_cor(:,a)),mx_p,mn_p); %intercept term has no collinearity measure
1017             else
1018                 fprintf('%s\t\t%d\t\t\t%.2f\t%f>=sig p>=%f\n',regress_results.predictor_names{a}, ...
1019                     sum(regress_results.h_coefs_cor(:,a)),regress_results.R_collinear(a), ...
1020                     mx_p,mn_p);
1021             end
1022         end
1023     end
1024 else
1025     fprintf('Predictor\t#_of_Sig_Channels\t\tp-values\n');
1026     for a=1:2,
1027         mn_p=min(regress_results.p_coefs_cor(:,a));
1028         if mn_p>=p.Results.alpha,
1029             %no significant effects for this predictor
1030             fprintf('%s\t\t%d\t\t\tp>=%f\n',regress_results.predictor_names{a}, ...
1031                 sum(regress_results.h_coefs_cor(:,a)),mn_p);
1032         else
1033             %at least one significant effect for this predictor
1034             sig_p_id=find(regress_results.p_coefs_cor(:,a)<p.Results.alpha);
1035             mx_p=max(regress_results.p_coefs_cor(sig_p_id,a));
1036             fprintf('%s\t\t%d\t\t%f>=sig p>=%f\n',regress_results.predictor_names{a}, ...
1037                 sum(regress_results.h_coefs_cor(:,a)),mx_p,mn_p);
1038         end
1039     end
1040 end
1041 
1042 %% Print Results to Output File
1043 if ~isempty(p.Results.output_file),
1044     [fid msg]=fopen(p.Results.output_file,'w');
1045     if fid==-1,
1046         error('Cannot create file %s for writing.  According to fopen.m: %s.', ...
1047             p.Results.file,msg);
1048     else
1049         % Header Information
1050         fprintf(fid,'**ANALYSIS_PARAMETERS**\n');
1051         fprintf(fid,'Total # of unique items: %d\n',n_items);
1052         fprintf(fid,'Min/Max # of participants contributing to each item: %d/%d\n', ...
1053             min(item_erps_ct),max(item_erps_ct));
1054         fprintf(fid,'Total number of participants: %d\n',n_sub);
1055         fprintf(fid,'Regression analysis coefficient t-score degrees of freedom: %d\n',regress_results.t_df);
1056         if p.Results.tail>0,
1057             fprintf(fid,'Tail of tests: Upper\n');
1058         elseif p.Results.tail<0,
1059             fprintf(fid,'Tail of tests: Lower\n');
1060         else
1061             fprintf(fid,'Tail of tests: Two-Tailed\n');
1062         end
1063         if n_chan==1,
1064             fprintf(fid,'Multiple comparison correction method: Not Applicable\n');
1065         else
1066             fprintf(fid,'Multiple comparison correction method: ');
1067             if strcmpi(correction,'Bonf-Holm'),
1068                 fprintf(fid,'Bonferroni-Holm procedure\n');
1069             elseif strcmpi(correction,'FDR BH'),
1070                 fprintf(fid,'Benjamini & Hochberg FDR control procedure\n');
1071             elseif strcmpi(correction,'FDR BY'),
1072                 fprintf(fid,'Benjamini & Yekutieli FDR control procedure\n');
1073             else
1074                 error(fid,'Multiple comparison correction method: UNKOWN!!!\n');
1075             end
1076         end
1077         fprintf(fid,'Family-Wise Alpha Level: %.3f\n',p.Results.alpha);
1078         fprintf(fid,'Number of channels analyzed: %d\n',n_chan);
1079         fprintf(fid,'Mean EEG amplitude measured from %d to %d ms.\n',time_wind(1), ...
1080             time_wind(2));
1081         fprintf(fid,'*.set_files_used_in_analysis #_of_trials_from_setfile_used\n');
1082         for a=1:n_sub,
1083             fprintf(fid,'%s %d\n',infiles{a},n_regress_trial(a));
1084         end
1085         fprintf(fid,'**REGRESSION_RESULTS_SUMMARY**\n');
1086         if n_pred>1,
1087             fprintf(fid,'Predictor\t#_of_Sig_Channels\tCollinearity\n');
1088             fprintf(fid,'%s\t\t%d\t\t\tNA\n',regress_results.predictor_names{1}, ...
1089                 sum(regress_results.h_coefs_cor(:,1))); %intercept term has no collinearity measure
1090             for a=2:n_pred+1,
1091                 fprintf(fid,'%s\t\t%d\t\t\t%.2f\n',regress_results.predictor_names{a}, ...
1092                     sum(regress_results.h_coefs_cor(:,a)),regress_results.R_collinear(a));
1093             end
1094         else
1095             fprintf(fid,'Predictor\t#_of_Sig_Channels\n');
1096             fprintf(fid,'%s\t\t%d\n',regress_results.predictor_names{1}, ...
1097                 sum(regress_results.h_coefs_cor(:,1))); %intercept term has no collinearity measure
1098             fprintf(fid,'%s\t\t%d\n',regress_results.predictor_names{2}, ...
1099                 sum(regress_results.h_coefs_cor(:,2)));
1100         end
1101         
1102         fprintf(fid,'**REGRESSION_RESULTS_PER_CHANNEL**\n');
1103         for c=1:n_chan,
1104             fprintf(fid,'Results_for_Channel: %s \n',regress_results.chanlocs(c).labels);
1105             fprintf(fid,'Predictor\tMean_Coefficient\t95%c_CI_lower_bound\t95%c_CI_upper_bound\tt-score\tp-value(Uncorrected)\tSignificant(Corrected)\tCohen''s_d\n', ...
1106                 37,37);
1107             for pred=1:n_pred+1,
1108                 %mean coefficient
1109                 %95% CI
1110                 %t-score
1111                 %p-value
1112                 %Cohen's d
1113                 if pred==1,
1114                     pred_name='Intercept';
1115                 else
1116                     pred_name=predictors{pred-1};
1117                 end
1118                 fprintf(fid,'%s \t%f \t%f \t%f \t%.3f \t%.3f \t%d \t%.2f\n', ...
1119                     pred_name,regress_results.item_coefs(c,pred), ...
1120                     regress_results.low_ci(c,pred),regress_results.hi_ci(c,pred), ...
1121                     regress_results.t_coefs(c,pred),regress_results.p_coefs_uncor(c,pred), ...
1122                     regress_results.h_coefs_cor(c,pred),regress_results.d_coefs(c,pred));
1123             end
1124         end
1125         fclose(fid);
1126     end
1127 end
1128 
1129 %% Plot Results
1130 if p.Results.plot_topos && n_chan>1,
1131     sig_topo_coefs(regress_results);
1132     sig_topo_coefs(regress_results,'statistic','t');
1133 end
1134 
1135 
1136

Generated on Tue 10-May-2016 16:37:45 by m2html © 2005