avg - Averaging of Continuous Raw EEG Data
The program avg performs signal averaging on data in the continuous data system format. It supercedes and combines the old fsravg and vsravg programs. avg allows decreases in sampling rate to enable long epochs to be averaged given that averaging epochs must be a multiple of 256 points in length, usually 256 points. Three files are required: a binlist file (most often obtained using the cdbl program), a log file, and the corresponding raw data file (usually on magtape). Additional files will be required to perform artifact rejection, digital filtering, or decimation, if these operations are desired.
avg is invoked thus:
avg presampling avgfile [options]optwhere
avgfile is the name of the output averaged data file, presampling is the total desired prestimulus msec. [options] are the optional combination of any of: -a arfile invoke Artifact Rejection (A.R.) using the A.R. functions and parameters in the file arfile -f rawdev Use rawdev rather than mt0, i.e. magtape, for raw data -p print A.R. reject status -m force only one swap area -n don’t rewind magtape on completion or exit -c chan_prec instead of employing 256 points in each averaged epoch, use chan_prec*256,thus increasing the epoch length (as well as the amount of disk space needed to store the average file) -r dec_fact decrease the sampling rate of the data (decimate) by an integral factor of dec_factor as they are extracted from the raw file -d id_filter (the -d option is NOT yet available on DOS systems) digitally filter the data using the integer digitial filter id_filter as they are extracted from the raw file -e show an estimate of the percentage of total items that have been processed, updated each time a new log item is processed. This option is only available on DOS systems. -o outfile print stats such as mismatches to outfile as well as to the screen. This option is only available on DOS systems. On UNIX systems the same result can be obtained by using the tee program.
The avgfile is where the averaged data will go when avg terminates. In order to attempt to protect the user from inadvertently destroying a pre-existing averaged data file, avg complains if the requested avgfile already exists. If one really wants to write over an existing file, it is necessary to remove it. This will also be necessary if one has attempted to run avg using the same avgfile name but the program died (due to errors further down the line) or was killed, etc.
Note that the presampling request is the total desired pre- event period desired. The onset delay in the raw file is subtracted from this value to calculate the additional sampling needed prior to the occurrence of the trigger. The onset delay can be negative, indicating that the event occurred before the trigger (see the mdh User’s Manual). This allows for calibrating and adjusting for time delays induced by equipment (especially filters), as well as the actual event displacements.
When artifact rejection is to be employed, both the -a and the name of the arf file must be included. The calibration and creation of arf files is discussed in the garv User’s Manual.
If a -p is also appended to the -a arfile, avg then prints the trial number and test which caused rejection for each rejected single trial.
The -f flag, used in conjunction with the following rawdev argument, specifies that the raw data are to be taken from rawdev, rather than mt0, which is the default (mt0 is the name of the tape drive device driver).
The -m flag is used for conserving memory space on machines that have some multitasking environment such as Desqview running. Note that the meaning of the -m flag is reversed on UNIX versions of avg. On the DOS version, the default is to use as much memory as possible for swap space, while on UNIX, the default is to use only one memory resident swap area. So, -m on DOS means to use only one resident swap area, while -m on UNIX means to use multiple resident swap areas.
The -n flag is used to prevent rewinding the magtape when avg terminates. It is used to simplify averaging subsequent raw files on the magtape, since the tape will be positioned properly for the next raw file. Note - when this flag is used, abnormal termination of avg will cause a search for the end of the current data file. This can take a long time, and a signal (interrupt, hangup, etc.) during this period will leave the magtape in an unspecified state.
The -c flag is used to increase the epoch length by an integral multiple of 256 points. By default, there are 256 points in an epoch. Using -c 2 would cause avg to use an epoch length of 512 points, thus doubling the length of the epoch.
The -r flag, used in conjunction with the dec_fact argument, is used to decimate the data by dec_fact, thus effectively decreasing the sampling rate.
The -d flag is not available in the DOS systems (yet) but is described here for completeness and in anticipation of it’s eventual implementation in the DOS version of the averaging program. The argument to the -d flag is the name of an integer digital filter. The specified filter is applied to the data as they are extracted from the raw file.
The -e flag is only available in the DOS version of the averaging program. When invoked, the averaging program prints an estimate of what percentage of the average is completed so far. It does this by first reading the last line in the bin list descriptor file to determine how many total events will be in the average. Then as it processes each event, it calculates the percentage of total events that have been processed and displays that percentage on the screen.
The -o flag is only available in the DOS version of the averaging program. One often wishses to save the information that is displayed on the monitor by avg, especially the final output of the total sums and rejects in each bin. This is easily accomplished when averaging on a UNIX system by using the UNIX program "tee". Since "tee" is not available on DOS, the -o option was introduced to provide similar functionality. The -o flag is followed by "outfile" on the command line, which specifies the name of a file where output displayed on the monitor is saved.
After successful invocation, avg prints: "Mount next rawfile and enter corresponding log and binlist files. ’dos’ to exit, ’none’ or cntrl z ends input:". At this point, the user should mount the magtape on the magtape drive. Make sure the density is set correctly and the drive placed ’online’. Then, enter the name of the logfile corresponding to the magtape on the drive, and then the binlist file. When everything is ready, hit return. If the user types ’dos’, a dos shell will be invoked, giving the user an opportunity to peruse a directory listing if needed to remember the names of the log and bin list files. Typing exit will return the user back to avg (on UNIX systems running the C shell, ’dos’ is not needed as the averager can be suspended by typing contorl-z). If the raw tape is actually digitized data and is readable, the logfile and binlist files exist and pass cursory checks of validity, avg will print out a line containing the number of channels, the number of points per epoch, the corresponding length of he entire epoch in msec, and the total presampling time. Next, the experiment description and subject description in the data header are printed on separate lines. If artifact rejection was requested, the total number of A.R. tests is printed. Finally, the device being used to store intermediate sums is printed, as well as the number of these "swap areas" that will be resident in the main machine memory. The more swap areas that fit in main memory, the faster will avg be able to average the data. If all is still well, a delay ensues while avg writes blank records to the averaged data file (to avoid running out of space after a possibly lengthy averaging session - better to run out before). Finally, the tape moves and averaging begins.
When the end of any of the raw, log, or binlist files is reached, avg prints a message to that effect, and a summary of the total events, total deleted events, data errors (deleted events arise during digitization; data errors can occur due to a variety of causes).
Sometimes one will want to average together more than one raw tape. In this case, when the first tape has finished and avg asks "Mount next rawfile and enter corresponding log and binlist files. ’dos’ to exit, ’none’ or cntrl d ends input:", mount the next tape and enter the correct log and binlist file names. It is important to be careful here, more so than when the first tape is mounted. Although avg will allow the user multiple attempts to enter the appropriate log and binlist file names if either cannot be opened or there are imcompatibilities amongst the files, avg has no way of knowing which set of files should be averaged together. Hence, if compatible files are entered, avg will average them into the current intermediate sums, even if they belong to another subject, etc.
Eventually, all the files will be averaged and one should respond "none" (or type control z; hold the control key down and hit z) to the request for the next raw tape, log, and binlist files. At this point avg writes the averaged data to the averaged data file and prints a summary of the proceedings. This includes the condition code and bin number, the total number of epochs in the sum, the total number of raw events (the total number of events which should have gone into the bin according to the binlist file), and the total number of rejects (as well as the percent rejected) for each bin. A summary of the bases for rejection is also given. This includes "dterrs", or data errors which caused loss of data, and the user named artifact rejection count bin totals (if artifact rejection was requested). This information is printed for each bin, but is also in the header preceding each data bin, so is not lost if the output of avg is not retained. avg then exits with a final "Averaging Complete".
It is possible to place more than one raw file on a single reel by judicious use of the -n flag while digitizing with cddg. The files on the tape are identified by their ordinal position on the tape not the name supplied as the 3rd argument to cddg. When avg is being used to average these files, it is necessary to ensure that the proper file is positioned as the next raw data file on the tape when avg asks one to "Mount next raw tape...".
The easiest way to accomplish this is to average the data on the tape in the same order they were digitized, using the -n flag on all invocations of avg except the last. avg will correctly position the tape in front of the next file when the -n flag is employed, even if all the raw records in the current data file are not read before termination. If this is attempted, it will not be possible to lump together different raw files, unless they happen to be contiguous on the tape.
The errors produced by avg are meant to be interpretable in and of themselves. The best of intentions, however, do not always produce the desired understanding; furthermore, the solution to the problem may not be clear even if the cause is. Hence, here is a list of the possible error messages generated by avg. Since avg employs a number of library routines, the list is not exhaustive. These types of errors should be referred to the system administrator.
avg can generate errors at a number of different stages of processing. To facilitate matters, the messages are broken down into the stage of processing or routine which actually prints the message. In many cases the message is prefixed with the name of the routine which encountered the error, and a subsequent message is generated by the calling routine as in aid in tracing the sequence of events. For example, during initialization of the artifact rejection structures using the arf file, certain errors in the syntax or parameter values engender an error prefixed with "A.R. -", and avg post-fixes the line "A.R. initialization" to the specific message.
First, the errors related to the initial invocational command line (from the shell):
"Can’t create avgfile xxxx"
The file "xxxx" (averaged data file) could not be created in the specified place.
"xxxx already exists"
The output averaged data file with name "xxxx" already exists. Probably avg was recently and unsuccessfully invoked using the same name "xxxx"; alternatively a file with this name already exists. If the former, it can simply be removed (rm).
"arfile required after -a"
No filename (containing the artifact rejection functions and parameters) followed the -a flag.
"rawdevice required after -f"
No device (special file) followed the -f flag.
"dec_factor required after -r"
The integer decimation factor was not present following the -r flag.
"chan_prec required after -c"
The integer channel precision was not present following the -c flag.
"-d requires filtername"
The integer digitial filter name was not present following the -d flag.
"file name required after -o"
The output file name was not present following the -o flag.
"x flag not recognized"
The flag "x" was present, but it is not a currently implemented option.
"arg not known"
An unidentified argument was present in the command line.
If the invocation is successful, avg requests that the user mount the next tape and enter the name of the corresponding log and binlist files. At this point, the following messages can appear:
"Not enough arguments. Try again."
Two arguments, the logfile and binlist file are needed here. Maybe a space between the two was forgotten, or an extra one inserted?
"Can’t open xxxx (logfile)" "Can’t get xxxx" (binlist file)
Either the logfile (first case) or binlist file (second case) could not be opened. Maybe they don’t exist, or they are in another directory, etc.
"Binlist file - Bad format"
This message occurs when the binlist file is opened, but the first line does not contain only one positive decimal integer representing the number of bins. Is the filename entered for the binlist file really a binlist file?
If the raw, log, and binlist files are the first being added in (i.e. no other data have yet been added during this invocation of avg), a number of data areas are initialized and parameters checked. If these are not O.K., a fatal error of the type Raw Data Errors (see below) is generated, followed by:
On the other hand, if this is the not the first volume being averaged, avg gives the user a chance to try another tape or binlist file, so as not to lose the data already averaged. If the raw file is incompatible, the following appears:
"Bad Rawfile (Out of order?) - Try again"
Whereas if the binlist file does not specify the same number of bins as the previous binlist files, this appears:
"Binlist file - Incompatible # of bins"
One must have the wrong binlist file, or be trying to average together apples and oranges, so to speak.
Once the raw, log, and binlist files have been opened, certain parameters are checked in regards to the raw data file. The following error messages can appear:
"openraw - xxxx"
This occurs when the -f flag was used (i.e. take raw data from the specified device), but the specified device cannot be opened. Perhaps it is already in use or offline, etc.
"openraw - Bad raw file"
The size of the first record of the raw file was not the proper size for a raw data header. This probably isn’t a bona fide raw file.
"openraw - Bad channel prec."
The requested channel precision was out of the valid range for the current implementation of the program.
"openraw - Presampling too Big"
The requested amount of presampling is longer than the epoch length. Either reduce it, change the sampling rate (use -r), or increase the channel precision.
"openraw - presampling too small"
The current implementation of the program does not allow the total pre-stimulus presampling to be less than 0 (i.e. start averaging sometime after the stimulus occurred), although this is a reasonable desire. Hence, the requested presampling must be greater than the onset delay specified at digitization time so that the total presampling interval is at least 0. If the error is due to a typo, great; if averaging after the start of the stimulu is desired, another approach will be necessary.
"openraw - No mem."
The system did not allocate memory for the raw data buffers. This should never happen (ha ha). See the system administrator.
"openraw - incompatible volume"
When more than one set of files are being averaged together, this can occur if the raw file was not digitized at the same rate as the previous data, or if the current volume does not have the proper "raw" format.
Once the raw file has been checked for validity, a swap device, or raw disk device is acquired to allow fast swapping of the intermediate sum for each bin. A list of available devices is kept in c:\tcapdevs, along with the number of blocks available on that device. The following two messages can occur:
"Can’t open c:\tcapdevs"
The file probably does not exist. Creating this file is part of a complete and proper installation of the continuous data system. This file is used by a number of programs that require intermediate swap space. The format of entries in c:\tcapdevs can be found in the "swpsubr.c" sources for the libu.lib library.
The above swap error will be followed by :
"Swap open failed"
and an exit to the shell. If a swap device is successfully acquired, as much memory as can be gotten is used to keep as many of the intermediate swap bins in main memory (to minimize swapping to and from the disk). If there is not enough main memory for even one swap area, or the automatic allocation routine fails, the following will be printed:
"gresmem: No swap space" or
"No intermediate swap memory"
In some cases this can be fixed by re-running avg and employing the -m (force single swap area) flag.
Once the swap device has been successfully opened and locked, the binlist file is read to transfer the descriptors for the conditions and bins into the intermediate swap space. The following errors can occur during this process:
"sbdinit - Bad format"
The binlist file did not have a single, positive number as the only entry in the first line of the file. Probably the wrong binlist file was entered.
"sbdinit - cc out of order"
Binlist files have a number of idiotic constraints, one of which is that the condition codes must be placed in ascending, contiguous (no integers skipped) order. This is a complaint by avg that the binlist file did not fulfill this format constraint.
"sbdinit - bin out of order"
Likewise, the bins must also be in ascending order without skipping any numbers. If this is printed, the binlist file didn’t satisfy this format requirement.
"sbdinit - swap error"
Something went wrong trying to swap in the intermediate bin so as to transfer the descriptors. This could be due to a disk error, or it could be a system error, or it could be ???
"sbdinit - avg write"
As the binlist file is checked, the output averaged data file is written with blank data so as to ensure that there will be sufficient space after all the work of averaging is complete. Better to mess up now than then. Could be a disk error - most likely no space on the device.
If the current binlist is not the first, the descriptors are not transferred to the intermediate data headers. Instead, only compatibility is checked. The following errors are possible in this multiple volume case:
"sbdinit - line not SD or CD"
Probably due to a binlist file with the wrong format, as is:
"sbdinit - unexpected EOF"
meaning End Of File.
Any errors encountered in this binlist check and initialization phase are fatal, and are followed by:
"Can’t initialize sbd"
A number of checks are performed on the arf file that is specified when artifact rejection is invoked. The messages generated and their causes are:
"A.R. - Can’t open xxxx"
The arf file "xxxx" could not be found. Perhaps it is not in this directory.
"A.R. - No mem."
There was not enough memory for the arf file parameters. This shouldn’t happen, and if it does, let the system administrator know.
"A.R. - Bad argument count line n"
The artifact rejection function specified on line n of the arf file did not have the proper number of arguments. Perhaps this is not really an arf file, or a space was left out, etc. Different functions require different numbers of arguments. Refer to the garv User’s Manual for a detailed list of the routines available and their requisite arguments.
"A.R. - Too many tests"
The arf file contained more tests than the current implementation of the program allows. Cannot some of them be eliminated?
"A.R. - Bad count bin line n"
The reject count bin specified on line n of the arf file was out of the range of 1 to 7, inclusive. Unfortunately there are only 7 artifact reject count bins for accounting, and the propects of there being more in the future are not too good. Hopefully, this is just a typo......
"A.R. - xxxx not known line n"
The artifact rejection function "xxxx" on line n is not a valid function name. Probably a typo, so edit the arf file, why doncha? For a list of the currently implemented functions, see the "garv User’s Manual".
During actual execution of the artifact rejection routines, the following runtime errors can occur:
"A.R. - Bad chan test n"
The nth test in the arf file in use specifies a test on a channel that is not present in the current data being averaged. Remember that channels start at 0, and that artifact rejection test n is on line n+1 (tests are numbered 0 to N-1, lines form 1 to N). Most likely the current arf file was designed for use with data of another sort.
"A.R. - Bad latencies test n"
This is similar to the previous "Bad chan" error, but can also occur when one is trying to test close to the beginning or end of the epoch. Since the data are quantized in both time and amplitude, there may not be a point corresponding to -200 msec, for example, even if 200 msec of presampling were requested. For example, if one has digitized data that were sampled at a rate of 200 Hz (5 msec per point) and a presampling request of 100 msec was made during the invocation of avg, the "grain size" is 5 msec, and one may have to specify something like -95 to 0 or -94 to 0 in order to satisfy the artifact rejection routines (which are dumb and do a lot of truncating and other bad things).
Mismatches are a complex subject. They occur when the log item number, raw item number, and binlist item number don’t match, or when the raw event time and log event time don’t match, or when the raw event number and log event number don’t match. This has, in the past, been due almost exclusively to errors on the magtape. There is a sophisticated rematching algorithm which attempts (indefinitely) to rematch the three files. Mismatches, however, can also occur if the wrong combination of log, raw, and binlist files are used in averaging. This can result in mismatches all the way down the raw file.
Certain information about the item numbers, event numbers, and times for the log, raw, and binlist files are printed out while rematching is attempted. This information can be used to try and diagnose the cause of the mismatch. A single mismatch which is rematched every couple of subjects is not unheard of, but repeated mismatches are a sign that something is amiss. The following messages are printed:
This is an indication that a mismatch has occurred. It is followed by certain information in the files at the record where the mismatch occurred. A complete treatment of "how to diagnose" mismatches is not possible here. If diagnosis is imperative, see the system administrator.
This cryptic message is emitted after a mismatch where a rematch was attained by resetting the raw item number to the log and binlist item number after everything else had been matched.
"Log -Binlist fatal mismatch blf item n"
The log and binlist file do not match at item n in the binlist file. This is fatal because both of these files are stored on the disk, and the disk is assumed to be error free. If this is not the result of improperly doctoring the binlist file, it is due to radical problems with the disk system. In either case, continuing averaging is not recommended.
A number of errors can occur during the running of avg. These are outlined here:
"Bad bin # in binlist file"
A bin was encountered in the binlist file which was out of the range of valid bins for these data and this binlist file. This could be due to an error in the file, but is most likely due to a binlist file that has been improperly created or doctored.
"Bad error count bin"
Somehow, a data error or artifact reject is being tallied for a bin which is out of the range for these data and the associated binlist file. See above.
"Can’t swap bin n"
This is most likely a hardware error; it was not possible to swap bin n, although it must have been possible to do so during initialization.
"Not enough mem." and "M.B.A. - No mem."
Again, these should not occur. See the system administrator or local Guru.
"avg - abnormal termination due to signal"
If avg is killed by an interrupt, hangup, or other signal, it dies after freeing the swap device and printing this message. All data are lost.
"Final swap error bin n"
It was not possible to swap in bin n to average tha data and then write it out to the averaged data file. Probably a hardware error.
"writavg error avg bin n"
An error occurred while attempting to write out the averaged data to the averaged data file. Bin n is the bin with the problem - probably a hardware error since space has already been reserved and in fact has been successfully written into once.
This follows the above messages, and is self evident.
Not really and error, just a status message indicating successful completion.
Table of Contents