Table of Contents

Name

avgfile - EPL System Avg (was: Data) File Format (ERP files)

Description

EPL System Avg (was: Data) File FormatJ. C. Hansen

The Origins of Average Files

ERP average files in the EPL analysis system have a specific format. This this allows the use of "general purpose" programs to deal with all types of experimental data. These ERP average files are most frequently generated by averaging programs, or programs that produce averaged ERPs from raw digitized data, log files, and sorting information, and are thus often called averaged ERP data files. There are other programs, however, which produce data files having the same format. These include data manipulation programs, smoothing programs, and grand averaging programs, for example. This document describe the general format of these data files. For further information on specific aspects of the headers, refer to the document describing the various header entries in detail.

Structure of an Average File

General

A data file, or averaged data file, is divided into conceptual units termed bins or super-records. These correspond to a unit of data in the file; this usually refers, for example, to a single averaging "bin" or ERP. Each super-record or bin consists of a header record followed by one or more sets of data associated with the experimental condition which generated the bin. Note that this differs from raw data files in which only a single header precedes an entire file of data.

Structure of a Bin

Each set of data in a bin consists of a particular type of "processing function" applied to the data. Currently only a simple average can be calculated, and hence there is always only one set of following each header. In the future, however, it may be possible to include noise estimates, standard errors around the ERPs, or Fourier Transforms of the data as processing functions. The number of sets of data following each header is specified and described in the header elements tpfuncs and pftypes respectively.

Structure of a Data Set

Each data set, such as the averaged ERPs associated with a bin, are comprised of a full complement of channels stored contiguously starting with channel 0 (zero). Each channel is a multiple of 256 words (512 bytes) in length; this channel precision is determined by the cprecis element in the header. Currently only a channel precision of 1 (one) is supported by analysis and plotting programs. The data for each channel are not multiplexed, but separated and stored in order. Thus, for example, if the data header specifies that cprecis is 1, nchans is 12, and tpfuncs is 1, the length of the super-record is 512 (header) + 12*512 bytes in length. In general, the length of a super-record is 512*(nchans*cprecis*tpfuncs+1) bytes. There are no constraints on the homogeneity of a data file. That is, each bin in a data file could, conceivably, have different numbers of channels, different channel precisions, and different numbers of processing functions. In the general case, then, one would have to sequentially scan a data file calculating the offset of the next header from the data in the current header. Practically speaking, there is a tacit convention that all bins in a data file have the same channel precision, number of channels, and number of processing function. No current programs generate anything else, and many programs rely on this uniformity. If a file has this restricted structure, it is a simple matter to calculate the offset of any data bin given only the data in the first header of the file. For example, data bin N will be found at offset 512*(N*(cprecis*nchans*tpfuncs+1)) in the file.

Expected Changes

It is possible that someone may want to form data files with mixed-length bins as has been discussed above. If this desire arises, an analysis weighing the cost of all the program alterations and the advantages of allowing mixed-length bins will have to be performed. Currently the approach is to create many different data files. This works too.


Table of Contents