Table of Contents
headerfile - a description of the EPL system data header structure
(ERP files)
A Description of the EPL System Data Header StructureJ.
C. Hansen
In the EPL Continuous Data Digitization and Analysis
System all digitized data and averaged data are preceded by headers having
a given structure. The data in the header includes binary information relating
to the sampling rate, the # of channels, etc, as well as ASCII descriptions
of the subject, the experiment, the experimental condition, etc.
Here is an abridged version of the header structure:
#define MAXCPRE 2
#define MAXCHNS 16
#define RECSIZP 256
#define DGMAGIC 013645
#define MAXDESC 40
#define MAXCDES 8
#define MAXRFCB 8
struct header {
int evtno; /* event # or magic number */
int epleng; /* epoch length in msec. */
int nchans; /* # of channels */
int sums; /* sums 0,1 => single trial */
int tpfuncs; /* total # of proc. funcs */
int pp10uv; /* points per 10 uvolts */
int verpos; /* 1 => pos ; -1 => opp. */
int odelay; /* msec from trig to stim */
int totevnt; /* total log events */
int ctickt; /* 10s of usecs per clock tick */
int evtimhi; /* hi clock time in ticks */
int evtimlo; /* lo clock time in ticks */
int ccoder; /* condition code */
int presam; /* pre-event time in msec. */
int trfuncs; /* total # of rej. functions */
int totrr; /* total raw rec. inc. rejects */
int totrej; /* total raw rejects */
int sbcode; /* subcond. # ( bin number ) */
int cprecis; /* channel prec * 256 pts. */
unsigned seqitem; /* sequential item # for raw */
int dummy1[4]; /* open */
int rfcnts[MAXRFCB]; /* individual rej. counts */
char rftypes[64]; /* 8 char. descs for 8 poss. rfs */
char chndes[128]; /* 16 channel descriptions */
char subdes[MAXDESC]; /* 40 char subject description */
char sbcdes[MAXDESC]; /* 40 char bin desc. */
char condes[MAXDESC]; /* 40 char condition desc.*/
char expdes[MAXDESC]; /* 40 char experiment desc. */
char pftypes[64]; /* 8 char descs. for 8 poss. pfs */
int dummy2[8]; /* open */
char rawname[16]; /* for raw file magtape ver. */
};
It has a length of exactly 512 bytes (standard DEC block). Many of the
entries in the header structure are holdovers from previous analysis systems
and are no longer in use, while others are redundant and are not always
used. In addition, the header preceding a raw data file has different slots
filled as compared with a header preceding an averaged data record. The
format of averaged data files and digitized data files are described in
separate documents.
The #defines preceding the header
proper place maximum values on certain header variables, define the length
of certain arrays, or else further define the data format. The RECSIZP
definition defines the basic number of points (2 byte integers) per channel
for both raw and averaged data. Note that the actual number of points per
channel in averaged data files can be an integral multiple of RECSIZP;
this is determined by the cprecis header element. Constraining the number
of points per channel to a multiple of 256 simplifies data input/output
and storage; keeping it a power of two facilitates performing Fourier analyses
on the data. The DGMAGIC constant is an arbitrary number which is placed
in the first word of a header which is valid for use in digitization. This
number is checked by programs which wish to verify that a file is indeed
a raw data file, or that a header has been verified as having parameters
that are within a reasonable range for use in digitization. The header element
for the magic number is evtno; this slot at one time (and can in future
systems which store discrete epochs) held the event number for discrete-epoch,
digitized, raw data files.
The MAXCPRE and MAXCHNS definitions
set the maximum channel precision (see the cprecis element) and the maximum
number of channels (nchans element) respectively. These constants are used
by a number of programs, and can be changed if the specific hardware and
software can handle the memory requirements and record sizes.
Four elements in the header structure
contain ASCII descriptions of the data. These are expdes, condes, sbcdes,
and subdes, containing up to MAXDESC (40) characters describing the experiment,
condition, sub-condition (bin), and subject, respectively. These ASCII descriptions
can actually be up to 40 characters in length, but it is preferred that
they be 39 characters or less, terminated by a zero byte. The experiment
and subjects descriptions are generally filled in when the raw data are
digitized, while the condition and sub-condition description (bin description)
are filled in when data are averaged, and appear only in averaged data
file headers.
The chndes element is a character array
with space for up to 128 characters.
This space is divided up into 16 eight character descriptions, or 32 four
character descriptions, depending on whether the nchans element is greater
than 16. As with all other ASCII descriptors, these need not be terminated
a zero byte, although it would be nice. The channels are arranged in ascending
order and the first character of the descriptor for channel n is at index
8*n or 4*n in the chndes array. Currently only 16 channels are supported,
and the MAXCDES definition gives the maximum length (8)
of a channel description.
These are filled during digitization.
This little jewel is used
to hold the optional name of the raw file. It is filled by the digitization
program and is used to verify that the data are the correct ones by some
programs which manipulate raw data; the ordinal position on the tape, however,
is definitive.
These integer structure elements
are essential for digitization and are thus filled prior to use by the
digitization program. The nchans element should be between one and MAXCHNS
inclusive, while the odelay and ctickt are contrained by the hardware and
digitization software. ctickt is in units of 10’s of microseconds, thus a
200 Hz sampling rate corresponds to a ctickt entry of 500. odelay is in
msec and is constrains the presampling request to be at least as large
as it is. odelay is set to zero during averaging and its value absorbed
into presam; see the manual on the averaging program for more details.
These header structure elements
are all set during the averaging stage of data analysis. They constitute
the minimal group of elements which must be filled to allow further processing.
All except pftypes are integer elements; pftypes is similar to rftypes
(below) and is a character array conatining ASCII descriptors. 1) sums
is set to the total number of epochs actually averaged together to form
these data. 2) tpfuncs (total # of processing functions) is always set
to one by current averaging programs; older data may have a zero in this
slot. This element determines the number of sets of nchans channels of processed
data that are contained in the data record of an averaged data file. 3)
pftypes contains the ASCII description corresponding to the processing
functions used to form these data. Since only simple averages can currently
be calculated, only the description "average" appears in the first 8 character
slot. 4) presam, or presampling, contains the total time (in msec) from
the beginning of the epoch to the event occurrence. Thus, if there was a
40 msec odelay (trigger to event time) and one requests a 100 msec presampling
interval from the averager, 60 additional msec prior to the trigger are
prefixed to the separated data epochs, odelay is set to 0, and presam is
set to 100. 5) cprecis, or channel precision, is usually 1 (one), although
it can be between 0 (in old data, this implies a 1) and MAXCPRE. This element
specifies the number of RECSIZP units of points which comprise a single
data channel. Hence, if cprecis is 2, there will be 512 points of data for
each channel (1024 bytes).
These three header
elements are associated with artifact rejection and data errors during
averaging. The trfuncs element is an integer between 1 and MAXRFCB, indicating
the number of classes of artifact rejection functions that were employed.
trfuncs always starts at 1 because rfcnts[0] is reserved for tallying data
errors which cause loss of single trials. The rfcnts array is an array
of integers which is closely allied with the rftypes; in fact each rfcnts
member contains the counts of the number of trials rejected on the basis
described by the corresponding rftypes description. rftypes descriptor 0
always contains the string "dterrs", and refers to the use of rfcnts[0]
as a count of the trials lost due to raw data errors. If trfuncs is 4, for
example, then rfcnts[1] through rfcnts[3] will contain artifact rejection
counts, and there will also be 3 rftypes descriptors filled with the associated
description. There can be up to eight (MAXRFCB) descriptions of up to eight
characters apiece in the rftypes character array. Since the first rfcnts
member and rftypes description is reserved for data errors, there are 7
members left for artifact rejection data. They follow the same rules as
the channel descriptions; the rftypes are filled in (along with the rfcnts
array) during averaging. Refer to the manual on the averaging program for
further details on wht these counts really mean and how artifact rejection
works!
These two header element are set after averaging
has been performed at the stage of normalization of the data. They are crucial
to further processing of the data and should be treated with care. verpos
stands for "vertex positive". It can have three values. If verpos is 1, the
data should be interpreted to mean that a positive integer value in the
data represents a positive voltage difference between an active electrode
and the reference electrode. A -1 in verpos means the opposite, while a 0
(zero) indicates the data are not normalized and the polarity is unknown.
A zero (0) in verpos also indicates that the value in pp10uv is not correct,
but that 1000 was placed there to allow plotting of the unnormalized data
for diagnostic puposes. The averaging program zeroes verpos and place 1000
in pp10uv. pp10uv stands for "points per 10 microvolts", and represents
the point value difference which represents a 10 microvolt difference in
the data. This is often set to 1000 giving a data resolution of one one-hundredth
(.01) microvolts per point.
These three integer
elements are set by the averaging program for informational puposes only;
they are not currently read by any programs. The totrr, or "total raw records"
element, contains the total number of trials assigned to this averaging
bin. The totrej, or "total raw rejects", on the other hand, specifies how
many of these raw records were rejected because of data errors or artifact
rejection failures. Note that totrr is the sum of the totrej and sums elements.
Finally, the sbcode is used to number the data records in an averaged data
file; it is not really necessary.
These integer elements are holdovers from the discrete epoch system
and remain for that use in the future. They are diddled internally by the
averaging program while the individual trials are being extracted from
the continuous data, but should not be relied on in an averaged data file
to contain anything but garbage.
The header structure
has been updated since this document was written, the updates pertain mostly
to supporting up to 64 channels of data, see the actual header source code
file /usr/local/erp/include/header.h for the latest.
Table of Contents
- Name
- Description
- General
- The Header Structure
- Fixed Parameters
- Tunable Parameters
- Experiment, Subject, Condition, and Bin Descriptions
- Channel Descriptions
- rawname
- nchans, odelay, and ctickt
- sums, tpfuncs, pftypes, presam, and cprecis
- rftypes, rfcnts, and trfuncs
- verpos and pp10uv
- totrr, totrej, and sbcode
- evtno, epleng, evtimhi, evtimlo, and ccoder
- Expected Modifications