EEG Data

Input Data

Two datasets were used in this project; a control set consisting of 10 healthy patients with no known neurological diseases and a test set consisting of 10 patients in a coma state who had suffered from traumatic brain injury. The EEG data used in this project were provided by Dr. Shinung Ching and was retrospectively collected from 20 patients who underwent EEG for routine monitoring purposed related to a diagnosis of coma or less-severe DLOC. DLOC was assessed using the Glasgow Coma Scale (GCD) ≤ 0 at the time of EEG, in the Neurological and Neurosurgical Intensive Care Unit at Barnes-Jewish Hospital and Washington University School of Medicine. For each case, EEG was carried out to detect non-convulsive seizures in patients with otherwise inadequately explained DLOC, and the cases in which seizures were not detected at any point in the hospitalization were analyzed. It is important to note that the assessments of the individual coma patients’ focal or diffuse classification using diagnostic data was not provided in the data sets used for analysis. Since severity and localization of the brain injury of each test patient was not provided, we sought to make conclusions on the overall effects of traumatic brain injury on brain network dynamics. Upon completion of Human Subject Training through the Collaborative Institutional Training Initiative (CITI), we satisfied the educational requirements necessary to conduct human subject research at Washington University.

EEG recordings were collected at a sampling frequency of either 250 or 500 Hz using the standard 10-20 system of electrode placement (Jasper, 1958). The 500 Hz data was downsampled to 250 Hz prior to analysis. The duration of each recording was approximately 20 minutes. Sample frequencies for functional connectivity EEG studies typically range between 250 and 512 Hz, so we factored in how a lower sample frequency may result in a greater variability among the measured coupling of two signals (van Diessen et. al., 2015). A bipolar montage was used in our analysis with 18 bipolar channels (FP1-F7, F7-T7, T7-P7, P7-O1, Fp1-F3, F3-C3, C3-P3, P3-O1, Fz-Cz, Cz-Pz, Fp2-F4, F4-C4, C4-P4, P4-O2, Fp2-F8, F8-T8, T8-P8,

and P8-O2). A visual of the electrode locations and the corresponding channels is included in Figure 2. A bipolar montage indicates that each channel recording represents the difference between two electrodes on the scalp, using one location as a reference point. The recordings from channels 5 and 15 were discarded from our analysis because they share the same nominal electrode as channels 1 and 11, respectively, and therefore measured the same signals.



A key characteristic of the data is that it was collected at ‘resting state,’ defined as a state in which the subject is awake and not performing an explicit mental or physical task, which makes it ideal for studying patterns in brain activity (van Diessen et. al., 2015). One of the drawbacks of using resting state data is the large degree of variability in the quality of the data since the behavior of subjects are not as easily controllable as it is with task related activity. Furthermore, we were not able to witness the exact procedures used to collect the data to ensure that subjects in both the control and test groups had a similar pre-experimental procedure. Considering these constraints, we were cognizant of overfitting spontaneous brain activity or artifacts that may provide a false representation of activation patterns. As such, our methods aimed to minimize bias by random data sampling as well as comparing graphs generated across multiple thresholds when analyzing network dynamics.

Preliminary EEG Processing for Network Analysis

After receiving the raw data, we were tasked with filtering the data for noise and eliminating the direct current (DC) component. Using a second-order, Butterworth bandpass filter implemented in Matlab, we filtered the data to select frequencies between 0.1 and 50 Hz. Additionally, channels 5 and 15 recorded the same signals as the electrodes of channels 1 and 11, respectively. We eliminating these two channels, so that a total of 16 channels were used in our analysis. Due to the high variability in levels of vigilance at the beginning and end of an EEG recording, we selected the center 80% of each channel recording. Without software tools to effectively identify artifacts in the data, we proceeded without targeted artifact removal since the effects of artifacts on our analysis would be minimal assuming the relatively high quality of data collection. A comparison of the raw and filtered data of a Control Patient 1’s channel 1 recording is featured in Figure 3.


EEG Data Used in Time-series Analysis

We used subsets of the same data in time series analysis that we used for our cross-correlation analysis, namely 300 – 500 data points of the EEG recordings that were provided to us by Professor Ching’s lab. Using smaller subsets of the data allowed us to better analyze finer trends in the data. Since our data was down-sampled to 250 Hz, every 250 points represents one second of data. Consequently, we could use the index of the data as a marker for time.