2008-06-19

Time stamp issues in EDF data converted from Stellate SIG format

by Forrest Sheng Bao http://fsbao.net

I think there is obvious a bug in Stellate Hamonie system, the software to sample data from EEG electrodes and manage them, convert them into EDF format. Actually, the EDF data converted from SIG has many problems, time stamps in wrong order, time stamps of wrong seconds, all-zero amplitudes, incorrect starting time, etc. I spent a lot of time to understand what's going on. At first, I was even frustrated that I couldn't use those data. Let's have a look at the data of their own SIG format and the data of EDF file converted from it.



This is the snapshot of viewing one channel data from their software, the Hamonie. The resolution of Y-axe is 10uV/mm while the time interval between two vertical green bars is 1 second.



This is the plot from EDF file converted by Hamonie, of the same data as above picture. The Y ticks are voltages in uV while the X ticks are numbers of sampling points (200 points = 1 second).

At first, the sign of above two images are opposite. This is because by default, the polarity in Stellate Hamonie system is negative up. Thus, their Y-axes is up-side-down. You can't change this setting in version 6.1, the one I got from the Chinese hospital. In version 6.2 of Stellate Browser, you can set it. Just click menu "Channels" -> "Edit", and set "Polarity" by clicking "Pos. Up" as follows:

The time stamps in EDF data have more problems. I used the Amplitude-Time cursor tool (In Stellate Browser menu: Tools -> Amplitude-Time curcor) to check exactly the value of each points on SIG data, and compare with those in EDF format data.

1. The time stamp of the first sampling point.

Firstly, according to EDF+ protocol, the time stamp of the first sampling point should be 0, which means the starting time of the series. If you sampling rate is 200Hz, then the time stamp of the second point is 0.005 - 0.005 second after the start. That makes the data I got can't pass the EDF compatibility test in several software. Let's have a look at the first sampling point of the EDF file:

23.303000,7.782218,-15.717028,5.035553,-9.765921,3.814813,-2.136295,4.272590,7.171848,9.765921,13.580733,19.837026,-30.060724,10.681476,-1.831110,-4.882960,11.749623,7.477033,-8.392588,13.580733,1.678518,-2.746665,11.597031
The 23.303 means this point is sampled 23.303 second after the record starts. Entries delimited by commas corresponds to signals defined in EDF header. For example, the 7.782218 corresponds to the Fp1 electrode.

I felt very confused about the relationship between the time stamps and time instants in SIG file. So I did a small experiment. I tried to read amplitude of first several samples of SIG data from Stellate Browser, and compared them with data read from EDF file.

Time stamp on SIG dataTime stamp on EDF dataAmplitude on SIG data (uV)Amplitude on EDF data (uV)
Sample #117:22:51:30523.30387.782128
Sample #217:22:51:31023.3081918.768878
Sample #317:22:51:31523.31388.239995
Sample #417:22:51:32023.3181212.20741
Sample #517:22:51:32523.32364.425183

Ok, so I think i don't need to care the time stamp. Each line follows the order.

2. Time stamps for any arbitrary interval

The doctors in China will tell me the time of ictal activities in form of absolute clock, like 22:23:34 Jun.03, 2008. I need to be able to locate the sampling point corresponding to a give time instant.

I did another verification to make sure the relationship is linear. As you can see from above two snapshots, there is a big peak, whose amplitude excesses 100uV, at round 5.5 second, thus the 1100th sample. Here are the data for that area:


Time stamp on SIG dataTime stamp on EDF dataAmplitude on SIG data (uV)Amplitude on EDF data (uV)
Sample #110917:22:56:85028.842109108.951051
Sample #111017:22:56:85528.847109109.408828
Sample #111117:22:56:86028.8529292.103282
Sample #111217:22:57:86528.8577878.432549
Sample #111317:22:57:87028.8527676.143661
Sample #111417:22:57:87528.8628483.925879
Sample #111517:22:57:88028.867112112.308086
Sample #111617:22:57:88528.8729897.506612
Sample #111717:22:57:89028.8778079.500697
Sample #111817:22:57:89528.8828585.146619
Sample #111917:22:57:90028.887104104.373275
I think my guessing is correct. The time between the 1st and the 1100th sampling is (1110-1) x 0.005 = 5.545 seconds. The time stamp of sample 1110 in SIG data should be 17:22:51.305 + 5.545s = 17.22.56.850. The time stamp of sample 1110 in EDF data should be 23.303 + 5.545 = 28.848. It's not 28.847! I checked the data and found the problem. The time stamp of the 200th sample is 24.298 while the one of the 201th sample is 24.302. I am not quite sure about the reason. Anyway, it's not a big deal.

3. Time stamps of last 1 second

The time stamp of the last 1 second is totally nonsense. The time stamp can jump from 36.297 to 0. In the data above, from 2601st sample to 2687th sample, the time stamps are from 0 to 0.430. The last time stamp of SIG data is 17:23:04.735, which should correspond to the 2687th sample ((17:23:04.735 - 17:22:51.305) x 200 +1 = 2687). But, the EDF file has the 2688th sample. What's wrong? Let's have a look at the samples 2685 to 2688:

Time stamp on SIG dataTime stamp on EDF dataAmplitude on SIG data (uV)Amplitude on EDF data (uV)
Sample #268517:23:04:7200.420-21-21.363
Sample #268617:23:04:7250.425-12-12.0548
Sample #268717:23:04:7300.430-15-14.8015
Sample #2688There is no such sample in SIG data0.435There is no such sample in SIG data-14.3437
So I still don't know where does the 2688th sample come from. Just neglect it.

4. All 0's record

From 0.44 second to 0.995 second in EDF data, amplitudes of all channels are zero's. I think maybe the Stellate Hamonie system wanna pad one complete second. Anyway, those data, just forget them.

5. The time the record starts

As i said before, I need the starting time of a record to determine the number of samples to a given arbitrary absolute time. This is also funny. On the SIG data, it says the record starts from 17:22:51.305. But in the EDF head, it says: 17:22:28. I checked the header information of SIG data, it is also 17:22:28. So what happened from 17:22:29 to 17:22:23?

Anyway, this is not a problem since I have figured out the relationship between time stamps in SIG data and EDF data. I just need to consider the time stamp of first sample in SIG data as the starting time.

Done!

Ok, so it is really hard to understand a format that you don't know before, along with that strange software. I still have some problem, the actual data of long-term EEG monitoring is very big, like 2G or 3G. But my old program load the entire data into the computer memory. Now, it is impossible. I need to play some programming tricks.

No comments: