Think, Forrest! Think!: 3/1/08

2008-03-24

What is the truth? I don't wanna know

by Forrest Sheng Bao http://fsbao.net

Many people are very politically sensitive and politically responsible these days. Well, most of them are some students in a lab I sometimes visit, like 5 minutes per day. But just this 5 minutes, they can raise an argument with me. They seemed very angry about some words in my instant messenger or facebook status.

They asked me for my perspectives on a recent issue. Well, it's not a question easy to answer. So I just answered that I can't make a judgment until I know the truth.

This made them angry, "You only believe the bullshit on western media". Well, fine. Actually I didn't know this until I saw a friend's Gtalk signature. I don't care most things farther than 5 miles from my university. I asked that friend and forgot it in a minute. Few days later when I went local Chinese church, I found the hot topic was not daily life or faith to God. Then I thought I should know what's going on.

So I never believed those "western media bullshit". I even don't know about what "the bullshit" is.

Ok, then "Can you tell me something that is not bullshit?" They said blah blah blah.

I asked them, "Why do you think that is the truth?" They said they just believe so.

So i asked them, "Why do you believe something?" They couldn't answer.

Anyway I won't believe anything until some phenomena forced me to believe it. So please allow me to know something before I can believe you. I won't believe you by what you just said. Repeating what you said won't work. So, please allow other people tell me what's going on.

But, the problem is. I don't care about this. I don't wanna know what the truth is about "that thing". This spring break, I finished two major parts of my two research projects, one of which is my master thesis. I have a tight schedule. I even don't care how high the gas price is - coz I have no car.

I think, it is a disaster to a society that most people focus on same things beyond their daily life. The world is diverse. On politics, I only care whether Hillary can be the president coz her issues are related to my benefits. I don't know who is a Premier Minister in Canada or the verbose history of Kosovo. If I was a reporter, some Canadians or Serbians will blame I am a idiot. How many things do you know about the country you don't live in?

I care some things and you care something else. It has nothing to do with social responsibility. How do you know my job has no benefit to the world? My job is to change the world. Is that OK?

2008-03-22

Western media incorrectly reported Tibet issues?

by Forrest Sheng Bao http://fsbao.net

I just saw some evidences against "incorrect report of western media". So I tried to check them and found some are made up. The website is: http://www.anti-cnn.com/ I will call them Anti-CNN for short.

Let's have a look at their first evidence regarding CNN. They (not CNN reporter) said CNN cropped a picture and neglected the truth that someone was throwing stones to army vehicles. Let's have a look at the CNN official website: http://edition.cnn.com/2008/WORLD/asiapcf/03/15/tibet.unrest/index.html?eref=edition
There is a line under the picture: Tibetans throw stones at army vehicles as a car burns on a street in the capital of Lhasa. So, did CNN "incorrectly" report "the truth"? (the headline of that page is updated so it differs from the headline in the snapshot)

Let's have a look at the BBC. Anti-CNN listed a snapshot to BBC website. The note to a picture on that website is: There is a military presence in Lhasa. Well, it differs from the note of BBC offiicial website http://news.bbc.co.uk/2/hi/asia-pacific/7300312.stm which said: There have been many reports of injuries and deaths in Lhasa.

Come on, don't try to convince me by madeup evidence.

PS: These incorrect "disproof" is now appear on CCTV (China's official TV) 's webpage: http://news.cctv.com/china/20080323/100134.shtml

I strongly recommend CCTV to remove it coz it harms the reputation of China and China government.

You can say NO to certain "western media". Isn't it just freedom?

by Forrest Sheng Bao http://fsbao.net

Recently, there is a movement, which is generally referred as "Anti-Western-Media movement". Generally, it is a group of Chinese students who think western media reported something about Tibet incorrectly. So they published some evidence online to say no to "western media".

Their general ideas are:

The western "freedom of speech" along with freedom and democracy are lies.
The western media like to make China black.
We don't like the west world.

I just wanna briefly list my points of view:

Are all western media's reports incorrect? I just think the banner is too large. Some western media also adopted the opposite opinion, right?
I think the incorrect report has nothing to do with democracy or freedom. Now no western reporters are allowed to enter Tibet. But they have to make reports to attract audiences. So they may use unreliable information source. It's just a business behavior. No relationship with democracy or freedom.
Where do you put your videos, photos and PDF docs to say NO to "western media"? I think they are on "western" sites, right? For example, we can watch videos from both sides on YouTube. Well, in a place you hear two opposite voices and they all can express them freely. Isn't it just the freedom of speech?
Some evidence they listed are not solid. For example, they referred many website snapshots. Well, why don't you just put the links to those webpages? Anyway, I just can't find the Fox News webpage. Other evidences (in English) have nothing to do with "incorrect reporting". The Chinese notes to those snapshots are inconsistent with original English texts.
The west world is not good and they harmed you coz they harmed your country. Then why are you still living in west world and even wanna be a citizen of west world?

2008-03-20

Package repositories of openSuSE

by Forrest Sheng Bao http://fsbao.net

I really can't understand why openSuSE didn't add official online repositories for me. I have to manually add by myself. Installing software thru network rather than local file is a good advantage of Linux. Users prefer an out-of-the-box configuration, right?

Well, here is the link telling your the URLs of official and semi-official repositories: http://en.opensuse.org/Package_Repositories#Official_Repositories

Enjoying openSuSE~

2008-03-18

Why Z-transform is related to digital filter design

by Forrest Sheng Bao http://fsbao.net

A friend of mine asked me an interesting question: Why do we learn Z-transform and then study designing digital filter? Well, this is interesting. But the clue is simple.

By default, lowercase function name are in time domain and uppercase function names are in Z-domain, a.k.a discrete complex domain. Y or y means the output where as X or x means the input. H or h means the filter.

Abbreviations and notations:
ZT: z-Transform
FIR: Finite Impulse Response
IIR: Infinite Impulse Response
Z(x): performing z-transform on signal x

The Digital Filter and Transfer Function

A digital filter works on sampled, discrete time signal, rather than continuous signal as analog filters. There are only few operations we can play on time series: summation, scalar multiplication and delay. The flow of data can be backward, the feedback in IIR filter. For example, a smooth filter that averages 3 consecutive points can be built by summarize current input, previous input and the second previous input.

A digital filter has an input and an output, which are the original series and the filtered series respectively. If you consider a digital filter as a blackbox, the relationship between its input and output can be represented by a transfer function H(z) in Z domain:

H(z)= \frac{N(z)}{D(z)} = \frac{ b_0 + b_1 z^{-1} + b_2 z^{-2} + \cdots + b_L z^{-L} }{ 1+ a_1 z^{-1} + a_2 z^{-2} + \cdots + a_M z^{-M} } (1)

such that
Y(z) = X(z) H(z).

The inverse z-transform (iZT) of the transform function is the impulse response, which characterize the system behavior in time domain. Depending on whether the transfer function H(z) has denominators, the impulse response of the system can be finite or infinite and thus lead to two types of filters, FIR filter and IIR filter. Before explaining those two types, I need to introduce a very important property of ZT.

The Time-shifting Property of z-Transform and the building of Digital Filters

ZT can be considered as a discrete-time version of Laplace transform by discretized the time. ZT has the property known as time-shifting property that
Z[x(n-k)]=z^{-k}Z[x(n)].
The power over z at Z-domain means a delay in time domain. You cannot have positive power over z since in reality the system is causal. You cannot have a signal before it happens. Time domain summation, scalar multiplication and delay can be represented by the summation, coefficients and powers of z on the transfer function. That is why people always relate digital filter design with ZT. For example, in time domain, the I/O equation of a smooth filter is

y(n)=\frac{x(n)+x(n-1)+x(n-2)}{3} (2)

Apply iZT on both side of the equation and the time-shifting property of ZT, we have

Y(z) = X(z) H(z) = X(z) \frac{1+ z^{-1} + z^{-2}}{3}

So, once you see the transfer function, you know how to build such a filter in time domain.

Did we miss something? Oh, the feedback. As you can see, Eq. 2 has no feedback. Why? And what if I want a filter that has back-propagation in time domain?

FIR and IIR: The Difference on the Denominator

Take a look at Eq. 1, if you multiply the denominator on both sides of the equation, you will notice that the left hand side, the output side, contains the negative power of z, thus, the delay of output. Hence, if one coefficient of the denominators (a_i's) is not zero, the output is not only contributed by the input or its delays but also previous output, the feedback.

A clearer picture can be seen by transforming Eq .1 back to time domain:

y(n) = -a_1 y(n-1) - \cdots - a_M y(n-M) + b_0 x(n) + \cdots + b_L x(n-L)

If one a_i is not zero, the response to any input signal can be infinite since previous output will always contribute current output. For example,

y(n) = 2 y(n-1) + x(n) = 2 (y(n-2)+x(n-1)) + x(n) = 4(y(n-3)+x(n-2)) + 2 x(n-1) + x(n) = \cdots

To any finite input, the output of such system is infinite. The impulse response is also infinite. So, a digital filter that has a transfer function with at least a_i \noteq 0 is called an IIR filter.

The output will continue increase its value. Hence, the system is divergent. If you study the relationship between the transfer function and system stability, the system is stable and causal if and only if all poles lie inside the unit circle of the z-plane. An IIR filter may not be stable if you choose the bad denominator coefficients for its transfer function.

So, let's talk about FIR filter. If all denominator coefficients of the transfer function are 0, the output will purely depend on the input without any feedback from previous output. The response of an FIR filter to a finite input is finite, as in the example in Eq. 2 and 3. You can consider FIR filter as a special case (denominator=1) of IIR filter.

2008-03-08

Truncating genes sequenced with 5' and 3' adapters

by Forrest Sheng Bao http://fsbao.net

Some genes in DNA sequences are very short, for example 15 bases. They are too short to sequence. So, we can add some adapters at 5' and 3' to extend it to 34 bases. Besides, we have a huge pool of sequences from different sources and we mixed them up for sequencing. So we can also use the adapter as an indicator to mark the source of the gene.

Here I wanna briefly explain my algorithm. I will use this sequence as an example: CATTCATGGACGTTGATAAGATCTTTCGTATGC, from a big pool of 5 million gene segments, costing us $3,000 for sequencing.

The 5' adapter can be one of the four combinations: AC, CA, GT or TG. The 3' adapter is TCGTATGCCGTCTTCTGCTTG.

Step 1:

Classifying sequences by the first 2 bases at 5'.
Let CA be class 1 and this sequence occurs 901 times in the sequence pool

Sequences doesn't start in any 5' adapter, will be put into Trash 1.

Step 2:

Truncating first 2 bases at the 5'

So now the sequence becomes: TTCATGGACGTTGATAAGATCTTTCGTATGC

Step 3:

Searching and truncating the 3' adapter from the 3' of the sequence
We set a slide window of length between 3 to 15, denoted as i. The initial value of i is 3.
We will try to compare the last i bases of the sequence with the first i bases of the 3' adapter. If they are identical, then truncate the last i bases of the sequence. If they are not identical, increase i by 1 and repeat the comparison.

3.1 If they are identical before i increases to 15, then truncate the last i bases of the sequence.
3.2 If they become identical after i reaches 15, then dump this sequence to Trash 2.
3.3 If they are still not identical after increasing the window length to the length of 3' adapter, dump this sequence to Trash 3.

Now this sequence becomes: TTCATGGACGTTGATAAGATCTT

Step 4:

Making sure there isn't segment of 3' adapter at other part of the remain sequence
Search the first 9 bases of 3' adapter in the sequence. If can't find it, then leave it alone. Otherwise, dump the sequence to Trash 4.

Step 5: Ending up

If a sequence hasn't been put into any trashes, then we call it a "useful sequence".

Sum up the occurrence of sequences in each pool (Trash 1, Trash 2, Trash 3, Trash 4 and useful sequences) respectively. The occurrence of sequences are different.

There are 256, 209 sequences that are useful, occupying 76%.
30, 576 sequences don't match any 5' adapters, occupying 9%
34, 900 sequences are in Trash 2 and 12, 032 sequences are in Trash 4, occupying 10% and 3% respectively.
There is no sequences in Trash 3.

The total occurrence of sequences is 336, 717

2008-03-02

Microsoft and (Amazon and Adobe's new support for Linux)

by Forrest Sheng Bao http://fsbao.net

Great news, Linux zealots! Two industrial giants, Amazon and Adobe just make nice with Linux users.

Linux users may always wonder how to purchase music on Linux since there is no iTunes for Linux. Well, where is new market, there is new competitors. Amazon now begins to attract those Linux users to purchase music from Amazon. Amazon has released binary package for Ubuntu 7.10, Debian 4, Fedora 8 and OpenSuSE 10.3. Wow, I won't buy any music on my Mac thru iTunes but Amazon, coz I love Linux more than Mac. I am now waiting for Amazon's movie player for Linux. Then I don't need to rent movie from Apple.

Adobe said on Feb. 25, that they will release Linux version AIR, the powerful web development environment on Linux later this year. Additionally, Adobe is committed to contributing to the open-source community on multiple fronts, including the open-source Flex framework and open-source BlazeDS for high-speed data connectivity.

While more and more industrial leaders announced their greetings to Linux, "a great player that makes a team greater", a 158 page Microsoft internal emails reveal scandalous truths about the squabbles that took place in the lead up to Vista's launch. Download the full PDF version of those emails. Here are some extremely funny things:

Mike Nash, Vice President of Microsoft, said he purchased a "Vista capable" laptop which can't enable Aero interface. He concluded "I now have a $2100 email machine".

John Shirley, a member of executive board of Microsoft, found that MSN messenger was not compatible with Vista! Another computer he just bought can't install Windows Vista. He complained "I cannot understand with a product this long in creation why there is such a shortage of drivers, I suppose the vendors did not trust us . . . enough to use the beta for driver testing?"

How did Microsoft CEO Steve Ballmer reply? He said "You are right that people did not trust us".

Yes, Mr. Ballmer, I don't trust Microsoft.

Plot to U. Bonn's data

by Forrest Sheng Bao http://fsbao.net

I tried to plot the data sourcing from a group at U. Bonn. http://www.epileptologie-bonn.de/front_content.php?idcat=193&lang=3&changelang=3

Here five tar files correspond to five data sets on that group's paper. As their description on their PRE paper: Volunteers were relaxed in an awake state with eyes open A and eyes closed B , respectively. Sets C, D, and E originated from our EEG archive of presurgical diagnosis. For the present study EEGs from five patients were selected, all of whom had achieved complete seizure control after resection of one of the hippocampal formations, which was therefore correctly diagnosed to be the epileptogenic zone cf. Fig. 2 . Segments in set D were recorded from within the epileptogenic zone, and those in set C from the hippocampal formation of the opposite hemisphere of the brain. While sets C and D contained only activity measured during seizure free intervals, set E only contained seizure activity.

http://narnia.cs.ttu.edu/drupal/files/source/2008/A.png.tar
http://narnia.cs.ttu.edu/drupal/files/source/2008/B.png.tar
http://narnia.cs.ttu.edu/drupal/files/source/2008/C.png.tar
http://narnia.cs.ttu.edu/drupal/files/source/2008/D.png.tar
http://narnia.cs.ttu.edu/drupal/files/source/2008/E.png.tar

It's really very obvious to see their differences. Follows are plots of interictal EEG vs. ictal EEG.

Another is common people's EEG vs. patients' interictal EEG. Although both of them don't cover seizure activities, I can still feel that they are different.

When I was trying to plot these data in Python, I also encountered a tutorial which contains some interesting stuff.
http://matplotlib.sourceforge.net/screenshots.html

PyWavelets and pywfdb library for biomedical time series analysis

by Forrest Sheng Bao http://fsbao.net

I am so excited today that I have found a very cool Python library for doing wavelet decomposition, the PyWavelets. It is developed by Filip Wasilewski

To do 1D wavelet decomposition, just use the function :

wavedec(data, wavelet, mode='sym', level=None)

where "data" is an array storing voltages of all sampling instant, "wavelet" is type of wavelet (e.g. Haar), "mode" is signal extension model(by default it's symmetric), "level" is the level of decomposition(by default it's 0).

The return value is an array of approximation (cA) and detail (cD) coefficients. These coefficients are just what I need to train the classifier.

Here is an example:

>>> import pywt
>>> coeffs = pywt.wavedec([1,2,3,4,5,6,7,8], 'db2', level=2)
>>> print pywt.waverec(coeffs, 'db2')
[ 1.  2.  3.  4.  5.  6.  7.  8.]

You can also do 2D wavelet decomposition by the function wavedec2. It's input are the same as 1D wavelet decomposition function wavedec This function will return a coefficients list [cAn, (cHn, cVn, cDn), ..., (cH1, cV1, cD1)], where n denotes the level of decomposition and cA, cH, cV and cD are approximation, horizontal detail, vertical detail and diagonal detail coefficients arrays. Here is an example:

>>> import pywt, numpy
>>> coeffs = pywt.wavedec2(numpy.ones((8,8)), 'db1', level=2)
>>> cA2, (cH2, cV2, cD2), (cH1, cV1, cD1) = coeffs

I also found another useful tool, pywfdb. It is used to read the data from the famous medical database, PhysioBank, physiologic signal archives for biomedical research. There are lots of open-access data for physiology, as well as neurology research. You can consider it as an Python implementation to WFDB, an open source package for viewing, analyzing, and creating recordings of physiologic signals. Flip gives an example of using his functions to read and plot a series of ECG data:

Classification and parameters of epileptic EEG

by Forrest Sheng Bao http://fsbao.net

data for healthy people

You can classify them by their brain activities: rest, sleep or cognitive activities.

And you can also classify them by the state of eyes: with eyes open or with eyes close.

Classify EEG data by the sampling place:

EEG data are recorded extracranially or intracranially. The first one is non-invasive and the later one is invasive. The later one is only recorded in a presurgical evaluation of focal epilepsies. The implantable electrodes is carried out to exactly localize the seizure focal, which is termed as epileptogenic zone.

The EEG signal reflecting abnormal neural discharging may source from epileptogenic zone or the opposite hemisphere of the brain.

We can also classify EEG data by the time interval that they are sampled:

EEG data can be recorded during interictal (epileptiform) activities or ictal activity. The ictal data starts from a period (for example 50 min) of pre-ictal data. The interictal period is relatively much longer than pre-ictal period. The interictal period can be up to 24 hours. It can be sampled by the widely used video-EEG recording system.

Here are some frequently-mentioned parameters of EEG data

channel number: 24/32/128
sampling rate: generally at the level of 100Hz (such as 256Hz)
ADC resolution: the number of bits of analog-to-digital converter
filters: notch or bandpass filters, and cutoff frequency

License:

Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License