Think, Forrest! Think!: 12/1/10

2010-12-27

Should human knowledge be represented in logic?

by Forrest Sheng Bao http://fsbao.net

Yes, knowledge representation is a big problem in AI. "In order for a program to be capable of learning something it must first be capable of being told it."

But first, should knowledge be represented in languages?

Second, does the language have to be based on logic?

An example. Do a bunch of data and their labels represent some kind of knowledge? Let's say the data are X-ray images of chest cancer patients and normal people, and a label is 1 if the doctors diagnose the subject as a patient and 0 as a normal people. Is this a piece of human knowledge? Sure. Do we create a special language to represent it? Well, we can simply use matrices - this is the convention in pattern recognition community.

Do we use logic? Probably not at all. Not all human knowledge can be easily represented in logic.

Third, what if only a small portion of human knowledge can be represented in logic?

Well, of course, it is also possible that a better way to represent human knowledge hasn't been invented yet.

So, I need to jump out of the box sometimes.

2010-12-19

Mathematics: a science for use

Die ganzen Zahlen hat der liebe Gott gemacht, alles andere ist Menschenwerk (God created the integers, all the rest is the work of man.) -- Leopold Kronecker, a mathematician

"There is only one nature - the division into science and engineering is a human imposition, not a natural one. Indeed, the division is a human failure; it reflects our limited capacity to comprehend the whole." -- Bill Wulf, a computer scientist (Someones claim this to be a quote of Sir William Cecil Dampier)

by Forrest Sheng Bao http://fsbao.net

I had a talk with some guys this weekend. They believed that science and engineering are different. Their argument was that when we do science, we do not think about its use whereas for engineering, we have a ``clear'' useful purpose to serve.

Well, then we should create a new word: mathematical engineer.

A large portion of math is developed from our desire to use in real problems.

Check these concepts below to see whether they are invented for use:

decimal numbers?
real numbers?
negative numbers?
imaginary numbers?
geometry? The word geometry means ``earth measurement.''
the matrix?
calculus? Calculus is a Latin word meaning pebble or stone used for counting.
differential equations?

We ``engineered'' those concepts out. Engineering (and the money it can bring) is a big driven force to the development of science, especially computer science.

Will thinking humanly be easier?

by Forrest Sheng Bao http://fsbao.net

In AI, planning is a pain in the ass, because automated planner is not practical so far. We want the planner to find a rational solution of our problem. Then it takes a lot of time to find the rational plan. For example, it may take much longer time for an automated planner to find a plan to land several incoming jet in even a small airport than a human navigator. Aviation fuel is very expensive and if a jet run out of gas it will crash.

What if we allow the planner to find an incorrect plan some time? I mean, humanly. Human beings make mistakes. Rationality is hard for us. Otherwise, we wouldn't create the word ``stupid,'' at least in Chinese, English and German. But we have been leaving with it for at least thousands of years according to documented history.

If we could find a proper balance between computing time and correctness of the solution, then thinking humanly could be a better choice for automated planning.

One step forward, do we really think (e.g., logic reasoning, satisfiability checking) when making a plan? If not, then we need to teach computers using our way to find a plan rather than teach them a new fancy way called ``think/act rationally.''

I think, therefore, I am? No, I am a human, therefore, I am.

When declarative programming does not make sense

by Forrest Sheng Bao http://fsbao.net

Pretty much nothing we have created can make sense everywhere. It's very hard to make a universally useful tool.

The term of declarative programming has been out there for many years ago. Its idea is to only tell a computer what you want without how to do it. The compute is supposed to just give you want you want like a black box. Some has become successful, like HTML, SQL, LaTeX or Prolog.

This idea sounds really cool, eh? Hold on one second. If we just need to declare what to do, then why do computer scientists spend so much time on developing algorithms? Why do we spend time on studying Quicksort or FFT?

Oh, yes, HTML, LaTeX or SQL are not designed for these jobs. All right. Let's narrow down.

One subset of declarative programming is called logic programming. Prolog is an example of logic programming languages. I am quite aware that logic programming is not popular at all when we have imperative ways to solve the same problem.

Why haven't a lot of programmers embrace the declarative programming idea?

Reason 1: It's slow.

Logic programs are solved using search algorithms (I guess this is the only way you can do it.) and the search in most cases takes exponential time w.r.t. to the problem size. Just think about traversing a binary tree. Then why would you do that if a problem has a faster algorithm?

For example, sorting. Of course you can use search to find the result. Just prepare all combinations to order given numbers and then check all combinations until you find the sorted one. The time complexity is O(n!). Isn't that much slower than O(nlogn)?

In such a case, using declarative programming does not make sense.

Reason 2: It's hard to use.

A world without loop? Sure, because you don't have to tell the computer how to do and loop is one kind of how-to-do. Instead, you are forced to use recursion because that is declarative. How to recursively do matrix multiplication? I would prefer the for-loop version.

I don't mean a life without loop is sad. But human beings do not think in declarative ways when solving many problems. We prefer step 1, step 2, step 3, .... If programmers don't like that, there is not need to develop a new programming language or paradigm to make them change the way they think. Microsoft Windows sucks, but a lot of people get used to it and therefore Microsoft is still keeping Mac OS or Linux from being mainstream desktop OS. They will give up Microsoft Windows only if [(1 and 3) or (2 and 3)]:

Windows sucks too much and they can't tolerate any more
Mac OS or Linux is good and they can't resist
the cost of transition is affordable or worthy

Then the question is, in what case we shall use declarative programming.
I don't know. I am still thinking. Come back some time later.

2010-12-17

Artificial intelligence is not all about thinking.

by Forrest Sheng Bao http://fsbao.net

This article is rated as PG-17 by me because reading this article may cause the collapsing of your world view and/or religious faith, and consequently, a threatening to your life.

I always thought that the introduction or the ending chapters of a textbook are not worth reading. Now I realize that I made such a big mistake on Russell and Norvig's famous AI textbook. I have been thinking about why symbolic AI is not developing fast since yesterday. And, now, I just realize that if i read those parts of the book, i don't have to swamp in Wikipedia pages about AI. I didn't read the 1st chapter and the 5th part of Russell and Norvig's famous AI textbook before I wrote this blog.

OK, let's begin.

Intelligence includes learning, right? But many people define AI as "a machine that can think." Like this one [1]. Thinking is not everything in learning.

2000 years ago, a Chinese guy called Confucius said:"To study and not think is a waste. To think and not study is dangerous. (學而不思則罔，思而不學則殆)"

I would change the definition of AI as "a machine that can perceive and think." Here, ``perceive'' includes learning.

But this is not the end. In the famous Turing test (similar tests like Chinese Room), a machine is supposed to communicate with a human tester. Hence, the machine needs to take some actions according to the knowledge in his ``brain.'' Do you think Aristotle is intelligent if he spoke nothing?

Therefore, I would like to revise my definition to AI into "a machine that can perceive, think and act."

Then I saw Herbert Simon's words:"machines that think, that learn and that create."[2]

And then I saw the title of the 5th part of Russell and Norvig's AI textbook:"Communicating, Perceiving and Acting." But I still would like to say that Russell and Norvig shouldn't put these ideas to the end of their book (before Conclusions). As I stated before, perceiving and acting are necessary parts of intelligence.

I think that people working on symbolic intelligence (mostly logic and reasoning, in contrast to computational intelligence, e.g., machine learning, evolution computing, part of robotics) focus on formulating human logic into symbols and semantics too much. Knowledge acquisition and expression have been underestimated.

References:
[1] The website of Artificial General Intelligence Conference: "The original goal of the AI field was the construction of `thinking machines' " http://www.agi-conf.org/

[2] Wikiquote page of Herbert A. Simon, http://en.wikiquote.org/wiki/Herbert_Simon

2010-12-12

Reporting errors is a must-have feature of a good software

by Forrest Sheng Bao http://fsbao.net

I happen to use two MATLAB toolboxes in one project now. Of course, you can make mistakes when using any software, e.g., giving the improper parameters.

The very annoying thing happens on one of the toolboxes (let me call it "the unprofessional one"), which does not tell me the possible error i make. It's very like the blue screen of Windows - "all we can tell you is to restart your Windows box." Hold on, what is the problem? What can I do to avoid this?

To solve the problem, I have to debug on my own, which is very time-consuming. Simply telling people the existence of error is not error reporting.

So what is a good one? Let's take a look at the error reporting example below.

>> save_nii_series('dspm8_SUGER-HEALTH01_NBH-EPI_20100917_02',1000,4189)

    * - SPM8: rest_spm_vol  --------------------------------------------

        Unable to find file:
                ..R-HEALTH01_NBH-EPI_20100917_02_1000.nii
         
        Please check that it exists.

        -----------------------------------------  20:48:19 - 12/12/2010

??? Error using ==> rest_ReadNiftiImage at 104
Meet error while reading data. Please restart MATLAB, this problem may be
solved.

Error in ==> save_nii_series at 18
        data = rest_ReadNiftiImage(nii_name);

The last four lines of error are produced by the unprofessional one, which only tells me there is an error but not what kind of error. It also gives me an incorrect instruction, restarting MATLAB. Not to mention that it has English grammar errors in the error reporting.

In contrast, the first a couple of lines of error are produced by SPM, a famous neuroimaging toolbox for MATLAB. It tells me the exact error, file path incorrectness.

2010-12-09

A Git guide for myself

by Forrest Sheng Bao http://fsbao.net

If you are referring this doc, you need to replace all capitalized letters by "real" letters.

First, initialize Git and clone (formally fetch) existing code from Git server to local. When prompted, enter the password.

$ git init
Initialized empty Git repository in /home/forrest/forrest/work/BME/CTF_SAM_OUT/Code/.git/
$ git clone ssh://USERNAME@PROJECTNAME.git.sourceforge.net/gitroot/PROJECTNAME/REPONAME
Initialized empty Git repository in /home/forrest/forrest/work/BME/CTF_SAM_OUT/Code/REPNAME/.git/
$ cd REPNAME/
$ git config user.name "YOUR NAME"
$ git config user.email "USERNAME@users.sf.net"

Move code you plan to push to Git server to current directory and add a push comment to them (like Summary in MediaWiki).

$ cp ../*.m .
$ git add *
$ git commit

Push them into Git server finally.

$ git remote add origin ssh://USERNAME@PROJECTNAME.git.sourceforge.net/gitroot/PROJECTNAME/REPONAME
$ git config branch.master.remote origin
$ git config branch.master.merge refs/heads/master
$ git push origin master

Done.

Reference: http://sourceforge.net/apps/trac/sourceforge/wiki/Git

2010-12-08

Ubuntu Strong: Dual folder window and cloud synchronization everywhere

by Forrest Sheng Bao http://fsbao.net

I have used many folder manager, Find on Mac OS X, IE on Windows, Nautilus on GNOME desktop environment of Linux. But, the latest Ubuntu (Linux) 10.10 modified Nautilus is the best I have ever used.

Two great features as shown above.

First, you can show two different folder contents in parallel. This is very useful. How many of you have to deal with copying files between to folders? Are you tired of switching between windows?

Second, I can synchronize any of my folder, no matter on which path, to Ubuntu One cloud. This is also very useful. Sometimes, we only need to backup files that are updated frequently in recent. And you don't wanna copy all files into the specified synchronization path, such as Dropbox folder. If you decide to not sync them anymore, easily uncheck the "Synchronize this folder" option.

2010-12-06

FreeSurfer surface-based work flow (Forrest version)

The Wiki of FreeSurfer is not well organized. Links are like Go-To statement. So instead of crawl over their links the next time, I would write done my own notes.

First, I have a NIFTI format data under my current folder and it needs to converted into MGZ format.

1$ ls
s108921-0002-00000-000001-01.hdr
s108921-0002-00000-000001-01.img
$ mri_convert s108921-0002-00000-000001-01.img 290.mgz

After that, create a folder 290 under $SUBJECTS_DIR.

$ mkdir ~/bin/freesurfer/subjects/290
$ mkdir -p ~/bin/freesurfer/subjects/290/mri/orig
$ cp 290.mgz ~/bin/freesurfer/subjects/290/mri/orig/

Then do the reconstruction

recon-all -all -s 290

Finally, check the surface

~$ tkmedit 290 brainmask.mgz -aux T1.mgz -surfs

(if you want to see the segmentation result, add this option, -segmentation aseg.mgz $FREESURFER_HOME/FreeSurferColorLUT.txt )

and check the geometric features (left hemisphere)

$ tksurfer 290 lh inflated

I played with some epileptic patients (structural/anatomical) MRI data. And come up with a new topic I can do later.

References:
http://surfer.nmr.mgh.harvard.edu/fswiki/FsTutorial/OutputData

2010-12-03

Running MATLAB without graphic interface

Case 1: Interactive over SSH

Start MATLAB using the command:

matlab -nodisplay -nojvm

Case 2: Just run a MATLAB script over SSH

In this case, you don't even need a MATLAB Prompt to interact with.

matlab -nodisplay -nojvm -r a_code > matlab.out

Your MATLAB program should have the file name a_code.m (you can change it to whatever name you like. But when you tell MATLAB to run it, omit the .m suffix.) The > matlab.out part redirects the output to a file rather than showing on your Linux/Mac OS X terminal. This way is preferred because you have a record on what is happening.

Case 3: Run MATLAB script with input variables over SSH

First, go to the directory containing the MATLAB script that defines the function (or, fancier, add that directory into MATLAB PATH by editing your MATLABPATH environment variable or pathdef.m file). Then, run like this

matlab -nodisplay -nojvm -r "my_function(10)"

Again, you can redirect the output to a file.

You can also write a Shell, Perl or Python script to do so. MATLAB gives an example using Perl at http://www.mathworks.com/help/techdoc/matlab_env/f8-4994.html

Case 4: On Sun Grid Engine (SGE)-powered cluster

Write a job script below and submit it using qsub

#!/bin/sh
#$ -V
#$ -cwd
#$ -S /bin/bash
#$ -N matlab
#$ -o $JOB_NAME.o$JOB_ID
#$ -e $JOB_NAME.e$JOB_ID
#$ -q normal 
#$ -pe fill 12
#$ -P hrothgar
matlab -nodisplay -nojvm -r a_code > matlab.out

Your MATLAB program should have the file name a_code.m.

For more options on starting MATLAB, please refer to its official doc at
http://www.mathworks.com/help/techdoc/ref/matlabunix.html

2010-12-02

The sleeping experiment

Many years ago, when i first started working on brain imaging, I heard such a hypothesis: If a man uniformly distribute the sleeping hours into several slots (e.g., sleeping 1.5 hours after every 4.5 hours of working), then he has enough sleep while in total sleeps less.

This is really a good hypothesis and I want to do an experiment to see whether it is true.

So here is my plan. I will sleep from 7PM to 10 PM and 6AM to 9 AM every day. For the rest time, i will work. In total, i only need 6 hours of sleeping every day. And I can increase my working hours to 18 per day. In such a way, I can have 136 hours of working per week - assuming that I am not gonna meet a pretty girl before I get my PhD.

The benefit is obvious, I can be on during normal business hours while having a whole night quietly working - people in other labs make loud noise when their carts pass by my office door.