2010-05-24

Python and module paths after compilation and installation from source code

by Forrest Sheng Bao http://fsbao.net

Like many supercomputers in the world, Grendel, the world's 175 fastest computer, forces me to install everything from compiling source code and does not allow me to mess up the beautiful /usr/local directory. I can only play around in my home directory and cannot enjoy apt-get to deploy all Python modules I need.

So, i gotta install Python 2.6, and numpy, scipy, mlpy and pyml all from source code - the default version of Python on CentOS/RedHat Enterprise Linux 5.4 is 2.4. During this progress, I have figured out how Python and its modules are placed, or more precisely the directory structure. The feeling is like when I first figured out how things under /usr/local is for.

1. Python (at least for Python 2.4 and Python 2.6)
The source code package contains a configure script as the convention. When you run the configure script, you do things like this
./configure --prefix=/home/bao/Python-2.6
And then your run
make
and
make install 

After make install, where is my Python 2.6? Denote the directory specified by --prefix as $PYTHON. The default value of $PYTHON is /usr/local/. After make install, $PYTHON contains four subdirectories, bin, include, lib and share, like the structure under /usr/local/. Python 2.6 interpreter itself is under $PYTHON/bin/, called python2.6 and/or python. include, lib and share sever the conventional purpose on UNIX systems, which are source files/heads, compiled shared/static libraries and documentations respectively.  lib could also include Python modules installed - I will detail this point in section 2.

You need to do is to add $PYTHON/bin into $PATH environmental variable in order to start python2.6 or python directly from your shell. I would delete the $PYTHON/bin/python and enter python2.6 to make sure I am calling the Python 2.6 interpreter because some programs of mine run on Python 2.4. You can use the which command to determine which exactly the Python interpreter is used, like this:
$ which python
/usr/bin/python
$ which python2.6
~/apps/Python-2.6/bin/python2.6

2. Locations of Python modules if installed to $PYTHON
Most Python module source code packages contain a setup.py scripts by convention.You install (including compiling) it by executing
python2.6 python install --prefix=/home/bao/Python2.6/
. Notice here I used python2.6 rather than default python (which is Python 2.4 interpreter on my system).

If you do not specify in --prefix, the setup.py will install everything into /usr/local/ by default.

So where is the module installed? Well, it goes to my $PYTHON as specified by --prefix. More precisely, it goes to the $PYTHON/lib. Under $PYTHON/lib there should be one or more directory(ies) like pythonX.Y, depending on the Python version (X.Y) you use. On mine, it is $PYTHON/lib/python2.6. Under the pythonX.Y directory, there is a folder called site-packages, where contains installed modules. On my system:
$ ls /home/bao/apps/Python-2.6/lib/python2.6/site-packages/
numpy  numpy-1.3.0-py2.6.egg-info  README

By default, your Python interpreter will look for modules under this site-packages directory.

3. What if I set directory other than $PYTHON after --prefix?

The setup.py will create the same hierarchy structure lib/pythonX.Y/site-packages under the directory you specified.

4. The $PYTHONPATH environmental variable and sys.path (You need to read this if you did section 3).

Let's do a small experiment first. Start your Python interpreter, import sys module and run sys.path:
$ python2.6
Python 2.6 (r26:66714, May 24 2010, 10:45:11) 
[GCC 4.1.2 20080704 (Red Hat 4.1.2-44)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.path
['', '/home/bao/apps/Python-2.6', '/home/bao/apps/Python-2.6/lib/python26.zip', '/home/bao/apps/Python-2.6/lib/python2.6', '/home/bao/apps/Python-2.6/lib/python2.6/plat-linux2', '/home/bao/apps/Python-2.6/lib/python2.6/lib-tk', '/home/bao/apps/Python-2.6/lib/python2.6/lib-old', '/home/bao/apps/Python-2.6/lib/python2.6/lib-dynload', '/home/bao/apps/Python-2.6/lib/python2.6/site-packages']

The output of sys.path is a list of default paths where Python interpreter will search modules. For import XYZ, it will search a folder called XYZ under all those directories.

If your modules are located under any of the directories, (e.g., the paths specified in --prefix are the same when you install Python interpreter and modules) , you don't have to do anything. As you can see, default search path will make sure installed modules be found.

O/w, you need to specify the $PYTHONPATH environmental variable on your UNIX system to tell Python interpreter to search XYZ under it. Suppose you install XYZ at /MyMoD/XYZ folder. Then your $PYTHONPATH should contain the string /MyMoD. Another case is when you Python module is just one Python file, like XYZ.py, located at /MyMoD/XYZ.py, then your $PYTHONPATH should also contain the string /MyMoD.

For more details about $PYTHONPATH, please look into http://docs.python.org/using/cmdline.html#envvar-PYTHONPATH

No comments: