Installing Galaxy on a ROCKS cluster ==================================== Here Galaxy is installed using Apache as a proxy, PostGreSQL as the db backend, and using DRMAA libraries to submit jobs to the SGE queue. Security is handled by Galaxy: by default users must be registered by the administrator to gain access. Gleaned from: - `Get Galaxy `_ - `Running Galaxy in a production environment `_ - `Running Galaxy Tools on a Cluster `_ Python ------ Galaxy in Dec 2011 only supports Python 2.6. Download build and install new Python in /share/apps Create a Python Galaxy environment using Virtualenv --------------------------------------------------- Galaxy will be installed in /share/apps/galaxy which is NFS mounted:: $ mkdir -p /share/apps/galaxy $ cd !$ Install `virtualenv.py `_:: $ easy_install virtualenv $ locate virtualenv.py /share/apps/lib/python2.7/site-packages/virtualenv-1.7-py2.7.egg/virtualenv.py $ python2.6 /share/apps/lib/python2.7/site-packages/virtualenv-1.7-py2.7.egg/virtualenv.py --no-site-packages galaxy_env The --no-site-packages flag is deprecated; it is now the default behavior. New python executable in galaxy_env/bin/python2.6 Also creating executable in galaxy_env/bin/python Installing setuptools............................done. Installing pip...............done. $ ls -l total 8 drwxrwxr-x 17 cymon biouser 4096 Dec 22 11:00 galaxy-dist drwxr-xr-x 5 cymon biouser 4096 Dec 22 13:10 galaxy_env Download Galaxy --------------- Galaxy uses `Mercurial `_. Download `source `_ and build with *galaxy_env* python (allows you to run unittests later):: $ ../../galaxy_env/bin/python2.6 setup.py install $ hg clone https://bitbucket.org/galaxy/galaxy-dist/ Install PostGreSQL 9.1 ---------------------- Althoug ROCKS has a default installation of posgresql libraries (I think they are needed by ganglia) the postgresql-server package in the ROCKs repo wont install because of dependency errors. Here I follow the instructions at the `PostGreSQL Yum installation pages `_. These instructions work because the posgresql installation is **postgresql91** and doesnt interfer with the installed libraries, BUT this only works with postgresql >= 9.0 else you are going to remove the current libraries and bork ganglia and maybe other things. Make sure yum doesnt remove the original libraries! :: $ cat /etc/redhat-release CentOS release 5.4 (Final) [etc]$ cat /etc/rocks-release Rocks release 5.3 (Rolled Tacos) [~]$ file /sbin/init /sbin/init: ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), for GNU/Linux 2.6.9... Exclude postgresql from CentOS base repo in /etc/yum.conf.d/* AND disable it in ROCKS stanza of /etc/yum.conf - disable all other repos:: $ cat yum.conf [main] cachedir=/var/cache/yum debuglevel=2 logfile=/var/log/yum.log pkgpolicy=newest distroverpkg=redhat-release tolerant=1 exactarch=1 assumeyes=1 [Rocks-5.3] name=Rocks 5.3 baseurl=http://10.1.1.1/install/rocks-dist/x86_64 exclude=postgresql* The most recent PostGreSQL version with a Package Group available for CentOS 5 is PG version 9.1 - `available packages `_:: $ wget http://yum.postgresql.org/9.1/redhat/rhel-5-x86_64/pgdg-centos91-9.1-4.noarch.rpm $ sudo rpm -ivh pgdg-centos91-9.1-4.noarch.rpm $ yum list postgres* $ yum list postgres* | cut -d " " -f 1 Excluding Finished Excluding Finished Excluding Finished Installed postgresql-libs.i386 <--------- INSTALLED postgresql-libs.x86_64 <--------- INSTALLED Available postgresql91.x86_64 postgresql91-contrib.x86_64 postgresql91-debuginfo.x86_64 postgresql91-devel.x86_64 postgresql91-docs.x86_64 postgresql91-jdbc.x86_64 postgresql91-jdbc-debuginfo.x86_64 postgresql91-libs.x86_64 postgresql91-odbc.x86_64 postgresql91-odbc-debuginfo.x86_64 postgresql91-plperl.x86_64 postgresql91-plpython.x86_64 postgresql91-pltcl.x86_64 postgresql91-python.x86_64 postgresql91-python-debuginfo.x86_64 postgresql91-server.x86_64 postgresql91-tcl.x86_64 postgresql91-tcl-debuginfo.x86_64 postgresql91-test.x86_64 postgresql_autodoc.noarch $ sudo yum install postgresql91-server ... Running Transaction Installing : postgresql91-libs 1/3 Installing : postgresql91 2/3 Installing : postgresql91-server 3/3 Installed: postgresql91-server.x86_64 0:9.1.2-1PGDG.rhel5 Dependency Installed: postgresql91.x86_64 0:9.1.2-1PGDG.rhel5 postgresql91-libs.x86_64 0:9.1.2-1PGDG.rhel5 Complete! Initialise the datbase (only once):: $ sudo service postgresql-9.1 initdb Initializing database: [ OK ] Add to startup:: $ sudo chkconfig postgresql-9.1 on Start the server:: $ sudo service postgresql-9.1 start Starting postgresql-9.1 service: [ OK ] Note: PGDATA=/var/lib/pgsql/9.1 Init a user as superuser:: $ sudo su postgres Password: bash-3.2$ createuser cymon could not change directory to "/home/cymon" Shall the new role be a superuser? (y/n) y bash-3.2$ exit exit $ createdb $ psql psql (9.1.2) Type "help" for help. cymon=# Create galaxy user ------------------ We need a user *galaxy*:: $ sudo useradd -d /share/apps/galaxy -G biouser -s /bin/bash galaxy This won't add /etc/skel because the dir already exists:: $ sudo su - galaxy -bash-3.2$ cp /etc/skel/.bash_profile . -bash-3.2$ cp /etc/skel/.bashrc . -bash-3.2$ . .bash_profile Edit .bash_profile and set $TEMP [for whatever reason the tmp directory that is stated in the intructions doesnt exist, so I made it]:: $ mkdir galaxy-dist/database/tmp $ tail -2 .bash_profile TEMP=/share/apps/galaxy/galaxy-dist/database/tmp export TEMP Add galaxy to PostGreSQL:: $ createuser galaxy Shall the new role be a superuser? (y/n) n Shall the new role be allowed to create databases? (y/n) y Shall the new role be allowed to create more new roles? (y/n) n $ psql psql (9.1.2) Type "help" for help. cymon=# alter user galaxy with encrypted password ''; ALTER ROLE $ sudo su - galaxy Password: $ createdb $ psql psql (9.1.2) Type "help" for help. galaxy=> Configure Galaxy ---------------- [cymon@gyra galaxy-dist]$ cp universe_wsgi.ini.sample universe_wsgi.ini Edit `universe_wsgi.ini`:: debug = False use_interactive = False new_file_path = /share/apps/galaxy/galaxy-dist/database/tmp start_job_runners = drmaa default_cluster_job_runner = drmaa:// Make the PostGreSQL database connection:: database_connection = postgres:///galaxy?user=galaxy&password='' cookie_path = /galaxy filter-with = proxy-prefix Configure security:: admin_users = require_login = True allow_user_creation = False allow_user_deletion = True allow_user_impersonation = True new_user_dataset_access_role_default_private = True Other stuff:: set_metadata_externally = True database_engine_option_server_side_cursors = True database_engine_option_strategy = threadlocal Dealing with the queue submission and Python -------------------------------------------- OK, here's the problem: Galaxy submits jobs to the queue in a clean default shell env - bash on cluster, so it uses the default python (/usr/bin/python). Consequently, it cannot find the .eggs which are installed in the Galaxy Virtualenv we've setup. When you try to download data, say from "UCSC Main table browser" the data download but then you get the following error:: WARNING:galaxy.eggs:Warning: MarkupSafe (a dependent egg of Mako) cannot be fetched ie it cannot find the .egg in the python installation and it failed to download it... Now, it should be possible to tell DRMAA to source the shell in the SGE submission script, I tried this BUT it didnt work:: default_cluster_job_runner = drmaa://-shell yes it runs but but not in the galaxy users environment (bug?). In one of the `Galaxy-dev threads `_, someone suggest going to each node and installing the eggs in the default python - this should work, but isn't a very nice solution. In the end I just decided to change the default submission script at the code level:: $ vi /state/partition1/apps/galaxy/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py and add:: PATH="/share/apps/galaxy/galaxy_env/bin":$PATH to the template:: drm_template = """#!/bin/sh source ~/.bash_profile PATH="/share/apps/galaxy/galaxy_env/bin":$PATH GALAXY_LIB="%s" if [ "$GALAXY_LIB" != "None" ]; then if [ -n "$PYTHONPATH" ]; then PYTHONPATH="$GALAXY_LIB:$PYTHONPATH" else PYTHONPATH="$GALAXY_LIB" fi export PYTHONPATH fi cd %s %s """ Configure Apache as Proxy ------------------------- Add the follwing to the bottom of /etc/httpd/conf/httpd.conf:: RewriteEngine on RewriteRule ^/galaxy$ /galaxy/ [R] RewriteRule ^/galaxy/static/style/(.*) /share/apps/galaxy/galaxy-dist/static/june_2007_style/blue/$1 [L] RewriteRule ^/galaxy/static/scripts/(.*) /share/apps/galaxy/galaxy-dist/static/scripts/packed/$1 [L] RewriteRule ^/galaxy/static/(.*) /share/apps/galaxy/galaxy-dist/static/$1 [L] RewriteRule ^/galaxy/favicon.ico /share/apps/galaxy/galaxy-dist/static/favicon.ico [L] RewriteRule ^/galaxy/robots.txt /share/apps/galaxy/galaxy-dist/static/robots.txt [L] RewriteRule ^/galaxy(.*) http://localhost:8080$1 [P] # Compress all uncompressed content. SetOutputFilter DEFLATE SetEnvIfNoCase Request_URI \.(?:gif|jpe?g|png)$ no-gzip dont-vary SetEnvIfNoCase Request_URI \.(?:t?gz|zip|bz2)$ no-gzip dont-vary # Allow browsers to cache everything from /static for 6 hours ExpiresActive On ExpiresDefault "access plus 6 hours" Restart Apache. Start Galaxy ------------ :: $ source galaxy_env/bin/activate (galaxy_env)$ (galaxy_env)$ sh galaxy-dist/run.sh Point browser at http://gyra.ualg.pt/galaxy