Install Globus (especially the Globus gatekeeper) on the head node of the cluster. We use Globus GRAM to communicate programmatically with Condor/SGE. To enable this, make sure that you have also installed the Condor/SGE job-manager. Follow the documentation for this on the Globus web site. You may install any version of Globus as long as you can submit jobs to Condor/SGE via GRAM (and the Java CoG Kit). We use Globus version 3.2.0 on our cluster.
Ensure that you can submit jobs to Condor/SGE via Globus, especially using the certificate/key-pair of the Tomcat server (described in Setting up GSI-based Security for Opal ). You can do this by following these steps:
Copy app_service.cert.pem (the certificate file) to app_service.all.pem.
Edit app_service.all.pem, and strip out everything excluding the region between the lines -----BEGIN CERTIFICATE----- and -----END CERTIFICATE-----. Leave those two lines in.
Append the app_service.privkey (the unencrypted private key) to the contents of app_service.all.pem.
Set the X509_USER_PROXY environment variable to the location of app_service.all.pem.
Submit a test job using the above proxy to the Condor job-manager as follows: globus-job-run "hostname:2119/jobmanager-condor" "/bin/ls". If you are using SGE, use "hostname:2119/jobmanager-sge".
If this above job succeeds, then Globus/Condor(SGE) can be used for scheduling purposes. You will have to add an entry into the grid-mapfile of the Globus installation (usually inside the /etc/grid-security directory) to authorize the service to launch jobs as follows: "/C=US/O=grid-devel/OU=sdsc/CN=app_service" app_user. Replace the value within quotes with the DN for the app_service.cert.pem. You can get the DN by running: grid-cert-info -subject -file app_service.cert.pem. The app_user is the Unix user running the Tomcat server hosting Opal.
You may also have to add the CA cert and signing policy of the above certificate into the list of trusted certificates for the Globus installation (usually inside the /etc/grid-security/certificates directory).
You may also want to check if you can submit parallel jobs via Globus. You can do so by running something like this: globusrun -o -r hostname:2119/jobmanager-sge "&(executable=<my_mpi_exec>)(count=n)(jobtype=mpi)". Replace <my_mpi_exec> with some valid MPI executable, and replace count with a valid number of processes for your executable
Set the following properties inside the opal.properties file: globus.gatekeeper to the URL for the Globus gatekeeper, globus.service_cert to the location of the server certificate, and globus.service_privkey to the location of the server's unencrypted private key.
If you would like to submit Globus jobs to a local cluster, then set the property opal.jobmanager to edu.sdsc.nbcr.opal.manager.GlobusJobManager, and if it is to a remote cluster, then set it to edu.sdsc.nbcr.opal.manager.RemoteGlobusJobManager. If you are using the remote Globus job manager, set the globus.gridftp_base to the base URL of the location to stage files to. Make sure that this is a valid directory, and that the app_user can write files can create directories in this location. It is important to notice that if you specify a path with gsiftp://myhost.mydomain.com:2812/data/foo.dat, the location 'data/foo.dat' will be relative to the remote home directory. To point to absolute paths, you have to use double slashes (//), e.g. gsiftp://myhost.mydomain.com:2812//data/foo.dat.
Reinstall Opal by running the following command:
ant install |
Restart Tomcat for the changes to take effect.