Use of SGE
Sun Grid Engine (SGE) is a task manager that allows to allocate automatically and dynamically the necessary resources, in a transparent way for the user.
SGE allows to submit, monitor, modify and delete tasks (or jobs). Submitted jobs are managed by a queue.
Any job must be encapsulated in an SGE script (qsub) or launched in the interactive mode via SGE (qlogin). No job should be launched directly on the login node.
Jobs must be run from the /SCRATCH-BIRD directory, but in no case run from /home.
Several parameters can be adjusted during a submission: number of processors, memory, specific variables...
There are several ways to use SGE:
- with a script :
qsub
Instructions are embedded in a script submited by the qsub command. SGE attributes a number to the submission. This identifier can be used to track its progress (qstat) or to delete it (qdel).
- with interactive session :
qlogin
This command allows you to connect directly in bash mode on a compute node.
It must be used to perform tests (script, libraries, environment ...) before using the qsub job submission mode.
To log on a specific node (use qstat -f to see the names of the nodes) : qlogin -l h=nodename
To end the session, run :
exit
- with a makefile :
qmake
The qmake command allows the parallel management of programs controlled by the widespread make tool in Unix.
Basic use:qmake -cwd -v PATH - [make options]
Queues
Several queues are available, depending on the duration of your jobs :
- max-24h.q : by default, max 24h
- max-7d.q : max 7 days
- max-1m.q : max 1 month
After this time, the jobs are stopped automatically.
Submit a job
Write a qsub script
#!/bin/bash
# Script options # Optional script directives
#$ -cwd
shell commands # Optional shell commands
application # Application itself
Launch a qsub script
qsub script_file [-- script_arguments]
It is possible to specifying the queue on which the job should run.
qsub -q max-7d.q script-file
Specifiy environment variables
To export your path, use the -v or -V option.
With this option, you can export the paths loaded by a module. For exemple :
module load plink
qsub -v PATH script_file [-- script_arguments]
In this case, you can use plink
command directly in your script file.
When your script use modules or conda environments, 2 solutions :
- load modules/environments before submission and use the -V option
- write load commands in the script file
Options
qsub -help
Main options
Option | Utilisation |
-V | export environment variables |
-cwd | Execute the job from the current working directory. |
-S | Specifies the interpreting shell for the job |
-M | email for notification |
-m e|b|a|n|s | how to send email : end;begin;aborted;never;suspended |
Options can be passed in the script with # $, either in the command line
Exemple
# Script options # Optional script directives
#$ -S /bin/bash
#$ -cwd
#$ -M mymail@univ-nantes.fr
#$ -m be
echo "Hello world"
echo $JOB_ID
MonProg -options
Environment variables
$HOME | Home directory on execution machine |
$USER | User ID of job owner |
$JOB_ID | Current job ID |
$JOB_NAME | Current job name; (like the -N option in qsub, qsh, qrsh, qlogin and qalter) |
$HOSTNAME | Name of the execution host |
$TASK_ID | Array job task index number |
Follow a job
Use qstat
command to follow your job execution.
qstat –f # Show all queues
qstat –u ”*” # Show all running jobs (all users)
qstat –f –u ”*” # Show all running jobs sorted by queues
Jobs could have differents status :
- qw : waiting on queue
- r : running
- s,S : suspended
- R : restarted
- E : Error
Information on error :
qstat -j jobnumber -explain E
Delete a job
Use qdel
command to delete a job :
# One job
qdel numjob
# A list of jobs
qdel numjob1 numjob2 numjob3
# All jobs for a user
qdel –u <username>