User Tools

Site Tools


using_ogs_sge

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
using_ogs_sge [2017/02/09 22:27]
mgstauff [Temp directory]
using_ogs_sge [2018/03/02 20:35] (current)
mgstauff [Per-job memory limit]
Line 56: Line 56:
 The most common way to use SGE is to run batch jobs via ''qsub''. This is the non-interactive method. The most common way to use SGE is to run batch jobs via ''qsub''. This is the non-interactive method.
  
-''qsub'' allows you to submit a job defined by a script, and the job scheduler will place your job in the **job queue**, to be run either immediately or when resources open up.+''qsub'' allows you to submit a job defined by a script (a file that holds a series of commands - basically a program), and the job scheduler will place your job in the **job queue**, to be run either immediately or when resources open up.
  
-You run a batch job like so:+You run a batch job like so, where ''myjobscript'' is the name of a script that holds some commands. For example, it could be a BASH script, or a PERL script.
  
-  [mgstauff@chead ~]$ qsub myjob +  [mgstauff@chead ~]$ qsub myjobscript 
-  Your job 27657 ("myjob") has been submitted+  Your job 27657 ("myjobscript") has been submitted
      
-//myjob// is any kind of script file that can be run in a bash shell. The bash shell is the default shell for the front end and compute nodes. It's also the default shell in the old CfN cluster.+Here's an example BASH script that could be in the file named ''myjobscript'' (you can cut-n-paste into a text editor on the cluster to try it yourself):
  
-The output says that your job has been submitted to the queue. It's either running right away or waiting for resources to open up so it can runThe output also gives you the job-ID (27657 in this case).+  //#!/bin/bash 
 +  echo I am a job running now on $HOSTNAME 
 +  ZZZ=5 
 +  echo Sleeping for $ZZZ... 
 +  sleep $ZZZ 
 +  echo NSLOTS: $NSLOTS 
 +  echo All Done.
  
 ===== Output from your job ===== ===== Output from your job =====
  
-Your script should be setup to save your image or data output files as you normally would, i.e. typically in your /jet directory somewhere in your project tree.+Your script should be setup to save your image or data output files as you normally would, i.e. typically in your /data/<xyz>/<username> directory somewhere in your project tree.
  
 But what happens to the terminal output of your script? That is, the text or error messages your script normally generates and shows on the screen when you run it from the command line? This output is saved to special files for each job in the job's working directory. They look like this:  But what happens to the terminal output of your script? That is, the text or error messages your script normally generates and shows on the screen when you run it from the command line? This output is saved to special files for each job in the job's working directory. They look like this: 
  
-  [mgstauff@chead ~]$ ls myjob.*+  [mgstauff@chead ~]$ ls myjobscript.*
   myjob.e27657   myjob.e27657
   myjob.o27657   myjob.o27657
Line 263: Line 269:
 ===== Temp directory ===== ===== Temp directory =====
 **IMPORTANT** **IMPORTANT**
-If your script/job makes use of temporary files and you need to provide a path for them, use the ''TMPDIR'' or ''TMP'' environment variables setup for your qsub job (and for qlogin). These are local to the compute node and will be __much__ faster than writing temporary files to /jet or other mounted devices.+If your script/job makes use of temporary files and you need to provide a path for them, use the ''TMPDIR'' environment variable setup for your qsub job (and for qlogin). This is a temporary directory local to the compute node and unique to your job. It will be __much__ faster than writing temporary files to /data or other remote-mounted devices. Also, it's automatically deleted when your job ends, so keeps the /tmp dir on your local node clean for other users. If script/application is hard-coded to use /tmp or ''TMP'', it is alright. It will use the local compute node drive, but will take longer to be cleared out of /tmp.
  
 In short, **do not write temporary files to your data directory** or other mounted devices unless you have to for some special reason. This will slow down the whole system a bit for you, and for everyone. In short, **do not write temporary files to your data directory** or other mounted devices unless you have to for some special reason. This will slow down the whole system a bit for you, and for everyone.
Line 339: Line 345:
 ==== Checking How Busy the Cluster Is ==== ==== Checking How Busy the Cluster Is ====
  
 +=== cfn-resources ===
 +
 +The best way to check resources is to run this command:
 +
 +  cfn-resources
 +
 +to get a list of resources available for ''all.q'', the default SGE queue. To view resources available for other queues, pass that queue name as an argument, e.g.:
 +
 +  cfn-resources himem.q
 +  
 === qstat -g c === === qstat -g c ===
-To get a good idea of how busy the cluster is, run ''qstat -g c'' This command shows how many slots are used for each queue. See this example:+ 
 +A lower-level, run ''qstat -g c'' This command shows how many slots are used for each queue. **HOWEVER** it does not take memory into account. Often a compute node has CPU cores available (slots) but no memory. Use the ''cfn-resources'' command for both resources. 
 + 
 +See this example:
  
 {{:qstat-gc-example.png?600|}} {{:qstat-gc-example.png?600|}}
Line 356: Line 375:
 A more detailed view of each node, including slot and memory usage, can be seen this way: A more detailed view of each node, including slot and memory usage, can be seen this way:
  
-  qstat -F h_vmem,s_vmem -q all.q+  qstat -F h_vmem,s_vmem,slots -q all.q
      
 this will show the info for the qsub queue, all.q. Replace all.q with another queue name to see its status this will show the info for the qsub queue, all.q. Replace all.q with another queue name to see its status
 +
 +**NOTE** that the ''slots'' info listed here is in fact CPU cores. So it'll show how man cores are available for each machine, along with how much memory. This tells you then if there are cores free and enough memory on any machine to run your jobs.
 +
 +You can add an alias to make this easier:
 +
 +  alias qsF='qstat -F h_vmem,s_vmem,slots -q all.q | less'
  
 ---- ----
Line 395: Line 420:
 __**However, you may not see any message that your job was killed because of a memory limitations.**__ There's a glitch in SGE in that it when you hit a memory limit, the SGE system doesn't always catch the fact before the operating system. If the operating system notices first, then your job will be killed such that SGE can't get a message back to you about what happened, and any exception/error handling in the app will most likely not be able to get its message to your output files before the process is terminated.  I hope to be able to find a workaround to this in the future. __**However, you may not see any message that your job was killed because of a memory limitations.**__ There's a glitch in SGE in that it when you hit a memory limit, the SGE system doesn't always catch the fact before the operating system. If the operating system notices first, then your job will be killed such that SGE can't get a message back to you about what happened, and any exception/error handling in the app will most likely not be able to get its message to your output files before the process is terminated.  I hope to be able to find a workaround to this in the future.
    
 +=== Java Memory Issues ===
 +
 +Java like to allocate lots of RAM. You usually have to limit its memory. [[java|Click here for details.]]
 +
 ==== Jobs on chead ==== ==== Jobs on chead ====
 If you're running something directly on chead, there are different limits. [[clusterbasics#don_t_generally_run_programs_on_the_front_end_itself|See here for details.]] If you're running something directly on chead, there are different limits. [[clusterbasics#don_t_generally_run_programs_on_the_front_end_itself|See here for details.]]
Line 447: Line 476:
  
 ==== Per-job memory limit ==== ==== Per-job memory limit ====
-There is a limit of 62GB per job at this point. This allows a single ''qlogin'' session to run a large memory job on a single compute node.+There is a limit of 30GB per job at this point for jobs running on the default queue, 'all.q'. See notes on the himem.q queue on this page if your job uses more memory.
  
 NOTE that if you request this much memory, you might have to wait for a node to become free since this means using most of a node's memory resources, and your job might be slowed along with other jobs on the node because memory swap space will most likely be used. NOTE that if you request this much memory, you might have to wait for a node to become free since this means using most of a node's memory resources, and your job might be slowed along with other jobs on the node because memory swap space will most likely be used.
Line 515: Line 544:
 If you have a lot of jobs to run, it's usually better to run them single-threaded. You'll run more of them at once, and in the end all of them will complete sooner. And when the cluster is busy, you'll spend less time waiting for a compute core with the requested number of cores available. So if you've submitted more jobs than you have slots in your quota, you're better off running them single-threaded. If you have a lot of jobs to run, it's usually better to run them single-threaded. You'll run more of them at once, and in the end all of them will complete sooner. And when the cluster is busy, you'll spend less time waiting for a compute core with the requested number of cores available. So if you've submitted more jobs than you have slots in your quota, you're better off running them single-threaded.
  
-The exception is if you ask for more memory for your jobs. The memory quota is 6GB / slot, so if each of your jobs asks for 12GB, you'll hit your memory quota before (twice as quickly as) your slot quota. So in that case you'd ask for 2 slots per job to speed things up.+__To modify queued jobs__, you can run this command: 
 + 
 +   qalter -binding linear:1 -pe unihost 1 <jobid> 
 + 
 +__The exception__ is if you ask for more memory for your jobs. The memory quota is 6GB / slot. So for example, if each of your jobs asks for 12GB, you'll hit your memory quota before (twice as quickly as) your slot quota. So in that case you'd ask for 2 slots per job to speed things up.
  
 And with a single or small number of jobs, you should have a decent idea of whether it will run faster with multiple cores before you ask for them, especially if you run this kind of job periodically. You can run once with 1 core, then once with 4 cores and compare the time it takes (you can add the 'date' command at the begin and end of your job script). If you ask for a bunch of cores and aren't utilizing them, you're wasting resources for everyone else. And with a single or small number of jobs, you should have a decent idea of whether it will run faster with multiple cores before you ask for them, especially if you run this kind of job periodically. You can run once with 1 core, then once with 4 cores and compare the time it takes (you can add the 'date' command at the begin and end of your job script). If you ask for a bunch of cores and aren't utilizing them, you're wasting resources for everyone else.
Line 521: Line 554:
 In your ''qlogin'' session or ''qsub'' job, the ''NSLOTS'' environment variable will be set by SGE to the number of slots you've requested (some scripting in your .bash_profile actually handles ''qlogin'' instances). In your ''qlogin'' session or ''qsub'' job, the ''NSLOTS'' environment variable will be set by SGE to the number of slots you've requested (some scripting in your .bash_profile actually handles ''qlogin'' instances).
  
-Use this variable in your scripts/commands if you need to know how many slots are available for threading. Matlab and ITK apps are setup automatically on the cluster by special handling, see below.+Use this variable in your scripts/commands if you need to know how many slots are available for threading. The [[matlab_usage|Matlab]], [[mrtrix_usage|MRTrix]] and ITK apps are setup automatically on the cluster by special handling, see below.
  
  
Line 560: Line 593:
  
 __NOTE__ Because the -V option will pass your environment variables to your qsub sessions, be careful what value you set for ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS. If it does not match the number of slots you're requesting for qsub, threading will not work properly and performance will decrease.  __NOTE__ Because the -V option will pass your environment variables to your qsub sessions, be careful what value you set for ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS. If it does not match the number of slots you're requesting for qsub, threading will not work properly and performance will decrease. 
 +
 +=== Limiting threads in OMP-based apps like FSL===
 +The default environment is setup to include
 +
 +  export OMP_NUM_THREADS=${NSLOTS}
 +  export OMP_THREAD_LIMIT=${NSLOTS}
 +
 +which limits OMP-based apps (like FSL) to use only as many threads as you have slots.
  
 === Limiting threads in Matlab === === Limiting threads in Matlab ===
using_ogs_sge.1486679278.txt.gz · Last modified: 2017/02/09 22:27 by mgstauff