Cheat Sheet: Using HPC
I want to...Example
TASK | METHOD | COMMANDS |
---|---|---|
Create a Job | From a UNIX command | srun hostname |
From a Script | sbatch script.sh | |
See Jobs running | See it from partitions | sinfo |
See All the jobs ID | squeue | |
See current jobs from a user | squeue -u <username> | |
See current running jobs by user | squeue -u <username> -t RUNNING | |
See current pending jobs by user | squeue -u <username> -t PENDING | |
See jobs from a user change <username> with a username | sacct -u <username> --format=JobID,JobName,MaxRSS,Elapsed | |
See status for a currently running job by its ID | sstat --format=AveCPU,AvePages,AveRSS,AveVMSize,JobID -j <jobid> --allsteps | |
See the Nodes being used |
Show how many nodes Each job is taking |
sinfo |
Show where the nodes are going | sinfo -N -l | |
Cancel a Job | First use squeue to see the job Id that you want to cancel then use scancel |
scancel job_ID example: scancel 12345 |
Cancel all the jobs from a user |
scancel -u <username> |
|
Cancel all the pending jobs from a user |
scancel -t PENDING -u <username> |
|
To Cancel one or more jobs by name | scancel --name myJobName | |
Pause a job | Pause a job by ID, use squeue to see the IDs | scontrol hold job_ID |
Resume a job | Resume a job by ID, use squeue to see the IDs | scontrol resume job_ID |
Change permissions of my folders | Change the permissions of your personal folders. |
chmod ### myfolder Where ### is a special 3 digit code such as 777. |
Examples: | Give me full permission of my folder while everyone else can just read it. | chmod 744 myfolder |
Let only me have read, white, and execute my folder. | chmod 700 myfolder | |
Let only other users of this computer read my folder | chmod 740 myfolder | |
Full permission to everyone for my folder | chmod 777 myfolder | |
Glossary: | ||
Job: | Task that is given to the computer to be executed. | |
Nodes: | Compute or GPU servers on which jobs are run. | |
Script: | A list of commands that get executed in order. | |
Partitions: | A partition is a collection of nodes. | |
Unix command: | Specific orders that can be given to a computer to be executed. |