Cheat Sheet: Using HPC
I want to...Example
| TASK | METHOD | COMMANDS |
|---|---|---|
| Create a Job | From a UNIX command | srun hostname |
| From a Script | sbatch script.sh | |
| See Jobs running | See it from partitions | sinfo |
| See All the jobs ID | squeue | |
| See current jobs from a user | squeue -u <username> | |
| See current running jobs by user | squeue -u <username> -t RUNNING | |
| See current pending jobs by user | squeue -u <username> -t PENDING | |
| See jobs from a user change <username> with a username | sacct -u <username> --format=JobID,JobName,MaxRSS,Elapsed | |
| See status for a currently running job by its ID | sstat --format=AveCPU,AvePages,AveRSS,AveVMSize,JobID -j <jobid> --allsteps | |
| See the Nodes being used |
Show how many nodes Each job is taking |
sinfo |
| Show where the nodes are going | sinfo -N -l | |
| Cancel a Job | First use squeue to see the job Id that you want to cancel then use scancel |
scancel job_ID example: scancel 12345 |
| Cancel all the jobs from a user |
scancel -u <username> |
|
| Cancel all the pending jobs from a user |
scancel -t PENDING -u <username> |
|
| To Cancel one or more jobs by name | scancel --name myJobName | |
| Pause a job | Pause a job by ID, use squeue to see the IDs | scontrol hold job_ID |
| Resume a job | Resume a job by ID, use squeue to see the IDs | scontrol resume job_ID |
| Change permissions of my folders | Change the permissions of your personal folders. |
chmod ### myfolder Where ### is a special 3 digit code such as 777. |
| Examples: | Give me full permission of my folder while everyone else can just read it. | chmod 744 myfolder |
| Let only me have read, white, and execute my folder. | chmod 700 myfolder | |
| Let only other users of this computer read my folder | chmod 740 myfolder | |
| Full permission to everyone for my folder | chmod 777 myfolder | |
| Glossary: | ||
| Job: | Task that is given to the computer to be executed. | |
| Nodes: | Compute or GPU servers on which jobs are run. | |
| Script: | A list of commands that get executed in order. | |
| Partitions: | A partition is a collection of nodes. | |
| Unix command: | Specific orders that can be given to a computer to be executed. |