Commands
Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters. 1
sbatch: Submit a job script.
The sbatch command allows you to submit a job script to the Slurm job scheduler. This script specifies the job’s required resources, the commands it will execute, and any dependencies. Using the sbatch command to easily submit your job to the Slurm workload manager, allowing you to schedule and execute your job on the available compute resources. sbatch docs
srun: Run a command on allocated compute node(s).
When you need to execute a specific command on the allocated compute nodes, you can use the srun command. This command allows you to run a command or a series of commands on the allocated compute nodes, ensuring you have the required resources for execution. srun docs
scancel: Delete a job.
Use the scancel command to cancel or delete a job running or in the queue. By specifying the job ID, you can quickly terminate the execution of a job, freeing up resources for other pending jobs. scancel docs
squeue: Show the state of jobs.
The squeue command provides a comprehensive overview of the current state of all jobs in the Slurm job queue. It displays information such as job ID, user, state, running time, etc. Use this command to monitor the progress of your jobs and get an overview of the entire system workload. squeue docs
squeue -u: See your jobs running or waiting to run.
Use the squeue -u command to view the jobs running or waiting to run for a particular user. This command filters the output of the squeue command only to display jobs associated with the specified user’s netID. It gives you a focused view of your own jobs, making tracking their progress more manageable.
squeue –start: Report the expected start time for pending jobs.
If you have pending jobs in the queue and want to know their expected start time, use the squeue –start command. This command provides an estimated start time for each pending job, giving you an idea of when your job will likely begin execution.
squeue -j: Show the nodes being used for your running job.
The squeue -j command can be used to obtain information about the compute nodes your running job uses. By specifying the job ID, you can retrieve details about the nodes allocated to your job, ensuring transparency and visibility into your job’s execution environment.
sinfo: Show state of nodes and partitions.
To get detailed information about the nodes and partitions (queues) in the Slurm cluster, you can use the sinfo command. This command provides insights into the compute nodes’ state, availability, and the partitions they belong to. By understanding the current state of the nodes and partitions, you can make informed decisions about job submission and resource allocation. sinfo docs
scontrol show jobid: Displays detailed info about a job.
For a detailed overview of a specific job, you can utilize the scontrol show jobid command. This command provides comprehensive information about the specified job, including its state, execution time, resource utilization, and more. It can help trouble or understand the characteristics of a particular job. scontrol docs
sacct: Display accounting data for all jobs and job steps.
The sacct command in Slurm helps you track and report on job activities and resource usage. By utilizing sacct, businesses can access detailed information about job status, start and end times, CPU and memory usage, and more. This data enables organizations to analyze job performance, identify bottlenecks, and optimize resource allocation for future jobs. With sacct, businesses can create comprehensive reports that provide valuable insights into job execution, helping them make informed decisions to improve efficiency and productivity. sacct docs
sreport: Generate reports from the Slurm accounting data.
The sreport command in Slurm allows businesses to generate detailed reports on resource utilization, job accounting, and billing. By utilizing sreport, organizations can access critical information such as job duration, resource consumption, and allocation usage. This data helps optimize resource allocation, identify cost-saving opportunities, and streamline billing processes. sreport docs