Intel MPI Jobs

This section describes how to submit MPI jobs under the Slurm batch system. Here we assume that MPI jobs are compiled using one of available versions of Intel parallel studio Intel MPI on KNL. Generally, there are two ways to launch an MPI job under the Slurm: 1) mpiexec.hydra from Intel parallel studio command line tool suite 2) srun from Slurm command utility. Even though both methods work pretty well under the Slurm, srun will allow Slurm to control and clean up all the MPI processes easily in addition to account all MPI processes more accurately.
       
  • Using mpiexec.hydra and ssh or rsh to launch an MPI job.
       
        This is very much similar to the way in launching MPI jobs under old PBS batch system. Inside a job submission script, one has to figure  out nodes allocated to the job using scontrol command. The following is a simple sbatch script.   
                 #!/bin/bash -l
                 #SBATCH -A youraccount
                 #SBATCH -p phi
                 #SBATCH -N 4
                 #SBATCH -t 00:30:00
                 #SBATCH -J youjobname
                 #SBATCH --mail-type=END
                 #SBATCH -C 18p
                 # create a temp file
                 tmpfile=`mktemp`

                 # convert slurm compact form to regular form
                 /usr/bin/scontrol show hostnames $SLURM_JOB_NODELIST > $tmpfile
                 source /dist/intel/parallel_studio_xe_2017/parallel_studio_xe_2017.0.035/bin/psxevars.sh intel64
                 mpiexec.hydra -bootstrap rsh -PSM2 -f $tmpfile -np $numnodes -perhost 1 mpiprog
                 rm -f $tmpfile

           Note: one can replace -bootstrap rsh by -bootstrap ssh if one's ssh keys to all hosts are set up correctly.
                     The -PSM2 option is to tell Intel MPI to use OPA fabrics.
  • Using mpiexec.hydra and bootstrap slurm option to allow slurm to manage an MPI job
           This method is very similar to the above. The only difference is the bootstrap option: -bootstrap slurm. This option allows mpi processes to be launched and managed by the Slurm system.

                mpiexec.hydra -bootstrap slurm -PSM2 -f $tmpfile -np $numnodes -perhost 1 anothermpiprog   

  • Using srun command to launch an MPI job
            This method is the preferred method according to the official Slurm documentation. In this way, one does not need to find out allocated nodes like previous two methods. The following is a simple script.

                #!/bin/bash -l
                #SBATCH -A youraccount
                #SBATCH -p phi
                #SBATCH -N 4
                #SBATCH -t 00:30:00
                #SBATCH -J mightyrun
                #SBATCH --mail-type=END
                #SBATCH -C 18p
                source /dist/intel/parallel_studio_xe_2017/parallel_studio_xe_2017.0.035/bin/psxevars.sh intel64
                # The following 3 variables to make sure opa fabrics is used
                export I_MPI_FABRICS_LIST=tmi
                export I_MPI_TMI_PROVIDER=psm2
                export I_MPI_FALLBACK=0
                # Using PMI slurm process management
                export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi.so
                srun -n 4 fullpath_to_mpi_prog

           For more information, please check Slurm documents on Slurm commands