Processes are mapped based on one of the following directives as applied at the job level:
SLOTassigns procs to each node up to the number of available slots on that node before moving to the next node in the allocationHWTHREADassigns a proc to each hardware thread on a node in a round-robin manner up to the number of available slots on that node before moving to the next node in the allocationCORE(default) assigns a proc to each core on a node in a round-robin manner up to the number of available slots on that node before moving to the next node in the allocationL1CACHEassigns a proc to each L1 cache on a node in a round-robin manner up to the number of available slots on that node before moving to the next node in the allocationL2CACHEassigns a proc to each L2 cache on a node in a round-robin manner up to the number of available slots on that node before moving to the next node in the allocationL3CACHEassigns a proc to each L3 cache on a node in a round-robin manner up to the number of available slots on that node before moving to the next node in the allocationNUMAassigns a proc to each NUMA region on a node in a round-robin manner up to the number of available slots on that node before moving to the next node in the allocationPACKAGEassigns a proc to each package on a node in a round-robin manner up to the number of available slots on that node before moving to the next node in the allocationNODEassigns processes in a round-robin fashion to all nodes in the allocation, with the number assigned to each node capped by the number of available slots on that nodeSEQ(often accompanied by the file=<path> qualifier) assigns one process to each node specified in the file. The sequential file is to contain an entry for each desired process, one per line of the file.PPR:N:resource maps N procs to each instance of the specified resource type in the allocationRANKFILE(often accompanied by the file=<path> qualifier) assigns one process to the node/resource specified in each entry of the file, one per line of the file.PE-LIST=a,bassigns procs to each node in the allocation based on the ORDERED qualifier. The list is comprised of comma-delimited ranges of CPUs to use for this job. If the ORDERED qualifier is not provided, then each node will be assigned procs up to the number of available slots, capped by the availability of the specified CPUs. If ORDERED is given, then one proc will be assigned to each of the specified CPUs, if available, capped by the number of slots on each node and the total number of specified processes. Providing the OVERLOAD qualifier to the “bind-to” option removes the check on availability of the CPU in both cases.
Any directive can include qualifiers by adding a colon (:) and any
combination of one or more of the following (delimited by colons) to
the --map-by option (except where noted):
PE=nbind n CPUs to each process (can not be used in combination with rankfile or pe-list directives)SPANload balance the processes across the allocation by treating the allocation as a single “super-node” (can not be used in combination withslot,node,seq,ppr,rankfile, orpe-listdirectives)OVERSUBSCRIBEallow more processes on a node than processing elementsNOOVERSUBSCRIBEmeans!OVERSUBSCRIBENOLOCALdo not launch processes on the same node asprunHWTCPUSuse hardware threads as CPU slotsCORECPUSuse cores as CPU slots (default)INHERITindicates that a child job (i.e., one spawned from within an application) shall inherit the placement policies of the parent job that spawned it.NOINHERITmeans`!INHERITFILE=<path>(path to file containing sequential or rankfile entries).ORDEREDonly applies to thePE-LISToption to indicate that procs are to be bound to each of the specified CPUs in the order in which they are assigned (i.e., the first proc on a node shall be bound to the first CPU in the list, the second proc shall be bound to the second CPU, etc.)
Note
Directives and qualifiers are case-insensitive and can be
shortened to the minimum number of characters to uniquely
identify them. Thus, L1CACHE can be given as l1cache or
simply as L1.
The type of CPU (core vs hwthread) used in the mapping algorithm is determined as follows:
by user directive on the command line via the HWTCPUS qualifier to the
--map-bydirectiveby setting the
rmaps_default_mapping_policyMCA parameter to include theHWTCPUSqualifier. This parameter sets the default value for a PRRTE DVM — qualifiers are carried across to DVM jobs started viaprununless overridden by the user’s command linedefaults to CORE in topologies where core CPUs are defined, and to hwthreads otherwise.
If your application uses threads, then you probably want to ensure that
you are either not bound at all (by specifying --bind-to none), or
bound to multiple cores using an appropriate binding level or specific
number of processing elements per application process via the PE=#
qualifier to the --map-by command line directive.
A more detailed description of the mapping, ranking, and binding
procedure can be obtained via the --help placement option.