Blame man7/cpuset.7

Packit 7cfc04
.\" Copyright (c) 2008 Silicon Graphics, Inc.
Packit 7cfc04
.\"
Packit 7cfc04
.\" Author: Paul Jackson (http://oss.sgi.com/projects/cpusets)
Packit 7cfc04
.\"
Packit 7cfc04
.\" %%%LICENSE_START(GPLv2_MISC)
Packit 7cfc04
.\" This is free documentation; you can redistribute it and/or
Packit 7cfc04
.\" modify it under the terms of the GNU General Public License
Packit 7cfc04
.\" version 2 as published by the Free Software Foundation.
Packit 7cfc04
.\"
Packit 7cfc04
.\" The GNU General Public License's references to "object code"
Packit 7cfc04
.\" and "executables" are to be interpreted as the output of any
Packit 7cfc04
.\" document formatting or typesetting system, including
Packit 7cfc04
.\" intermediate and printed output.
Packit 7cfc04
.\"
Packit 7cfc04
.\" This manual is distributed in the hope that it will be useful,
Packit 7cfc04
.\" but WITHOUT ANY WARRANTY; without even the implied warranty of
Packit 7cfc04
.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
Packit 7cfc04
.\" GNU General Public License for more details.
Packit 7cfc04
.\"
Packit 7cfc04
.\" You should have received a copy of the GNU General Public
Packit 7cfc04
.\" License along with this manual; if not, see
Packit 7cfc04
.\" <http://www.gnu.org/licenses/>.
Packit 7cfc04
.\" %%%LICENSE_END
Packit 7cfc04
.\"
Packit 7cfc04
.TH CPUSET 7 2017-09-15 "Linux" "Linux Programmer's Manual"
Packit 7cfc04
.SH NAME
Packit 7cfc04
cpuset \- confine processes to processor and memory node subsets
Packit 7cfc04
.SH DESCRIPTION
Packit 7cfc04
The cpuset filesystem is a pseudo-filesystem interface
Packit 7cfc04
to the kernel cpuset mechanism,
Packit 7cfc04
which is used to control the processor placement
Packit 7cfc04
and memory placement of processes.
Packit 7cfc04
It is commonly mounted at
Packit 7cfc04
.IR /dev/cpuset .
Packit 7cfc04
.PP
Packit 7cfc04
On systems with kernels compiled with built in support for cpusets,
Packit 7cfc04
all processes are attached to a cpuset, and cpusets are always present.
Packit 7cfc04
If a system supports cpusets, then it will have the entry
Packit 7cfc04
.B nodev cpuset
Packit 7cfc04
in the file
Packit 7cfc04
.IR /proc/filesystems .
Packit 7cfc04
By mounting the cpuset filesystem (see the
Packit 7cfc04
.B EXAMPLE
Packit 7cfc04
section below),
Packit 7cfc04
the administrator can configure the cpusets on a system
Packit 7cfc04
to control the processor and memory placement of processes
Packit 7cfc04
on that system.
Packit 7cfc04
By default, if the cpuset configuration
Packit 7cfc04
on a system is not modified or if the cpuset filesystem
Packit 7cfc04
is not even mounted, then the cpuset mechanism,
Packit 7cfc04
though present, has no effect on the system's behavior.
Packit 7cfc04
.PP
Packit 7cfc04
A cpuset defines a list of CPUs and memory nodes.
Packit 7cfc04
.PP
Packit 7cfc04
The CPUs of a system include all the logical processing
Packit 7cfc04
units on which a process can execute, including, if present,
Packit 7cfc04
multiple processor cores within a package and Hyper-Threads
Packit 7cfc04
within a processor core.
Packit 7cfc04
Memory nodes include all distinct
Packit 7cfc04
banks of main memory; small and SMP systems typically have
Packit 7cfc04
just one memory node that contains all the system's main memory,
Packit 7cfc04
while NUMA (non-uniform memory access) systems have multiple memory nodes.
Packit 7cfc04
.PP
Packit 7cfc04
Cpusets are represented as directories in a hierarchical
Packit 7cfc04
pseudo-filesystem, where the top directory in the hierarchy
Packit 7cfc04
.RI ( /dev/cpuset )
Packit 7cfc04
represents the entire system (all online CPUs and memory nodes)
Packit 7cfc04
and any cpuset that is the child (descendant) of
Packit 7cfc04
another parent cpuset contains a subset of that parent's
Packit 7cfc04
CPUs and memory nodes.
Packit 7cfc04
The directories and files representing cpusets have normal
Packit 7cfc04
filesystem permissions.
Packit 7cfc04
.PP
Packit 7cfc04
Every process in the system belongs to exactly one cpuset.
Packit 7cfc04
A process is confined to run only on the CPUs in
Packit 7cfc04
the cpuset it belongs to, and to allocate memory only
Packit 7cfc04
on the memory nodes in that cpuset.
Packit 7cfc04
When a process
Packit 7cfc04
.BR fork (2)s,
Packit 7cfc04
the child process is placed in the same cpuset as its parent.
Packit 7cfc04
With sufficient privilege, a process may be moved from one
Packit 7cfc04
cpuset to another and the allowed CPUs and memory nodes
Packit 7cfc04
of an existing cpuset may be changed.
Packit 7cfc04
.PP
Packit 7cfc04
When the system begins booting, a single cpuset is
Packit 7cfc04
defined that includes all CPUs and memory nodes on the
Packit 7cfc04
system, and all processes are in that cpuset.
Packit 7cfc04
During the boot process, or later during normal system operation,
Packit 7cfc04
other cpusets may be created, as subdirectories of this top cpuset,
Packit 7cfc04
under the control of the system administrator,
Packit 7cfc04
and processes may be placed in these other cpusets.
Packit 7cfc04
.PP
Packit 7cfc04
Cpusets are integrated with the
Packit 7cfc04
.BR sched_setaffinity (2)
Packit 7cfc04
scheduling affinity mechanism and the
Packit 7cfc04
.BR mbind (2)
Packit 7cfc04
and
Packit 7cfc04
.BR set_mempolicy (2)
Packit 7cfc04
memory-placement mechanisms in the kernel.
Packit 7cfc04
Neither of these mechanisms let a process make use
Packit 7cfc04
of a CPU or memory node that is not allowed by that process's cpuset.
Packit 7cfc04
If changes to a process's cpuset placement conflict with these
Packit 7cfc04
other mechanisms, then cpuset placement is enforced
Packit 7cfc04
even if it means overriding these other mechanisms.
Packit 7cfc04
The kernel accomplishes this overriding by silently
Packit 7cfc04
restricting the CPUs and memory nodes requested by
Packit 7cfc04
these other mechanisms to those allowed by the
Packit 7cfc04
invoking process's cpuset.
Packit 7cfc04
This can result in these
Packit 7cfc04
other calls returning an error, if for example, such
Packit 7cfc04
a call ends up requesting an empty set of CPUs or
Packit 7cfc04
memory nodes, after that request is restricted to
Packit 7cfc04
the invoking process's cpuset.
Packit 7cfc04
.PP
Packit 7cfc04
Typically, a cpuset is used to manage
Packit 7cfc04
the CPU and memory-node confinement for a set of
Packit 7cfc04
cooperating processes such as a batch scheduler job, and these
Packit 7cfc04
other mechanisms are used to manage the placement of
Packit 7cfc04
individual processes or memory regions within that set or job.
Packit 7cfc04
.SH FILES
Packit 7cfc04
Each directory below
Packit 7cfc04
.I /dev/cpuset
Packit 7cfc04
represents a cpuset and contains a fixed set of pseudo-files
Packit 7cfc04
describing the state of that cpuset.
Packit 7cfc04
.PP
Packit 7cfc04
New cpusets are created using the
Packit 7cfc04
.BR mkdir (2)
Packit 7cfc04
system call or the
Packit 7cfc04
.BR mkdir (1)
Packit 7cfc04
command.
Packit 7cfc04
The properties of a cpuset, such as its flags, allowed
Packit 7cfc04
CPUs and memory nodes, and attached processes, are queried and modified
Packit 7cfc04
by reading or writing to the appropriate file in that cpuset's directory,
Packit 7cfc04
as listed below.
Packit 7cfc04
.PP
Packit 7cfc04
The pseudo-files in each cpuset directory are automatically created when
Packit 7cfc04
the cpuset is created, as a result of the
Packit 7cfc04
.BR mkdir (2)
Packit 7cfc04
invocation.
Packit 7cfc04
It is not possible to directly add or remove these pseudo-files.
Packit 7cfc04
.PP
Packit 7cfc04
A cpuset directory that contains no child cpuset directories,
Packit 7cfc04
and has no attached processes, can be removed using
Packit 7cfc04
.BR rmdir (2)
Packit 7cfc04
or
Packit 7cfc04
.BR rmdir (1).
Packit 7cfc04
It is not necessary, or possible,
Packit 7cfc04
to remove the pseudo-files inside the directory before removing it.
Packit 7cfc04
.PP
Packit 7cfc04
The pseudo-files in each cpuset directory are
Packit 7cfc04
small text files that may be read and
Packit 7cfc04
written using traditional shell utilities such as
Packit 7cfc04
.BR cat (1),
Packit 7cfc04
and
Packit 7cfc04
.BR echo (1),
Packit 7cfc04
or from a program by using file I/O library functions or system calls,
Packit 7cfc04
such as
Packit 7cfc04
.BR open (2),
Packit 7cfc04
.BR read (2),
Packit 7cfc04
.BR write (2),
Packit 7cfc04
and
Packit 7cfc04
.BR close (2).
Packit 7cfc04
.PP
Packit 7cfc04
The pseudo-files in a cpuset directory represent internal kernel
Packit 7cfc04
state and do not have any persistent image on disk.
Packit 7cfc04
Each of these per-cpuset files is listed and described below.
Packit 7cfc04
.\" ====================== tasks ======================
Packit 7cfc04
.TP
Packit 7cfc04
.I tasks
Packit 7cfc04
List of the process IDs (PIDs) of the processes in that cpuset.
Packit 7cfc04
The list is formatted as a series of ASCII
Packit 7cfc04
decimal numbers, each followed by a newline.
Packit 7cfc04
A process may be added to a cpuset (automatically removing
Packit 7cfc04
it from the cpuset that previously contained it) by writing its
Packit 7cfc04
PID to that cpuset's
Packit 7cfc04
.I tasks
Packit 7cfc04
file (with or without a trailing newline).
Packit 7cfc04
.IP
Packit 7cfc04
.B Warning:
Packit 7cfc04
only one PID may be written to the
Packit 7cfc04
.I tasks
Packit 7cfc04
file at a time.
Packit 7cfc04
If a string is written that contains more
Packit 7cfc04
than one PID, only the first one will be used.
Packit 7cfc04
.\" =================== notify_on_release ===================
Packit 7cfc04
.TP
Packit 7cfc04
.I notify_on_release
Packit 7cfc04
Flag (0 or 1).
Packit 7cfc04
If set (1), that cpuset will receive special handling
Packit 7cfc04
after it is released, that is, after all processes cease using
Packit 7cfc04
it (i.e., terminate or are moved to a different cpuset)
Packit 7cfc04
and all child cpuset directories have been removed.
Packit 7cfc04
See the \fBNotify On Release\fR section, below.
Packit 7cfc04
.\" ====================== cpus ======================
Packit 7cfc04
.TP
Packit 7cfc04
.I cpuset.cpus
Packit 7cfc04
List of the physical numbers of the CPUs on which processes
Packit 7cfc04
in that cpuset are allowed to execute.
Packit 7cfc04
See \fBList Format\fR below for a description of the
Packit 7cfc04
format of
Packit 7cfc04
.IR cpus .
Packit 7cfc04
.IP
Packit 7cfc04
The CPUs allowed to a cpuset may be changed by
Packit 7cfc04
writing a new list to its
Packit 7cfc04
.I cpus
Packit 7cfc04
file.
Packit 7cfc04
.\" ==================== cpu_exclusive ====================
Packit 7cfc04
.TP
Packit 7cfc04
.I cpuset.cpu_exclusive
Packit 7cfc04
Flag (0 or 1).
Packit 7cfc04
If set (1), the cpuset has exclusive use of
Packit 7cfc04
its CPUs (no sibling or cousin cpuset may overlap CPUs).
Packit 7cfc04
By default, this is off (0).
Packit 7cfc04
Newly created cpusets also initially default this to off (0).
Packit 7cfc04
.IP
Packit 7cfc04
Two cpusets are
Packit 7cfc04
.I sibling
Packit 7cfc04
cpusets if they share the same parent cpuset in the
Packit 7cfc04
.I /dev/cpuset
Packit 7cfc04
hierarchy.
Packit 7cfc04
Two cpusets are
Packit 7cfc04
.I cousin
Packit 7cfc04
cpusets if neither is the ancestor of the other.
Packit 7cfc04
Regardless of the
Packit 7cfc04
.I cpu_exclusive
Packit 7cfc04
setting, if one cpuset is the ancestor of another,
Packit 7cfc04
and if both of these cpusets have nonempty
Packit 7cfc04
.IR cpus ,
Packit 7cfc04
then their
Packit 7cfc04
.I cpus
Packit 7cfc04
must overlap, because the
Packit 7cfc04
.I cpus
Packit 7cfc04
of any cpuset are always a subset of the
Packit 7cfc04
.I cpus
Packit 7cfc04
of its parent cpuset.
Packit 7cfc04
.\" ====================== mems ======================
Packit 7cfc04
.TP
Packit 7cfc04
.I cpuset.mems
Packit 7cfc04
List of memory nodes on which processes in this cpuset are
Packit 7cfc04
allowed to allocate memory.
Packit 7cfc04
See \fBList Format\fR below for a description of the
Packit 7cfc04
format of
Packit 7cfc04
.IR mems .
Packit 7cfc04
.\" ==================== mem_exclusive ====================
Packit 7cfc04
.TP
Packit 7cfc04
.I cpuset.mem_exclusive
Packit 7cfc04
Flag (0 or 1).
Packit 7cfc04
If set (1), the cpuset has exclusive use of
Packit 7cfc04
its memory nodes (no sibling or cousin may overlap).
Packit 7cfc04
Also if set (1), the cpuset is a \fBHardwall\fR cpuset (see below).
Packit 7cfc04
By default, this is off (0).
Packit 7cfc04
Newly created cpusets also initially default this to off (0).
Packit 7cfc04
.IP
Packit 7cfc04
Regardless of the
Packit 7cfc04
.I mem_exclusive
Packit 7cfc04
setting, if one cpuset is the ancestor of another,
Packit 7cfc04
then their memory nodes must overlap, because the memory
Packit 7cfc04
nodes of any cpuset are always a subset of the memory nodes
Packit 7cfc04
of that cpuset's parent cpuset.
Packit 7cfc04
.\" ==================== mem_hardwall ====================
Packit 7cfc04
.TP
Packit 7cfc04
.IR cpuset.mem_hardwall " (since Linux 2.6.26)"
Packit 7cfc04
Flag (0 or 1).
Packit 7cfc04
If set (1), the cpuset is a \fBHardwall\fR cpuset (see below).
Packit 7cfc04
Unlike \fBmem_exclusive\fR,
Packit 7cfc04
there is no constraint on whether cpusets
Packit 7cfc04
marked \fBmem_hardwall\fR may have overlapping
Packit 7cfc04
memory nodes with sibling or cousin cpusets.
Packit 7cfc04
By default, this is off (0).
Packit 7cfc04
Newly created cpusets also initially default this to off (0).
Packit 7cfc04
.\" ==================== memory_migrate ====================
Packit 7cfc04
.TP
Packit 7cfc04
.IR cpuset.memory_migrate " (since Linux 2.6.16)"
Packit 7cfc04
Flag (0 or 1).
Packit 7cfc04
If set (1), then memory migration is enabled.
Packit 7cfc04
By default, this is off (0).
Packit 7cfc04
See the \fBMemory Migration\fR section, below.
Packit 7cfc04
.\" ==================== memory_pressure ====================
Packit 7cfc04
.TP
Packit 7cfc04
.IR cpuset.memory_pressure " (since Linux 2.6.16)"
Packit 7cfc04
A measure of how much memory pressure the processes in this
Packit 7cfc04
cpuset are causing.
Packit 7cfc04
See the \fBMemory Pressure\fR section, below.
Packit 7cfc04
Unless
Packit 7cfc04
.I memory_pressure_enabled
Packit 7cfc04
is enabled, always has value zero (0).
Packit 7cfc04
This file is read-only.
Packit 7cfc04
See the
Packit 7cfc04
.B WARNINGS
Packit 7cfc04
section, below.
Packit 7cfc04
.\" ================= memory_pressure_enabled =================
Packit 7cfc04
.TP
Packit 7cfc04
.IR cpuset.memory_pressure_enabled " (since Linux 2.6.16)"
Packit 7cfc04
Flag (0 or 1).
Packit 7cfc04
This file is present only in the root cpuset, normally
Packit 7cfc04
.IR /dev/cpuset .
Packit 7cfc04
If set (1), the
Packit 7cfc04
.I memory_pressure
Packit 7cfc04
calculations are enabled for all cpusets in the system.
Packit 7cfc04
By default, this is off (0).
Packit 7cfc04
See the
Packit 7cfc04
\fBMemory Pressure\fR section, below.
Packit 7cfc04
.\" ================== memory_spread_page ==================
Packit 7cfc04
.TP
Packit 7cfc04
.IR cpuset.memory_spread_page " (since Linux 2.6.17)"
Packit 7cfc04
Flag (0 or 1).
Packit 7cfc04
If set (1), pages in the kernel page cache
Packit 7cfc04
(filesystem buffers) are uniformly spread across the cpuset.
Packit 7cfc04
By default, this is off (0) in the top cpuset,
Packit 7cfc04
and inherited from the parent cpuset in
Packit 7cfc04
newly created cpusets.
Packit 7cfc04
See the \fBMemory Spread\fR section, below.
Packit 7cfc04
.\" ================== memory_spread_slab ==================
Packit 7cfc04
.TP
Packit 7cfc04
.IR cpuset.memory_spread_slab " (since Linux 2.6.17)"
Packit 7cfc04
Flag (0 or 1).
Packit 7cfc04
If set (1), the kernel slab caches
Packit 7cfc04
for file I/O (directory and inode structures) are
Packit 7cfc04
uniformly spread across the cpuset.
Packit 7cfc04
By defaultBy default, is off (0) in the top cpuset,
Packit 7cfc04
and inherited from the parent cpuset in
Packit 7cfc04
newly created cpusets.
Packit 7cfc04
See the \fBMemory Spread\fR section, below.
Packit 7cfc04
.\" ================== sched_load_balance ==================
Packit 7cfc04
.TP
Packit 7cfc04
.IR cpuset.sched_load_balance " (since Linux 2.6.24)"
Packit 7cfc04
Flag (0 or 1).
Packit 7cfc04
If set (1, the default) the kernel will
Packit 7cfc04
automatically load balance processes in that cpuset over
Packit 7cfc04
the allowed CPUs in that cpuset.
Packit 7cfc04
If cleared (0) the
Packit 7cfc04
kernel will avoid load balancing processes in this cpuset,
Packit 7cfc04
.I unless
Packit 7cfc04
some other cpuset with overlapping CPUs has its
Packit 7cfc04
.I sched_load_balance
Packit 7cfc04
flag set.
Packit 7cfc04
See \fBScheduler Load Balancing\fR, below, for further details.
Packit 7cfc04
.\" ================== sched_relax_domain_level ==================
Packit 7cfc04
.TP
Packit 7cfc04
.IR cpuset.sched_relax_domain_level " (since Linux 2.6.26)"
Packit 7cfc04
Integer, between \-1 and a small positive value.
Packit 7cfc04
The
Packit 7cfc04
.I sched_relax_domain_level
Packit 7cfc04
controls the width of the range of CPUs over which the kernel scheduler
Packit 7cfc04
performs immediate rebalancing of runnable tasks across CPUs.
Packit 7cfc04
If
Packit 7cfc04
.I sched_load_balance
Packit 7cfc04
is disabled, then the setting of
Packit 7cfc04
.I sched_relax_domain_level
Packit 7cfc04
does not matter, as no such load balancing is done.
Packit 7cfc04
If
Packit 7cfc04
.I sched_load_balance
Packit 7cfc04
is enabled, then the higher the value of the
Packit 7cfc04
.IR sched_relax_domain_level ,
Packit 7cfc04
the wider
Packit 7cfc04
the range of CPUs over which immediate load balancing is attempted.
Packit 7cfc04
See \fBScheduler Relax Domain Level\fR, below, for further details.
Packit 7cfc04
.\" ================== proc cpuset ==================
Packit 7cfc04
.PP
Packit 7cfc04
In addition to the above pseudo-files in each directory below
Packit 7cfc04
.IR /dev/cpuset ,
Packit 7cfc04
each process has a pseudo-file,
Packit 7cfc04
.IR /proc/<pid>/cpuset ,
Packit 7cfc04
that displays the path of the process's cpuset directory
Packit 7cfc04
relative to the root of the cpuset filesystem.
Packit 7cfc04
.\" ================== proc status ==================
Packit 7cfc04
.PP
Packit 7cfc04
Also the
Packit 7cfc04
.I /proc/<pid>/status
Packit 7cfc04
file for each process has four added lines,
Packit 7cfc04
displaying the process's
Packit 7cfc04
.I Cpus_allowed
Packit 7cfc04
(on which CPUs it may be scheduled) and
Packit 7cfc04
.I Mems_allowed
Packit 7cfc04
(on which memory nodes it may obtain memory),
Packit 7cfc04
in the two formats \fBMask Format\fR and \fBList Format\fR (see below)
Packit 7cfc04
as shown in the following example:
Packit 7cfc04
.PP
Packit 7cfc04
.in +4n
Packit 7cfc04
.EX
Packit 7cfc04
Cpus_allowed:   ffffffff,ffffffff,ffffffff,ffffffff
Packit 7cfc04
Cpus_allowed_list:     0\-127
Packit 7cfc04
Mems_allowed:   ffffffff,ffffffff
Packit 7cfc04
Mems_allowed_list:     0\-63
Packit 7cfc04
.EE
Packit 7cfc04
.in
Packit 7cfc04
.PP
Packit 7cfc04
The "allowed" fields were added in Linux 2.6.24;
Packit 7cfc04
the "allowed_list" fields were added in Linux 2.6.26.
Packit 7cfc04
.\" ================== EXTENDED CAPABILITIES ==================
Packit 7cfc04
.SH EXTENDED CAPABILITIES
Packit 7cfc04
In addition to controlling which
Packit 7cfc04
.I cpus
Packit 7cfc04
and
Packit 7cfc04
.I mems
Packit 7cfc04
a process is allowed to use, cpusets provide the following
Packit 7cfc04
extended capabilities.
Packit 7cfc04
.\" ================== Exclusive Cpusets ==================
Packit 7cfc04
.SS Exclusive cpusets
Packit 7cfc04
If a cpuset is marked
Packit 7cfc04
.I cpu_exclusive
Packit 7cfc04
or
Packit 7cfc04
.IR mem_exclusive ,
Packit 7cfc04
no other cpuset, other than a direct ancestor or descendant,
Packit 7cfc04
may share any of the same CPUs or memory nodes.
Packit 7cfc04
.PP
Packit 7cfc04
A cpuset that is
Packit 7cfc04
.I mem_exclusive
Packit 7cfc04
restricts kernel allocations for
Packit 7cfc04
buffer cache pages and other internal kernel data pages
Packit 7cfc04
commonly shared by the kernel across
Packit 7cfc04
multiple users.
Packit 7cfc04
All cpusets, whether
Packit 7cfc04
.I mem_exclusive
Packit 7cfc04
or not, restrict allocations of memory for user space.
Packit 7cfc04
This enables configuring a
Packit 7cfc04
system so that several independent jobs can share common kernel data,
Packit 7cfc04
while isolating each job's user allocation in
Packit 7cfc04
its own cpuset.
Packit 7cfc04
To do this, construct a large
Packit 7cfc04
.I mem_exclusive
Packit 7cfc04
cpuset to hold all the jobs, and construct child,
Packit 7cfc04
.RI non- mem_exclusive
Packit 7cfc04
cpusets for each individual job.
Packit 7cfc04
Only a small amount of kernel memory,
Packit 7cfc04
such as requests from interrupt handlers, is allowed to be
Packit 7cfc04
placed on memory nodes
Packit 7cfc04
outside even a
Packit 7cfc04
.I mem_exclusive
Packit 7cfc04
cpuset.
Packit 7cfc04
.\" ================== Hardwall ==================
Packit 7cfc04
.SS Hardwall
Packit 7cfc04
A cpuset that has
Packit 7cfc04
.I mem_exclusive
Packit 7cfc04
or
Packit 7cfc04
.I mem_hardwall
Packit 7cfc04
set is a
Packit 7cfc04
.I hardwall
Packit 7cfc04
cpuset.
Packit 7cfc04
A
Packit 7cfc04
.I hardwall
Packit 7cfc04
cpuset restricts kernel allocations for page, buffer,
Packit 7cfc04
and other data commonly shared by the kernel across multiple users.
Packit 7cfc04
All cpusets, whether
Packit 7cfc04
.I hardwall
Packit 7cfc04
or not, restrict allocations of memory for user space.
Packit 7cfc04
.PP
Packit 7cfc04
This enables configuring a system so that several independent
Packit 7cfc04
jobs can share common kernel data, such as filesystem pages,
Packit 7cfc04
while isolating each job's user allocation in its own cpuset.
Packit 7cfc04
To do this, construct a large
Packit 7cfc04
.I hardwall
Packit 7cfc04
cpuset to hold
Packit 7cfc04
all the jobs, and construct child cpusets for each individual
Packit 7cfc04
job which are not
Packit 7cfc04
.I hardwall
Packit 7cfc04
cpusets.
Packit 7cfc04
.PP
Packit 7cfc04
Only a small amount of kernel memory, such as requests from
Packit 7cfc04
interrupt handlers, is allowed to be taken outside even a
Packit 7cfc04
.I hardwall
Packit 7cfc04
cpuset.
Packit 7cfc04
.\" ================== Notify On Release ==================
Packit 7cfc04
.SS Notify on release
Packit 7cfc04
If the
Packit 7cfc04
.I notify_on_release
Packit 7cfc04
flag is enabled (1) in a cpuset,
Packit 7cfc04
then whenever the last process in the cpuset leaves
Packit 7cfc04
(exits or attaches to some other cpuset)
Packit 7cfc04
and the last child cpuset of that cpuset is removed,
Packit 7cfc04
the kernel will run the command
Packit 7cfc04
.IR /sbin/cpuset_release_agent ,
Packit 7cfc04
supplying the pathname (relative to the mount point of the
Packit 7cfc04
cpuset filesystem) of the abandoned cpuset.
Packit 7cfc04
This enables automatic removal of abandoned cpusets.
Packit 7cfc04
.PP
Packit 7cfc04
The default value of
Packit 7cfc04
.I notify_on_release
Packit 7cfc04
in the root cpuset at system boot is disabled (0).
Packit 7cfc04
The default value of other cpusets at creation
Packit 7cfc04
is the current value of their parent's
Packit 7cfc04
.I notify_on_release
Packit 7cfc04
setting.
Packit 7cfc04
.PP
Packit 7cfc04
The command
Packit 7cfc04
.I /sbin/cpuset_release_agent
Packit 7cfc04
is invoked, with the name
Packit 7cfc04
.RI ( /dev/cpuset
Packit 7cfc04
relative path)
Packit 7cfc04
of the to-be-released cpuset in
Packit 7cfc04
.IR argv[1] .
Packit 7cfc04
.PP
Packit 7cfc04
The usual contents of the command
Packit 7cfc04
.I /sbin/cpuset_release_agent
Packit 7cfc04
is simply the shell script:
Packit 7cfc04
.PP
Packit 7cfc04
.in +4n
Packit 7cfc04
.EX
Packit 7cfc04
#!/bin/sh
Packit 7cfc04
rmdir /dev/cpuset/$1
Packit 7cfc04
.EE
Packit 7cfc04
.in
Packit 7cfc04
.PP
Packit 7cfc04
As with other flag values below, this flag can
Packit 7cfc04
be changed by writing an ASCII
Packit 7cfc04
number 0 or 1 (with optional trailing newline)
Packit 7cfc04
into the file, to clear or set the flag, respectively.
Packit 7cfc04
.\" ================== Memory Pressure ==================
Packit 7cfc04
.SS Memory pressure
Packit 7cfc04
The
Packit 7cfc04
.I memory_pressure
Packit 7cfc04
of a cpuset provides a simple per-cpuset running average of
Packit 7cfc04
the rate that the processes in a cpuset are attempting to free up in-use
Packit 7cfc04
memory on the nodes of the cpuset to satisfy additional memory requests.
Packit 7cfc04
.PP
Packit 7cfc04
This enables batch managers that are monitoring jobs running in dedicated
Packit 7cfc04
cpusets to efficiently detect what level of memory pressure that job
Packit 7cfc04
is causing.
Packit 7cfc04
.PP
Packit 7cfc04
This is useful both on tightly managed systems running a wide mix of
Packit 7cfc04
submitted jobs, which may choose to terminate or reprioritize jobs that
Packit 7cfc04
are trying to use more memory than allowed on the nodes assigned them,
Packit 7cfc04
and with tightly coupled, long-running, massively parallel scientific
Packit 7cfc04
computing jobs that will dramatically fail to meet required performance
Packit 7cfc04
goals if they start to use more memory than allowed to them.
Packit 7cfc04
.PP
Packit 7cfc04
This mechanism provides a very economical way for the batch manager
Packit 7cfc04
to monitor a cpuset for signs of memory pressure.
Packit 7cfc04
It's up to the batch manager or other user code to decide
Packit 7cfc04
what action to take if it detects signs of memory pressure.
Packit 7cfc04
.PP
Packit 7cfc04
Unless memory pressure calculation is enabled by setting the pseudo-file
Packit 7cfc04
.IR /dev/cpuset/cpuset.memory_pressure_enabled ,
Packit 7cfc04
it is not computed for any cpuset, and reads from any
Packit 7cfc04
.I memory_pressure
Packit 7cfc04
always return zero, as represented by the ASCII string "0\en".
Packit 7cfc04
See the \fBWARNINGS\fR section, below.
Packit 7cfc04
.PP
Packit 7cfc04
A per-cpuset, running average is employed for the following reasons:
Packit 7cfc04
.IP * 3
Packit 7cfc04
Because this meter is per-cpuset rather than per-process or per virtual
Packit 7cfc04
memory region, the system load imposed by a batch scheduler monitoring
Packit 7cfc04
this metric is sharply reduced on large systems, because a scan of
Packit 7cfc04
the tasklist can be avoided on each set of queries.
Packit 7cfc04
.IP *
Packit 7cfc04
Because this meter is a running average rather than an accumulating
Packit 7cfc04
counter, a batch scheduler can detect memory pressure with a
Packit 7cfc04
single read, instead of having to read and accumulate results
Packit 7cfc04
for a period of time.
Packit 7cfc04
.IP *
Packit 7cfc04
Because this meter is per-cpuset rather than per-process,
Packit 7cfc04
the batch scheduler can obtain the key information\(emmemory
Packit 7cfc04
pressure in a cpuset\(emwith a single read, rather than having to
Packit 7cfc04
query and accumulate results over all the (dynamically changing)
Packit 7cfc04
set of processes in the cpuset.
Packit 7cfc04
.PP
Packit 7cfc04
The
Packit 7cfc04
.I memory_pressure
Packit 7cfc04
of a cpuset is calculated using a per-cpuset simple digital filter
Packit 7cfc04
that is kept within the kernel.
Packit 7cfc04
For each cpuset, this filter tracks
Packit 7cfc04
the recent rate at which processes attached to that cpuset enter the
Packit 7cfc04
kernel direct reclaim code.
Packit 7cfc04
.PP
Packit 7cfc04
The kernel direct reclaim code is entered whenever a process has to
Packit 7cfc04
satisfy a memory page request by first finding some other page to
Packit 7cfc04
repurpose, due to lack of any readily available already free pages.
Packit 7cfc04
Dirty filesystem pages are repurposed by first writing them
Packit 7cfc04
to disk.
Packit 7cfc04
Unmodified filesystem buffer pages are repurposed
Packit 7cfc04
by simply dropping them, though if that page is needed again, it
Packit 7cfc04
will have to be reread from disk.
Packit 7cfc04
.PP
Packit 7cfc04
The
Packit 7cfc04
.I cpuset.memory_pressure
Packit 7cfc04
file provides an integer number representing the recent (half-life of
Packit 7cfc04
10 seconds) rate of entries to the direct reclaim code caused by any
Packit 7cfc04
process in the cpuset, in units of reclaims attempted per second,
Packit 7cfc04
times 1000.
Packit 7cfc04
.\" ================== Memory Spread ==================
Packit 7cfc04
.SS Memory spread
Packit 7cfc04
There are two Boolean flag files per cpuset that control where the
Packit 7cfc04
kernel allocates pages for the filesystem buffers and related
Packit 7cfc04
in-kernel data structures.
Packit 7cfc04
They are called
Packit 7cfc04
.I cpuset.memory_spread_page
Packit 7cfc04
and
Packit 7cfc04
.IR cpuset.memory_spread_slab .
Packit 7cfc04
.PP
Packit 7cfc04
If the per-cpuset Boolean flag file
Packit 7cfc04
.I cpuset.memory_spread_page
Packit 7cfc04
is set, then
Packit 7cfc04
the kernel will spread the filesystem buffers (page cache) evenly
Packit 7cfc04
over all the nodes that the faulting process is allowed to use, instead
Packit 7cfc04
of preferring to put those pages on the node where the process is running.
Packit 7cfc04
.PP
Packit 7cfc04
If the per-cpuset Boolean flag file
Packit 7cfc04
.I cpuset.memory_spread_slab
Packit 7cfc04
is set,
Packit 7cfc04
then the kernel will spread some filesystem-related slab caches,
Packit 7cfc04
such as those for inodes and directory entries, evenly over all the nodes
Packit 7cfc04
that the faulting process is allowed to use, instead of preferring to
Packit 7cfc04
put those pages on the node where the process is running.
Packit 7cfc04
.PP
Packit 7cfc04
The setting of these flags does not affect the data segment
Packit 7cfc04
(see
Packit 7cfc04
.BR brk (2))
Packit 7cfc04
or stack segment pages of a process.
Packit 7cfc04
.PP
Packit 7cfc04
By default, both kinds of memory spreading are off and the kernel
Packit 7cfc04
prefers to allocate memory pages on the node local to where the
Packit 7cfc04
requesting process is running.
Packit 7cfc04
If that node is not allowed by the
Packit 7cfc04
process's NUMA memory policy or cpuset configuration or if there are
Packit 7cfc04
insufficient free memory pages on that node, then the kernel looks
Packit 7cfc04
for the nearest node that is allowed and has sufficient free memory.
Packit 7cfc04
.PP
Packit 7cfc04
When new cpusets are created, they inherit the memory spread settings
Packit 7cfc04
of their parent.
Packit 7cfc04
.PP
Packit 7cfc04
Setting memory spreading causes allocations for the affected page or
Packit 7cfc04
slab caches to ignore the process's NUMA memory policy and be spread
Packit 7cfc04
instead.
Packit 7cfc04
However, the effect of these changes in memory placement
Packit 7cfc04
caused by cpuset-specified memory spreading is hidden from the
Packit 7cfc04
.BR mbind (2)
Packit 7cfc04
or
Packit 7cfc04
.BR set_mempolicy (2)
Packit 7cfc04
calls.
Packit 7cfc04
These two NUMA memory policy calls always appear to behave as if
Packit 7cfc04
no cpuset-specified memory spreading is in effect, even if it is.
Packit 7cfc04
If cpuset memory spreading is subsequently turned off, the NUMA
Packit 7cfc04
memory policy most recently specified by these calls is automatically
Packit 7cfc04
reapplied.
Packit 7cfc04
.PP
Packit 7cfc04
Both
Packit 7cfc04
.I cpuset.memory_spread_page
Packit 7cfc04
and
Packit 7cfc04
.I cpuset.memory_spread_slab
Packit 7cfc04
are Boolean flag files.
Packit 7cfc04
By default, they contain "0", meaning that the feature is off
Packit 7cfc04
for that cpuset.
Packit 7cfc04
If a "1" is written to that file, that turns the named feature on.
Packit 7cfc04
.PP
Packit 7cfc04
Cpuset-specified memory spreading behaves similarly to what is known
Packit 7cfc04
(in other contexts) as round-robin or interleave memory placement.
Packit 7cfc04
.PP
Packit 7cfc04
Cpuset-specified memory spreading can provide substantial performance
Packit 7cfc04
improvements for jobs that:
Packit 7cfc04
.IP a) 3
Packit 7cfc04
need to place thread-local data on
Packit 7cfc04
memory nodes close to the CPUs which are running the threads that most
Packit 7cfc04
frequently access that data; but also
Packit 7cfc04
.IP b)
Packit 7cfc04
need to access large filesystem data sets that must to be spread
Packit 7cfc04
across the several nodes in the job's cpuset in order to fit.
Packit 7cfc04
.PP
Packit 7cfc04
Without this policy,
Packit 7cfc04
the memory allocation across the nodes in the job's cpuset
Packit 7cfc04
can become very uneven,
Packit 7cfc04
especially for jobs that might have just a single
Packit 7cfc04
thread initializing or reading in the data set.
Packit 7cfc04
.\" ================== Memory Migration ==================
Packit 7cfc04
.SS Memory migration
Packit 7cfc04
Normally, under the default setting (disabled) of
Packit 7cfc04
.IR cpuset.memory_migrate ,
Packit 7cfc04
once a page is allocated (given a physical page
Packit 7cfc04
of main memory), then that page stays on whatever node it
Packit 7cfc04
was allocated, so long as it remains allocated, even if the
Packit 7cfc04
cpuset's memory-placement policy
Packit 7cfc04
.I mems
Packit 7cfc04
subsequently changes.
Packit 7cfc04
.PP
Packit 7cfc04
When memory migration is enabled in a cpuset, if the
Packit 7cfc04
.I mems
Packit 7cfc04
setting of the cpuset is changed, then any memory page in use by any
Packit 7cfc04
process in the cpuset that is on a memory node that is no longer
Packit 7cfc04
allowed will be migrated to a memory node that is allowed.
Packit 7cfc04
.PP
Packit 7cfc04
Furthermore, if a process is moved into a cpuset with
Packit 7cfc04
.I memory_migrate
Packit 7cfc04
enabled, any memory pages it uses that were on memory nodes allowed
Packit 7cfc04
in its previous cpuset, but which are not allowed in its new cpuset,
Packit 7cfc04
will be migrated to a memory node allowed in the new cpuset.
Packit 7cfc04
.PP
Packit 7cfc04
The relative placement of a migrated page within
Packit 7cfc04
the cpuset is preserved during these migration operations if possible.
Packit 7cfc04
For example,
Packit 7cfc04
if the page was on the second valid node of the prior cpuset,
Packit 7cfc04
then the page will be placed on the second valid node of the new cpuset,
Packit 7cfc04
if possible.
Packit 7cfc04
.\" ================== Scheduler Load Balancing ==================
Packit 7cfc04
.SS Scheduler load balancing
Packit 7cfc04
The kernel scheduler automatically load balances processes.
Packit 7cfc04
If one CPU is underutilized,
Packit 7cfc04
the kernel will look for processes on other more
Packit 7cfc04
overloaded CPUs and move those processes to the underutilized CPU,
Packit 7cfc04
within the constraints of such placement mechanisms as cpusets and
Packit 7cfc04
.BR sched_setaffinity (2).
Packit 7cfc04
.PP
Packit 7cfc04
The algorithmic cost of load balancing and its impact on key shared
Packit 7cfc04
kernel data structures such as the process list increases more than
Packit 7cfc04
linearly with the number of CPUs being balanced.
Packit 7cfc04
For example, it
Packit 7cfc04
costs more to load balance across one large set of CPUs than it does
Packit 7cfc04
to balance across two smaller sets of CPUs, each of half the size
Packit 7cfc04
of the larger set.
Packit 7cfc04
(The precise relationship between the number of CPUs being balanced
Packit 7cfc04
and the cost of load balancing depends
Packit 7cfc04
on implementation details of the kernel process scheduler, which is
Packit 7cfc04
subject to change over time, as improved kernel scheduler algorithms
Packit 7cfc04
are implemented.)
Packit 7cfc04
.PP
Packit 7cfc04
The per-cpuset flag
Packit 7cfc04
.I sched_load_balance
Packit 7cfc04
provides a mechanism to suppress this automatic scheduler load
Packit 7cfc04
balancing in cases where it is not needed and suppressing it would have
Packit 7cfc04
worthwhile performance benefits.
Packit 7cfc04
.PP
Packit 7cfc04
By default, load balancing is done across all CPUs, except those
Packit 7cfc04
marked isolated using the kernel boot time "isolcpus=" argument.
Packit 7cfc04
(See \fBScheduler Relax Domain Level\fR, below, to change this default.)
Packit 7cfc04
.PP
Packit 7cfc04
This default load balancing across all CPUs is not well suited to
Packit 7cfc04
the following two situations:
Packit 7cfc04
.IP * 3
Packit 7cfc04
On large systems, load balancing across many CPUs is expensive.
Packit 7cfc04
If the system is managed using cpusets to place independent jobs
Packit 7cfc04
on separate sets of CPUs, full load balancing is unnecessary.
Packit 7cfc04
.IP *
Packit 7cfc04
Systems supporting real-time on some CPUs need to minimize
Packit 7cfc04
system overhead on those CPUs, including avoiding process load
Packit 7cfc04
balancing if that is not needed.
Packit 7cfc04
.PP
Packit 7cfc04
When the per-cpuset flag
Packit 7cfc04
.I sched_load_balance
Packit 7cfc04
is enabled (the default setting),
Packit 7cfc04
it requests load balancing across
Packit 7cfc04
all the CPUs in that cpuset's allowed CPUs,
Packit 7cfc04
ensuring that load balancing can move a process (not otherwise pinned,
Packit 7cfc04
as by
Packit 7cfc04
.BR sched_setaffinity (2))
Packit 7cfc04
from any CPU in that cpuset to any other.
Packit 7cfc04
.PP
Packit 7cfc04
When the per-cpuset flag
Packit 7cfc04
.I sched_load_balance
Packit 7cfc04
is disabled, then the
Packit 7cfc04
scheduler will avoid load balancing across the CPUs in that cpuset,
Packit 7cfc04
\fIexcept\fR in so far as is necessary because some overlapping cpuset
Packit 7cfc04
has
Packit 7cfc04
.I sched_load_balance
Packit 7cfc04
enabled.
Packit 7cfc04
.PP
Packit 7cfc04
So, for example, if the top cpuset has the flag
Packit 7cfc04
.I sched_load_balance
Packit 7cfc04
enabled, then the scheduler will load balance across all
Packit 7cfc04
CPUs, and the setting of the
Packit 7cfc04
.I sched_load_balance
Packit 7cfc04
flag in other cpusets has no effect,
Packit 7cfc04
as we're already fully load balancing.
Packit 7cfc04
.PP
Packit 7cfc04
Therefore in the above two situations, the flag
Packit 7cfc04
.I sched_load_balance
Packit 7cfc04
should be disabled in the top cpuset, and only some of the smaller,
Packit 7cfc04
child cpusets would have this flag enabled.
Packit 7cfc04
.PP
Packit 7cfc04
When doing this, you don't usually want to leave any unpinned processes in
Packit 7cfc04
the top cpuset that might use nontrivial amounts of CPU, as such processes
Packit 7cfc04
may be artificially constrained to some subset of CPUs, depending on
Packit 7cfc04
the particulars of this flag setting in descendant cpusets.
Packit 7cfc04
Even if such a process could use spare CPU cycles in some other CPUs,
Packit 7cfc04
the kernel scheduler might not consider the possibility of
Packit 7cfc04
load balancing that process to the underused CPU.
Packit 7cfc04
.PP
Packit 7cfc04
Of course, processes pinned to a particular CPU can be left in a cpuset
Packit 7cfc04
that disables
Packit 7cfc04
.I sched_load_balance
Packit 7cfc04
as those processes aren't going anywhere else anyway.
Packit 7cfc04
.\" ================== Scheduler Relax Domain Level ==================
Packit 7cfc04
.SS Scheduler relax domain level
Packit 7cfc04
The kernel scheduler performs immediate load balancing whenever
Packit 7cfc04
a CPU becomes free or another task becomes runnable.
Packit 7cfc04
This load
Packit 7cfc04
balancing works to ensure that as many CPUs as possible are usefully
Packit 7cfc04
employed running tasks.
Packit 7cfc04
The kernel also performs periodic load
Packit 7cfc04
balancing off the software clock described in
Packit 7cfc04
.BR time (7).
Packit 7cfc04
The setting of
Packit 7cfc04
.I sched_relax_domain_level
Packit 7cfc04
applies only to immediate load balancing.
Packit 7cfc04
Regardless of the
Packit 7cfc04
.I sched_relax_domain_level
Packit 7cfc04
setting, periodic load balancing is attempted over all CPUs
Packit 7cfc04
(unless disabled by turning off
Packit 7cfc04
.IR sched_load_balance .)
Packit 7cfc04
In any case, of course, tasks will be scheduled to run only on
Packit 7cfc04
CPUs allowed by their cpuset, as modified by
Packit 7cfc04
.BR sched_setaffinity (2)
Packit 7cfc04
system calls.
Packit 7cfc04
.PP
Packit 7cfc04
On small systems, such as those with just a few CPUs, immediate load
Packit 7cfc04
balancing is useful to improve system interactivity and to minimize
Packit 7cfc04
wasteful idle CPU cycles.
Packit 7cfc04
But on large systems, attempting immediate
Packit 7cfc04
load balancing across a large number of CPUs can be more costly than
Packit 7cfc04
it is worth, depending on the particular performance characteristics
Packit 7cfc04
of the job mix and the hardware.
Packit 7cfc04
.PP
Packit 7cfc04
The exact meaning of the small integer values of
Packit 7cfc04
.I sched_relax_domain_level
Packit 7cfc04
will depend on internal
Packit 7cfc04
implementation details of the kernel scheduler code and on the
Packit 7cfc04
non-uniform architecture of the hardware.
Packit 7cfc04
Both of these will evolve
Packit 7cfc04
over time and vary by system architecture and kernel version.
Packit 7cfc04
.PP
Packit 7cfc04
As of this writing, when this capability was introduced in Linux
Packit 7cfc04
2.6.26, on certain popular architectures, the positive values of
Packit 7cfc04
.I sched_relax_domain_level
Packit 7cfc04
have the following meanings.
Packit 7cfc04
.PP
Packit 7cfc04
.PD 0
Packit 7cfc04
.IP \fB(1)\fR 4
Packit 7cfc04
Perform immediate load balancing across Hyper-Thread
Packit 7cfc04
siblings on the same core.
Packit 7cfc04
.IP \fB(2)\fR
Packit 7cfc04
Perform immediate load balancing across other cores in the same package.
Packit 7cfc04
.IP \fB(3)\fR
Packit 7cfc04
Perform immediate load balancing across other CPUs
Packit 7cfc04
on the same node or blade.
Packit 7cfc04
.IP \fB(4)\fR
Packit 7cfc04
Perform immediate load balancing across over several
Packit 7cfc04
(implementation detail) nodes [On NUMA systems].
Packit 7cfc04
.IP \fB(5)\fR
Packit 7cfc04
Perform immediate load balancing across over all CPUs
Packit 7cfc04
in system [On NUMA systems].
Packit 7cfc04
.PD
Packit 7cfc04
.PP
Packit 7cfc04
The
Packit 7cfc04
.I sched_relax_domain_level
Packit 7cfc04
value of zero (0) always means
Packit 7cfc04
don't perform immediate load balancing,
Packit 7cfc04
hence that load balancing is done only periodically,
Packit 7cfc04
not immediately when a CPU becomes available or another task becomes
Packit 7cfc04
runnable.
Packit 7cfc04
.PP
Packit 7cfc04
The
Packit 7cfc04
.I sched_relax_domain_level
Packit 7cfc04
value of minus one (\-1)
Packit 7cfc04
always means use the system default value.
Packit 7cfc04
The system default value can vary by architecture and kernel version.
Packit 7cfc04
This system default value can be changed by kernel
Packit 7cfc04
boot-time "relax_domain_level=" argument.
Packit 7cfc04
.PP
Packit 7cfc04
In the case of multiple overlapping cpusets which have conflicting
Packit 7cfc04
.I sched_relax_domain_level
Packit 7cfc04
values, then the highest such value
Packit 7cfc04
applies to all CPUs in any of the overlapping cpusets.
Packit 7cfc04
In such cases,
Packit 7cfc04
the value \fBminus one (\-1)\fR is the lowest value, overridden by any
Packit 7cfc04
other value, and the value \fBzero (0)\fR is the next lowest value.
Packit 7cfc04
.SH FORMATS
Packit 7cfc04
The following formats are used to represent sets of
Packit 7cfc04
CPUs and memory nodes.
Packit 7cfc04
.\" ================== Mask Format ==================
Packit 7cfc04
.SS Mask format
Packit 7cfc04
The \fBMask Format\fR is used to represent CPU and memory-node bit masks
Packit 7cfc04
in the
Packit 7cfc04
.I /proc/<pid>/status
Packit 7cfc04
file.
Packit 7cfc04
.PP
Packit 7cfc04
This format displays each 32-bit
Packit 7cfc04
word in hexadecimal (using ASCII characters "0" - "9" and "a" - "f");
Packit 7cfc04
words are filled with leading zeros, if required.
Packit 7cfc04
For masks longer than one word, a comma separator is used between words.
Packit 7cfc04
Words are displayed in big-endian
Packit 7cfc04
order, which has the most significant bit first.
Packit 7cfc04
The hex digits within a word are also in big-endian order.
Packit 7cfc04
.PP
Packit 7cfc04
The number of 32-bit words displayed is the minimum number needed to
Packit 7cfc04
display all bits of the bit mask, based on the size of the bit mask.
Packit 7cfc04
.PP
Packit 7cfc04
Examples of the \fBMask Format\fR:
Packit 7cfc04
.PP
Packit 7cfc04
.in +4n
Packit 7cfc04
.EX
Packit 7cfc04
00000001                        # just bit 0 set
Packit 7cfc04
40000000,00000000,00000000      # just bit 94 set
Packit 7cfc04
00000001,00000000,00000000      # just bit 64 set
Packit 7cfc04
000000ff,00000000               # bits 32\-39 set
Packit 7cfc04
00000000,000e3862               # 1,5,6,11\-13,17\-19 set
Packit 7cfc04
.EE
Packit 7cfc04
.in
Packit 7cfc04
.PP
Packit 7cfc04
A mask with bits 0, 1, 2, 4, 8, 16, 32, and 64 set displays as:
Packit 7cfc04
.PP
Packit 7cfc04
.in +4n
Packit 7cfc04
.EX
Packit 7cfc04
00000001,00000001,00010117
Packit 7cfc04
.EE
Packit 7cfc04
.in
Packit 7cfc04
.PP
Packit 7cfc04
The first "1" is for bit 64, the
Packit 7cfc04
second for bit 32, the third for bit 16, the fourth for bit 8, the
Packit 7cfc04
fifth for bit 4, and the "7" is for bits 2, 1, and 0.
Packit 7cfc04
.\" ================== List Format ==================
Packit 7cfc04
.SS List format
Packit 7cfc04
The \fBList Format\fR for
Packit 7cfc04
.I cpus
Packit 7cfc04
and
Packit 7cfc04
.I mems
Packit 7cfc04
is a comma-separated list of CPU or memory-node
Packit 7cfc04
numbers and ranges of numbers, in ASCII decimal.
Packit 7cfc04
.PP
Packit 7cfc04
Examples of the \fBList Format\fR:
Packit 7cfc04
.PP
Packit 7cfc04
.in +4n
Packit 7cfc04
.EX
Packit 7cfc04
0\-4,9           # bits 0, 1, 2, 3, 4, and 9 set
Packit 7cfc04
0\-2,7,12\-14     # bits 0, 1, 2, 7, 12, 13, and 14 set
Packit 7cfc04
.EE
Packit 7cfc04
.in
Packit 7cfc04
.\" ================== RULES ==================
Packit 7cfc04
.SH RULES
Packit 7cfc04
The following rules apply to each cpuset:
Packit 7cfc04
.IP * 3
Packit 7cfc04
Its CPUs and memory nodes must be a (possibly equal)
Packit 7cfc04
subset of its parent's.
Packit 7cfc04
.IP *
Packit 7cfc04
It can be marked
Packit 7cfc04
.IR cpu_exclusive
Packit 7cfc04
only if its parent is.
Packit 7cfc04
.IP *
Packit 7cfc04
It can be marked
Packit 7cfc04
.IR mem_exclusive
Packit 7cfc04
only if its parent is.
Packit 7cfc04
.IP *
Packit 7cfc04
If it is
Packit 7cfc04
.IR cpu_exclusive ,
Packit 7cfc04
its CPUs may not overlap any sibling.
Packit 7cfc04
.IP *
Packit 7cfc04
If it is
Packit 7cfc04
.IR memory_exclusive ,
Packit 7cfc04
its memory nodes may not overlap any sibling.
Packit 7cfc04
.\" ================== PERMISSIONS ==================
Packit 7cfc04
.SH PERMISSIONS
Packit 7cfc04
The permissions of a cpuset are determined by the permissions
Packit 7cfc04
of the directories and pseudo-files in the cpuset filesystem,
Packit 7cfc04
normally mounted at
Packit 7cfc04
.IR /dev/cpuset .
Packit 7cfc04
.PP
Packit 7cfc04
For instance, a process can put itself in some other cpuset (than
Packit 7cfc04
its current one) if it can write the
Packit 7cfc04
.I tasks
Packit 7cfc04
file for that cpuset.
Packit 7cfc04
This requires execute permission on the encompassing directories
Packit 7cfc04
and write permission on the
Packit 7cfc04
.I tasks
Packit 7cfc04
file.
Packit 7cfc04
.PP
Packit 7cfc04
An additional constraint is applied to requests to place some
Packit 7cfc04
other process in a cpuset.
Packit 7cfc04
One process may not attach another to
Packit 7cfc04
a cpuset unless it would have permission to send that process
Packit 7cfc04
a signal (see
Packit 7cfc04
.BR kill (2)).
Packit 7cfc04
.PP
Packit 7cfc04
A process may create a child cpuset if it can access and write the
Packit 7cfc04
parent cpuset directory.
Packit 7cfc04
It can modify the CPUs or memory nodes
Packit 7cfc04
in a cpuset if it can access that cpuset's directory (execute
Packit 7cfc04
permissions on the each of the parent directories) and write the
Packit 7cfc04
corresponding
Packit 7cfc04
.I cpus
Packit 7cfc04
or
Packit 7cfc04
.I mems
Packit 7cfc04
file.
Packit 7cfc04
.PP
Packit 7cfc04
There is one minor difference between the manner in which these
Packit 7cfc04
permissions are evaluated and the manner in which normal filesystem
Packit 7cfc04
operation permissions are evaluated.
Packit 7cfc04
The kernel interprets
Packit 7cfc04
relative pathnames starting at a process's current working directory.
Packit 7cfc04
Even if one is operating on a cpuset file, relative pathnames
Packit 7cfc04
are interpreted relative to the process's current working directory,
Packit 7cfc04
not relative to the process's current cpuset.
Packit 7cfc04
The only ways that
Packit 7cfc04
cpuset paths relative to a process's current cpuset can be used are
Packit 7cfc04
if either the process's current working directory is its cpuset
Packit 7cfc04
(it first did a
Packit 7cfc04
.B cd
Packit 7cfc04
or
Packit 7cfc04
.BR chdir (2)
Packit 7cfc04
to its cpuset directory beneath
Packit 7cfc04
.IR /dev/cpuset ,
Packit 7cfc04
which is a bit unusual)
Packit 7cfc04
or if some user code converts the relative cpuset path to a
Packit 7cfc04
full filesystem path.
Packit 7cfc04
.PP
Packit 7cfc04
In theory, this means that user code should specify cpusets
Packit 7cfc04
using absolute pathnames, which requires knowing the mount point of
Packit 7cfc04
the cpuset filesystem (usually, but not necessarily,
Packit 7cfc04
.IR /dev/cpuset ).
Packit 7cfc04
In practice, all user level code that this author is aware of
Packit 7cfc04
simply assumes that if the cpuset filesystem is mounted, then
Packit 7cfc04
it is mounted at
Packit 7cfc04
.IR /dev/cpuset .
Packit 7cfc04
Furthermore, it is common practice for carefully written
Packit 7cfc04
user code to verify the presence of the pseudo-file
Packit 7cfc04
.I /dev/cpuset/tasks
Packit 7cfc04
in order to verify that the cpuset pseudo-filesystem
Packit 7cfc04
is currently mounted.
Packit 7cfc04
.\" ================== WARNINGS ==================
Packit 7cfc04
.SH WARNINGS
Packit 7cfc04
.SS Enabling memory_pressure
Packit 7cfc04
By default, the per-cpuset file
Packit 7cfc04
.I cpuset.memory_pressure
Packit 7cfc04
always contains zero (0).
Packit 7cfc04
Unless this feature is enabled by writing "1" to the pseudo-file
Packit 7cfc04
.IR /dev/cpuset/cpuset.memory_pressure_enabled ,
Packit 7cfc04
the kernel does
Packit 7cfc04
not compute per-cpuset
Packit 7cfc04
.IR memory_pressure .
Packit 7cfc04
.SS Using the echo command
Packit 7cfc04
When using the
Packit 7cfc04
.B echo
Packit 7cfc04
command at the shell prompt to change the values of cpuset files,
Packit 7cfc04
beware that the built-in
Packit 7cfc04
.B echo
Packit 7cfc04
command in some shells does not display an error message if the
Packit 7cfc04
.BR write (2)
Packit 7cfc04
system call fails.
Packit 7cfc04
.\" Gack!  csh(1)'s echo does this
Packit 7cfc04
For example, if the command:
Packit 7cfc04
.PP
Packit 7cfc04
.in +4n
Packit 7cfc04
.EX
Packit 7cfc04
echo 19 > cpuset.mems
Packit 7cfc04
.EE
Packit 7cfc04
.in
Packit 7cfc04
.PP
Packit 7cfc04
failed because memory node 19 was not allowed (perhaps
Packit 7cfc04
the current system does not have a memory node 19), then the
Packit 7cfc04
.B echo
Packit 7cfc04
command might not display any error.
Packit 7cfc04
It is better to use the
Packit 7cfc04
.B /bin/echo
Packit 7cfc04
external command to change cpuset file settings, as this
Packit 7cfc04
command will display
Packit 7cfc04
.BR write (2)
Packit 7cfc04
errors, as in the example:
Packit 7cfc04
.PP
Packit 7cfc04
.in +4n
Packit 7cfc04
.EX
Packit 7cfc04
/bin/echo 19 > cpuset.mems
Packit 7cfc04
/bin/echo: write error: Invalid argument
Packit 7cfc04
.EE
Packit 7cfc04
.in
Packit 7cfc04
.\" ================== EXCEPTIONS ==================
Packit 7cfc04
.SH EXCEPTIONS
Packit 7cfc04
.SS Memory placement
Packit 7cfc04
Not all allocations of system memory are constrained by cpusets,
Packit 7cfc04
for the following reasons.
Packit 7cfc04
.PP
Packit 7cfc04
If hot-plug functionality is used to remove all the CPUs that are
Packit 7cfc04
currently assigned to a cpuset, then the kernel will automatically
Packit 7cfc04
update the
Packit 7cfc04
.I cpus_allowed
Packit 7cfc04
of all processes attached to CPUs in that cpuset
Packit 7cfc04
to allow all CPUs.
Packit 7cfc04
When memory hot-plug functionality for removing
Packit 7cfc04
memory nodes is available, a similar exception is expected to apply
Packit 7cfc04
there as well.
Packit 7cfc04
In general, the kernel prefers to violate cpuset placement,
Packit 7cfc04
rather than starving a process that has had all its allowed CPUs or
Packit 7cfc04
memory nodes taken offline.
Packit 7cfc04
User code should reconfigure cpusets to refer only to online CPUs
Packit 7cfc04
and memory nodes when using hot-plug to add or remove such resources.
Packit 7cfc04
.PP
Packit 7cfc04
A few kernel-critical, internal memory-allocation requests, marked
Packit 7cfc04
GFP_ATOMIC, must be satisfied immediately.
Packit 7cfc04
The kernel may drop some
Packit 7cfc04
request or malfunction if one of these allocations fail.
Packit 7cfc04
If such a request cannot be satisfied within the current process's cpuset,
Packit 7cfc04
then we relax the cpuset, and look for memory anywhere we can find it.
Packit 7cfc04
It's better to violate the cpuset than stress the kernel.
Packit 7cfc04
.PP
Packit 7cfc04
Allocations of memory requested by kernel drivers while processing
Packit 7cfc04
an interrupt lack any relevant process context, and are not confined
Packit 7cfc04
by cpusets.
Packit 7cfc04
.SS Renaming cpusets
Packit 7cfc04
You can use the
Packit 7cfc04
.BR rename (2)
Packit 7cfc04
system call to rename cpusets.
Packit 7cfc04
Only simple renaming is supported; that is, changing the name of a cpuset
Packit 7cfc04
directory is permitted, but moving a directory into
Packit 7cfc04
a different directory is not permitted.
Packit 7cfc04
.\" ================== ERRORS ==================
Packit 7cfc04
.SH ERRORS
Packit 7cfc04
The Linux kernel implementation of cpusets sets
Packit 7cfc04
.I errno
Packit 7cfc04
to specify the reason for a failed system call affecting cpusets.
Packit 7cfc04
.PP
Packit 7cfc04
The possible
Packit 7cfc04
.I errno
Packit 7cfc04
settings and their meaning when set on
Packit 7cfc04
a failed cpuset call are as listed below.
Packit 7cfc04
.TP
Packit 7cfc04
.B E2BIG
Packit 7cfc04
Attempted a
Packit 7cfc04
.BR write (2)
Packit 7cfc04
on a special cpuset file
Packit 7cfc04
with a length larger than some kernel-determined upper
Packit 7cfc04
limit on the length of such writes.
Packit 7cfc04
.TP
Packit 7cfc04
.B EACCES
Packit 7cfc04
Attempted to
Packit 7cfc04
.BR write (2)
Packit 7cfc04
the process ID (PID) of a process to a cpuset
Packit 7cfc04
.I tasks
Packit 7cfc04
file when one lacks permission to move that process.
Packit 7cfc04
.TP
Packit 7cfc04
.B EACCES
Packit 7cfc04
Attempted to add, using
Packit 7cfc04
.BR write (2),
Packit 7cfc04
a CPU or memory node to a cpuset, when that CPU or memory node was
Packit 7cfc04
not already in its parent.
Packit 7cfc04
.TP
Packit 7cfc04
.B EACCES
Packit 7cfc04
Attempted to set, using
Packit 7cfc04
.BR write (2),
Packit 7cfc04
.I cpuset.cpu_exclusive
Packit 7cfc04
or
Packit 7cfc04
.I cpuset.mem_exclusive
Packit 7cfc04
on a cpuset whose parent lacks the same setting.
Packit 7cfc04
.TP
Packit 7cfc04
.B EACCES
Packit 7cfc04
Attempted to
Packit 7cfc04
.BR write (2)
Packit 7cfc04
a
Packit 7cfc04
.I cpuset.memory_pressure
Packit 7cfc04
file.
Packit 7cfc04
.TP
Packit 7cfc04
.B EACCES
Packit 7cfc04
Attempted to create a file in a cpuset directory.
Packit 7cfc04
.TP
Packit 7cfc04
.B EBUSY
Packit 7cfc04
Attempted to remove, using
Packit 7cfc04
.BR rmdir (2),
Packit 7cfc04
a cpuset with attached processes.
Packit 7cfc04
.TP
Packit 7cfc04
.B EBUSY
Packit 7cfc04
Attempted to remove, using
Packit 7cfc04
.BR rmdir (2),
Packit 7cfc04
a cpuset with child cpusets.
Packit 7cfc04
.TP
Packit 7cfc04
.B EBUSY
Packit 7cfc04
Attempted to remove
Packit 7cfc04
a CPU or memory node from a cpuset
Packit 7cfc04
that is also in a child of that cpuset.
Packit 7cfc04
.TP
Packit 7cfc04
.B EEXIST
Packit 7cfc04
Attempted to create, using
Packit 7cfc04
.BR mkdir (2),
Packit 7cfc04
a cpuset that already exists.
Packit 7cfc04
.TP
Packit 7cfc04
.B EEXIST
Packit 7cfc04
Attempted to
Packit 7cfc04
.BR rename (2)
Packit 7cfc04
a cpuset to a name that already exists.
Packit 7cfc04
.TP
Packit 7cfc04
.B EFAULT
Packit 7cfc04
Attempted to
Packit 7cfc04
.BR read (2)
Packit 7cfc04
or
Packit 7cfc04
.BR write (2)
Packit 7cfc04
a cpuset file using
Packit 7cfc04
a buffer that is outside the writing processes accessible address space.
Packit 7cfc04
.TP
Packit 7cfc04
.B EINVAL
Packit 7cfc04
Attempted to change a cpuset, using
Packit 7cfc04
.BR write (2),
Packit 7cfc04
in a way that would violate a
Packit 7cfc04
.I cpu_exclusive
Packit 7cfc04
or
Packit 7cfc04
.I mem_exclusive
Packit 7cfc04
attribute of that cpuset or any of its siblings.
Packit 7cfc04
.TP
Packit 7cfc04
.B EINVAL
Packit 7cfc04
Attempted to
Packit 7cfc04
.BR write (2)
Packit 7cfc04
an empty
Packit 7cfc04
.I cpuset.cpus
Packit 7cfc04
or
Packit 7cfc04
.I cpuset.mems
Packit 7cfc04
list to a cpuset which has attached processes or child cpusets.
Packit 7cfc04
.TP
Packit 7cfc04
.B EINVAL
Packit 7cfc04
Attempted to
Packit 7cfc04
.BR write (2)
Packit 7cfc04
a
Packit 7cfc04
.I cpuset.cpus
Packit 7cfc04
or
Packit 7cfc04
.I cpuset.mems
Packit 7cfc04
list which included a range with the second number smaller than
Packit 7cfc04
the first number.
Packit 7cfc04
.TP
Packit 7cfc04
.B EINVAL
Packit 7cfc04
Attempted to
Packit 7cfc04
.BR write (2)
Packit 7cfc04
a
Packit 7cfc04
.I cpuset.cpus
Packit 7cfc04
or
Packit 7cfc04
.I cpuset.mems
Packit 7cfc04
list which included an invalid character in the string.
Packit 7cfc04
.TP
Packit 7cfc04
.B EINVAL
Packit 7cfc04
Attempted to
Packit 7cfc04
.BR write (2)
Packit 7cfc04
a list to a
Packit 7cfc04
.I cpuset.cpus
Packit 7cfc04
file that did not include any online CPUs.
Packit 7cfc04
.TP
Packit 7cfc04
.B EINVAL
Packit 7cfc04
Attempted to
Packit 7cfc04
.BR write (2)
Packit 7cfc04
a list to a
Packit 7cfc04
.I cpuset.mems
Packit 7cfc04
file that did not include any online memory nodes.
Packit 7cfc04
.TP
Packit 7cfc04
.B EINVAL
Packit 7cfc04
Attempted to
Packit 7cfc04
.BR write (2)
Packit 7cfc04
a list to a
Packit 7cfc04
.I cpuset.mems
Packit 7cfc04
file that included a node that held no memory.
Packit 7cfc04
.TP
Packit 7cfc04
.B EIO
Packit 7cfc04
Attempted to
Packit 7cfc04
.BR write (2)
Packit 7cfc04
a string to a cpuset
Packit 7cfc04
.I tasks
Packit 7cfc04
file that
Packit 7cfc04
does not begin with an ASCII decimal integer.
Packit 7cfc04
.TP
Packit 7cfc04
.B EIO
Packit 7cfc04
Attempted to
Packit 7cfc04
.BR rename (2)
Packit 7cfc04
a cpuset into a different directory.
Packit 7cfc04
.TP
Packit 7cfc04
.B ENAMETOOLONG
Packit 7cfc04
Attempted to
Packit 7cfc04
.BR read (2)
Packit 7cfc04
a
Packit 7cfc04
.I /proc/<pid>/cpuset
Packit 7cfc04
file for a cpuset path that is longer than the kernel page size.
Packit 7cfc04
.TP
Packit 7cfc04
.B ENAMETOOLONG
Packit 7cfc04
Attempted to create, using
Packit 7cfc04
.BR mkdir (2),
Packit 7cfc04
a cpuset whose base directory name is longer than 255 characters.
Packit 7cfc04
.TP
Packit 7cfc04
.B ENAMETOOLONG
Packit 7cfc04
Attempted to create, using
Packit 7cfc04
.BR mkdir (2),
Packit 7cfc04
a cpuset whose full pathname,
Packit 7cfc04
including the mount point (typically "/dev/cpuset/") prefix,
Packit 7cfc04
is longer than 4095 characters.
Packit 7cfc04
.TP
Packit 7cfc04
.B ENODEV
Packit 7cfc04
The cpuset was removed by another process at the same time as a
Packit 7cfc04
.BR write (2)
Packit 7cfc04
was attempted on one of the pseudo-files in the cpuset directory.
Packit 7cfc04
.TP
Packit 7cfc04
.B ENOENT
Packit 7cfc04
Attempted to create, using
Packit 7cfc04
.BR mkdir (2),
Packit 7cfc04
a cpuset in a parent cpuset that doesn't exist.
Packit 7cfc04
.TP
Packit 7cfc04
.B ENOENT
Packit 7cfc04
Attempted to
Packit 7cfc04
.BR access (2)
Packit 7cfc04
or
Packit 7cfc04
.BR open (2)
Packit 7cfc04
a nonexistent file in a cpuset directory.
Packit 7cfc04
.TP
Packit 7cfc04
.B ENOMEM
Packit 7cfc04
Insufficient memory is available within the kernel; can occur
Packit 7cfc04
on a variety of system calls affecting cpusets, but only if the
Packit 7cfc04
system is extremely short of memory.
Packit 7cfc04
.TP
Packit 7cfc04
.B ENOSPC
Packit 7cfc04
Attempted to
Packit 7cfc04
.BR write (2)
Packit 7cfc04
the process ID (PID)
Packit 7cfc04
of a process to a cpuset
Packit 7cfc04
.I tasks
Packit 7cfc04
file when the cpuset had an empty
Packit 7cfc04
.I cpuset.cpus
Packit 7cfc04
or empty
Packit 7cfc04
.I cpuset.mems
Packit 7cfc04
setting.
Packit 7cfc04
.TP
Packit 7cfc04
.B ENOSPC
Packit 7cfc04
Attempted to
Packit 7cfc04
.BR write (2)
Packit 7cfc04
an empty
Packit 7cfc04
.I cpuset.cpus
Packit 7cfc04
or
Packit 7cfc04
.I cpuset.mems
Packit 7cfc04
setting to a cpuset that
Packit 7cfc04
has tasks attached.
Packit 7cfc04
.TP
Packit 7cfc04
.B ENOTDIR
Packit 7cfc04
Attempted to
Packit 7cfc04
.BR rename (2)
Packit 7cfc04
a nonexistent cpuset.
Packit 7cfc04
.TP
Packit 7cfc04
.B EPERM
Packit 7cfc04
Attempted to remove a file from a cpuset directory.
Packit 7cfc04
.TP
Packit 7cfc04
.B ERANGE
Packit 7cfc04
Specified a
Packit 7cfc04
.I cpuset.cpus
Packit 7cfc04
or
Packit 7cfc04
.I cpuset.mems
Packit 7cfc04
list to the kernel which included a number too large for the kernel
Packit 7cfc04
to set in its bit masks.
Packit 7cfc04
.TP
Packit 7cfc04
.B ESRCH
Packit 7cfc04
Attempted to
Packit 7cfc04
.BR write (2)
Packit 7cfc04
the process ID (PID) of a nonexistent process to a cpuset
Packit 7cfc04
.I tasks
Packit 7cfc04
file.
Packit 7cfc04
.\" ================== VERSIONS ==================
Packit 7cfc04
.SH VERSIONS
Packit 7cfc04
Cpusets appeared in version 2.6.12 of the Linux kernel.
Packit 7cfc04
.\" ================== NOTES ==================
Packit 7cfc04
.SH NOTES
Packit 7cfc04
Despite its name, the
Packit 7cfc04
.I pid
Packit 7cfc04
parameter is actually a thread ID,
Packit 7cfc04
and each thread in a threaded group can be attached to a different
Packit 7cfc04
cpuset.
Packit 7cfc04
The value returned from a call to
Packit 7cfc04
.BR gettid (2)
Packit 7cfc04
can be passed in the argument
Packit 7cfc04
.IR pid .
Packit 7cfc04
.\" ================== BUGS ==================
Packit 7cfc04
.SH BUGS
Packit 7cfc04
.I cpuset.memory_pressure
Packit 7cfc04
cpuset files can be opened
Packit 7cfc04
for writing, creation, or truncation, but then the
Packit 7cfc04
.BR write (2)
Packit 7cfc04
fails with
Packit 7cfc04
.I errno
Packit 7cfc04
set to
Packit 7cfc04
.BR EACCES ,
Packit 7cfc04
and the creation and truncation options on
Packit 7cfc04
.BR open (2)
Packit 7cfc04
have no effect.
Packit 7cfc04
.\" ================== EXAMPLE ==================
Packit 7cfc04
.SH EXAMPLE
Packit 7cfc04
The following examples demonstrate querying and setting cpuset
Packit 7cfc04
options using shell commands.
Packit 7cfc04
.SS Creating and attaching to a cpuset.
Packit 7cfc04
To create a new cpuset and attach the current command shell to it,
Packit 7cfc04
the steps are:
Packit 7cfc04
.PP
Packit 7cfc04
.PD 0
Packit 7cfc04
.IP 1) 4
Packit 7cfc04
mkdir /dev/cpuset (if not already done)
Packit 7cfc04
.IP 2)
Packit 7cfc04
mount \-t cpuset none /dev/cpuset (if not already done)
Packit 7cfc04
.IP 3)
Packit 7cfc04
Create the new cpuset using
Packit 7cfc04
.BR mkdir (1).
Packit 7cfc04
.IP 4)
Packit 7cfc04
Assign CPUs and memory nodes to the new cpuset.
Packit 7cfc04
.IP 5)
Packit 7cfc04
Attach the shell to the new cpuset.
Packit 7cfc04
.PD
Packit 7cfc04
.PP
Packit 7cfc04
For example, the following sequence of commands will set up a cpuset
Packit 7cfc04
named "Charlie", containing just CPUs 2 and 3, and memory node 1,
Packit 7cfc04
and then attach the current shell to that cpuset.
Packit 7cfc04
.PP
Packit 7cfc04
.in +4n
Packit 7cfc04
.EX
Packit 7cfc04
.RB "$" " mkdir /dev/cpuset"
Packit 7cfc04
.RB "$" " mount \-t cpuset cpuset /dev/cpuset"
Packit 7cfc04
.RB "$" " cd /dev/cpuset"
Packit 7cfc04
.RB "$" " mkdir Charlie"
Packit 7cfc04
.RB "$" " cd Charlie"
Packit 7cfc04
.RB "$" " /bin/echo 2\-3 > cpuset.cpus"
Packit 7cfc04
.RB "$" " /bin/echo 1 > cpuset.mems"
Packit 7cfc04
.RB "$" " /bin/echo $$ > tasks"
Packit 7cfc04
# The current shell is now running in cpuset Charlie
Packit 7cfc04
# The next line should display '/Charlie'
Packit 7cfc04
.RB "$" " cat /proc/self/cpuset"
Packit 7cfc04
.EE
Packit 7cfc04
.in
Packit 7cfc04
.\"
Packit 7cfc04
.SS Migrating a job to different memory nodes.
Packit 7cfc04
To migrate a job (the set of processes attached to a cpuset)
Packit 7cfc04
to different CPUs and memory nodes in the system, including moving
Packit 7cfc04
the memory pages currently allocated to that job,
Packit 7cfc04
perform the following steps.
Packit 7cfc04
.PP
Packit 7cfc04
.PD 0
Packit 7cfc04
.IP 1) 4
Packit 7cfc04
Let's say we want to move the job in cpuset
Packit 7cfc04
.I alpha
Packit 7cfc04
(CPUs 4\(en7 and memory nodes 2\(en3) to a new cpuset
Packit 7cfc04
.I beta
Packit 7cfc04
(CPUs 16\(en19 and memory nodes 8\(en9).
Packit 7cfc04
.IP 2)
Packit 7cfc04
First create the new cpuset
Packit 7cfc04
.IR beta .
Packit 7cfc04
.IP 3)
Packit 7cfc04
Then allow CPUs 16\(en19 and memory nodes 8\(en9 in
Packit 7cfc04
.IR beta .
Packit 7cfc04
.IP 4)
Packit 7cfc04
Then enable
Packit 7cfc04
.I memory_migration
Packit 7cfc04
in
Packit 7cfc04
.IR beta .
Packit 7cfc04
.IP 5)
Packit 7cfc04
Then move each process from
Packit 7cfc04
.I alpha
Packit 7cfc04
to
Packit 7cfc04
.IR beta .
Packit 7cfc04
.PD
Packit 7cfc04
.PP
Packit 7cfc04
The following sequence of commands accomplishes this.
Packit 7cfc04
.PP
Packit 7cfc04
.in +4n
Packit 7cfc04
.EX
Packit 7cfc04
.RB "$" " cd /dev/cpuset"
Packit 7cfc04
.RB "$" " mkdir beta"
Packit 7cfc04
.RB "$" " cd beta"
Packit 7cfc04
.RB "$" " /bin/echo 16\-19 > cpuset.cpus"
Packit 7cfc04
.RB "$" " /bin/echo 8\-9 > cpuset.mems"
Packit 7cfc04
.RB "$" " /bin/echo 1 > cpuset.memory_migrate"
Packit 7cfc04
.RB "$" " while read i; do /bin/echo $i; done < ../alpha/tasks > tasks"
Packit 7cfc04
.EE
Packit 7cfc04
.in
Packit 7cfc04
.PP
Packit 7cfc04
The above should move any processes in
Packit 7cfc04
.I alpha
Packit 7cfc04
to
Packit 7cfc04
.IR beta ,
Packit 7cfc04
and any memory held by these processes on memory nodes 2\(en3 to memory
Packit 7cfc04
nodes 8\(en9, respectively.
Packit 7cfc04
.PP
Packit 7cfc04
Notice that the last step of the above sequence did not do:
Packit 7cfc04
.PP
Packit 7cfc04
.in +4n
Packit 7cfc04
.EX
Packit 7cfc04
.RB "$" " cp ../alpha/tasks tasks"
Packit 7cfc04
.EE
Packit 7cfc04
.in
Packit 7cfc04
.PP
Packit 7cfc04
The
Packit 7cfc04
.I while
Packit 7cfc04
loop, rather than the seemingly easier use of the
Packit 7cfc04
.BR cp (1)
Packit 7cfc04
command, was necessary because
Packit 7cfc04
only one process PID at a time may be written to the
Packit 7cfc04
.I tasks
Packit 7cfc04
file.
Packit 7cfc04
.PP
Packit 7cfc04
The same effect (writing one PID at a time) as the
Packit 7cfc04
.I while
Packit 7cfc04
loop can be accomplished more efficiently, in fewer keystrokes and in
Packit 7cfc04
syntax that works on any shell, but alas more obscurely, by using the
Packit 7cfc04
.B \-u
Packit 7cfc04
(unbuffered) option of
Packit 7cfc04
.BR sed (1):
Packit 7cfc04
.PP
Packit 7cfc04
.in +4n
Packit 7cfc04
.EX
Packit 7cfc04
.RB "$" " sed \-un p < ../alpha/tasks > tasks"
Packit 7cfc04
.EE
Packit 7cfc04
.in
Packit 7cfc04
.\" ================== SEE ALSO ==================
Packit 7cfc04
.SH SEE ALSO
Packit 7cfc04
.BR taskset (1),
Packit 7cfc04
.BR get_mempolicy (2),
Packit 7cfc04
.BR getcpu (2),
Packit 7cfc04
.BR mbind (2),
Packit 7cfc04
.BR sched_getaffinity (2),
Packit 7cfc04
.BR sched_setaffinity (2),
Packit 7cfc04
.BR sched_setscheduler (2),
Packit 7cfc04
.BR set_mempolicy (2),
Packit 7cfc04
.BR CPU_SET (3),
Packit 7cfc04
.BR proc (5),
Packit 7cfc04
.BR cgroups (7),
Packit 7cfc04
.BR numa (7),
Packit 7cfc04
.BR sched (7),
Packit 7cfc04
.BR migratepages (8),
Packit 7cfc04
.BR numactl (8)
Packit 7cfc04
.PP
Packit 7cfc04
.IR Documentation/cgroup\-v1/cpusets.txt
Packit 7cfc04
in the Linux kernel source tree
Packit 7cfc04
.\" commit 45ce80fb6b6f9594d1396d44dd7e7c02d596fef8
Packit 7cfc04
(or
Packit 7cfc04
.IR Documentation/cpusets.txt
Packit 7cfc04
before Linux 2.6.29)
Packit 7cfc04
.SH COLOPHON
Packit 7cfc04
This page is part of release 4.15 of the Linux
Packit 7cfc04
.I man-pages
Packit 7cfc04
project.
Packit 7cfc04
A description of the project,
Packit 7cfc04
information about reporting bugs,
Packit 7cfc04
and the latest version of this page,
Packit 7cfc04
can be found at
Packit 7cfc04
\%https://www.kernel.org/doc/man\-pages/.