23.3.6 Limiting execution to certain CPUs

On a multi-processor system the operating system usually distributes the different processes which are runnable on all available CPUs in a way which allows the system to work most efficiently. Which processes and threads run can to some extend be controlled with the scheduling functionality described in the last sections. But which CPU finally executes which process or thread is not covered.

There are a number of reasons why a program might want to have control over this aspect of the system as well:

The POSIX standard up to this date is of not much help to solve this problem. The Linux kernel provides a set of interfaces to allow specifying affinity sets for a process. The scheduler will schedule the thread or process on CPUs specified by the affinity masks. The interfaces which the GNU C Library define follow to some extent the Linux kernel interface.

Data Type: cpu_set_t

This data set is a bitset where each bit represents a CPU. How the system’s CPUs are mapped to bits in the bitset is system dependent. The data type has a fixed size; it is strongly recommended to allocate a dynamically sized set based on the actual number of CPUs detected, such as via get_nprocs_conf(), and use the CPU_*_S variants instead of the fixed-size ones.

This type is a GNU extension and is defined in sched.h.

To manipulate the bitset, to set and reset bits, and thus add and remove CPUs from the sets, a number of macros are defined. Some of the macros take a CPU number as a parameter. Here it is important to never exceed the size of the bitset, either CPU_SETSIZE for fixed sets or the allocated size for dynamic sets. For each macro there is a fixed-size version (documented below) and a dynamic-sized version (with a _S suffix).

Macro: int CPU_SETSIZE

The value of this macro is the maximum number of CPUs which can be handled with a fixed cpu_set_t object.

For applications that require CPU sets larger than the built-in size, a set of macros that support dynamically-sized sets are defined.

Macro: size_t CPU_ALLOC_SIZE (size_t count)

Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.

Given a count of CPUs to hold, returns the size of the set to allocate. This return value is appropriate to be used in the *_S macros.

This macro is a GNU extension and is defined in sched.h.

Macro: cpu_set_t * CPU_ALLOC (size_t count)

Preliminary: | MT-Safe | AS-Unsafe lock | AC-Unsafe lock fd mem | See POSIX Safety Concepts.

Given the count of CPUs to hold, returns a set large enough to hold them; that is, the resulting set will be valid for CPUs numbered 0 through count-1, inclusive. This set must be freed via CPU_FREE to avoid memory leaks. Warning: the argument is the CPU count and not the size returned by CPU_ALLOC_SIZE.

This macro is a GNU extension and is defined in sched.h.

Macro: void CPU_FREE (cpu_set_t *set)

Preliminary: | MT-Safe | AS-Unsafe lock | AC-Unsafe lock fd mem | See POSIX Safety Concepts.

Frees a CPU set previously allocated by CPU_ALLOC.

This macro is a GNU extension and is defined in sched.h.

The type cpu_set_t should be considered opaque; all manipulation should happen via the CPU_* macros described below.

Macro: void CPU_ZERO (cpu_set_t *set)

Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.

This macro initializes the CPU set set to be the empty set.

This macro is a GNU extension and is defined in sched.h.

Macro: void CPU_SET (int cpu, cpu_set_t *set)

Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.

This macro adds cpu to the CPU set set.

The cpu parameter must not have side effects since it is evaluated more than once.

This macro is a GNU extension and is defined in sched.h.

Macro: void CPU_CLR (int cpu, cpu_set_t *set)

Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.

This macro removes cpu from the CPU set set.

The cpu parameter must not have side effects since it is evaluated more than once.

This macro is a GNU extension and is defined in sched.h.

Macro: cpu_set_t * CPU_AND (cpu_set_t *dest, cpu_set_t *src1, cpu_set_t *src2)

Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.

This macro populates dest with only those CPUs included in both src1 and src2. Its value is dest.

This macro is a GNU extension and is defined in sched.h.

Macro: cpu_set_t * CPU_OR (cpu_set_t *dest, cpu_set_t *src1, cpu_set_t *src2)

Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.

This macro populates dest with those CPUs included in either src1 or src2. Its value is dest.

This macro is a GNU extension and is defined in sched.h.

Macro: cpu_set_t * CPU_XOR (cpu_set_t *dest, cpu_set_t *src1, cpu_set_t *src2)

Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.

This macro populates dest with those CPUs included in either src1 or src2, but not both. Its value is dest.

This macro is a GNU extension and is defined in sched.h.

Macro: int CPU_ISSET (int cpu, const cpu_set_t *set)

Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.

This macro returns a nonzero value (true) if cpu is a member of the CPU set set, and zero (false) otherwise.

The cpu parameter must not have side effects since it is evaluated more than once.

This macro is a GNU extension and is defined in sched.h.

Macro: int CPU_COUNT (const cpu_set_t *set)

Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.

This macro returns the count of CPUs (bits) set in set.

This macro is a GNU extension and is defined in sched.h.

Macro: int CPU_EQUAL (cpu_set_t *src1, cpu_set_t *src2)

Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.

This macro returns nonzero if the two sets set1 and set2 have the same contents; that is, the set of CPUs represented by both sets is identical.

This macro is a GNU extension and is defined in sched.h.

Macro: void CPU_ZERO_S (size_t size, cpu_set_t *set)
Macro: void CPU_SET_S (int cpu, size_t size, cpu_set_t *set)
Macro: void CPU_CLR_S (int cpu, size_t size, cpu_set_t *set)
Macro: cpu_set_t * CPU_AND_S (size_t size, cpu_set_t *dest, cpu_set_t *src1, cpu_set_t *src2)
Macro: cpu_set_t * CPU_OR_S (size_t size, cpu_set_t *dest, cpu_set_t *src1, cpu_set_t *src2)
Macro: cpu_set_t * CPU_XOR_S (size_t size, cpu_set_t *dest, cpu_set_t *src1, cpu_set_t *src2)
Macro: int CPU_ISSET_S (int cpu, size_t size, const cpu_set_t *set)
Macro: int CPU_COUNT_S (size_t size, const cpu_set_t *set)
Macro: int CPU_EQUAL_S (size_t size, cpu_set_t *src1, cpu_set_t *src2)

Each of these macros performs the same action as its non-_S variant, but takes a size argument to specify the set size. This size argument is as returned by the CPU_ALLOC_SIZE macro, defined above.

CPU bitsets can be constructed from scratch or the currently installed affinity mask can be retrieved from the system.

Function: int sched_getaffinity (pid_t pid, size_t cpusetsize, cpu_set_t *cpuset)

Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.

This function stores the CPU affinity mask for the process or thread with the ID pid in the cpusetsize bytes long bitmap pointed to by cpuset. If successful, the function always initializes all bits in the cpu_set_t object and returns zero.

If pid does not correspond to a process or thread on the system the or the function fails for some other reason, it returns -1 and errno is set to represent the error condition.

ESRCH

No process or thread with the given ID found.

EFAULT

The pointer cpuset does not point to a valid object.

This function is a GNU extension and is declared in sched.h.

Note that it is not portably possible to use this information to retrieve the information for different POSIX threads. A separate interface must be provided for that.

Function: int sched_setaffinity (pid_t pid, size_t cpusetsize, const cpu_set_t *cpuset)

Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.

This function installs the cpusetsize bytes long affinity mask pointed to by cpuset for the process or thread with the ID pid. If successful the function returns zero and the scheduler will in the future take the affinity information into account.

If the function fails it will return -1 and errno is set to the error code:

ESRCH

No process or thread with the given ID found.

EFAULT

The pointer cpuset does not point to a valid object.

EINVAL

The bitset is not valid. This might mean that the affinity set might not leave a processor for the process or thread to run on.

This function is a GNU extension and is declared in sched.h.

Function: int getcpu (unsigned int *cpu, unsigned int *node)

Preliminary: | MT-Safe | AS-Safe | AC-Safe | See POSIX Safety Concepts.

The getcpu function identifies the processor and node on which the calling thread or process is currently running and writes them into the integers pointed to by the cpu and node arguments. The processor is a unique nonnegative integer identifying a CPU. The node is a unique nonnegative integer identifying a NUMA node. When either cpu or node is NULL, nothing is written to the respective pointer.

The return value is 0 on success and -1 on failure. The following errno error condition is defined for this function:

ENOSYS

The operating system does not support this function.

This function is Linux-specific and is declared in sched.h.

Function: int sched_getcpu (void)

Similar to getcpu but with a simpler interface. On success, returns a nonnegative number identifying the CPU on which the current thread is running. Returns -1 on failure. The following errno error condition is defined for this function:

ENOSYS

The operating system does not support this function.

This function is Linux-specific and is declared in sched.h.

Here’s an example of how to use most of the above to limit the number of CPUs a process runs on, not including error handling or good logic on CPU choices:

#define _GNU_SOURCE
#include <sched.h>
#include <sys/sysinfo.h>
#include <unistd.h>
void
limit_cpus (void)
{
  unsigned int mycpu;
  size_t nproc, cssz, cpu;
  cpu_set_t *cs;
  getcpu (&mycpu, NULL);
  nproc = get_nprocs_conf ();
  cssz = CPU_ALLOC_SIZE (nproc);
  cs = CPU_ALLOC (nproc);
  sched_getaffinity (0, cssz, cs);
  if (CPU_COUNT_S (cssz, cs) > nproc / 2)
    {
      for (cpu = nproc / 2; cpu < nproc; cpu ++)
        if (cpu != mycpu)
          CPU_CLR_S (cpu, cssz, cs);
      sched_setaffinity (0, cssz, cs);
    }
  CPU_FREE (cs);
}