The description below applies to:
allocate clause, except when the allocator modifier is a
constant expression with value omp_default_mem_alloc and no
align modifier has been specified. (In that case, the normal
malloc allocation is used.)
allocate directive for variables in static memory; while
the alignment is honored, the normal static memory is used.
allocate directive for automatic/stack variables, except
when the allocator clause is a constant expression with value
omp_default_mem_alloc and no align clause has been
specified. (In that case, the normal allocation is used: stack allocation
and, sometimes for Fortran, also malloc [depending on flags such as
-fstack-arrays].)
allocators directive and the executable
allocate directive for Fortran pointers and allocatables is
supported, but requires that files containing those directives has to be
compiled with -fopenmp-allocators. Additionally, all files that
might explicitly or implicitly deallocate memory allocated that way must
also be compiled with that option.
align clause
and the alignment of the type after honoring, if present, the
aligned (GNU::aligned) attribute and C’s _Alignas
and C++’s alignas. However, the align clause of the
allocate directive has no effect on the value of C’s
_Alignof and C++’s alignof.
GCC supports the following predefined allocators and predefined memory spaces:
| Predefined allocators | Associated predefined memory spaces |
|---|---|
| omp_default_mem_alloc | omp_default_mem_space |
| omp_large_cap_mem_alloc | omp_large_cap_mem_space |
| omp_const_mem_alloc | omp_const_mem_space |
| omp_high_bw_mem_alloc | omp_high_bw_mem_space |
| omp_low_lat_mem_alloc | omp_low_lat_mem_space |
| omp_cgroup_mem_alloc | omp_low_lat_mem_space (implementation defined) |
| omp_pteam_mem_alloc | omp_low_lat_mem_space (implementation defined) |
| omp_thread_mem_alloc | omp_low_lat_mem_space (implementation defined) |
| ompx_gnu_pinned_mem_alloc | omp_default_mem_space (GNU extension) |
Each predefined allocator, including omp_null_allocator, has a corresponding
allocator class template that meet the C++ allocator completeness requirements.
These are located in the omp::allocator namespace, and the
ompx::allocator namespace for gnu extensions. This allows the
allocator-aware C++ standard library containers to use OpenMP allocation routines;
for instance:
std::vector<int, omp::allocator::cgroup_mem<int>> vec;
The following allocator templates are supported:
| Predefined allocators | Associated allocator template |
|---|---|
| omp_null_allocator | omp::allocator::null_allocator |
| omp_default_mem_alloc | omp::allocator::default_mem |
| omp_large_cap_mem_alloc | omp::allocator::large_cap_mem |
| omp_const_mem_alloc | omp::allocator::const_mem |
| omp_high_bw_mem_alloc | omp::allocator::high_bw_mem |
| omp_low_lat_mem_alloc | omp::allocator::low_lat_mem |
| omp_cgroup_mem_alloc | omp::allocator::cgroup_mem |
| omp_pteam_mem_alloc | omp::allocator::pteam_mem |
| omp_thread_mem_alloc | omp::allocator::thread_mem |
| ompx_gnu_pinned_mem_alloc | ompx::allocator::gnu_pinned_mem |
The following traits are available when constructing a new allocator;
if a trait is not specified or with the value default, the
specified default value is used for that trait. The predefined
allocators use the default values of each trait, except that the
omp_cgroup_mem_alloc, omp_pteam_mem_alloc, and
omp_thread_mem_alloc allocators have the access trait
set to cgroup, pteam, and thread, respectively.
For each trait, a named constant prefixed by omp_atk_ exists;
for each non-numeric value, a named constant prefixed by omp_atv_
exists.
| Trait | Allowed values | Default value |
|---|---|---|
sync_hint | contended, uncontended,
serialized, private | contended |
alignment | Positive integer being a power of two | 1 byte |
access | all, cgroup,
pteam, thread | all |
pool_size | Positive integer (bytes) | See below. |
fallback | default_mem_fb, null_fb,
abort_fb, allocator_fb | See below |
fb_data | allocator handle | (none) |
pinned | true, false | See below |
partition | environment, nearest,
blocked, interleaved | environment |
For the fallback trait, the default value is null_fb for the
omp_default_mem_alloc allocator and any allocator that is associated
with device memory; for all other allocators, it is default_mem_fb
by default.
For the pinned trait, the default value is true for
predefined allocator ompx_gnu_pinned_mem_alloc (a GNU extension), and
false for all others.
The following description applies to the initial device (the host) and largely also to non-host devices; for the latter, also see Offload-Target Specifics.
For the memory spaces, the following applies:
omp_default_mem_space is supported
omp_const_mem_space maps to omp_default_mem_space
omp_low_lat_mem_space is only available on supported devices,
and maps to omp_default_mem_space otherwise.
omp_large_cap_mem_space maps to omp_default_mem_space,
unless the memkind library is available
omp_high_bw_mem_space maps to omp_default_mem_space,
unless the memkind library is available
On Linux systems, where the memkind
library (libmemkind.so.0) is available at runtime and the respective
memkind kind is supported, it is used when creating memory allocators requesting
partition trait interleaved except when the memory space
is omp_large_cap_mem_space (uses MEMKIND_HBW_INTERLEAVE)
omp_high_bw_mem_space (uses
MEMKIND_HBW_PREFERRED)
omp_large_cap_mem_space (uses
MEMKIND_DAX_KMEM_ALL or, if not available, MEMKIND_DAX_KMEM)
On Linux systems, where the numa
library (libnuma.so.1) is available at runtime, it used when creating
memory allocators requesting
partition trait nearest, except when both the
libmemkind library is available and the memory space is either
omp_large_cap_mem_space or omp_high_bw_mem_space
Note that the numa library will round up the allocation size to a multiple of
the system page size; therefore, consider using it only with large data or
by sharing allocations via the pool_size trait. Furthermore, the Linux
kernel does not guarantee that an allocation will always be on the nearest NUMA
node nor that after reallocation the same node will be used. Note additionally
that, on Linux, the default setting of the memory placement policy is to use the
current node; therefore, unless the memory placement policy has been overridden,
the partition trait environment (the default) will be effectively
a nearest allocation.
Additional notes regarding the traits:
pinned trait is supported on Linux hosts, but is subject to
the OS ulimit/rlimit locked memory settings. It currently
uses mmap and is therefore optimized for few allocations, including
large data. If the conditions for numa or memkind allocations are
fulfilled, those allocators are used instead.
pool_size trait is no pool and for every
(re)allocation the associated library routine is called, which might
internally use a memory pool. Currently, the same applies when a
pool_size has been specified, except that once allocations exceed
the the pool size, the action of the fallback trait applies.
partition trait, the partition part size will be the same
as the requested size (i.e. interleaved or blocked has no
effect), except for interleaved when the memkind library is
available. Furthermore, for nearest and unless the numa library
is available, the memory might not be on the same NUMA node as thread
that allocated the memory; on Linux, this is in particular the case when
the memory placement policy is set to preferred.
access trait has no effect such that memory is always
accessible by all threads. (Except on supported no-host devices.)
sync_hint trait has no effect.
See also: Offload-Target Specifics