The description below applies to:
allocate
clause, except when the allocator
modifier is a
constant expression with value omp_default_mem_alloc
and no
align
modifier has been specified. (In that case, the normal
malloc
allocation is used.)
allocate
directive for variables in static memory; while
the alignment is honored, the normal static memory is used.
allocate
directive for automatic/stack variables, except
when the allocator
clause is a constant expression with value
omp_default_mem_alloc
and no align
clause has been
specified. (In that case, the normal allocation is used: stack allocation
and, sometimes for Fortran, also malloc
[depending on flags such as
-fstack-arrays].)
allocators
directive and the executable
allocate
directive for Fortran pointers and allocatables is
supported, but requires that files containing those directives has to be
compiled with -fopenmp-allocators. Additionally, all files that
might explicitly or implicitly deallocate memory allocated that way must
also be compiled with that option.
align
clause
and the alignment of the type after honoring, if present, the
aligned
(GNU::aligned
) attribute and C’s _Alignas
and C++’s alignas
. However, the align
clause of the
allocate
directive has no effect on the value of C’s
_Alignof
and C++’s alignof
.
GCC supports the following predefined allocators and predefined memory spaces:
Predefined allocators | Associated predefined memory spaces |
---|---|
omp_default_mem_alloc | omp_default_mem_space |
omp_large_cap_mem_alloc | omp_large_cap_mem_space |
omp_const_mem_alloc | omp_const_mem_space |
omp_high_bw_mem_alloc | omp_high_bw_mem_space |
omp_low_lat_mem_alloc | omp_low_lat_mem_space |
omp_cgroup_mem_alloc | omp_low_lat_mem_space (implementation defined) |
omp_pteam_mem_alloc | omp_low_lat_mem_space (implementation defined) |
omp_thread_mem_alloc | omp_low_lat_mem_space (implementation defined) |
ompx_gnu_pinned_mem_alloc | omp_default_mem_space (GNU extension) |
Each predefined allocator, including omp_null_allocator
, has a corresponding
allocator class template that meet the C++ allocator completeness requirements.
These are located in the omp::allocator
namespace, and the
ompx::allocator
namespace for gnu extensions. This allows the
allocator-aware C++ standard library containers to use OpenMP allocation routines;
for instance:
std::vector<int, omp::allocator::cgroup_mem<int>> vec;
The following allocator templates are supported:
Predefined allocators | Associated allocator template |
---|---|
omp_null_allocator | omp::allocator::null_allocator |
omp_default_mem_alloc | omp::allocator::default_mem |
omp_large_cap_mem_alloc | omp::allocator::large_cap_mem |
omp_const_mem_alloc | omp::allocator::const_mem |
omp_high_bw_mem_alloc | omp::allocator::high_bw_mem |
omp_low_lat_mem_alloc | omp::allocator::low_lat_mem |
omp_cgroup_mem_alloc | omp::allocator::cgroup_mem |
omp_pteam_mem_alloc | omp::allocator::pteam_mem |
omp_thread_mem_alloc | omp::allocator::thread_mem |
ompx_gnu_pinned_mem_alloc | ompx::allocator::gnu_pinned_mem |
The following traits are available when constructing a new allocator;
if a trait is not specified or with the value default
, the
specified default value is used for that trait. The predefined
allocators use the default values of each trait, except that the
omp_cgroup_mem_alloc
, omp_pteam_mem_alloc
, and
omp_thread_mem_alloc
allocators have the access
trait
set to cgroup
, pteam
, and thread
, respectively.
For each trait, a named constant prefixed by omp_atk_
exists;
for each non-numeric value, a named constant prefixed by omp_atv_
exists.
Trait | Allowed values | Default value |
---|---|---|
sync_hint | contended , uncontended ,
serialized , private | contended |
alignment | Positive integer being a power of two | 1 byte |
access | all , cgroup ,
pteam , thread | all |
pool_size | Positive integer (bytes) | See below. |
fallback | default_mem_fb , null_fb ,
abort_fb , allocator_fb | See below |
fb_data | allocator handle | (none) |
pinned | true , false | See below |
partition | environment , nearest ,
blocked , interleaved | environment |
For the fallback
trait, the default value is null_fb
for the
omp_default_mem_alloc
allocator and any allocator that is associated
with device memory; for all other allocators, it is default_mem_fb
by default.
For the pinned
trait, the default value is true
for
predefined allocator ompx_gnu_pinned_mem_alloc
(a GNU extension), and
false
for all others.
The following description applies to the initial device (the host) and largely also to non-host devices; for the latter, also see Offload-Target Specifics.
For the memory spaces, the following applies:
omp_default_mem_space
is supported
omp_const_mem_space
maps to omp_default_mem_space
omp_low_lat_mem_space
is only available on supported devices,
and maps to omp_default_mem_space
otherwise.
omp_large_cap_mem_space
maps to omp_default_mem_space
,
unless the memkind library is available
omp_high_bw_mem_space
maps to omp_default_mem_space
,
unless the memkind library is available
On Linux systems, where the memkind
library (libmemkind.so.0
) is available at runtime and the respective
memkind kind is supported, it is used when creating memory allocators requesting
partition
trait interleaved
except when the memory space
is omp_large_cap_mem_space
(uses MEMKIND_HBW_INTERLEAVE
)
omp_high_bw_mem_space
(uses
MEMKIND_HBW_PREFERRED
)
omp_large_cap_mem_space
(uses
MEMKIND_DAX_KMEM_ALL
or, if not available, MEMKIND_DAX_KMEM
)
On Linux systems, where the numa
library (libnuma.so.1
) is available at runtime, it used when creating
memory allocators requesting
partition
trait nearest
, except when both the
libmemkind library is available and the memory space is either
omp_large_cap_mem_space
or omp_high_bw_mem_space
Note that the numa library will round up the allocation size to a multiple of
the system page size; therefore, consider using it only with large data or
by sharing allocations via the pool_size
trait. Furthermore, the Linux
kernel does not guarantee that an allocation will always be on the nearest NUMA
node nor that after reallocation the same node will be used. Note additionally
that, on Linux, the default setting of the memory placement policy is to use the
current node; therefore, unless the memory placement policy has been overridden,
the partition
trait environment
(the default) will be effectively
a nearest
allocation.
Additional notes regarding the traits:
pinned
trait is supported on Linux hosts, but is subject to
the OS ulimit
/rlimit
locked memory settings. It currently
uses mmap
and is therefore optimized for few allocations, including
large data. If the conditions for numa or memkind allocations are
fulfilled, those allocators are used instead.
pool_size
trait is no pool and for every
(re)allocation the associated library routine is called, which might
internally use a memory pool. Currently, the same applies when a
pool_size
has been specified, except that once allocations exceed
the the pool size, the action of the fallback
trait applies.
partition
trait, the partition part size will be the same
as the requested size (i.e. interleaved
or blocked
has no
effect), except for interleaved
when the memkind library is
available. Furthermore, for nearest
and unless the numa library
is available, the memory might not be on the same NUMA node as thread
that allocated the memory; on Linux, this is in particular the case when
the memory placement policy is set to preferred.
access
trait has no effect such that memory is always
accessible by all threads. (Except on supported no-host devices.)
sync_hint
trait has no effect.
See also: Offload-Target Specifics