17.14 Vectorization

Target Hook: tree TARGET_VECTORIZE_BUILTIN_MASK_FOR_LOAD (void)

This hook should return the DECL of a function f that given an address addr as an argument returns a mask m that can be used to extract from two vectors the relevant data that resides in addr in case addr is not properly aligned.

The autovectorizer, when vectorizing a load operation from an address addr that may be unaligned, will generate two vector loads from the two aligned addresses around addr. It then generates a REALIGN_LOAD operation to extract the relevant data from the two loaded vectors. The first two arguments to REALIGN_LOAD, v1 and v2, are the two vectors, each of size VS, and the third argument, OFF, defines how the data will be extracted from these two vectors: if OFF is 0, then the returned vector is v2; otherwise, the returned vector is composed from the last VS-OFF elements of v1 concatenated to the first OFF elements of v2.

If this hook is defined, the autovectorizer will generate a call to f (using the DECL tree that this hook returns) and will use the return value of f as the argument OFF to REALIGN_LOAD. Therefore, the mask m returned by f should comply with the semantics expected by REALIGN_LOAD described above. If this hook is not defined, then addr will be used as the argument OFF to REALIGN_LOAD, in which case the low log2(VS) − 1 bits of addr will be considered.

Target Hook: int TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST (enum vect_cost_for_stmt type_of_cost, tree vectype, int misalign)

Returns cost of different scalar or vector statements for vectorization cost model. For vector memory operations the cost may depend on type (vectype) and misalignment value (misalign).

Target Hook: poly_uint64 TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT (const_tree type)

This hook returns the preferred alignment in bits for accesses to vectors of type type in vectorized code. This might be less than or greater than the ABI-defined value returned by TARGET_VECTOR_ALIGNMENT. It can be equal to the alignment of a single element, in which case the vectorizer will not try to optimize for alignment.

The default hook returns TYPE_ALIGN (type), which is correct for most targets.

Target Hook: bool TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE (const_tree type, bool is_packed)

Return true if vector alignment is reachable (by peeling N iterations) for the given scalar type type. is_packed is false if the scalar access using type is known to be naturally aligned.

Target Hook: bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode mode, machine_mode op_mode, rtx output, rtx in0, rtx in1, const vec_perm_indices &sel)

This hook is used to test whether the target can permute up to two vectors of mode op_mode using the permutation vector sel, producing a vector of mode mode. The hook is also used to emit such a permutation.

When the hook is being used to test whether the target supports a permutation, in0, in1, and out are all null. When the hook is being used to emit a permutation, in0 and in1 are the source vectors of mode op_mode and out is the destination vector of mode mode. in1 is the same as in0 if sel describes a permutation on one vector instead of two.

Return true if the operation is possible, emitting instructions for it if rtxes are provided.

If the hook returns false for a mode with multibyte elements, GCC will try the equivalent byte operation. If that also fails, it will try forcing the selector into a register and using the vec_permmode instruction pattern. There is no need for the hook to handle these two implementation approaches itself.

Target Hook: bool TARGET_VECTORIZE_PREFERRED_DIV_AS_SHIFTS_OVER_MULT (const_tree type)

Sometimes it is possible to implement a vector division using a sequence of two addition-shift pairs, giving four instructions in total. Return true if taking this approach for vectype is likely to be better than using a sequence involving highpart multiplication. Default is false if can_mult_highpart_p, otherwise true.

Target Hook: tree TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION (unsigned code, tree vec_type_out, tree vec_type_in)

This hook should return the decl of a function that implements the vectorized variant of the function with the combined_fn code code or NULL_TREE if such a function is not available. The return type of the vectorized function shall be of vector type vec_type_out and the argument types should be vec_type_in.

Target Hook: tree TARGET_VECTORIZE_BUILTIN_MD_VECTORIZED_FUNCTION (tree fndecl, tree vec_type_out, tree vec_type_in)

This hook should return the decl of a function that implements the vectorized variant of target built-in function fndecl. The return type of the vectorized function shall be of vector type vec_type_out and the argument types should be vec_type_in.

Target Hook: bool TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT (machine_mode mode, const_tree type, int misalignment, bool is_packed, bool is_gather_scatter)

This hook should return true if the target supports misaligned vector store/load of a specific factor denoted in the misalignment parameter. The vector store/load should be of machine mode mode and the elements in the vectors should be of type type. The is_packed parameter is true if the misalignment is unknown and the memory access is defined in a packed struct. is_gather_scatter is true if the load/store is a gather or scatter.

Target Hook: machine_mode TARGET_VECTORIZE_PREFERRED_SIMD_MODE (scalar_mode mode)

This hook should return the preferred mode for vectorizing scalar mode mode. The default is equal to word_mode, because the vectorizer can do some transformations even in absence of specialized SIMD hardware.

Target Hook: machine_mode TARGET_VECTORIZE_SPLIT_REDUCTION (machine_mode)

This hook should return the preferred mode to split the final reduction step on mode to. The reduction is then carried out reducing upper against lower halves of vectors recursively until the specified mode is reached. The default is mode which means no splitting.

Target Hook: unsigned int TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES (vector_modes *modes, bool all)

If using the mode returned by TARGET_VECTORIZE_PREFERRED_SIMD_MODE is not the only approach worth considering, this hook should add one mode to modes for each useful alternative approach. These modes are then passed to TARGET_VECTORIZE_RELATED_MODE to obtain the vector mode for a given element mode.

The modes returned in modes should use the smallest element mode possible for the vectorization approach that they represent, preferring integer modes over floating-poing modes in the event of a tie. The first mode should be the TARGET_VECTORIZE_PREFERRED_SIMD_MODE for its element mode.

If all is true, add suitable vector modes even when they are generally not expected to be worthwhile.

The hook returns a bitmask of flags that control how the modes in modes are used. The flags are:

VECT_COMPARE_COSTS

Tells the loop vectorizer to try all the provided modes and pick the one with the lowest cost. By default the vectorizer will choose the first mode that works.

The hook does not need to do anything if the vector returned by TARGET_VECTORIZE_PREFERRED_SIMD_MODE is the only one relevant for autovectorization. The default implementation adds no modes and returns 0.

Target Hook: opt_machine_mode TARGET_VECTORIZE_RELATED_MODE (machine_mode vector_mode, scalar_mode element_mode, poly_uint64 nunits)

If a piece of code is using vector mode vector_mode and also wants to operate on elements of mode element_mode, return the vector mode it should use for those elements. If nunits is nonzero, ensure that the mode has exactly nunits elements, otherwise pick whichever vector size pairs the most naturally with vector_mode. Return an empty opt_machine_mode if there is no supported vector mode with the required properties.

There is no prescribed way of handling the case in which nunits is zero. One common choice is to pick a vector mode with the same size as vector_mode; this is the natural choice if the target has a fixed vector size. Another option is to choose a vector mode with the same number of elements as vector_mode; this is the natural choice if the target has a fixed number of elements. Alternatively, the hook might choose a middle ground, such as trying to keep the number of elements as similar as possible while applying maximum and minimum vector sizes.

The default implementation uses mode_for_vector to find the requested mode, returning a mode with the same size as vector_mode when nunits is zero. This is the correct behavior for most targets.

Target Hook: opt_machine_mode TARGET_VECTORIZE_GET_MASK_MODE (machine_mode mode)

Return the mode to use for a vector mask that holds one boolean result for each element of vector mode mode. The returned mask mode can be a vector of integers (class MODE_VECTOR_INT), a vector of booleans (class MODE_VECTOR_BOOL) or a scalar integer (class MODE_INT). Return an empty opt_machine_mode if no such mask mode exists.

The default implementation returns a MODE_VECTOR_INT with the same size and number of elements as mode, if such a mode exists.

Target Hook: bool TARGET_VECTORIZE_CONDITIONAL_OPERATION_IS_EXPENSIVE (unsigned ifn)

This hook returns true if masked operation ifn (really of type internal_fn) should be considered more expensive to use than implementing the same operation without masking. GCC can then try to use unconditional operations instead with extra selects.

Target Hook: bool TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE (unsigned ifn)

This hook returns true if masked internal function ifn (really of type internal_fn) should be considered expensive when the mask is all zeros. GCC can then try to branch around the instruction instead.

Target Hook: class vector_costs * TARGET_VECTORIZE_CREATE_COSTS (vec_info *vinfo, bool costing_for_scalar)

This hook should initialize target-specific data structures in preparation for modeling the costs of vectorizing a loop or basic block. The default allocates three unsigned integers for accumulating costs for the prologue, body, and epilogue of the loop or basic block. If loop_info is non-NULL, it identifies the loop being vectorized; otherwise a single block is being vectorized. If costing_for_scalar is true, it indicates the current cost model is for the scalar version of a loop or block; otherwise it is for the vector version.

Target Hook: tree TARGET_VECTORIZE_BUILTIN_GATHER (const_tree mem_vectype, const_tree index_type, int scale)

Target builtin that implements vector gather operation. mem_vectype is the vector type of the load and index_type is scalar type of the index, scaled by scale. The default is NULL_TREE which means to not vectorize gather loads.

Target Hook: tree TARGET_VECTORIZE_BUILTIN_SCATTER (const_tree vectype, const_tree index_type, int scale)

Target builtin that implements vector scatter operation. vectype is the vector type of the store and index_type is scalar type of the index, scaled by scale. The default is NULL_TREE which means to not vectorize scatter stores.

Target Hook: bool TARGET_VECTORIZE_PREFER_GATHER_SCATTER (machine_mode mode, int scale, unsigned int group_size)

This hook returns TRUE if gather loads or scatter stores are cheaper on this target than a sequence of elementwise loads or stores. The mode and scale correspond to the gather_load and scatter_store instruction patterns. The group_size is the number of scalar elements in each scalar loop iteration that are to be combined into the vector.