LLVM
8.0.1
|
The hsa_ext_control_directives_t specifies the values for the HSAIL control directives. More...
#include "Target/AMDGPU/AMDKernelCodeT.h"
Public Attributes | |
hsa_ext_control_directive_present64_t | enabled_control_directives |
This is a bit set indicating which control directives have been specified. More... | |
hsa_ext_exception_kind16_t | enable_break_exceptions |
If enableBreakExceptions is not enabled then must be 0, otherwise must be non-0 and specifies the set of HSAIL exceptions that must have the BREAK policy enabled. More... | |
hsa_ext_exception_kind16_t | enable_detect_exceptions |
If enableDetectExceptions is not enabled then must be 0, otherwise must be non-0 and specifies the set of HSAIL exceptions that must have the DETECT policy enabled. More... | |
uint32_t | max_dynamic_group_size |
If maxDynamicGroupSize is not enabled then must be 0, and any amount of dynamic group segment can be allocated for a dispatch, otherwise the value specifies the maximum number of bytes of dynamic group segment that can be allocated for a dispatch. More... | |
uint32_t | max_flat_grid_size |
If maxFlatGridSize is not enabled then must be 0, otherwise must be greater than 0. More... | |
uint32_t | max_flat_workgroup_size |
If maxFlatWorkgroupSize is not enabled then must be 0, otherwise must be greater than 0. More... | |
uint32_t | requested_workgroups_per_cu |
If requestedWorkgroupsPerCu is not enabled then must be 0, and the finalizer is free to generate ISA that may result in any number of work-groups executing on a single compute unit. More... | |
hsa_dim3_t | required_grid_size |
If not enabled then all elements for Dim3 must be 0, otherwise every element must be greater than 0. More... | |
hsa_dim3_t | required_workgroup_size |
If requiredWorkgroupSize is not enabled then all elements for Dim3 must be 0, and the produced code can be dispatched with any legal work-group range consistent with the dispatch dimensions. More... | |
uint8_t | required_dim |
If requiredDim is not enabled then must be 0 and the produced kernel code can be dispatched with 1, 2 or 3 dimensions. More... | |
uint8_t | reserved [75] |
Reserved. Must be 0. More... | |
The hsa_ext_control_directives_t specifies the values for the HSAIL control directives.
These control how the finalizer generates code. This struct is used both as an argument to hsaFinalizeKernel to specify values for the control directives, and is used in HsaKernelCode to record the values of the control directives that the finalize used when generating the code which either came from the finalizer argument or explicit HSAIL control directives. See the definition of the control directives in HSA Programmer's Reference Manual which also defines how the values specified as finalizer arguments have to agree with the control directives in the HSAIL code.
Definition at line 210 of file AMDKernelCodeT.h.
hsa_ext_exception_kind16_t hsa_ext_control_directives_s::enable_break_exceptions |
If enableBreakExceptions is not enabled then must be 0, otherwise must be non-0 and specifies the set of HSAIL exceptions that must have the BREAK policy enabled.
If this set is not empty then the generated code may have lower performance than if the set is empty. If the kernel being finalized has any enablebreakexceptions control directives, then the values specified by this argument are unioned with the values in these control directives. If any of the functions the kernel calls have an enablebreakexceptions control directive, then they must be equal or a subset of, this union.
Definition at line 227 of file AMDKernelCodeT.h.
hsa_ext_exception_kind16_t hsa_ext_control_directives_s::enable_detect_exceptions |
If enableDetectExceptions is not enabled then must be 0, otherwise must be non-0 and specifies the set of HSAIL exceptions that must have the DETECT policy enabled.
If this set is not empty then the generated code may have lower performance than if the set is empty. However, an implementation should endeavour to make the performance impact small. If the kernel being finalized has any enabledetectexceptions control directives, then the values specified by this argument are unioned with the values in these control directives. If any of the functions the kernel calls have an enabledetectexceptions control directive, then they must be equal or a subset of, this union.
Definition at line 239 of file AMDKernelCodeT.h.
hsa_ext_control_directive_present64_t hsa_ext_control_directives_s::enabled_control_directives |
This is a bit set indicating which control directives have been specified.
If the value is 0 then there are no control directives specified and the rest of the fields can be ignored. The bits are accessed using the hsa_ext_control_directives_present_mask_t. Any control directive that is not enabled in this bit set must have the value of all 0s.
Definition at line 216 of file AMDKernelCodeT.h.
uint32_t hsa_ext_control_directives_s::max_dynamic_group_size |
If maxDynamicGroupSize is not enabled then must be 0, and any amount of dynamic group segment can be allocated for a dispatch, otherwise the value specifies the maximum number of bytes of dynamic group segment that can be allocated for a dispatch.
If the kernel being finalized has any maxdynamicsize control directives, then the values must be the same, and must be the same as this argument if it is enabled. This value can be used by the finalizer to determine the maximum number of bytes of group memory used by each work-group by adding this value to the group memory required for all group segment variables used by the kernel and all functions it calls, and group memory used to implement other HSAIL features such as fbarriers and the detect exception operations. This can allow the finalizer to determine the expected number of work-groups that can be executed by a compute unit and allow more resources to be allocated to the work-items if it is known that fewer work-groups can be executed due to group memory limitations.
Definition at line 256 of file AMDKernelCodeT.h.
uint32_t hsa_ext_control_directives_s::max_flat_grid_size |
If maxFlatGridSize is not enabled then must be 0, otherwise must be greater than 0.
See HSA Programmer's Reference Manual description of maxflatgridsize control directive.
Definition at line 261 of file AMDKernelCodeT.h.
uint32_t hsa_ext_control_directives_s::max_flat_workgroup_size |
If maxFlatWorkgroupSize is not enabled then must be 0, otherwise must be greater than 0.
See HSA Programmer's Reference Manual description of maxflatworkgroupsize control directive.
Definition at line 266 of file AMDKernelCodeT.h.
uint32_t hsa_ext_control_directives_s::requested_workgroups_per_cu |
If requestedWorkgroupsPerCu is not enabled then must be 0, and the finalizer is free to generate ISA that may result in any number of work-groups executing on a single compute unit.
Otherwise, the finalizer should attempt to generate ISA that will allow the specified number of work-groups to execute on a single compute unit. This is only a hint and can be ignored by the finalizer. If the kernel being finalized, or any of the functions it calls, has a requested control directive, then the values must be the same. This can be used to determine the number of resources that should be allocated to a single work-group and work-item. For example, a low value may allow more resources to be allocated, resulting in higher per work-item performance, as it is known there will never be more than the specified number of work-groups actually executing on the compute unit. Conversely, a high value may allocate fewer resources, resulting in lower per work-item performance, which is offset by the fact it allows more work-groups to actually execute on the compute unit.
Definition at line 283 of file AMDKernelCodeT.h.
uint8_t hsa_ext_control_directives_s::required_dim |
If requiredDim is not enabled then must be 0 and the produced kernel code can be dispatched with 1, 2 or 3 dimensions.
If enabled then the value is 1..3 and the code produced must only be dispatched with a dimension that matches. Other values are illegal. If the kernel being finalized, or any of the functions it calls, has a requireddimsize control directive, then the values must be the same. This can be used to optimize the code generated to compute the absolute and flat work-group and work-item id, and the dim HSAIL operations.
Definition at line 311 of file AMDKernelCodeT.h.
hsa_dim3_t hsa_ext_control_directives_s::required_grid_size |
If not enabled then all elements for Dim3 must be 0, otherwise every element must be greater than 0.
See HSA Programmer's Reference Manual description of requiredgridsize control directive.
Definition at line 288 of file AMDKernelCodeT.h.
hsa_dim3_t hsa_ext_control_directives_s::required_workgroup_size |
If requiredWorkgroupSize is not enabled then all elements for Dim3 must be 0, and the produced code can be dispatched with any legal work-group range consistent with the dispatch dimensions.
Otherwise, the code produced must always be dispatched with the specified work-group range. No element of the specified range must be 0. It must be consistent with required_dimensions and max_flat_workgroup_size. If the kernel being finalized, or any of the functions it calls, has a requiredworkgroupsize control directive, then the values must be the same. Specifying a value can allow the finalizer to optimize work-group id operations, and if the number of work-items in the work-group is less than the WAVESIZE then barrier operations can be optimized to just a memory fence.
Definition at line 301 of file AMDKernelCodeT.h.
uint8_t hsa_ext_control_directives_s::reserved[75] |
Reserved. Must be 0.
Definition at line 314 of file AMDKernelCodeT.h.