Class CLCapabilities

java.lang.Object
org.lwjgl.opencl.CLCapabilities

public class CLCapabilities extends Object
Defines the capabilities of an OpenCL platform or device.

The instance returned by CL.createPlatformCapabilities(long) exposes the functionality present on either the platform or any of its devices. This is unlike the PLATFORM_EXTENSIONS string, which returns only platform functionality, supported across all platform devices.

The instance returned by CL.createDeviceCapabilities(long, org.lwjgl.opencl.CLCapabilities) exposes only the functionality available on that particular device.

  • Field Details

    • clGetPlatformIDs

      public final long clGetPlatformIDs
    • clGetPlatformInfo

      public final long clGetPlatformInfo
    • clGetDeviceIDs

      public final long clGetDeviceIDs
    • clGetDeviceInfo

      public final long clGetDeviceInfo
    • clCreateContext

      public final long clCreateContext
    • clCreateContextFromType

      public final long clCreateContextFromType
    • clRetainContext

      public final long clRetainContext
    • clReleaseContext

      public final long clReleaseContext
    • clGetContextInfo

      public final long clGetContextInfo
    • clCreateCommandQueue

      public final long clCreateCommandQueue
    • clRetainCommandQueue

      public final long clRetainCommandQueue
    • clReleaseCommandQueue

      public final long clReleaseCommandQueue
    • clGetCommandQueueInfo

      public final long clGetCommandQueueInfo
    • clCreateBuffer

      public final long clCreateBuffer
    • clEnqueueReadBuffer

      public final long clEnqueueReadBuffer
    • clEnqueueWriteBuffer

      public final long clEnqueueWriteBuffer
    • clEnqueueCopyBuffer

      public final long clEnqueueCopyBuffer
    • clEnqueueMapBuffer

      public final long clEnqueueMapBuffer
    • clCreateImage2D

      public final long clCreateImage2D
    • clCreateImage3D

      public final long clCreateImage3D
    • clGetSupportedImageFormats

      public final long clGetSupportedImageFormats
    • clEnqueueReadImage

      public final long clEnqueueReadImage
    • clEnqueueWriteImage

      public final long clEnqueueWriteImage
    • clEnqueueCopyImage

      public final long clEnqueueCopyImage
    • clEnqueueCopyImageToBuffer

      public final long clEnqueueCopyImageToBuffer
    • clEnqueueCopyBufferToImage

      public final long clEnqueueCopyBufferToImage
    • clEnqueueMapImage

      public final long clEnqueueMapImage
    • clGetImageInfo

      public final long clGetImageInfo
    • clRetainMemObject

      public final long clRetainMemObject
    • clReleaseMemObject

      public final long clReleaseMemObject
    • clEnqueueUnmapMemObject

      public final long clEnqueueUnmapMemObject
    • clGetMemObjectInfo

      public final long clGetMemObjectInfo
    • clCreateSampler

      public final long clCreateSampler
    • clRetainSampler

      public final long clRetainSampler
    • clReleaseSampler

      public final long clReleaseSampler
    • clGetSamplerInfo

      public final long clGetSamplerInfo
    • clCreateProgramWithSource

      public final long clCreateProgramWithSource
    • clCreateProgramWithBinary

      public final long clCreateProgramWithBinary
    • clRetainProgram

      public final long clRetainProgram
    • clReleaseProgram

      public final long clReleaseProgram
    • clBuildProgram

      public final long clBuildProgram
    • clUnloadCompiler

      public final long clUnloadCompiler
    • clGetProgramInfo

      public final long clGetProgramInfo
    • clGetProgramBuildInfo

      public final long clGetProgramBuildInfo
    • clCreateKernel

      public final long clCreateKernel
    • clCreateKernelsInProgram

      public final long clCreateKernelsInProgram
    • clRetainKernel

      public final long clRetainKernel
    • clReleaseKernel

      public final long clReleaseKernel
    • clSetKernelArg

      public final long clSetKernelArg
    • clGetKernelInfo

      public final long clGetKernelInfo
    • clGetKernelWorkGroupInfo

      public final long clGetKernelWorkGroupInfo
    • clEnqueueNDRangeKernel

      public final long clEnqueueNDRangeKernel
    • clEnqueueTask

      public final long clEnqueueTask
    • clEnqueueNativeKernel

      public final long clEnqueueNativeKernel
    • clWaitForEvents

      public final long clWaitForEvents
    • clGetEventInfo

      public final long clGetEventInfo
    • clRetainEvent

      public final long clRetainEvent
    • clReleaseEvent

      public final long clReleaseEvent
    • clEnqueueMarker

      public final long clEnqueueMarker
    • clEnqueueBarrier

      public final long clEnqueueBarrier
    • clEnqueueWaitForEvents

      public final long clEnqueueWaitForEvents
    • clGetEventProfilingInfo

      public final long clGetEventProfilingInfo
    • clFlush

      public final long clFlush
    • clFinish

      public final long clFinish
    • clGetExtensionFunctionAddress

      public final long clGetExtensionFunctionAddress
    • clCreateFromGLBuffer

      public final long clCreateFromGLBuffer
    • clCreateFromGLTexture2D

      public final long clCreateFromGLTexture2D
    • clCreateFromGLTexture3D

      public final long clCreateFromGLTexture3D
    • clCreateFromGLRenderbuffer

      public final long clCreateFromGLRenderbuffer
    • clGetGLObjectInfo

      public final long clGetGLObjectInfo
    • clGetGLTextureInfo

      public final long clGetGLTextureInfo
    • clEnqueueAcquireGLObjects

      public final long clEnqueueAcquireGLObjects
    • clEnqueueReleaseGLObjects

      public final long clEnqueueReleaseGLObjects
    • clCreateSubBuffer

      public final long clCreateSubBuffer
    • clSetMemObjectDestructorCallback

      public final long clSetMemObjectDestructorCallback
    • clEnqueueReadBufferRect

      public final long clEnqueueReadBufferRect
    • clEnqueueWriteBufferRect

      public final long clEnqueueWriteBufferRect
    • clEnqueueCopyBufferRect

      public final long clEnqueueCopyBufferRect
    • clCreateUserEvent

      public final long clCreateUserEvent
    • clSetUserEventStatus

      public final long clSetUserEventStatus
    • clSetEventCallback

      public final long clSetEventCallback
    • clGetExtensionFunctionAddressForPlatform

      public final long clGetExtensionFunctionAddressForPlatform
    • clRetainDevice

      public final long clRetainDevice
    • clReleaseDevice

      public final long clReleaseDevice
    • clCreateSubDevices

      public final long clCreateSubDevices
    • clCreateImage

      public final long clCreateImage
    • clCreateProgramWithBuiltInKernels

      public final long clCreateProgramWithBuiltInKernels
    • clCompileProgram

      public final long clCompileProgram
    • clLinkProgram

      public final long clLinkProgram
    • clUnloadPlatformCompiler

      public final long clUnloadPlatformCompiler
    • clGetKernelArgInfo

      public final long clGetKernelArgInfo
    • clEnqueueFillBuffer

      public final long clEnqueueFillBuffer
    • clEnqueueFillImage

      public final long clEnqueueFillImage
    • clEnqueueMigrateMemObjects

      public final long clEnqueueMigrateMemObjects
    • clEnqueueMarkerWithWaitList

      public final long clEnqueueMarkerWithWaitList
    • clEnqueueBarrierWithWaitList

      public final long clEnqueueBarrierWithWaitList
    • clCreateFromGLTexture

      public final long clCreateFromGLTexture
    • clCreateCommandQueueWithProperties

      public final long clCreateCommandQueueWithProperties
    • clCreatePipe

      public final long clCreatePipe
    • clGetPipeInfo

      public final long clGetPipeInfo
    • clSVMAlloc

      public final long clSVMAlloc
    • clSVMFree

      public final long clSVMFree
    • clEnqueueSVMFree

      public final long clEnqueueSVMFree
    • clEnqueueSVMMemcpy

      public final long clEnqueueSVMMemcpy
    • clEnqueueSVMMemFill

      public final long clEnqueueSVMMemFill
    • clEnqueueSVMMap

      public final long clEnqueueSVMMap
    • clEnqueueSVMUnmap

      public final long clEnqueueSVMUnmap
    • clSetKernelArgSVMPointer

      public final long clSetKernelArgSVMPointer
    • clSetKernelExecInfo

      public final long clSetKernelExecInfo
    • clCreateSamplerWithProperties

      public final long clCreateSamplerWithProperties
    • clSetDefaultDeviceCommandQueue

      public final long clSetDefaultDeviceCommandQueue
    • clGetDeviceAndHostTimer

      public final long clGetDeviceAndHostTimer
    • clGetHostTimer

      public final long clGetHostTimer
    • clCreateProgramWithIL

      public final long clCreateProgramWithIL
    • clCloneKernel

      public final long clCloneKernel
    • clGetKernelSubGroupInfo

      public final long clGetKernelSubGroupInfo
    • clEnqueueSVMMigrateMem

      public final long clEnqueueSVMMigrateMem
    • clSetProgramReleaseCallback

      public final long clSetProgramReleaseCallback
    • clSetProgramSpecializationConstant

      public final long clSetProgramSpecializationConstant
    • clSetContextDestructorCallback

      public final long clSetContextDestructorCallback
    • clCreateBufferWithProperties

      public final long clCreateBufferWithProperties
    • clCreateImageWithProperties

      public final long clCreateImageWithProperties
    • clTrackLiveObjectsAltera

      public final long clTrackLiveObjectsAltera
    • clReportLiveObjectsAltera

      public final long clReportLiveObjectsAltera
    • clEnqueueWaitSignalAMD

      public final long clEnqueueWaitSignalAMD
    • clEnqueueWriteSignalAMD

      public final long clEnqueueWriteSignalAMD
    • clEnqueueMakeBuffersResidentAMD

      public final long clEnqueueMakeBuffersResidentAMD
    • clCreateCommandQueueWithPropertiesAPPLE

      public final long clCreateCommandQueueWithPropertiesAPPLE
    • clLogMessagesToSystemLogAPPLE

      public final long clLogMessagesToSystemLogAPPLE
    • clLogMessagesToStdoutAPPLE

      public final long clLogMessagesToStdoutAPPLE
    • clLogMessagesToStderrAPPLE

      public final long clLogMessagesToStderrAPPLE
    • clGetGLContextInfoAPPLE

      public final long clGetGLContextInfoAPPLE
    • clImportMemoryARM

      public final long clImportMemoryARM
    • clReleaseDeviceEXT

      public final long clReleaseDeviceEXT
    • clRetainDeviceEXT

      public final long clRetainDeviceEXT
    • clCreateSubDevicesEXT

      public final long clCreateSubDevicesEXT
    • clGetImageRequirementsInfoEXT

      public final long clGetImageRequirementsInfoEXT
    • clEnqueueMigrateMemObjectEXT

      public final long clEnqueueMigrateMemObjectEXT
    • clEnqueueGenerateMipmapIMG

      public final long clEnqueueGenerateMipmapIMG
    • clCreateAcceleratorINTEL

      public final long clCreateAcceleratorINTEL
    • clRetainAcceleratorINTEL

      public final long clRetainAcceleratorINTEL
    • clReleaseAcceleratorINTEL

      public final long clReleaseAcceleratorINTEL
    • clGetAcceleratorInfoINTEL

      public final long clGetAcceleratorInfoINTEL
    • clCreateBufferWithPropertiesINTEL

      public final long clCreateBufferWithPropertiesINTEL
    • clGetSupportedGLTextureFormatsINTEL

      public final long clGetSupportedGLTextureFormatsINTEL
    • clGetSupportedVA_APIMediaSurfaceFormatsINTEL

      public final long clGetSupportedVA_APIMediaSurfaceFormatsINTEL
    • clHostMemAllocINTEL

      public final long clHostMemAllocINTEL
    • clDeviceMemAllocINTEL

      public final long clDeviceMemAllocINTEL
    • clSharedMemAllocINTEL

      public final long clSharedMemAllocINTEL
    • clMemFreeINTEL

      public final long clMemFreeINTEL
    • clMemBlockingFreeINTEL

      public final long clMemBlockingFreeINTEL
    • clGetMemAllocInfoINTEL

      public final long clGetMemAllocInfoINTEL
    • clSetKernelArgMemPointerINTEL

      public final long clSetKernelArgMemPointerINTEL
    • clEnqueueMemFillINTEL

      public final long clEnqueueMemFillINTEL
    • clEnqueueMemcpyINTEL

      public final long clEnqueueMemcpyINTEL
    • clEnqueueMigrateMemINTEL

      public final long clEnqueueMigrateMemINTEL
    • clEnqueueMemAdviseINTEL

      public final long clEnqueueMemAdviseINTEL
    • clGetDeviceIDsFromVA_APIMediaAdapterINTEL

      public final long clGetDeviceIDsFromVA_APIMediaAdapterINTEL
    • clCreateFromVA_APIMediaSurfaceINTEL

      public final long clCreateFromVA_APIMediaSurfaceINTEL
    • clEnqueueAcquireVA_APIMediaSurfacesINTEL

      public final long clEnqueueAcquireVA_APIMediaSurfacesINTEL
    • clEnqueueReleaseVA_APIMediaSurfacesINTEL

      public final long clEnqueueReleaseVA_APIMediaSurfacesINTEL
    • clCreateCommandBufferKHR

      public final long clCreateCommandBufferKHR
    • clRetainCommandBufferKHR

      public final long clRetainCommandBufferKHR
    • clReleaseCommandBufferKHR

      public final long clReleaseCommandBufferKHR
    • clFinalizeCommandBufferKHR

      public final long clFinalizeCommandBufferKHR
    • clEnqueueCommandBufferKHR

      public final long clEnqueueCommandBufferKHR
    • clCommandBarrierWithWaitListKHR

      public final long clCommandBarrierWithWaitListKHR
    • clCommandCopyBufferKHR

      public final long clCommandCopyBufferKHR
    • clCommandCopyBufferRectKHR

      public final long clCommandCopyBufferRectKHR
    • clCommandCopyBufferToImageKHR

      public final long clCommandCopyBufferToImageKHR
    • clCommandCopyImageKHR

      public final long clCommandCopyImageKHR
    • clCommandCopyImageToBufferKHR

      public final long clCommandCopyImageToBufferKHR
    • clCommandFillBufferKHR

      public final long clCommandFillBufferKHR
    • clCommandFillImageKHR

      public final long clCommandFillImageKHR
    • clCommandNDRangeKernelKHR

      public final long clCommandNDRangeKernelKHR
    • clGetCommandBufferInfoKHR

      public final long clGetCommandBufferInfoKHR
    • clCreateCommandQueueWithPropertiesKHR

      public final long clCreateCommandQueueWithPropertiesKHR
    • clCreateEventFromEGLSyncKHR

      public final long clCreateEventFromEGLSyncKHR
    • clCreateFromEGLImageKHR

      public final long clCreateFromEGLImageKHR
    • clEnqueueAcquireEGLObjectsKHR

      public final long clEnqueueAcquireEGLObjectsKHR
    • clEnqueueReleaseEGLObjectsKHR

      public final long clEnqueueReleaseEGLObjectsKHR
    • clEnqueueAcquireExternalMemObjectsKHR

      public final long clEnqueueAcquireExternalMemObjectsKHR
    • clEnqueueReleaseExternalMemObjectsKHR

      public final long clEnqueueReleaseExternalMemObjectsKHR
    • clCreateEventFromGLsyncKHR

      public final long clCreateEventFromGLsyncKHR
    • clGetGLContextInfoKHR

      public final long clGetGLContextInfoKHR
    • clCreateProgramWithILKHR

      public final long clCreateProgramWithILKHR
    • clCreateSemaphoreWithPropertiesKHR

      public final long clCreateSemaphoreWithPropertiesKHR
    • clEnqueueWaitSemaphoresKHR

      public final long clEnqueueWaitSemaphoresKHR
    • clEnqueueSignalSemaphoresKHR

      public final long clEnqueueSignalSemaphoresKHR
    • clGetSemaphoreInfoKHR

      public final long clGetSemaphoreInfoKHR
    • clReleaseSemaphoreKHR

      public final long clReleaseSemaphoreKHR
    • clRetainSemaphoreKHR

      public final long clRetainSemaphoreKHR
    • clGetKernelSubGroupInfoKHR

      public final long clGetKernelSubGroupInfoKHR
    • clGetKernelSuggestedLocalWorkSizeKHR

      public final long clGetKernelSuggestedLocalWorkSizeKHR
    • clTerminateContextKHR

      public final long clTerminateContextKHR
    • clCreateBufferNV

      public final long clCreateBufferNV
    • clSetContentSizeBufferPoCL

      public final long clSetContentSizeBufferPoCL
    • clGetDeviceImageInfoQCOM

      public final long clGetDeviceImageInfoQCOM
    • OpenCL10

      public final boolean OpenCL10
      When true, CL10 is supported.
    • OpenCL10GL

      public final boolean OpenCL10GL
      When true, CL10GL is supported.
    • OpenCL11

      public final boolean OpenCL11
      When true, CL11 is supported.
    • OpenCL12

      public final boolean OpenCL12
      When true, CL12 is supported.
    • OpenCL12GL

      public final boolean OpenCL12GL
      When true, CL12GL is supported.
    • OpenCL20

      public final boolean OpenCL20
      When true, CL20 is supported.
    • OpenCL21

      public final boolean OpenCL21
      When true, CL21 is supported.
    • OpenCL22

      public final boolean OpenCL22
      When true, CL22 is supported.
    • OpenCL30

      public final boolean OpenCL30
      When true, CL30 is supported.
    • cl_altera_compiler_mode

      public final boolean cl_altera_compiler_mode
      When true, ALTERACompilerMode is supported.
    • cl_altera_device_temperature

      public final boolean cl_altera_device_temperature
      When true, ALTERADeviceTemperature is supported.
    • cl_altera_live_object_tracking

      public final boolean cl_altera_live_object_tracking
      When true, ALTERALiveObjectTracking is supported.
    • cl_amd_bus_addressable_memory

      public final boolean cl_amd_bus_addressable_memory
      When true, AMDBusAddressableMemory is supported.
    • cl_amd_compile_options

      public final boolean cl_amd_compile_options
      When true, the amd_compile_options extension is supported.

      This extension adds the following options, which are not part of the OpenCL specification:

      • -g – This is an experimental feature that lets you use the GNU project debugger, GDB, to debug kernels on x86 CPUs running Linux or cygwin/minGW under Windows. This option does not affect the default optimization of the OpenCL code.
      • -O0 – Specifies to the compiler not to optimize. This is equivalent to the OpenCL standard option -cl-opt-disable.
      • -f[no-]bin-source – Does [not] generate OpenCL source in the .source section. By default, the source is NOT generated.
      • -f[no-]bin-llvmir – Does [not] generate LLVM IR in the .llvmir section. By default, LLVM IR IS generated.
      • -f[no-]bin-amdil – Does [not] generate AMD IL in the .amdil section. By Default, AMD IL is NOT generated.
      • -f[no-]bin-exe – Does [not] generate the executable (ISA) in .text section. By default, the executable IS generated.
      • -f[no-]bin-hsail – Does [not] generate HSAIL/BRIG in the binary. By default, HSA IL/BRIG is NOT generated.

      To avoid source changes, there are two environment variables that can be used to change CL options during the runtime:

      • AMD_OCL_BUILD_OPTIONS – Overrides the CL options specified in BuildProgram.
      • AMD_OCL_BUILD_OPTIONS_APPEND – Appends options to the options specified in BuildProgram.
    • cl_amd_device_attribute_query

      public final boolean cl_amd_device_attribute_query
      When true, AMDDeviceAttributeQuery is supported.
    • cl_amd_device_board_name

      public final boolean cl_amd_device_board_name
      When true, AMDDeviceBoardName is supported.
    • cl_amd_device_persistent_memory

      public final boolean cl_amd_device_persistent_memory
      When true, AMDDevicePersistentMemory is supported.
    • cl_amd_device_profiling_timer_offset

      public final boolean cl_amd_device_profiling_timer_offset
      When true, AMDDeviceProfilingTimerOffset is supported.
    • cl_amd_device_topology

      public final boolean cl_amd_device_topology
      When true, AMDDeviceTopology is supported.
    • cl_amd_event_callback

      public final boolean cl_amd_event_callback
      When true, the amd_event_callback extension is supported.

      This extension provides the ability to register event callbacks for states other than COMPLETE. The full set of event states are allowed: QUEUED, SUBMITTED, and RUNNING.

    • cl_amd_fp64

      public final boolean cl_amd_fp64
      When true, the amd_fp64 extension is supported.

      This extension provides a subset of the functionality of that provided by the cl_khr_fp64 extension. When enabled, the compiler recognizes the double scalar and vector types, compiles expressions involving those types, and accepts calls to all builtin functions enabled by the cl_khr_fp64 extension. However, this extension does not guarantee that all cl_khr_fp64 built in functions are implemented and does not guarantee that the built in functions that have been implemented would be considered conformant to the cl_khr_fp64 extension.

    • cl_amd_media_ops

      public final boolean cl_amd_media_ops
      When true, the amd_media_ops extension is supported.

      The directive when enabled adds the following built-in functions to the OpenCL language.

      
       Note: typen denote opencl scalar type {n = 1} and vector types {n = 4, 8, 16}.
       
       Build-in Function
         uint  amd_pack(float4 src)
       Description
         dst =   ((((uint)src.s0) & 0xff)      )
               + ((((uint)src.s1) & 0xff) <<  8)
               + ((((uint)src.s2) & 0xff) << 16)
               + ((((uint)src.s3) & 0xff) << 24)
       
       Build-in Function
         floatn  amd_unpack3(unitn src)
       Description
         dst.s0 = (float)((src.s0 >> 24) & 0xff)
         similar operation applied to other components of the vectors
       
       Build-in Function
         floatn   amd_unpack2 (unitn src)
       Description
         dst.s0 = (float)((src.s0 >> 16) & 0xff)
         similar operation applied to other components of the vectors
       
       Build-in Function
         floatn   amd_unpack1 (unitn src)
       Description
         dst.s0 = (float)((src.s0 >> 8) & 0xff)
         similar operation applied to other components of the vectors
       
       Build-in Function
         floatn   amd_unpack0 (unitn src)
       Description
         dst.s0 = (float)(src.s0 & 0xff)
         similar operation applied to other components of the vectors
       
       Build-in Function
         uintn  amd_bitalign (uintn src0, uintn src1, uintn src2)
       Description
         dst.s0 =  (uint) (((((long)src0.s0) << 32) | (long)src1.s0) >> (src2.s0 & 31))
         similar operation applied to other components of the vectors.
       
       
       Build-in Function
         uintn  amd_bytealign (uintn src0, uintn src1, uintn src2)
       Description
         dst.s0 =  (uint) (((((long)src0.s0) << 32) | (long)src1.s0) >> ((src2.s0 & 3)*8))
         similar operation applied to other components of the vectors
       
       Build-in Function
         uintn  amd_lerp (uintn src0, uintn src1, uintn src2)
       Description
         dst.s0 = (((((src0.s0 >>  0) & 0xff) + ((src1.s0 >>  0) & 0xff) + ((src2.s0 >>  0) & 1)) >> 1) <<  0) +
                  (((((src0.s0 >>  8) & 0xff) + ((src1.s0 >>  8) & 0xff) + ((src2.s0 >>  8) & 1)) >> 1) <<  8) +
                  (((((src0.s0 >> 16) & 0xff) + ((src1.s0 >> 16) & 0xff) + ((src2.s0 >> 16) & 1)) >> 1) << 16) +
                  (((((src0.s0 >> 24) & 0xff) + ((src1.s0 >> 24) & 0xff) + ((src2.s0 >> 24) & 1)) >> 1) << 24);
         similar operation applied to other components of the vectors
       
       Build-in Function
         uintn  amd_sad (uintn src0, uintn src1, uintn src2)
       Description
         dst.s0 = src2.s0 +
                  abs(((src0.s0 >>  0) & 0xff) - ((src1.s0 >>  0) & 0xff)) +
                  abs(((src0.s0 >>  8) & 0xff) - ((src1.s0 >>  8) & 0xff)) +
                  abs(((src0.s0 >> 16) & 0xff) - ((src1.s0 >> 16) & 0xff)) +
                  abs(((src0.s0 >> 24) & 0xff) - ((src1.s0 >> 24) & 0xff));
         similar operation applied to other components of the vectors
       
       Build-in Function
         uintn  amd_sadhi (uintn src0, uintn src1n, uintn src2)
       Description
         dst.s0 = src2.s0 +
                  (abs(((src0.s0 >>  0) & 0xff) - ((src1.s0 >>  0) & 0xff)) << 16) +
                  (abs(((src0.s0 >>  8) & 0xff) - ((src1.s0 >>  8) & 0xff)) << 16) +
                  (abs(((src0.s0 >> 16) & 0xff) - ((src1.s0 >> 16) & 0xff)) << 16) +
                  (abs(((src0.s0 >> 24) & 0xff) - ((src1.s0 >> 24) & 0xff)) << 16);
         similar operation applied to other components of the vectors
       
       Build-in Function
         uint  amd_sad4(uint4 src0, uint4 src1, uint src2)
       Description
         dst   = src2   +
                  abs(((src0.s0 >>  0) & 0xff) - ((src1.s0 >>  0) & 0xff)) +
                  abs(((src0.s0 >>  8) & 0xff) - ((src1.s0 >>  8) & 0xff)) +
                  abs(((src0.s0 >> 16) & 0xff) - ((src1.s0 >> 16) & 0xff)) +
                  abs(((src0.s0 >> 24) & 0xff) - ((src1.s0 >> 24) & 0xff)) +
                  abs(((src0.s1 >>  0) & 0xff) - ((src1.s0 >>  0) & 0xff)) +
                  abs(((src0.s1 >>  8) & 0xff) - ((src1.s1 >>  8) & 0xff)) +
                  abs(((src0.s1 >> 16) & 0xff) - ((src1.s1 >> 16) & 0xff)) +
                  abs(((src0.s1 >> 24) & 0xff) - ((src1.s1 >> 24) & 0xff)) +
                  abs(((src0.s2 >>  0) & 0xff) - ((src1.s2 >>  0) & 0xff)) +
                  abs(((src0.s2 >>  8) & 0xff) - ((src1.s2 >>  8) & 0xff)) +
                  abs(((src0.s2 >> 16) & 0xff) - ((src1.s2 >> 16) & 0xff)) +
                  abs(((src0.s2 >> 24) & 0xff) - ((src1.s2 >> 24) & 0xff)) +
                  abs(((src0.s3 >>  0) & 0xff) - ((src1.s3 >>  0) & 0xff)) +
                  abs(((src0.s3 >>  8) & 0xff) - ((src1.s3 >>  8) & 0xff)) +
                  abs(((src0.s3 >> 16) & 0xff) - ((src1.s3 >> 16) & 0xff)) +
                  abs(((src0.s3 >> 24) & 0xff) - ((src1.s3 >> 24) & 0xff));
    • cl_amd_media_ops2

      public final boolean cl_amd_media_ops2
      When true, the amd_media_ops2 extension is supported.

      The directive when enabled adds the following built-in functions to the OpenCL language.

      
       Note: typen denote open scalar type { n = 1 } and vector types { n = 2, 4, 8, 16 }.
       
       Build-in Function
         uintn  amd_msad (uintn src0, uintn src1, uintn src2)
       Description
         uchar4 src0u8 = as_uchar4(src0.s0);
         uchar4 src1u8 = as_uchar4(src1.s0);
         dst.s0 = src2.s0 +
                  ((src1u8.s0 == 0) ? 0 : abs(src0u8.s0 - src1u8.s0)) +
                  ((src1u8.s1 == 0) ? 0 : abs(src0u8.s1 - src1u8.s1)) +
                  ((src1u8.s2 == 0) ? 0 : abs(src0u8.s2 - src1u8.s2)) +
                  ((src1u8.s3 == 0) ? 0 : abs(src0u8.s3 - src1u8.s3));
         similar operation applied to other components of the vectors
       
       Build-in Function
         ulongn amd_qsad (ulongn src0, uintn src1, ulongn src2)
       Description
         uchar8 src0u8 = as_uchar8(src0.s0);
         ushort4 src2u16 = as_ushort4(src2.s0);
         ushort4 dstu16;
         dstu16.s0 = amd_sad(as_uint(src0u8.s0123), src1.s0, src2u16.s0);
         dstu16.s1 = amd_sad(as_uint(src0u8.s1234), src1.s0, src2u16.s1);
         dstu16.s2 = amd_sad(as_uint(src0u8.s2345), src1.s0, src2u16.s2);
         dstu16.s3 = amd_sad(as_uint(src0u8.s3456), src1.s0, src2u16.s3);
         dst.s0 = as_uint2(dstu16);
         similar operation applied to other components of the vectors
       
       Build-in Function
         ulongn amd_mqsad (ulongn src0, uintn src1, ulongn src2)
       Description
         uchar8 src0u8 = as_uchar8(src0.s0);
         ushort4 src2u16 = as_ushort4(src2.s0);
         ushort4 dstu16;
         dstu16.s0 = amd_msad(as_uint(src0u8.s0123), src1.s0, src2u16.s0);
         dstu16.s1 = amd_msad(as_uint(src0u8.s1234), src1.s0, src2u16.s1);
         dstu16.s2 = amd_msad(as_uint(src0u8.s2345), src1.s0, src2u16.s2);
         dstu16.s3 = amd_msad(as_uint(src0u8.s3456), src1.s0, src2u16.s3);
         dst.s0 = as_uint2(dstu16);
         similar operation applied to other components of the vectors
       
       Build-in Function
         uintn  amd_sadw (uintn src0, uintn src1, uintn src2)
       Description
         ushort2 src0u16 = as_ushort2(src0.s0);
         ushort2 src1u16 = as_ushort2(src1.s0);
         dst.s0 = src2.s0 +
                  abs(src0u16.s0 - src1u16.s0) +
                  abs(src0u16.s1 - src1u16.s1);
         similar operation applied to other components of the vectors
       
       Build-in Function
         uintn  amd_sadd (uintn src0, uintn src1, uintn src2)
       Description
         dst.s0 = src2.s0 +  abs(src0.s0 - src1.s0);
         similar operation applied to other components of the vectors
       
       Built-in Function:
         uintn amd_bfm (uintn src0, uintn src1)
       Description
         dst.s0 = ((1 << (src0.s0 & 0x1f)) - 1) << (src1.s0 & 0x1f);
         similar operation applied to other components of the vectors
       
       Built-in Function:
         uintn amd_bfe (uintn src0, uintn src1, uintn src2)
       Description
         NOTE: operator >> below represent logical right shift
         offset = src1.s0 & 31;
         width = src2.s0 & 31;
         if width = 0
             dst.s0 = 0;
         else if (offset + width) < 32
             dst.s0 = (src0.s0 << (32 - offset - width)) >> (32 - width);
         else
             dst.s0 = src0.s0 >> offset;
         similar operation applied to other components of the vectors
       
       Built-in Function:
          intn amd_bfe (intn src0, uintn src1, uintn src2)
       Description
         NOTE: operator >> below represent arithmetic right shift
         offset = src1.s0 & 31;
         width = src2.s0 & 31;
         if width = 0
             dst.s0 = 0;
         else if (offset + width) < 32
             dst.s0 = src0.s0 << (32-offset-width) >> 32-width;
         else
             dst.s0 = src0.s0 >> offset;
         similar operation applied to other components of the vectors
       
       Built-in Function:
          intn amd_median3 (intn src0, intn src1, intn src2)
          uintn amd_median3 (uintn src0, uintn src1, uintn src2)
          floatn amd_median3 (floatn src0, floatn src1, floattn src2)
       Description
          returns median of src0, src1, and src2
       
       Built-in Function:
          intn amd_min3 (intn src0, intn src1, intn src2)
          uintn amd_min3 (uintn src0, uintn src1, uintn src2)
          floatn amd_min3 (floatn src0, floatn src1, floattn src2)
       Description
          returns min of src0, src1, and src2
       
       Built-in Function:
          intn amd_max3 (intn src0, intn src1, intn src2)
          uintn amd_max3 (uintn src0, uintn src1, uintn src2)
          floatn amd_max3 (floatn src0, floatn src1, floattn src2)
       Description
          returns max of src0, src1, and src2
    • cl_amd_offline_devices

      public final boolean cl_amd_offline_devices
      When true, AMDOfflineDevices is supported.
    • cl_amd_popcnt

      public final boolean cl_amd_popcnt
      When true, the amd_popcnt extension is supported.

      This extension introduces a “population count” function called popcnt. This extension was taken into core OpenCL 1.2, and the function was renamed popcount. The core 1.2 popcount function is identical to the AMD extension popcnt function.

    • cl_amd_predefined_macros

      public final boolean cl_amd_predefined_macros
      When true, the amd_predefined_macros extension is supported.

      The following macros are predefined when compiling OpenCL™ C kernels. These macros are defined automatically based on the device for which the code is being compiled.

      GPU devices
      • __Barts__
      • __BeaverCreek__
      • __Bheem__
      • __Bonaire__
      • __Caicos__
      • __Capeverde__
      • __Carrizo__
      • __Cayman__
      • __Cedar__
      • __Cypress__
      • __Devastator__
      • __Hainan__
      • __Iceland__
      • __Juniper__
      • __Kalindi__
      • __Kauai__
      • __Lombok__
      • __Loveland__
      • __Mullins__
      • __Oland__
      • __Pitcairn__
      • __RV710__
      • __RV730__
      • __RV740__
      • __RV770__
      • __RV790__
      • __Redwood__
      • __Scrapper__
      • __Spectre__
      • __Spooky__
      • __Tahiti__
      • __Tonga__
      • __Turks__
      • __WinterPark__
      • __GPU__
      CPU devices
      • __CPU__
      • __X86__
      • __X86_64__

      Note that __GPU__ or __CPU__ are predefined whenever a GPU or CPU device is the compilation target.

    • cl_amd_printf

      public final boolean cl_amd_printf
      When true, the amd_printf extension is supported.

      This extension adds the built-in function printf(__constant char * restrict format, …);

      This function writes output to the stdout stream associated with the host application. The format string is a character sequence that:

      • is null-terminated and composed of zero and more directives,
      • ordinary characters (i.e. not %), which are copied directly to the output stream unchanged, and
      • conversion specifications, each of which can result in fetching zero or more arguments, converting them, and then writing the final result to the output stream.

      The format string must be resolvable at compile time; thus, it cannot be dynamically created by the executing program. (Note that the use of variadic arguments in the built-in printf does not imply its use in other builtins; more importantly, it is not valid to use printf in user-defined functions or kernels.)

      The OpenCL C printf closely matches the definition found as part of the C99 standard. Note that conversions introduced in the format string with % are supported with the following guidelines:

      • A 32-bit floating point argument is not converted to a 64-bit double, unless the extension cl_khr_fp64 is supported and enabled. This includes the double variants if cl_khr_fp64 is supported and defined in the corresponding compilation unit.
      • 64-bit integer types can be printed using %ld / %lx / %lu.
      • %lld / %llx / %llu are not supported and reserved for 128-bit integer types (long long).
      • All OpenCL vector types can be explicitly passed and printed using the modifier vn, where n can be 2, 3, 4, 8, or 16. This modifier appears before the original conversion specifier for the vector’s component type (for example, to print a float4 %v4f). Since vn is a conversion specifier, it is valid to apply optional flags, such as field width and precision, just as it is when printing the component types. Since a vector is an aggregate type, the comma separator is used between the components: 0:1, … , n-2:n-1.
    • cl_amd_vec3

      public final boolean cl_amd_vec3
      When true, the amd_vec3 extension is supported.

      This extension adds support for vectors with three elements: float3, short3, char3, etc. This data type was added to OpenCL 1.1 as a core feature.

    • cl_APPLE_biased_fixed_point_image_formats

      public final boolean cl_APPLE_biased_fixed_point_image_formats
      When true, APPLEBiasedFixedPointImageFormats is supported.
    • cl_APPLE_command_queue_priority

      public final boolean cl_APPLE_command_queue_priority
      When true, APPLECommandQueuePriority is supported.
    • cl_APPLE_command_queue_select_compute_units

      public final boolean cl_APPLE_command_queue_select_compute_units
      When true, APPLECommandQueueSelectComputeUnits is supported.
    • cl_APPLE_ContextLoggingFunctions

      public final boolean cl_APPLE_ContextLoggingFunctions
      When true, APPLEContextLoggingFunctions is supported.
    • cl_APPLE_fixed_alpha_channel_orders

      public final boolean cl_APPLE_fixed_alpha_channel_orders
      When true, APPLEFixedAlphaChannelOrders is supported.
    • cl_APPLE_fp64_basic_ops

      public final boolean cl_APPLE_fp64_basic_ops
      When true, APPLE_fp64_basic_ops is supported.
    • cl_APPLE_gl_sharing

      public final boolean cl_APPLE_gl_sharing
      When true, APPLEGLSharing is supported.
    • cl_APPLE_query_kernel_names

      public final boolean cl_APPLE_query_kernel_names
      When true, APPLEQueryKernelNames is supported.
    • cl_arm_controlled_kernel_termination

      public final boolean cl_arm_controlled_kernel_termination
      When true, ARMControlledKernelTermination is supported.
    • cl_arm_core_id

      public final boolean cl_arm_core_id
      When true, ARMCoreID is supported.
    • cl_arm_import_memory

      public final boolean cl_arm_import_memory
      When true, ARMImportMemory is supported.
    • cl_arm_integer_dot_product_accumulate_int16

      public final boolean cl_arm_integer_dot_product_accumulate_int16
      When true, the cl_arm_integer_dot_product_accumulate_int16 extension is supported.
    • cl_arm_integer_dot_product_accumulate_int8

      public final boolean cl_arm_integer_dot_product_accumulate_int8
      When true, the cl_arm_integer_dot_product_accumulate_int8 extension is supported.
    • cl_arm_integer_dot_product_accumulate_saturate_int8

      public final boolean cl_arm_integer_dot_product_accumulate_saturate_int8
      When true, the cl_arm_integer_dot_product_accumulate_saturate_int8 extension is supported.
    • cl_arm_integer_dot_product_int8

      public final boolean cl_arm_integer_dot_product_int8
      When true, the cl_arm_integer_dot_product_int8 extension is supported.
    • cl_arm_job_slot_selection

      public final boolean cl_arm_job_slot_selection
      When true, ARMJobSlotSelection is supported.
    • cl_arm_non_uniform_work_group_size

      public final boolean cl_arm_non_uniform_work_group_size
      When true, the arm_non_uniform_work_group_size extension is supported.

      This extension provides a way to enqueue kernels with local work-group sizes that are not integer factors of the global work-group size in OpenCL C 1.x languages.

      Such work-groups are referred to in the OpenCL 2.0 specification as non-uniform work-groups.

      To enable this extension the option -cl-arm-non-uniform-work-group-size must be provided in the options string when building a program from source using BuildProgram. Kernels created from such a program will be able to be enqueued via EnqueueNDRangeKernel with a non-uniform local work-group size.

      This feature is enabled by default in OpenCL C 2.0. See section 5.10 of the OpenCL 2.0 API specification. This section also details how kernels that are enqueued with non-uniform work-group sizes are divided into work groups.

      The built in function get_local_size() for kernels that have been built with this extension will take on the OpenCL 2.0 behaviour. See section 6.13.1 of the OpenCL 2.0 C specification for details.

    • cl_arm_printf

      public final boolean cl_arm_printf
      When true, ARMPrintf is supported.
    • cl_arm_protected_memory_allocation

      public final boolean cl_arm_protected_memory_allocation
      When true, ARMProtectedMemoryAllocation is supported.
    • cl_arm_scheduling_controls

      public final boolean cl_arm_scheduling_controls
      When true, ARMSchedulingControls is supported.
    • cl_arm_thread_limit_hint

      public final boolean cl_arm_thread_limit_hint
      When true, the arm_thread_limit_hint extension is supported.

      This extension enables an application to provide a hint for the maximum number of threads allowed to run concurrently on a compute unit. This results in a limit in the threads used by a kernel instance on devices that support it, lowering pressure on caches.

    • cl_cl_arm_import_memory_android_hardware_buffer

      public final boolean cl_cl_arm_import_memory_android_hardware_buffer
      When true, the cl_arm_import_memory_android_hardware_buffer extension is supported.
    • cl_cl_arm_import_memory_dma_buf

      public final boolean cl_cl_arm_import_memory_dma_buf
      When true, the cl_arm_import_memory_dma_buf extension is supported.
    • cl_cl_arm_import_memory_host

      public final boolean cl_cl_arm_import_memory_host
      When true, the cl_arm_import_memory_host extension is supported.
    • cl_cl_arm_import_memory_protected

      public final boolean cl_cl_arm_import_memory_protected
      When true, the cl_arm_import_memory_protected extension is supported.
    • cl_ext_atomic_counters_32

      public final boolean cl_ext_atomic_counters_32
      When true, EXTAtomicCounters32 is supported.
    • cl_ext_atomic_counters_64

      public final boolean cl_ext_atomic_counters_64
      When true, EXTAtomicCounters64 is supported.
    • cl_ext_cxx_for_opencl

      public final boolean cl_ext_cxx_for_opencl
      When true, EXTCXXForOpencl is supported.
    • cl_ext_device_fission

      public final boolean cl_ext_device_fission
      When true, EXTDeviceFission is supported.
    • cl_ext_float_atomics

      public final boolean cl_ext_float_atomics
      When true, EXTFloatAtomics is supported.
    • cl_ext_image_from_buffer

      public final boolean cl_ext_image_from_buffer
      When true, EXTImageFromBuffer is supported.
    • cl_ext_image_requirements_info

      public final boolean cl_ext_image_requirements_info
      When true, EXTImageRequirementsInfo is supported.
    • cl_ext_migrate_memobject

      public final boolean cl_ext_migrate_memobject
      When true, EXTMigrateMemobject is supported.
    • cl_img_cached_allocations

      public final boolean cl_img_cached_allocations
      When true, IMGCachedAllocations is supported.
    • cl_img_generate_mipmap

      public final boolean cl_img_generate_mipmap
      When true, IMGGenerateMipmap is supported.
    • cl_img_mem_properties

      public final boolean cl_img_mem_properties
      When true, IMGMemProperties is supported.
    • cl_img_yuv_image

      public final boolean cl_img_yuv_image
      When true, IMGYUVImage is supported.
    • cl_intel_accelerator

      public final boolean cl_intel_accelerator
      When true, INTELAccelerator is supported.
    • cl_intel_advanced_motion_estimation

      public final boolean cl_intel_advanced_motion_estimation
      When true, INTELAdvancedMotionEstimation is supported.
    • cl_intel_bfloat16_conversions

      public final boolean cl_intel_bfloat16_conversions
      This extension adds built-in functions to convert between single-precision 32-bit floating-point values and 16-bit bfloat16 values. The 16-bit bfloat16 format has similar dynamic range as the 32-bit float format, albeit with lower precision than the 16-bit half format.

      Please note that this extension currently does not introduce a bfloat16 type to OpenCL C and instead the built-in functions convert to or from a ushort 16-bit unsigned integer type with a bit pattern that represents a bfloat16 value.

    • cl_intel_command_queue_families

      public final boolean cl_intel_command_queue_families
      When true, INTELCommandQueueFamilies is supported.
    • cl_intel_create_buffer_with_properties

      public final boolean cl_intel_create_buffer_with_properties
      When true, INTELCreateBufferWithProperties is supported.
    • cl_intel_device_attribute_query

      public final boolean cl_intel_device_attribute_query
      When true, INTELDeviceAttributeQuery is supported.
    • cl_intel_device_partition_by_names

      public final boolean cl_intel_device_partition_by_names
      When true, INTELDevicePartitionByNames is supported.
    • cl_intel_device_side_avc_motion_estimation

      public final boolean cl_intel_device_side_avc_motion_estimation
      When true, INTELDeviceSideAVCMotionEstimation is supported.
    • cl_intel_driver_diagnostics

      public final boolean cl_intel_driver_diagnostics
      When true, INTELDriverDiagnostics is supported.
    • cl_intel_egl_image_yuv

      public final boolean cl_intel_egl_image_yuv
      When true, INTELEGLImageYUV is supported.
    • cl_intel_exec_by_local_thread

      public final boolean cl_intel_exec_by_local_thread
      When true, INTELExecByLocalThread is supported.
    • cl_intel_media_block_io

      public final boolean cl_intel_media_block_io
      This extension augments the block read/write functionality available in the Intel vendor extensions intel_subgroups and intel_media_block_io by the specification of additional built-in functions to facilitate the reading and writing of flexible 2D regions from images. This API allows for the explicit specification of the width and height of the image regions.

      While not required, this extension is most useful when the subgroup size is known at compile-time. The primary use case for this extension is to support the reading of the edge texels (or image elements) of neighboring macro-blocks as described in the Intel vendor extension intel_device_side_avc_motion_estimation. When using the built-in functions from cl_intel_device_ side_avc_motion_estimation the subgroup size is implicitly fixed to 16. In other use cases the subgroup size may be fixed using the intel_required_subgroup_size extension, if needed.

    • cl_intel_mem_alloc_buffer_location

      public final boolean cl_intel_mem_alloc_buffer_location
      When true, INTELMemAllocBufferLocation is supported.
    • cl_intel_mem_channel_property

      public final boolean cl_intel_mem_channel_property
      When true, INTELMemChannelProperty is supported.
    • cl_intel_mem_force_host_memory

      public final boolean cl_intel_mem_force_host_memory
      When true, INTELMemForceHostMemory is supported.
    • cl_intel_motion_estimation

      public final boolean cl_intel_motion_estimation
      When true, INTELMotionEstimation is supported.
    • cl_intel_packed_yuv

      public final boolean cl_intel_packed_yuv
      When true, INTELPackedYUV is supported.
    • cl_intel_planar_yuv

      public final boolean cl_intel_planar_yuv
      When true, INTELPlanarYUV is supported.
    • cl_intel_printf

      public final boolean cl_intel_printf
      When true, intel_printf is supported.
    • cl_intel_required_subgroup_size

      public final boolean cl_intel_required_subgroup_size
      When true, INTELRequiredSubgroupSize is supported.
    • cl_intel_sharing_format_query

      public final boolean cl_intel_sharing_format_query
      When true, INTELSharingFormatQuery is supported.
    • cl_intel_simultaneous_sharing

      public final boolean cl_intel_simultaneous_sharing
      When true, INTELSimultaneousSharing is supported.
    • cl_intel_spirv_device_side_avc_motion_estimation

      public final boolean cl_intel_spirv_device_side_avc_motion_estimation
      This extension defines how modules using the SPIR-V extension SPV_INTEL_device_side_avc_motion_estimation may behave in an OpenCL environment.

      Requires OpenCL 2.1 and intel_device_side_avc_motion_estimation.

    • cl_intel_spirv_media_block_io

      public final boolean cl_intel_spirv_media_block_io
      This extension defines how modules using the SPIR-V extension SPV_INTEL_media_block_io may behave in an OpenCL environment.

      Requires OpenCL 2.1 and intel_spirv_media_block_io.

    • cl_intel_spirv_subgroups

      public final boolean cl_intel_spirv_subgroups
      This extension defines how modules using the SPIR-V extension SPV_INTEL_subgroups may behave in an OpenCL environment.

      Requires OpenCL 2.1 and intel_subgroups.

    • cl_intel_split_work_group_barrier

      public final boolean cl_intel_split_work_group_barrier
      This extension adds built-in functions to split a barrier or work_group_barrier function in OpenCL C into two separate operations: the first indicates that a work-item has "arrived" at a barrier but should continue executing, and the second indicates that a work-item should "wait" for all of the work-items to arrive at the barrier before executing further.

      Splitting a barrier operation may improve performance and may provide a closer match to "latch" or "barrier" operations in other parallel languages such as C++ 20.

    • cl_intel_subgroup_matrix_multiply_accumulate

      public final boolean cl_intel_subgroup_matrix_multiply_accumulate
      The goal of this extension is to allow programmers to access specialized hardware to compute the product of an M x K matrix with a K x N matrix and then add an M x N matrix accumulation value. This is a commonly used building block to compute the product of two large matrices. When used in an OpenCL kernel, all work items in the subgroup cooperate to perform this operation.

      This is a low-level extension for expert programmers seeking to access this functionality directly in custom kernels. Most users will access this functionality via high-level libraries or frameworks.

      Requires support for subgroups.

    • cl_intel_subgroup_split_matrix_multiply_accumulate

      public final boolean cl_intel_subgroup_split_matrix_multiply_accumulate
      The goal of this extension is to allow programmers to access specialized hardware to compute the product of an M x K matrix with a K x N matrix and then add an M x N matrix accumulation value. This is a commonly used building block to compute the product of two large matrices.

      The functionality described in this extension is very similar to the functionality described in the cl_intel_subgroup_matrix_multiply_accumulate extension, with one key difference: in this extension, work items across two subgroups cooperate to perform the operation. This is done by splitting the M x K matrix source across two participating subgroups: The first M-divided-by-2 rows of the matrix source are provided by the first subgroup, and the remaining M-divided-by-2 rows of the matrix source are provided by the second subgroup.

      Splitting the matrix source improves performance by halving the amount of data each subgroup must load for the first matrix source.

      Requires support for subgroups.

    • cl_intel_subgroups

      public final boolean cl_intel_subgroups
      When true, INTELSubgroups is supported.
    • cl_intel_subgroups_char

      public final boolean cl_intel_subgroups_char
      The goal of this extension is to allow programmers to improve the performance of applications operating on 8-bit data types by extending the subgroup functions described in the intel_subgroups extension to support 8-bit integer data types (chars and uchars). Specifically, the extension:
      • Extends the subgroup broadcast function to allow 8-bit integer values to be broadcast from one work item to all other work items in the subgroup.
      • Extends the subgroup scan and reduction functions to operate on 8-bit integer data types.
      • Extends the Intel subgroup shuffle functions to allow arbitrarily exchanging 8-bit integer values among work items in the subgroup.
      • Extends the Intel subgroup block read and write functions to allow reading and writing 8-bit integer data from images and buffers.

      Requires OpenCL 1.2 and intel_subgroups.

    • cl_intel_subgroups_long

      public final boolean cl_intel_subgroups_long
      The goal of this extension is to allow programmers to improve the performance of applications operating on 64-bit data types by extending the subgroup functions described in the intel_subgroups extension to support 64-bit integer data types (longs and ulongs). Specifically, the extension:
      • Extends the Intel subgroup block read and write functions to allow reading and writing 64-bit integer data from images and buffers.

      Note that cl_intel_subgroups and cl_khr_subgroups already support broadcasts, scans, and reductions for 64-bit integer types, and that cl_intel_subgroups already supports shuffles for 64-bit integer types.

      Requires OpenCL 1.2 and intel_subgroups.

    • cl_intel_subgroups_short

      public final boolean cl_intel_subgroups_short
      The goal of this extension is to allow programmers to improve the performance of applications operating on 16-bit data types by extending the subgroup functions described in the intel_subgroups extension to support 16-bit integer data types (shorts and ushorts). Specifically, the extension:
      • Extends the subgroup broadcast function to allow 16-bit integer values to be broadcast from one work item to all other work items in the subgroup.
      • Extends the subgroup scan and reduction functions to operate on 16-bit integer data types.
      • Extends the Intel subgroup shuffle functions to allow arbitrarily exchanging 16-bit integer values among work items in the subgroup.
      • Extends the Intel subgroup block read and write functions to allow reading and writing 16-bit integer data from images and buffers.

      Requires OpenCL 1.2 and intel_subgroups.

    • cl_intel_unified_shared_memory

      public final boolean cl_intel_unified_shared_memory
      When true, INTELUnifiedSharedMemory is supported.
    • cl_intel_va_api_media_sharing

      public final boolean cl_intel_va_api_media_sharing
      When true, INTELVAAPIMediaSharing is supported.
    • cl_khr_3d_image_writes

      public final boolean cl_khr_3d_image_writes
      When true, the khr_3d_image_writes extension is supported.

      This extension adds support for kernel writes to 3D images.

    • cl_khr_async_work_group_copy_fence

      public final boolean cl_khr_async_work_group_copy_fence
      When true, the khr_async_work_group_copy_fence extension is supported.

      The extension adds a new built-in function to OpenCL C to establish a memory synchronization ordering of asynchronous copies.

    • cl_khr_byte_addressable_store

      public final boolean cl_khr_byte_addressable_store
      When true, the khr_byte_addressable_store extension is supported.

      This extension eliminates the restriction of not allowing writes to a pointer (or array elements) of types less than 32-bit wide in kernel program.

    • cl_khr_command_buffer

      public final boolean cl_khr_command_buffer
      When true, KHRCommandBuffer is supported.
    • cl_khr_create_command_queue

      public final boolean cl_khr_create_command_queue
      When true, KHRCreateCommandQueue is supported.
    • cl_khr_depth_images

      public final boolean cl_khr_depth_images
      When true, KHRDepthImages is supported.
    • cl_khr_device_enqueue_local_arg_types

      public final boolean cl_khr_device_enqueue_local_arg_types
      When true, the khr_device_enqueue_local_arg_types extension is supported.

      This extension allows arguments to blocks passed to enqueue_kernel functions to be declared as a pointer to any type (built-in or user-defined) in local memory instead of just local void *.

    • cl_khr_device_uuid

      public final boolean cl_khr_device_uuid
      When true, KHRDeviceUUID is supported.
    • cl_khr_egl_event

      public final boolean cl_khr_egl_event
      When true, KHREGLEvent is supported.
    • cl_khr_egl_image

      public final boolean cl_khr_egl_image
      When true, KHREGLImage is supported.
    • cl_khr_expect_assume

      public final boolean cl_khr_expect_assume
      When true, the khr_expect_assume extension is supported.

      This extension adds mechanisms to provide information to the compiler that may improve the performance of some kernels. Specifically, this extension adds the ability to:

      • Tell the compiler the expected value of a variable.
      • Allow the compiler to assume a condition is true.

      These functions are not required for functional correctness.

      The initial version of this extension extends the OpenCL SPIR-V environment to support new instructions for offline compilation tool chains. Similar functionality may be provided by some OpenCL C online compilation tool chains, but formal support in OpenCL C is not required by the initial version of the extension.

    • cl_khr_extended_async_copies

      public final boolean cl_khr_extended_async_copies
      When true, the khr_extended_async_copies extension is supported.

      This extension augments built-in asynchronous copy functions to OpenCL C to support more patterns:

      1. for async copy between 2D source and 2D destination.
      2. for async copy between 3D source and 3D destination.
    • cl_khr_extended_bit_ops

      public final boolean cl_khr_extended_bit_ops
      When true, the khr_extended_bit_ops extension is supported.

      This extension adds OpenCL C functions for performing extended bit operations. Specifically, the following functions are added:

      • bitfield insert: insert bits from one source operand into another source operand.
      • bitfield extract: extract bits from a source operand, with sign- or zero-extension.
      • bit reverse: reverse the bits of a source operand.
    • cl_khr_extended_versioning

      public final boolean cl_khr_extended_versioning
      When true, KHRExtendedVersioning is supported.
    • cl_khr_external_memory

      public final boolean cl_khr_external_memory
      When true, KHRExternalMemory is supported.
    • cl_khr_external_memory_dma_buf

      public final boolean cl_khr_external_memory_dma_buf
      When true, the khr_external_memory_dma_buf extension is supported.
    • cl_khr_external_memory_opaque_fd

      public final boolean cl_khr_external_memory_opaque_fd
      When true, the khr_external_memory_opaque_fd extension is supported.
    • cl_khr_external_memory_win32

      public final boolean cl_khr_external_memory_win32
      When true, the khr_external_memory_win32 extension is supported.
    • cl_khr_external_semaphore

      public final boolean cl_khr_external_semaphore
      When true, KHRExternalSemaphore is supported.
    • cl_khr_fp16

      public final boolean cl_khr_fp16
      When true, KHRFP16 is supported.
    • cl_khr_fp64

      public final boolean cl_khr_fp64
      When true, KHRFP64 is supported.
    • cl_khr_gl_depth_images

      public final boolean cl_khr_gl_depth_images
      When true, KHRGLDepthImages is supported.
    • cl_khr_gl_event

      public final boolean cl_khr_gl_event
      When true, KHRGLEvent is supported.
    • cl_khr_gl_msaa_sharing

      public final boolean cl_khr_gl_msaa_sharing
      When true, KHRGLMSAASharing is supported.
    • cl_khr_gl_sharing

      public final boolean cl_khr_gl_sharing
      When true, KHRGLSharing is supported.
    • cl_khr_global_int32_base_atomics

      public final boolean cl_khr_global_int32_base_atomics
      When true, the khr_global_int32_base_atomics extension is supported.

      This extension adds basic atomic operations on 32-bit integers in global memory.

    • cl_khr_global_int32_extended_atomics

      public final boolean cl_khr_global_int32_extended_atomics
      When true, the khr_global_int32_extended_atomics extension is supported.

      This extension adds extended atomic operations on 32-bit integers in global memory.

    • cl_khr_icd

      public final boolean cl_khr_icd
      When true, KHRICD is supported.
    • cl_khr_il_program

      public final boolean cl_khr_il_program
      When true, KHRILProgram is supported.
    • cl_khr_image2d_from_buffer

      public final boolean cl_khr_image2d_from_buffer
      When true, KHRImage2DFromBuffer is supported.
    • cl_khr_initialize_memory

      public final boolean cl_khr_initialize_memory
      When true, KHRInitializeMemory is supported.
    • cl_khr_int64_base_atomics

      public final boolean cl_khr_int64_base_atomics
      When true, the khr_int64_base_atomics extension is supported.

      This extension adds basic atomic operations on 64-bit integers in both global and local memory.

    • cl_khr_int64_extended_atomics

      public final boolean cl_khr_int64_extended_atomics
      When true, the khr_int64_extended_atomics extension is supported.

      This extension adds extended atomic operations on 64-bit integers in both global and local memory.

    • cl_khr_integer_dot_product

      public final boolean cl_khr_integer_dot_product
      When true, KHRIntegerDotProduct is supported.
    • cl_khr_local_int32_base_atomics

      public final boolean cl_khr_local_int32_base_atomics
      When true, the khr_local_int32_base_atomics extension is supported.

      This extension adds basic atomic operations on 32-bit integers in local memory.

    • cl_khr_local_int32_extended_atomics

      public final boolean cl_khr_local_int32_extended_atomics
      When true, the khr_local_int32_extended_atomics extension is supported.

      This extension adds extended atomic operations on 32-bit integers in local memory.

    • cl_khr_mipmap_image

      public final boolean cl_khr_mipmap_image
      When true, KHRMipmapImage is supported.
    • cl_khr_mipmap_image_writes

      public final boolean cl_khr_mipmap_image_writes
      When true, the khr_mipmap_image_writes extension is supported.

      This extension adds built-in functions that can be used to write a mip-mapped image in an OpenCL C program.

    • cl_khr_pci_bus_info

      public final boolean cl_khr_pci_bus_info
      When true, KHRPCIBusInfo is supported.
    • cl_khr_priority_hints

      public final boolean cl_khr_priority_hints
      When true, KHRPriorityHints is supported.
    • cl_khr_select_fprounding_mode

      public final boolean cl_khr_select_fprounding_mode
      When true, the khr_select_fprounding_mode extension is supported.

      This extension adds support for specifying the rounding mode for an instruction or group of instructions in the program source.

      The appropriate rounding mode can be specified using #pragma OPENCL SELECT_ROUNDING_MODE rounding-mode in the program source.

      The #pragma OPENCL SELECT_ROUNDING_MODE sets the rounding mode for all instructions that operate on floating-point types (scalar or vector types) or produce floating-point values that follow this pragma in the program source until the next #pragma OPENCL SELECT_ROUNDING_MODE is encountered. Note that the rounding mode specified for a block of code is known at compile time. Except where otherwise documented, the callee functions do not inherit the rounding mode of the caller function.

      If this extension is enabled, the __ROUNDING_MODE__ preprocessor symbol shall be defined to be one of the following according to the current rounding mode:

      
       #define __ROUNDING_MODE__ rte
       #define __ROUNDING_MODE__ rtz
       #define __ROUNDING_MODE__ rtp
       #define __ROUNDING_MODE__ rtz

      The default rounding mode is round to nearest even. The built-in math functions, the common functions, and the geometric functions are implemented with the round to nearest even rounding mode.

      Various built-in conversions and the vstore_half and vstorea_halfn built-in functions that do not specify a rounding mode inherit the current rounding mode. Conversions from floating-point to integer type always use rtz mode, except where the user specifically asks for another rounding mode.

      Notes The above four rounding modes are defined by IEEE 754. Floating-point calculations may be carried out internally with extra precision and then rounded to fit into the destination type. Round to nearest even is currently the only rounding mode required by the OpenCL specification and is therefore the default rounding mode. In addition, only static selection of rounding mode is supported. Dynamically reconfiguring the rounding modes as specified by the IEEE 754 spec is not a requirement.

    • cl_khr_semaphore

      public final boolean cl_khr_semaphore
      When true, KHRSemaphore is supported.
    • cl_khr_spir

      public final boolean cl_khr_spir
      When true, KHRSPIR is supported.
    • cl_khr_srgb_image_writes

      public final boolean cl_khr_srgb_image_writes
      When true, the khr_srgb_image_writes extension is supported.

      This extension enables kernels to write to sRGB images using the write_imagef built-in function. The sRGB image formats that may be written to will be returned by GetSupportedImageFormats.

      When the image is an sRGB image, the write_imagef built-in function will perform the linear to sRGB conversion. Only the R, G, and B components are converted from linear to sRGB; the A component is written as-is.

    • cl_khr_subgroup_ballot

      public final boolean cl_khr_subgroup_ballot
      When true, the khr_subgroup_ballot extension is supported.

      This extension adds the ability to collect and operate on ballots from work items in the subgroup.

    • cl_khr_subgroup_clustered_reduce

      public final boolean cl_khr_subgroup_clustered_reduce
      When true, the khr_subgroup_clustered_reduce extension is supported.

      This extension adds support for clustered reductions that operate on a subset of work items in the subgroup.

    • cl_khr_subgroup_extended_types

      public final boolean cl_khr_subgroup_extended_types
      When true, the khr_subgroup_extended_types extension is supported.

      This extension adds additional supported data types to the existing subgroup broadcast, scan, and reduction functions.

    • cl_khr_subgroup_named_barrier

      public final boolean cl_khr_subgroup_named_barrier
      When true, KHRSubgroupNamedBarrier is supported.
    • cl_khr_subgroup_non_uniform_arithmetic

      public final boolean cl_khr_subgroup_non_uniform_arithmetic
      When true, the khr_subgroup_non_uniform_arithmetic extension is supported.

      This extension adds the ability to use some subgroup functions within non-uniform flow control, including additional scan and reduction operators.

    • cl_khr_subgroup_non_uniform_vote

      public final boolean cl_khr_subgroup_non_uniform_vote
      When true, the khr_subgroup_non_uniform_vote extension is supported.

      This extension adds the ability to elect a single work item from a subgroup to perform a task and to hold votes among work items in a subgroup.

    • cl_khr_subgroup_shuffle

      public final boolean cl_khr_subgroup_shuffle
      When true, the khr_subgroup_shuffle extension is supported.

      This extension adds additional ways to exchange data among work items in a subgroup.

    • cl_khr_subgroup_shuffle_relative

      public final boolean cl_khr_subgroup_shuffle_relative
      When true, the khr_subgroup_shuffle_relative extension is supported.

      This extension adds specialized ways to exchange data among work items in a subgroup that may perform better on some implementations.

    • cl_khr_subgroups

      public final boolean cl_khr_subgroups
      When true, KHRSubgroups is supported.
    • cl_khr_suggested_local_work_size

      public final boolean cl_khr_suggested_local_work_size
      When true, KHRSuggestedLocalWorkSize is supported.
    • cl_khr_terminate_context

      public final boolean cl_khr_terminate_context
      When true, KHRTerminateContext is supported.
    • cl_khr_throttle_hints

      public final boolean cl_khr_throttle_hints
      When true, KHRThrottleHints is supported.
    • cl_nv_compiler_options

      public final boolean cl_nv_compiler_options
      When true, the nv_compiler_options extension is supported.

      This extension allows the programmer to pass options to the PTX assembler allowing greater control over code generation.

      
       -cl-nv-maxrregcount <N>
           Passed on to ptxas as --maxrregcount <N>
               N is a positive integer.
           Specify the maximum number of registers that GPU functions can use.
           Until a function-specific limit, a higher value will generally increase
           the performance of individual GPU threads that execute this function.
           However, because thread registers are allocated from a global register
           pool on each GPU, a higher value of this option will also reduce the
           maximum thread block size, thereby reducing the amount of thread
           parallelism. Hence, a good maxrregcount value is the result of a
           trade-off.
           If this option is not specified, then no maximum is assumed. Otherwise
           the specified value will be rounded to the next multiple of 4 registers
           until the GPU specific maximum of 128 registers.
       
       -cl-nv-opt-level <N>
           Passed on to ptxas as --opt-level <N>
               N is a positive integer, or 0 (no optimization).
           Specify optimization level.
           Default value:  3.
       
       -cl-nv-verbose
           Passed on to ptxas as --verbose
           Enable verbose mode.
           Output will be reported in the build log (accessible through the
           callback parameter to clBuildProgram).
    • cl_nv_copy_opts

      public final boolean cl_nv_copy_opts
      When true, the nv_copy_opts extension is supported.
    • cl_nv_create_buffer

      public final boolean cl_nv_create_buffer
      When true, NVCreateBuffer is supported.
    • cl_nv_device_attribute_query

      public final boolean cl_nv_device_attribute_query
      When true, NVDeviceAttributeQuery is supported.
    • cl_nv_pragma_unroll

      public final boolean cl_nv_pragma_unroll
      When true, the nv_pragma_unroll extension is supported.
      Overview

      This extension extends the OpenCL C language with a hint that allows loops to be unrolled. This pragma must be used for a loop and can be used to specify full unrolling or partial unrolling by a certain amount. This is a hint and the compiler may ignore this pragma for any reason.

      Goals

      The principal goal of the pragma unroll is to improve the performance of loops via unrolling. Typically this enables other optimizations or improves instruction level parallelism of a thread.

      Details

      A user may specify that a loop in the source program be unrolled. This is done via a pragma. The syntax of this pragma is as follows

      #pragma unroll [unroll-factor]

      The pragma unroll may optionally specify an unroll factor. The pragma must be placed immediately before the loop and only applies to that loop.

      If unroll factor is not specified then the compiler will try to do complete or full unrolling of the loop. If a loop unroll factor is specified the compiler will perform partial loop unrolling. The loop factor, if specified, must be a compile time non negative integer constant.

      A loop unroll factor of 1 means that the compiler should not unroll the loop.

      A complete unroll specification has no effect if the trip count of the loop is not compile-time computable.

    • cl_pocl_content_size

      public final boolean cl_pocl_content_size
      When true, pocl_content_size is supported.
    • cl_qcom_ext_host_ptr

      public final boolean cl_qcom_ext_host_ptr
      When true, QCOMEXTHostPtr is supported.
    • cl_qcom_ext_host_ptr_iocoherent

      public final boolean cl_qcom_ext_host_ptr_iocoherent
      When true, QCOMEXTHostPtrIOCoherent is supported.