Class CU101

    • Method Detail

      • cuStreamBeginCapture_v2

        public static int cuStreamBeginCapture_v2​(long hStream,
                                                  int mode)
      • ncuThreadExchangeStreamCaptureMode

        public static int ncuThreadExchangeStreamCaptureMode​(long mode)
      • cuThreadExchangeStreamCaptureMode

        public static int cuThreadExchangeStreamCaptureMode​(java.nio.IntBuffer mode)
        Swaps the stream capture interaction mode for a thread.

        Sets the calling thread's stream capture interaction mode to the value contained in *mode, and overwrites *mode with the previous mode for the thread. To facilitate deterministic behavior across function or module boundaries, callers are encouraged to use this API in a push-pop fashion:

         CUstreamCaptureMode mode = desiredMode
         cuThreadExchangeStreamCaptureMode(&mode); // restore previous mode

        During stream capture (see StreamBeginCapture), some actions, such as a call to cudaMalloc, may be unsafe. In the case of cudaMalloc, the operation is not enqueued asynchronously to a stream, and is not observed by stream capture. Therefore, if the sequence of operations captured via cuStreamBeginCapture depended on the allocation being replayed whenever the graph is launched, the captured graph would be invalid.

        Therefore, stream capture places restrictions on API calls that can be made within or concurrently to a cuStreamBeginCapture-cuStreamEndCapture sequence. This behavior can be controlled via this API and flags to cuStreamBeginCapture.

        A thread's mode is one of the following:

        • STREAM_CAPTURE_MODE_GLOBAL: This is the default mode.

          If the local thread has an ongoing capture sequence that was not initiated with CU_STREAM_CAPTURE_MODE_RELAXED at cuStreamBeginCapture, or if any other thread has a concurrent capture sequence initiated with CU_STREAM_CAPTURE_MODE_GLOBAL, this thread is prohibited from potentially unsafe API calls.

        • STREAM_CAPTURE_MODE_THREAD_LOCAL: If the local thread has an ongoing capture sequence not initiated with CU_STREAM_CAPTURE_MODE_RELAXED, it is prohibited from potentially unsafe API calls. Concurrent capture sequences in other threads are ignored.
        • STREAM_CAPTURE_MODE_RELAXED: The local thread is not prohibited from potentially unsafe API calls. Note that the thread is still prohibited from API calls which necessarily conflict with stream capture, for example, attempting EventQuery on an event that was last recorded inside a capture sequence.
        mode - pointer to mode value to swap with the current mode
      • ncuStreamGetCaptureInfo

        public static int ncuStreamGetCaptureInfo​(long hStream,
                                                  long captureStatus,
                                                  long id)
        Unsafe version of: StreamGetCaptureInfo
      • cuStreamGetCaptureInfo

        public static int cuStreamGetCaptureInfo​(long hStream,
                                                 java.nio.IntBuffer captureStatus,
                                                 java.nio.LongBuffer id)
        Query capture status of a stream.

        Query the capture status of a stream and and get an id for the capture sequence, which is unique over the lifetime of the process.

        If called on STREAM_LEGACY (the "null stream") while a stream not created with STREAM_NON_BLOCKING is capturing, returns CU.CUDA_ERROR_STREAM_CAPTURE_IMPLICIT.

        A valid id is returned only if both of the following are true:

      • ncuGraphExecKernelNodeSetParams

        public static int ncuGraphExecKernelNodeSetParams​(long hGraphExec,
                                                          long hNode,
                                                          long nodeParams)
        Unsafe version of: GraphExecKernelNodeSetParams
      • cuGraphExecKernelNodeSetParams

        public static int cuGraphExecKernelNodeSetParams​(long hGraphExec,
                                                         long hNode,
                                                         CUDA_KERNEL_NODE_PARAMS nodeParams)
        Sets the parameters for a kernel node in the given graphExec.

        Sets the parameters of a kernel node in an executable graph hGraphExec. The node is identified by the corresponding node hNode in the non-executable graph, from which the executable graph was instantiated.

        hNode must not have been removed from the original graph. The func field of nodeParams cannot be modified and must match the original value. All other values can be modified.

        The modifications take effect at the next launch of hGraphExec. Already enqueued or running launches of hGraphExec are not affected by this call. hNode is also not modified by this call.

        hGraphExec - the executable graph in which to set the specified node
        hNode - kernel node from the graph from which graphExec was instantiated
        nodeParams - updated parameters to set