RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://docs.nvidia.com/cuda/cuda-driver-api/groupCUDAMALLOC__ASYNC.html below:

CUDA Driver API :: CUDA Toolkit Documentation

CUresult cuMemAllocAsync ( CUdeviceptr*Â dptr, size_tÂ bytesize, CUstreamÂ hStream )

Allocates memory with stream ordered semantics.

dptr: - Returned device pointer
bytesize: - Number of bytes to allocate
hStream: - The stream establishing the stream ordering contract and the memory pool to allocate from

Inserts an allocation operation into hStream. A pointer to the allocated memory is returned immediately in *dptr. The allocation must not be accessed until the the allocation operation completes. The allocation comes from the memory pool current to the stream's device.

Note:

The default memory pool of a device contains device memory from that device.
Basic stream ordering allows future work submitted into the same stream to use the allocation. Stream query, stream synchronize, and CUDA events can be used to guarantee that the allocation operation completes before work submitted in a separate stream runs.
During stream capture, this function results in the creation of an allocation node. In this case, the allocation is owned by the graph instead of the memory pool. The memory pool's properties are used to set the node's creation parameters.

CUresult cuMemAllocFromPoolAsync ( CUdeviceptr*Â dptr, size_tÂ bytesize, CUmemoryPoolÂ pool, CUstreamÂ hStream )

Allocates memory from a specified pool with stream ordered semantics.

dptr: - Returned device pointer
bytesize: - Number of bytes to allocate
pool: - The pool to allocate from
hStream: - The stream establishing the stream ordering semantic

Note:

The specified memory pool may be from a device different than that of the specified hStream.

Basic stream ordering allows future work submitted into the same stream to use the allocation. Stream query, stream synchronize, and CUDA events can be used to guarantee that the allocation operation completes before work submitted in a separate stream runs.

Note:

During stream capture, this function results in the creation of an allocation node. In this case, the allocation is owned by the graph instead of the memory pool. The memory pool's properties are used to set the node's creation parameters.

CUresult cuMemFreeAsync ( CUdeviceptrÂ dptr, CUstreamÂ hStream )

Frees memory with stream ordered semantics.

dptr: - memory to free
hStream: - The stream establishing the stream ordering contract.

Inserts a free operation into hStream. The allocation must not be accessed after stream execution reaches the free. After this API returns, accessing the memory from any subsequent work launched on the GPU or querying its pointer attributes results in undefined behavior.

Note:

During stream capture, this function results in the creation of a free node and must therefore be passed the address of a graph allocation.

CUresult cuMemGetDefaultMemPool ( CUmemoryPool*Â pool_out, CUmemLocation*Â location, CUmemAllocationTypeÂ type )

Returns the default memory pool for a given location and allocation type.

The memory location can be of one of CU_MEM_LOCATION_TYPE_DEVICE, CU_MEM_LOCATION_TYPE_HOST or CU_MEM_LOCATION_TYPE_HOST_NUMA. The allocation type can be one of CU_MEM_ALLOCATION_TYPE_PINNED or CU_MEM_ALLOCATION_TYPE_MANAGED. When the allocation type is CU_MEM_ALLOCATION_TYPE_MANAGED, the location type can also be CU_MEM_LOCATION_TYPE_NONE to indicate no preferred location for the managed memory pool. In all other cases, the call returns CUDA_ERROR_INVALID_VALUE.

Note:

Note that this function may also return error codes from previous, asynchronous launches.

CUresult cuMemGetMemPool ( CUmemoryPool*Â pool, CUmemLocation*Â location, CUmemAllocationTypeÂ type )

Gets the current memory pool for a memory location and of a particular allocation type.

CUresult cuMemPoolCreate ( CUmemoryPool*Â pool, const CUmemPoolProps*Â poolProps )

Creates a memory pool.

Creates a CUDA memory pool and returns the handle in pool. The poolProps determines the properties of the pool such as the backing device and IPC capabilities.

To create a memory pool for HOST memory not targeting a specific NUMA node, applications must set set CUmemPoolProps::CUmemLocation::type to CU_MEM_LOCATION_TYPE_HOST. CUmemPoolProps::CUmemLocation::id is ignored for such pools. Pools created with the type CU_MEM_LOCATION_TYPE_HOST are not IPC capable and CUmemPoolProps::handleTypes must be 0, any other values will result in CUDA_ERROR_INVALID_VALUE. To create a memory pool targeting a specific host NUMA node, applications must set CUmemPoolProps::CUmemLocation::type to CU_MEM_LOCATION_TYPE_HOST_NUMA and CUmemPoolProps::CUmemLocation::id must specify the NUMA ID of the host memory node. Specifying CU_MEM_LOCATION_TYPE_HOST_NUMA_CURRENT as the CUmemPoolProps::CUmemLocation::type will result in CUDA_ERROR_INVALID_VALUE. By default, the pool's memory will be accessible from the device it is allocated on. In the case of pools created with CU_MEM_LOCATION_TYPE_HOST_NUMA or CU_MEM_LOCATION_TYPE_HOST, their default accessibility will be from the host CPU. Applications can control the maximum size of the pool by specifying a non-zero value for CUmemPoolProps::maxSize. If set to 0, the maximum size of the pool will default to a system dependent value.

Applications that intend to use CU_MEM_HANDLE_TYPE_FABRIC based memory sharing must ensure: (1) `nvidia-caps-imex-channels` character device is created by the driver and is listed under /proc/devices (2) have at least one IMEX channel file accessible by the user launching the application.

When exporter and importer CUDA processes have been granted access to the same IMEX channel, they can securely share memory.

The IMEX channel security model works on a per user basis. Which means all processes under a user can share memory if the user has access to a valid IMEX channel. When multi-user isolation is desired, a separate IMEX channel is required for each user.

These channel files exist in /dev/nvidia-caps-imex-channels/channel* and can be created using standard OS native calls like mknod on Linux. For example: To create channel0 with the major number from /proc/devices users can execute the following command: `mknod /dev/nvidia-caps-imex-channels/channel0 c <major number>=""> 0`

Note:

Specifying CU_MEM_HANDLE_TYPE_NONE creates a memory pool that will not support IPC.

CUresult cuMemPoolDestroy ( CUmemoryPoolÂ pool )

Destroys the specified memory pool.

If any pointers obtained from this pool haven't been freed or the pool has free operations that haven't completed when cuMemPoolDestroy is invoked, the function will return immediately and the resources associated with the pool will be released automatically once there are no more outstanding allocations.

Destroying the current mempool of a device sets the default mempool of that device as the current mempool for that device.

Note:

A device's default memory pool cannot be destroyed.

CUresult cuMemPoolExportPointer ( CUmemPoolPtrExportData*Â shareData_out, CUdeviceptrÂ ptr )

Export data to share a memory pool allocation between processes.

shareData_out: - Returned export data
ptr: - pointer to memory being exported

CUresult cuMemPoolExportToShareableHandle ( void*Â handle_out, CUmemoryPoolÂ pool, CUmemAllocationHandleTypeÂ handleType, unsigned long longÂ flags )

Exports a memory pool to the requested handle type.

handle_out: - Returned OS handle
pool: - pool to export
handleType: - the type of handle to create
flags: - must be 0

Given an IPC capable mempool, create an OS handle to share the pool with another process. A recipient process can convert the shareable handle into a mempool with cuMemPoolImportFromShareableHandle. Individual pointers can then be shared with the cuMemPoolExportPointer and cuMemPoolImportPointer APIs. The implementation of what the shareable handle is and how it can be transferred is defined by the requested handle type.

Note:

: To create an IPC capable mempool, create a mempool with a CUmemAllocationHandleType other than CU_MEM_HANDLE_TYPE_NONE.

CUresult cuMemPoolGetAccess ( CUmemAccess_flags*Â flags, CUmemoryPoolÂ memPool, CUmemLocation*Â location )

Returns the accessibility of a pool from a device.

flags: - the accessibility of the pool from the specified location
memPool: - the pool being queried
location: - the location accessing the pool

CUresult cuMemPoolGetAttribute ( CUmemoryPoolÂ pool, CUmemPool_attributeÂ attr, void*Â value )

Gets attributes of a memory pool.

pool: - The memory pool to get attributes of
attr: - The attribute to get
value: - Retrieved value

CUresult cuMemPoolImportFromShareableHandle ( CUmemoryPool*Â pool_out, void*Â handle, CUmemAllocationHandleTypeÂ handleType, unsigned long longÂ flags )

imports a memory pool from a shared handle.

pool_out: - Returned memory pool
handle: - OS handle of the pool to open
handleType: - The type of handle being imported
flags: - must be 0

CUresult cuMemPoolImportPointer ( CUdeviceptr*Â ptr_out, CUmemoryPoolÂ pool, CUmemPoolPtrExportData*Â shareData )

Import a memory pool allocation from another process.

ptr_out: - pointer to imported memory
pool: - pool from which to import
shareData: - data specifying the memory to import

Returns in ptr_out a pointer to the imported memory. The imported memory must not be accessed before the allocation operation completes in the exporting process. The imported memory must be freed from all importing processes before being freed in the exporting process. The pointer may be freed with cuMemFree or cuMemFreeAsync. If cuMemFreeAsync is used, the free must be completed on the importing process before the free operation on the exporting process.

Note:

The cuMemFreeAsync api may be used in the exporting process before the cuMemFreeAsync operation completes in its stream as long as the cuMemFreeAsync in the exporting process specifies a stream with a stream dependency on the importing process's cuMemFreeAsync.

CUresult cuMemPoolSetAccess ( CUmemoryPoolÂ pool, const CUmemAccessDesc*Â map, size_tÂ count )

Controls visibility of pools between devices.

pool: - The pool being modified
map: - Array of access descriptors. Each descriptor instructs the access to enable for a single gpu.
count: - Number of descriptors in the map array.

CUresult cuMemPoolSetAttribute ( CUmemoryPoolÂ pool, CUmemPool_attributeÂ attr, void*Â value )

Sets attributes of a memory pool.

pool: - The memory pool to modify
attr: - The attribute to modify
value: - Pointer to the value to assign

CUresult cuMemPoolTrimTo ( CUmemoryPoolÂ pool, size_tÂ minBytesToKeep )

Tries to release memory back to the OS.

pool: - The memory pool to trim
minBytesToKeep: - If the pool has less than minBytesToKeep reserved, the TrimTo operation is a no-op. Otherwise the pool will be guaranteed to have at least minBytesToKeep bytes reserved after the operation.

Releases memory back to the OS until the pool contains fewer than minBytesToKeep reserved bytes, or there is no more memory that the allocator can safely release. The allocator cannot release OS allocations that back outstanding asynchronous allocations. The OS allocations may happen at different granularity from the user allocations.

Note:

: Allocations that have not been freed count as outstanding.
: Allocations that have been asynchronously freed but whose completion has not been observed on the host (eg. by a synchronize) can count as outstanding.

CUresult cuMemSetMemPool ( CUmemLocation*Â location, CUmemAllocationTypeÂ type, CUmemoryPoolÂ pool )

Sets the current memory pool for a memory location and allocation type.

When a memory pool is set as the current memory pool, the location parameter should be the same as the location of the pool. The location and allocation type specified must match those of the pool otherwise CUDA_ERROR_INVALID_VALUE is returned. By default, a memory location's current memory pool is its default memory pool that can be obtained via cuMemGetDefaultMemPool. If the location type is CU_MEM_LOCATION_TYPE_DEVICE and the allocation type is CU_MEM_ALLOCATION_TYPE_PINNED, then this API is the equivalent of calling cuDeviceSetMemPool with the location id as the device. For further details on the implications, please refer to the documentation for cuDeviceSetMemPool.

Note:

Use cuMemAllocFromPoolAsync to specify asynchronous allocations from a device different than the one the stream runs on.

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__MALLOC__ASYNC.html below:

CUDA Driver API :: CUDA Toolkit Documentation

Showing content from https://docs.nvidia.com/cuda/cuda-driver-api/groupCUDAMALLOC__ASYNC.html below: