CUDA Runtime API 汇总

1.      cudaChooseDevice: select compute-device which best matches criteria;

2.      cudaDeviceGetAttribute: returns information about the device;

3.      cudaDeviceGetByPCIBusld: returns a handle to a compute device;

4.      cudaDeviceGetCacheConfig:returns the preferred cache configuration for the current device;

5.      cudaDeviceGetLimit: returns resource limits;

6.      cudaDeviceGetPCIBusld: returns a PCI Bus ID string for the device;

7.      cudaDeviceShareMemConfig:  returns the shared memory configuration for the current device;

8.      cudaDeviceGetStreamPriorityRange:returns numerical values that correspond to the least and greatest stream priorities;

9.      cudaDeviceReset: destroy all allocations and reset all state on the current device in the current process;

10.  cudaDeviceSetCacheConfig: sets the preferred cache configuration for the current device;

11.  cudaDeviceSetLimit: set resource limits;

12.  cudaDeviceSetSharedMemConfig:sets the shared memory configuration for the current device;

13.  cudaDeviceSynchronize: wait for compute device to finish;

14.  cudaGetDevice: returns which device is currently being used;

15.  cudaGetDeviceCount: returns the number of compute-capable devices;

16.  cudaGetDeviceFlags: gets the flags for the current device;

17.  cudaGetDeviceProperties: returns information about the compute-device;

18.  cudaIpcCloseMemHandle: close memory mapped with cudaIpcOpenMemHandle;

19.  cudaIpcGetEventHandle: get an interprocess handle for a previously allocated event;

20.  cudaIpcGetMemHandle:  get an interprocess memory handle for an existing device memory allocation;

21.  cudaIpcOpenEventHandle: opens an interprocess event handle for use in the current process;

22.  cudaIpcOpenMemHandle: opens an interprocess memory handle exported from another process and returns a device pointer usable in the local process;

23.  cudaSetDevice: set device to be used for GPU executions;

24.  cudaSetDeviceFlags: sets flags to be used for device executions;

25.  cudaSetValidDevices: set a list of devices that can be used for CUDA;

26.  cudaThreadExit: exit and cleanup from CUDA launches;

27.  cudaThreadGetCacheConfig: returns the preferred cache configuration for the current device;

28.  cudaThreadGetLimit: returns resource limits;

29.  cudaThreadSetCacheConfig: sets the preferred cache configuration for the current device;

30.  cudaThreadSetLimit: set resource limits;

31.  cudaThreadSynchronize: wait for compute device to finish;

32.  cudaGetErrorName: returns the string representation of an error code enum name;

33.  cudaGetErrorString: returns the description string for an error code;

34.  cudaGetLastError: returns the last error from a runtime call;

35.  cudaPeekAtLastError: returns the last error from a runtime call;

36.  cudaStreamCallback_t: type of stream callback functions;

37.  cudaStreamAddCallback: add a callback to a compute stream;

38.  cudaStreamAttachMemAsync:attach memory to a stream asynchronously;

39.  cudaStreamCreate: create an asynchronous stream;

40.  cudaStreamCreateWithFlags: create an asynchronous stream;

41.  cudaStreamCreateWithPriority:create an asynchronous stream with the specified priority;

42.  cudaStreamDestroy: destroys and cleans up an asynchronous stream;

43.  cudaStreamGetFlags: query the flags of a stream;

44.  cudaStreamGetPriority: query the priority of a stream;

45.  cudaStreamQuery: queries an asynchronous stream for completion status;

46.  cudaStreamSynchronize: waits for stream tasks to complete;

47.  cudaStreamWaitEvent: make a compute stream wait on an event;

48.  cudaEventCreate: create an event object;

49.  cudaEventCreateWithFlags: creates an event object with the specified flags;

50.  cudaEventDestroy: destroys an event object;

51.  cudaEventElapsedTime: computes the elapsed time between events;

52.  cudaEventQuery: queries an event’s status;

53.  cudaEventRecord: records an event;

54.  cudaEventSynchronize: waits for an event to complete;

55.  cudaFuncGetAttributes: find out attributes for a given function;

56.  cudaFuncSetCacheConfig: sets the preferred cache configuration for a device function;

57.  cudaFuncSetSharedMemConfig: sets the shared memory configuration for a device function;

58.  cudaGetParameterBuffer: obtains a parameter buffer;

59.  cudaGetParameterBufferV2: launches a specified buffer;

60.  cudaLaunchKernel: launches a device function;

61.  cudaSetDoubleForDevice: converts a double argument to the executed on a device;

62.  cudaSetDoubleForHost: converts a double argument after execution on a device;

63.  cudaOccupancyMaxActiveBlocksPerMultiprocessor: returns occupancy for a device function;

64.  cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags: returns occupancy for a device function with the specified flags;

65.  cudaConfigureCall: configure a device-launch;

66.  cuadLaunch: launches a devicefunction;

67.  cudaSetupArgument: configure adevice launch;

68.  cudaArrayGetInfo: gets info about the specified cudaArray;

69.  cudaFree: frees memory on the device;

70.  cudaFreeArray: frees an array on the device;

71.  cudaFreeHost: frees pape-locked memory;

72.  cudaFreeMipmappedArray: frees a mipmapped array on the device;

73.  cudaGetMipmappedArrayLevel: gets a mipmap level of a CUDA mipmapped array;

74.  cudaGetSymbolAddress: finds the address associated with a CUDA symbol;

75.  cudaGetSymbolSize: finds the size of the object associated with a CUDA symbol;

76.  cudaHostAlloc: allocates page-locked memory on the host;

77.  cudaHostGetDevicePointer: passes back device pointer of mapped host memory allocated by cudaHostAlloc orregistered by cudaHostRegister;

78.  cudaHostGetFlags: passes back flags used to allocate pinned host memory allocated by cudaHostAlloc;

79.  cudaHostRegister: registers an existing host memory range for use by CUDA;

80.  cudaHostUnregister: unregistersa memory range that was registered with cudaHostRegister;

81.  cudaMalloc: allocate memory on the device;

82.  cudaMalloc3D: allocates logical 1D, 2D, or 3D memory objects on the device;

83.  cudaMalloc3DArray: allocate an array on the device;

84.  cuadMallocArray: allocate an array on the device;

85.  cudaMallocHost: allocates page-locked memory on the host;

86.  cudaMallocManaged: allocates memory that will be automatically managed by the Unified Memory system;

87.  cudaMallocMipmappedArray: allocate a mipmapped array on the device;

88.  cudaMallocPitch: allocates pitched memory on the device;

89.  cudaMemcpy: copies data between host and device;

90.  cudaMemcpy2D: copies data between host and device;

91.  cudaMemcpy2DArrayToArray: copies data between host and device;

92.  cudaMemcpy2DAsync: copies data between host and device;

93.  cudaMemcpy2DFromArray: copies data between host and device;

94.  cudaMemcpy2DFromArrayAsync: copies data between host and device;

95.  cudaMemcpy2DToArray: copies data between host and device;

96.  cudaMemcpy2DToArrayAsync: copies data between host and device;

97.  cudaMemcpy3D: copies data between 3D objects;

98.  cudaMemcpy3DAsync: copies data between 3D objects;

99.  cudaMemcpy3DPeer: copies memory between devices;

100.  cudaMemcpy3DPeerAsync: copies memory between devices asynchronously;

101.  cudaMemcpyArrayToArray: copies data between host and device;

102.  cudaMemcpyAsync: copies data between host and device;

103.  cudaMemcpyFromArray: copies data between host and device;

104.  cudaMemcpyFromArrayAsync: copies data between host and device;

105.  cudaMemcpyFromSymbol: copies data from the given symbol on the device;

106.  cudaMemcpyFromSymbolAsync: copies data from the given symbol on the device;

107.  cudaMemcpyPeer: copies memory between two devices;

108.  cudaMemcpyPeerAsync: copies memory between two devices asynchronously;

109.  cudaMemcpyToArray: copies data between host and device;

110.  cudaMemcpyToArrayAsync: copies data between host and device;

111.  cudaMemcpyToSymbol: copies data to the given symbol on the device;

112.  cudaMemcpyToSymbolAsync: copies data to the given symbol on the device;

113.  cudaMemGetInfo: gets free and total device memory;

114.  cudaMemset: initializes or sets device memory to a value;

115.  cudaMemset2D: initializes or sets device memory to a value;

116.  cudaMemset2DAsync: initializes or sets device memory to a value;

117.  cudaMemset3D: initializes or sets device memory to a value;

118.  cudaMemset3DAsync: initializes or sets device memory to a value;

119.  cudaMemsetAsync: initializes or sets device memory to a value;

120.  make_cudaExtent: returns a cudaExtent based on input parameters;

121.  make_cudaPitchedPtr: returns a cudaPitchedPtr based on input parameters;

122.  make_cudaPos: returns a cudaPos based on input parameters;

123.  cudaPointerGetAttributes: returns attributes about a specified pointer;

124.  cudaDeviceCanAccessPeer: queries if a device may directly access apeer device’s memory;

125.  cudaDeviceDisablePeerAccess: disables direct access to memory allocations on a peer device;

126.  cudaDeviceEnablePeerAccess: enables direct access to memory allocations on a peer device;

127.  cudaGLDeviceList: CUDA devices corresponding to the current OpenGLcontext;

128.  cudaGLGetDevices: gets the CUDA devices associated with the current OpenGL context;

129.  cudaGraphicsGLRegisterBuffer: registers an OpenGL buffer object;

130.  cudaGraphicsGLRegisterImage: register an OpenGL texture or renderbuffer object;

131.  cudaWGLGetDevice: gets the CUDA device associated with hGpu;

132.  cudaGLMapFlags: CUDA GL Map Flags;

133.  cudaGLMapBufferObject: maps a buffer object for access by CUDA;

134.  cudaGLMapBufferObjectAsync: maps a buffer object for access by CUDA;

135.  cudaGLRegisterBufferObject: registers a buffer object for access byCUDA;

136.  cudaGLSetBufferObjectMapFlags: set usage flags for mapping an OpenGL buffer;

137.  cudaGLSetGLDevice: sets a CUDA device to use OpenGL interoperability;

138.  cudaGLUnmapBufferObject: unmaps a buffer object for access by CUDA;

139.  cudaGLUnmapBufferObjectAsync: unmaps a buffer object for access byCUDA;

140.  cudaGLUnregisterBufferObject: unregisters a buffer object for accessby CUDA;

141.  cudaD3D9DeviceList: CUDA devices corresponding to a D3D9 device;

142.  cudaD3D9GetDevice: gets the device number for an adapter;

143.  cudaD3D9GetDevices: get the CUDA devices corresponding to a Direct3D9 device;

144.  cudaD3D9GetDirect3DDevice: gets the Direct3D device against whichthe current CUDA context was created;

145.  cudaD3D9SetDirect3DDevice: sets the Direct3D 9 device to use for interoperability with a CUDA device;

146.  cudaGraphicsD3D9RegisterResource: register a Direct3D 9 resourcesfor access by CUDA;

147.  cudaD3D9MapFlags: CUDA D3D9 Map Flags;

148.  cudaD3D9RegisterFlags: CUDA D3D9 Register Flags;

149.  cudaD3D9MapResources: map Direct3D resources for access by CUDA;

150.  cudaD3D9RegisterResource: registers a Direct3D resource for accessby CUDA;

151.  cudaD3D9ResourceGetMappedArray: get an array through which to accessa subresource of a Direct3D resource which has been mapped for access by CUDA;

152.  cudaD3D9ResourceGetMappedPitch: get the pitch of a subresource of aDirect3D resource which has been mapped for access by CUDA;

153.  cudaD3D9ResourceGetMappedPointer: get a pointer through which to access a subresource of a Direct3D resource which has been mapped for access byCUDA;

154.  cudaD3D9ResourceGetMappedSize: get the size of a subresource of a Direct3D resource which has been mapped for access by CUDA;

155.  cudaD3D9ResourceGetSurfaceDimensions: get the dimensions of a registered Direct3D surface;

156.  cudaD3D9ResourceSetMapFlags: set usage flags for mapping a Direct3D resource;

157.  cudaD3D9UnmapResources: unmap Direct3D resources for access by CUDA;

158.  cudaD3D9UnregisterResource: unregisters a Direct3D resource for access by CUDA;

159.  cudaD3D10DeviceList: CUDA devices corresponding to a D3D10 device;

160.  cudaD3D10GetDevice: gets the device number for an adapter;

161.  cudaD3D10GetDevices: gets the CUDA devices corresponding to a Direct3D 10 device;

162.  cudaGraphicsD3D10RegisterResource: registers a Direct3D 10 resourcefor access by CUDA;

163.  cudaD3D10MapFlags: CUDA D3D10 Map Flags;

164.  cudaD3D10RegisterFlags: CUDA D3D10 Register Flags;

165.  cudaD3D10GetDirect3DDevice: gets the Direct3D device against which the current CUDA context was created;

166.  cudaD3D10MapResources: maps Direct3D resources for access by CUDA;

167.  cudaD3D10RegisterResources: registers a Direct3D 10 resource foraccess by CUDA;

168.  cudaD3D10ResourceGetMappedArray: gets an array through which toaccess a subresource of a Direct3D which has been mapped for access by CUDA;

169.  cudaD3D10ResourceGetMappedPitch: gets the pitch of a subresource of a Direct3D resource which has been mapped for access by CUDA;

170.  cudaD3D10ResourceGetMappedPointer: gets a pointer through which to access a subresource of a Direct3D resource which has been mapped for access byCUDA;

171.  cudaD3D10ResourceGetMappedSize: get the size of a subresource of a Direct3D resource which has been mapped for access by CUDA;

172.  cudaD3D10ResourceGetSurfaceDimensions: gets the dimensions of a registered Direct3D surface;

173.  cudaD3D10ResourceSetMapFlags: set usage flags for mapping a Direct3D resource;

174.  cudaD3D10SetDirect3DDevice: sets the Direct3D 10 device to use for interoperability with a CUDA device;

175.  cudaD3D10UnmapResources: unmaps Direct3D resources;

176.  cudaD3D10UnregisterResource: unregisters a Direct3D resource;

177.  cudaD3D11DeviceList: CUDA devices corresponding to a D3D11 device;

178.  cudaD3D11GetDevice: gets the device number for an adapter;

179.  cudaD3D11GetDevices: gets the CUDA devices corresponding to a Direct3D 11 device;

180.  cudaGraphicsD3D11RegisterResource: register a Direct3D 11 resource for access by CUDA;

181.  cudaD3D11GetDirect3DDevice: gets the Direct3D device against which the current CUDA context was created;

182.  cudaD3D11SetDirect3DDevice: sets the Direct3D 11 device to use for interoperability with a CUDA device;

183.  cudaGraphicsVDPAURegisterOutputSurface: register a VdpOutputSurface object;

184.  cudaGraphicsVDPAURegisterVideoSurface: register a VdpVideoSurface object;

185.  cudaVDPAUGetDevice: gets the CUDA device associated with a VdpDevice;

186.  cudaVDPAUSetVDPAUDevice: sets a CUDA device to use VDPAUinteroperability;

187.  cudaGraphicsMapResources: map graphics resources for access by CUDA;

188.  cudaGraphicsResourceGetMappedMipmappedArray: get a mipmapped array through which to access a mapped graphics resource;

189.  cudaGraphicsResourceGetMappedPointer: get an device pointer through which to access a mapped graphics resource;

190.  cudaGraphicsResourceSetMapFlags: set usage falgs for mapping agraphics resource;

191.  cudaGraphicsSubResourceGetMappedArray: get an array through which to access a subresource of a mapped graphics resource;

192.  cudaGraphicsUnmapResources: unmap graphics resources;

193.  cudaGraphicsUnregisterResource: unregisters a graphics resource for access by CUDA;

194.  cudaBindTexture: binds a memory area to a texture;

195.  cudaBindTexture2D: binds a 2D memory area to a texture;

196.  cudaBindTextureToArray: binds an array to a texture;

197.  cudaBindTextureToMipmappedArray: binds a mipmapped array to atexture;

198.  cudaCreateChannelDesc: returns a channel descriptor using the specified format;

199.  cudaGetChannelDesc: get the channel descriptor of an array;

200.  cudaGetTextureAlignmentOffset: get the alignment offset of a texture;

201.  cudaGetTextureReference: get the texture reference associated with asymbol;

202.  cudaUnbindTexture: unbinds a texture;

203.  cudaBindSurfaceToArray: binds an array to a surface;

204.  cudaGetSurfaceReference: get the surface reference associated with asymbol;

205.  cudaCreateTextureObject: create a texture object;

206.  cudaDestroyTextureObject: destroys a texture object;

207.  cudaGetTextureObjectResourceDesc: returns a texture object’s resource descriptor;

208.  cudaGetTextureObjectResourceViewDesc: returns a texture object’s resource view descriptor;

209.  cudaGetTextureObjectTextureDesc: returns a texture object’s texture descriptor;

210.  cudaCreateSurfaceObject: creates a surface object;

211.  cudaDestroySurfaceObject: destroys a surface object;

212.  cudaGetSurfaceObjectResourceDesc: returns a surface object’s resource descriptor , returns the resource descriptor for the surface object specified by surfObject;

213.  cudaDriverGetVersion: returns the CUDA driver version;

214.  cudaRuntimeGetVersion: returns the CUDA runtime version;

215.  __cudaOccupancyB2DHelper: cppClassifierVisibility: visibility=public;

216.  cudaBindSurfaceToArray: binds an array to a surface;

217.  cudBindTexture: binds a memory area to a texture;

218.  cudaBindTexture2D: binds a 2D memory area to a texture;

219.  cudaBindTextureToArray: binds an array to a texture;

220.  cudaBindTextureToMipmappedArray: binds a mipmapped array to atexture;

221.  cudaEventCreate: creates an event object with the specified flags;

222.  cudaFuncGetAttributes: find out attributes for a given function;

223.  cudaFuncSetCacheConfig: sets the preferred cache configuration for adevice function;

224.  cudaGetSymbolAddress: finds the address associated with a CUDA symbol;

225.  cudaGetSymbolSize: finds the size of the object associated with a CUDA symbol;

226.  cudaGetTextureAlignmentOffset: get the alignment offset of atexture;

227.  cudaLaunch: launches a device function;

228.  cudaLaunchKernel: launches a device function;

229.  cudaMallocHost: allocates page-locked memory on the host;

230.  cudaMallocManaged: allocates memory that will automatically managedby the Unified Memory system;

231.  cudaMemcpyFromSymbol: copies data from the given symbol on the device;

232.  cudaMemcpyFromSymbolAsync: copies data from the given symbol on the device;

233.  cudaMemcpyToSymbol: copies data to the given symbol on the device;

234.  cudaMemcpyToSymbolAsync: copies data to the given symbol on the device;

235.  cudaOccupancyMaxActiveBlocksPerMultiprocessor: returns occupancy fora device function;

236.  cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags: returns occupancy for a device function with the specified flags;

237.  cudaOccupancyMaxPotentialBlockSize: returns grid and block size that achieves maximum potential occupancy for a device function;

238.  cudaOccupancyMaxPotentialBlockSizeVariableSMem: returns grid and block size that achieves maximum potential occupancy for a device function;

239.  cudaOccupancyMaxPotentialBlockSizeVariableSmemWithFlags: returns grid and block size that achieves maximum potential occupancy for a device function;

240.  cudaOccupancyMaxPotentialBlocksSizeWithFlags: returns grid and blocksize that achieved maximum potential occupancy for a device function with the specified flags;

241.  cudaSetupArgument: configure a device launch;

242.  cudaStreamAttachMemAsync: attach memory to a stream asynchronously;

243.  cudaUnbindTexture: unbinds a texture;

244.  cudaProfilerInitialize: initialize the CUDA profiler;

245.  cudaProfilerStart: enable profiling;

246.  cudaProfilerStop: disable profiling;

时间: 2024-08-29 11:52:12

CUDA Runtime API 汇总的相关文章

关于CUDA两种API:Runtime API 和 Driver API

CUDA 眼下有两种不同的 API:Runtime API 和 Driver API,两种 API 各有其适用的范围. 高级API(cuda_runtime.h)是一种C++风格的接口,构建于低级API之上.因为 runtime API 较easy使用,一開始我们会以 runetime API 为主:

CUDA Driver API 使用说明

CUDA Driver API 使用说明 1. 简介 CUDA Driver API是在CUDA动态库(libcuda.so)中实现.若在eclipse环境中开发时,需要添加libcuda.so文件所在的路径,并在程序中引用cuda.h文件. 2. 环境配置 2.1 源程序 对于Driver API的使用只需在源程序中include相应的头文件cuda.h,并在使用其它的Driver API之前,需要先调用cuInit()函数对Driver进行初始化,如下所示的程序. 1 #include <s

Pytorch报错:cuda runtime error (59) : device-side assert triggered at /pytorch/aten/src/THC/generic/THCTensorMath.cu:26

Pytorch报错:cuda runtime error (59) : device-side assert triggered at /pytorch/aten/src/THC/generic/THCTensorMath.cu:26 这种问题是网上比较常见的,一般的原因就是Label没有从0开始导致数组或者tensor超出范围.我这次也是这个原因,具体来说,是由于使用了nll_loss造成的.关于NLLLoss,可以看我的这篇文章. 在计算NLLLoss时,要算X_label,但是输入的lab

openstack中nova组件Hypervisors、Floating_ips的全部python API 汇总

感谢朋友支持本博客,欢迎共同探讨交流,因为能力和时间有限.错误之处在所难免,欢迎指正! 假设转载.请保留作者信息. 博客地址:http://blog.csdn.net/qq_21398167 原博文地址:http://blog.csdn.net/qq_21398167/article/details/46620189 Floating_ips class novaclient.v2.floating_ips.FloatingIP(manager,info, loaded=False) Bases

Android 系统中,那些能大幅提高工作效率的 API 汇总(持续更新中...)

前言 "条条大路通罗马."工作中,实现某个需求的方式往往不是唯一的,这些不同实现方式不仅表现在代码质量上,还影响着我们的工作效率.就像,在 Android 系统中,总有那么一些鲜为人知的 API 能够减少我们很多零碎的工作量.于是,就想凭着一些经验,整理一些常用的,找个地方归纳总结,也供日后翻阅. getResources().getIdentifier(String name, String defType, String defPackage) 根据资源名称获取资源 id.正常情况

openstack中nova组件Hypervisors、Floating_ips的所有python API 汇总

Floating_ips class novaclient.v2.floating_ips.FloatingIP(manager, info, loaded=False) Bases: novaclient.openstack.common.apiclient.base.Resource Populate and bind to a manager. Parameters: manager – BaseManager object info – dictionary representing r

openstack中Nova组件Networks的所有python API 汇总

Networks Network interface. class novaclient.v2.networks.Network(manager, info, loaded=False) Bases: novaclient.openstack.common.apiclient.base.Resource A network. Populate and bind to a manager. Parameters: manager – BaseManager object info – dictio

openstack中Nova组件servers的所有python API 汇总

Servers Server interface. class novaclient.v2.servers.Server(manager, info, loaded=False) Bases: novaclient.openstack.common.apiclient.base.Resource Populate and bind to a manager. Parameters: manager – BaseManager object info – dictionary representi

openstack中Nova组件images的所有python API 汇总

Images Image interface. class novaclient.v2.images.Image(manager, info, loaded=False) Bases: novaclient.openstack.common.apiclient.base.Resource An image is a collection of files used to create or rebuild a server. Populate and bind to a manager. Par