External calls are not supported - CUDA(不支持外部调用 - CUDA)
问题描述
目标是调用另一个文件中可用的设备函数,当我编译 global 内核时,它显示以下错误 *External calls are not supported (found non-inlined call to _Z6GoldenSectionCUDA)*.
Objective is to call a device function available in another file, when i compile the global kernel it shows the following error *External calls are not supported (found non-inlined call to _Z6GoldenSectionCUDA)*.
有问题的代码(不是完整的代码,而是出现问题的地方),猫范数.h
Problematic Code (not the full code but where the problem arises), cat norm.h
# ifndef NORM_H_
# define NORM_H_
# include<stdio.h>
__device__ double invcdf(double prob, double mean, double stddev);
#endif
cat norm.cu
cat norm.cu
# include <norm.h>
__device__ double invcdf(double prob, double mean, double stddev) {
return (mean + stddev*normcdfinv(prob));
}
猫测试.cu
# include <norm.h>
# include <curand.h>
# include <curand_kernel.h>
__global__ void phase2Kernel(double* out_profit, struct strategyHolder* strategy) {
curandState seedValue;
curand_init(threadIdx.x, 0, 0, &seedValue);
double randomD = invcdf(curand_uniform_double( &seedValue ), 300, 80);
}
nvcc -c norm.cu -o norm.o -I"."
nvcc -c test.cu -o test.o -I"."
nvcc -c norm.cu -o norm.o -I"."
nvcc -c test.cu -o test.o -I"."
推荐答案
您正在尝试进行单独编译,这需要一些特殊的命令行选项.请参阅 NVCC 手册 有关详细信息,但这里是如何让您的示例编译.我已针对 sm_20,但您可以针对 sm_20 或更高版本,具体取决于您拥有的 GPU.旧设备 (sm_1x) 上无法单独编译.
You're trying to do separate compilation, which needs some special command line options. See the NVCC manual for details, but here's how to get your example to compile. I've targeted sm_20, but you can target sm_20 or later depending on what GPU you have. Separate compilation is not possible on older devices (sm_1x).
- 您无需在头文件中将
__device__
函数声明为extern
,但如果您有任何静态设备变量,则需要将它们声明为 <代码>外部 如下图所示编译为设备生成可重定位代码(
-dc
是-c
的设备等价物,见手册了解更多信息)
- You don't need to declare the
__device__
function asextern
in your header file, but if you have any static device variables they will need to be declared asextern
Generate relocatable code for the device by compiling as shown below (
-dc
is the device equivalent of-c
, see the manual for more information)
nvcc -arch=sm_20 -dc norm.cu -o norm.o -I.
nvcc -arch=sm_20 -dc test.cu -o test.o -I.
通过在最终主机链接之前调用 nvlink 来链接代码的设备部分
Link the device parts of the code by calling nvlink before the final host link
nvlink -arch=sm_20 norm.o test.o -o final.o
这篇关于不支持外部调用 - CUDA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:不支持外部调用 - CUDA
- C++ 协变模板 2021-01-01
- 一起使用 MPI 和 OpenCV 时出现分段错误 2022-01-01
- STL 中有 dereference_iterator 吗? 2022-01-01
- Stroustrup 的 Simple_window.h 2022-01-01
- 使用/clr 时出现 LNK2022 错误 2022-01-01
- 从python回调到c++的选项 2022-11-16
- 如何对自定义类的向量使用std::find()? 2022-11-07
- 静态初始化顺序失败 2022-01-01
- 与 int by int 相比,为什么执行 float by float 矩阵乘法更快? 2021-01-01
- 近似搜索的工作原理 2021-01-01