NVIDIA Jetson TK1 – cuDNN install with Caffe example

NVIDIA’s cuDNN is a GPU-accelelerated library of primitives for deep neural networks, which is designed to be integrated into higher-level machine learning frameworks, such as UC Berkeley’s Caffe deep learning framework software.

In an earlier blog post, we installed Caffe on a Jetson TK1. Here’s a short video on how to install cuDNN and compile Caffe with cuDNN support. Looky here:

Note

As of this writing cuDNN requires CUDA 6.5. On the Jetson this means LT4 21.x must be installed. The current release of cuDNN is RC2. However, the current implementation of Caffe requires R1. The video shows installation of R1. The Jetson is running LT4 21.2 with CUDA 6.5

In order to install cuDNN, go to the NVIDIA cuDNN page and download the cuDNN libraries using your NVIDIA Developer account. There are two versions currently listed on the website, R1 and R2. R2 is the most recent. As noted above R1 is the version currently being used by Caffe. Once you download cuDNN, you can untar them and place them in your library and include paths. Here are some gists on Github to install cuDNN RC1 or another gist to install cuDNN R2. The gists are of the form:

$ tar -zxvf cudnn-6.5version.tgz
$ cd cudnn-6.5version
# copy the include file
$ sudo cp cudnn.h /usr/local/cuda-6.5/include
$ sudo cp libcudnn* /usr/local/cuda-6.5/lib

The cuDNN libraries are placed into the cuda-6.5 library directory, a convenient place since CUDA 6.5 needs to be in your LD_LIBRARY_PATH. Note: You should only install one of the releases.

Installing Caffe is straightforward, here a Github gist for installation. Basically install the tools and get Caffe:

Note: Make sure that you are in the ~/ directory, i.e. $ cd ~/ before the next steps.

$ sudo add-apt-repository universe
$ sudo apt-get update
$ sudo apt-get install libprotobuf-dev protobuf-compiler gfortran \
libboost-dev cmake libleveldb-dev libsnappy-dev \
libboost-thread-dev libboost-system-dev \
libatlas-base-dev libhdf5-serial-dev libgflags-dev \
libgoogle-glog-dev liblmdb-dev -y

$ sudo usermod -a -G video $USER

# Git clone Caffe
$ sudo apt-get install -y git
$ git clone https://github.com/BVLC/caffe.git
$ cd caffe && git checkout dev
$ cp Makefile.config.example Makefile.config

At this point, edit Makefile.confg:

$ gedit Makefile.config

and uncomment the line:

# USE_CUDNN := 1

Changing the line to:

USE_CUDNN := 1

For the tests, I also uncommented the CUDA_ARCH *_50 lines which are lower down in the file.

NOTE (6-15-2015): Aaron Schumacher found an issue with some of the later versions of Caffe. From his article: The NVIDIA Jetson TK1 with Caffe on MNIST

Unfortunately master has a really large value for LMDB_MAP_SIZE in src/caffe/util/db.cpp, which confuses our little 32-bit ARM processor on the Jetson, eventually leading to Caffe tests failing with errors like MDB_MAP_FULL: Environment mapsize limit reached. Caffe GitHub issue #1861 has some discussion about this and maybe it will be fixed eventually, but for the moment if you manually adjust the value from 1099511627776 to 536870912, you’ll be able to run all the Caffe tests successfully.

Then compile the Caffe and run the tests:

make -j 4 all
make -j 4 runtest

Caffe with cuDNN Results

In the video the results are shown for running the Caffe time example:

build/tools/caffe time –model=models/bvlc_alexnet/deploy.prototxt –gpu=0

Note: These results are the summation of 10 iterations, so per image recognition on the Average Forward Pass is the listed result divided by 10, i.e. 227.156 ms is ~23 ms per image recognition.

I also timed the examples without cuDNN installed:

With cuDNN:

Default:

I0119 22:22:13.032065 2223 caffe.cpp:273] Average Forward pass: 252.767 ms.
I0119 22:22:13.032985 2223 caffe.cpp:275] Average Backward pass: 261.981 ms.
I0119 22:22:13.033064 2223 caffe.cpp:277] Average Forward-Backward: 517.052 ms.

CPU Maximum Performance Setting

I0119 22:23:42.967684 2246 caffe.cpp:273] Average Forward pass: 233.343 ms.
I0119 22:23:42.967722 2246 caffe.cpp:275] Average Backward pass: 247.55 ms.
I0119 22:23:42.967759 2246 caffe.cpp:277] Average Forward-Backward: 481.215 ms.
I0119 22:23:42.967803 2246 caffe.cpp:279] Total Time: 24060.7 ms.

GPU and CPU Maximum Performance

I0119 22:24:59.754598 2261 caffe.cpp:273] Average Forward pass: 233.941 ms.
I0119 22:24:59.754642 2261 caffe.cpp:275] Average Backward pass: 246.8 ms.
I0119 22:24:59.754683 2261 caffe.cpp:277] Average Forward-Backward: 481.099 ms.
I0119 22:24:59.754729 2261 caffe.cpp:279] Total Time: 24055 ms.

Without cuDNN:

Default Settings:

I0119 21:21:15.920301 2080 caffe.cpp:273] Average Forward pass: 248.729 ms.
I0119 21:21:15.920436 2080 caffe.cpp:275] Average Backward pass: 243.773 ms.
I0119 21:21:15.920559 2080 caffe.cpp:277] Average Forward-Backward: 494.648 ms.
I0119 21:21:15.920708 2080 caffe.cpp:279] Total Time: 24732.4 ms.

CPU Maximum Performance Setting

I0119 21:25:51.013579 2228 caffe.cpp:273] Average Forward pass: 225.83 ms.
I0119 21:25:51.013624 2228 caffe.cpp:275] Average Backward pass: 227.208 ms.
I0119 21:25:51.013659 2228 caffe.cpp:277] Average Forward-Backward: 453.36 ms.
I0119 21:25:51.013701 2228 caffe.cpp:279] Total Time: 22668 ms.

GPU and CPU Maximum Performance

I0119 21:27:20.919353 2254 caffe.cpp:273] Average Forward pass: 225.722 ms.
I0119 21:27:20.919394 2254 caffe.cpp:275] Average Backward pass: 227.156 ms.
I0119 21:27:20.919428 2254 caffe.cpp:277] Average Forward-Backward: 453.203 ms.
I0119 21:27:20.919467 2254 caffe.cpp:279] Total Time: 22660.2 ms.

In comparing the numbers, you’ll note that the cuDNN produces about the same results as with the hand written code. A couple of things to remember: First, this is for cuDNN R1 so undoubtably the new version is probably faster. Second, by using a library cuDNN learning frameworks, such as Caffe, developers can concentrate on their particular problem set and avoid hand writing CUDA code for common tasks.

I was surprised that cranking up the GPU clocks to their maximum didn’t shorten the times significantly. Interesting, it means that there’s bottlenecks elsewhere.

Compared with the Big Iron

The Caffe website states:

Caffe can process over 40M images per day with a single NVIDIA K40 or Titan GPU*. That’s 5 ms/image in training, and 2 ms/image in test. We believe that Caffe is the fastest CNN implementation available.

Those results are about an order of magnitude faster than the TK1 on a Jetson. Why is that interesting?

While absolutely faster, the big GPU cards require a lot of power. Probably starting at a minimum of 500 watts for an ultra performance GPU card. In comparison, a Jetson requires about 10 watts. Math is not my strong point, but I could imagine putting together a cluster of Jetsons that produces similar results in a much smaller power footprint.

Conclusion

By leveraging a NVIDIA supported library such as cuDNN, developers can take advantage of fast algorithms that take advantage of the GPU architecture without having to worry about the minutiae of hand writing CUDA code. The advantage multiplies as architectures and feature sets of the hardware change.

34 Comments

  1. I think I lost you on the comparison with Big Iron. The TK1 takes ~200ms per image and the K40 takes ~2ms per image. These are 2 orders of magnitude of difference, not one. But, perhaps, I’m missing something.

  2. I’m having a couple issues with the cuDNN install for CAFFE, and they all seem related to the MDB map size.

    The make all reported a large integer truncation warning for convert_mnist_data.cpp, and it also reported the same warning for db.cpp

    Then the make runtest errored out on db.hpp with a MDB_MAP_FULL: Environment mapsize limit reached.

    I believe the solution for both is to lower the mapsize to 1GB (1073741824).

    There is a setting for this in both cpp files.

    I’m not sure how this happened. I’m positive I downloaded the correct R1 version for the Jetson TK1, and it uses a 32bit version of Ubuntu.

    • Interesting. I encountered the truncation warning also and didn’t think too much about it. However in my case, ‘make runtest’ did not have any issues and performed all of the tests as seen in the video.

      Not much help, but the compilation and tests were done immediately after a clean install of LT4 21.2, CUDA 6.5 and OpenCV using JetPack 1.0. Caffe and cuDNN were installed using the gist scripts on github noted in the blog post.

      • I haven’t been able to replicate the problem.

        I am going to try with cuDNN RC 3 with the new 21.3 LT4 release and see if there are any issues, but it’s hard to address an issue I haven’t replicated.

      • Here’s something that might be causing the issue that people have reported on Github: “I think this issue is due to the Jetson being a 32-bit (ARM) device, and the constant LMDB_MAP_SIZE in src/caffe/util/db.cpp being too big for it to understand. Here’s the whole line:

        const size_t LMDB_MAP_SIZE = 1099511627776; // 1 TB

        The solution suggested by Боголюбский Алексей of using 2^29 (536870912) instead works at least well enough to get all the tests to run successfully”.

    • Here’s something that might be causing the issue that people have reported on Github: “I think this issue is due to the Jetson being a 32-bit (ARM) device, and the constant LMDB_MAP_SIZE in src/caffe/util/db.cpp being too big for it to understand. Here’s the whole line:

      const size_t LMDB_MAP_SIZE = 1099511627776; // 1 TB

      The solution suggested by Боголюбский Алексей of using 2^29 (536870912) instead works at least well enough to get all the tests to run successfully”.

  3. Hello, when I get to the “make -j 4 all” step, I get the following errors:

    ubuntu@tegra-ubuntu:~/cudnn-6.5-linux-armv7-R1/caffe$ make -j 4 all
    CXX .build_release/src/caffe/proto/caffe.pb.cc
    CXX src/caffe/common.cpp
    CXX src/caffe/layers/deconv_layer.cpp
    CXX src/caffe/layers/dropout_layer.cpp
    In file included from ./include/caffe/util/device_alternate.hpp:40:0,
    from ./include/caffe/common.hpp:19,
    from src/caffe/common.cpp:5:
    ./include/caffe/util/cudnn.hpp:64:32: error: variable or field ‘createTensor4dDesc’ declared void
    inline void createTensor4dDesc(cudnnTensorDescriptor_t* desc) {
    ^
    ./include/caffe/util/cudnn.hpp:64:32: error: ‘cudnnTensorDescriptor_t’ was not declared in this scope
    ./include/caffe/util/cudnn.hpp:64:57: error: ‘desc’ was not declared in this scope
    inline void createTensor4dDesc(cudnnTensorDescriptor_t* desc) {
    ^
    ./include/caffe/util/cudnn.hpp:69:29: error: variable or field ‘setTensor4dDesc’ declared void
    inline void setTensor4dDesc(cudnnTensorDescriptor_t* desc,
    ^
    ./include/caffe/util/cudnn.hpp:69:29: error: ‘cudnnTensorDescriptor_t’ was not declared in this scope
    ./include/caffe/util/cudnn.hpp:69:54: error: ‘desc’ was not declared in this scope
    inline void setTensor4dDesc(cudnnTensorDescriptor_t* desc,
    ^
    ./include/caffe/util/cudnn.hpp:70:5: error: expected primary-expression before ‘int’
    int n, int c, int h, int w,
    ^
    ./include/caffe/util/cudnn.hpp:70:12: error: expected primary-expression before ‘int’
    int n, int c, int h, int w,
    ^
    ./include/caffe/util/cudnn.hpp:70:19: error: expected primary-expression before ‘int’
    int n, int c, int h, int w,
    ^
    ./include/caffe/util/cudnn.hpp:70:26: error: expected primary-expression before ‘int’
    int n, int c, int h, int w,
    ^
    ./include/caffe/util/cudnn.hpp:71:5: error: expected primary-expression before ‘int’
    int stride_n, int stride_c, int stride_h, int stride_w) {
    ^
    ./include/caffe/util/cudnn.hpp:71:19: error: expected primary-expression before ‘int’
    int stride_n, int stride_c, int stride_h, int stride_w) {
    ^
    ./include/caffe/util/cudnn.hpp:71:33: error: expected primary-expression before ‘int’
    int stride_n, int stride_c, int stride_h, int stride_w) {
    ^
    ./include/caffe/util/cudnn.hpp:71:47: error: expected primary-expression before ‘int’
    int stride_n, int stride_c, int stride_h, int stride_w) {
    ^
    ./include/caffe/util/cudnn.hpp:77:29: error: variable or field ‘setTensor4dDesc’ declared void
    inline void setTensor4dDesc(cudnnTensorDescriptor_t* desc,
    ^
    ./include/caffe/util/cudnn.hpp:77:29: error: ‘cudnnTensorDescriptor_t’ was not declared in this scope
    ./include/caffe/util/cudnn.hpp:77:54: error: ‘desc’ was not declared in this scope
    inline void setTensor4dDesc(cudnnTensorDescriptor_t* desc,
    ^
    ./include/caffe/util/cudnn.hpp:78:5: error: expected primary-expression before ‘int’
    int n, int c, int h, int w) {
    ^
    ./include/caffe/util/cudnn.hpp:78:12: error: expected primary-expression before ‘int’
    int n, int c, int h, int w) {
    ^
    ./include/caffe/util/cudnn.hpp:78:19: error: expected primary-expression before ‘int’
    int n, int c, int h, int w) {
    ^
    ./include/caffe/util/cudnn.hpp:78:26: error: expected primary-expression before ‘int’
    int n, int c, int h, int w) {
    ^
    ./include/caffe/util/cudnn.hpp:102:5: error: ‘cudnnTensorDescriptor_t’ has not been declared
    cudnnTensorDescriptor_t bottom, cudnnFilterDescriptor_t filter,
    ^
    ./include/caffe/util/cudnn.hpp: In function ‘void caffe::cudnn::setConvolutionDesc(cudnnConvolutionStruct**, int, cudnnFilterDescriptor_t, int, int, int, int)’:
    ./include/caffe/util/cudnn.hpp:105:70: error: there are no arguments to ‘cudnnSetConvolution2dDescriptor’ that depend on a template parameter, so a declaration of ‘cudnnSetConvolution2dDescriptor’ must be available [-fpermissive]
    pad_h, pad_w, stride_h, stride_w, 1, 1, CUDNN_CROSS_CORRELATION));
    ^
    ./include/caffe/util/cudnn.hpp:12:28: note: in definition of macro ‘CUDNN_CHECK’
    cudnnStatus_t status = condition; \
    ^
    ./include/caffe/util/cudnn.hpp:105:70: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated)
    pad_h, pad_w, stride_h, stride_w, 1, 1, CUDNN_CROSS_CORRELATION));
    ^
    ./include/caffe/util/cudnn.hpp:12:28: note: in definition of macro ‘CUDNN_CHECK’
    cudnnStatus_t status = condition; \
    ^
    ./include/caffe/util/cudnn.hpp: In function ‘void caffe::cudnn::createPoolingDesc(cudnnPoolingStruct**, caffe::PoolingParameter_PoolMethod, cudnnPoolingMode_t*, int, int, int, int, int, int)’:
    ./include/caffe/util/cudnn.hpp:117:13: error: ‘CUDNN_POOLING_AVERAGE_COUNT_INCLUDE_PADDING’ was not declared in this scope
    *mode = CUDNN_POOLING_AVERAGE_COUNT_INCLUDE_PADDING;
    ^
    ./include/caffe/util/cudnn.hpp:124:41: error: there are no arguments to ‘cudnnSetPooling2dDescriptor’ that depend on a template parameter, so a declaration of ‘cudnnSetPooling2dDescriptor’ must be available [-fpermissive]
    pad_h, pad_w, stride_h, stride_w));
    ^
    ./include/caffe/util/cudnn.hpp:12:28: note: in definition of macro ‘CUDNN_CHECK’
    cudnnStatus_t status = condition; \
    ^
    make: *** [.build_release/src/caffe/common.o] Error 1
    make: *** Waiting for unfinished jobs….
    In file included from ./include/caffe/util/device_alternate.hpp:40:0,
    from ./include/caffe/common.hpp:19,
    from src/caffe/layers/dropout_layer.cpp:5:
    ./include/caffe/util/cudnn.hpp:64:32: error: variable or field ‘createTensor4dDesc’ declared void
    inline void createTensor4dDesc(cudnnTensorDescriptor_t* desc) {
    ^
    ./include/caffe/util/cudnn.hpp:64:32: error: ‘cudnnTensorDescriptor_t’ was not declared in this scope
    ./include/caffe/util/cudnn.hpp:64:57: error: ‘desc’ was not declared in this scope
    inline void createTensor4dDesc(cudnnTensorDescriptor_t* desc) {
    ^
    ./include/caffe/util/cudnn.hpp:69:29: error: variable or field ‘setTensor4dDesc’ declared void
    inline void setTensor4dDesc(cudnnTensorDescriptor_t* desc,
    ^
    ./include/caffe/util/cudnn.hpp:69:29: error: ‘cudnnTensorDescriptor_t’ was not declared in this scope
    ./include/caffe/util/cudnn.hpp:69:54: error: ‘desc’ was not declared in this scope
    inline void setTensor4dDesc(cudnnTensorDescriptor_t* desc,
    ^
    ./include/caffe/util/cudnn.hpp:70:5: error: expected primary-expression before ‘int’
    int n, int c, int h, int w,
    ^
    ./include/caffe/util/cudnn.hpp:70:12: error: expected primary-expression before ‘int’
    int n, int c, int h, int w,
    ^
    ./include/caffe/util/cudnn.hpp:70:19: error: expected primary-expression before ‘int’
    int n, int c, int h, int w,
    ^
    ./include/caffe/util/cudnn.hpp:70:26: error: expected primary-expression before ‘int’
    int n, int c, int h, int w,
    ^
    ./include/caffe/util/cudnn.hpp:71:5: error: expected primary-expression before ‘int’
    int stride_n, int stride_c, int stride_h, int stride_w) {
    ^
    ./include/caffe/util/cudnn.hpp:71:19: error: expected primary-expression before ‘int’
    int stride_n, int stride_c, int stride_h, int stride_w) {
    ^
    ./include/caffe/util/cudnn.hpp:71:33: error: expected primary-expression before ‘int’
    int stride_n, int stride_c, int stride_h, int stride_w) {
    ^
    ./include/caffe/util/cudnn.hpp:71:47: error: expected primary-expression before ‘int’
    int stride_n, int stride_c, int stride_h, int stride_w) {
    ^
    ./include/caffe/util/cudnn.hpp:77:29: error: variable or field ‘setTensor4dDesc’ declared void
    inline void setTensor4dDesc(cudnnTensorDescriptor_t* desc,
    ^
    ./include/caffe/util/cudnn.hpp:77:29: error: ‘cudnnTensorDescriptor_t’ was not declared in this scope
    ./include/caffe/util/cudnn.hpp:77:54: error: ‘desc’ was not declared in this scope
    inline void setTensor4dDesc(cudnnTensorDescriptor_t* desc,
    ^
    ./include/caffe/util/cudnn.hpp:78:5: error: expected primary-expression before ‘int’
    int n, int c, int h, int w) {
    ^
    ./include/caffe/util/cudnn.hpp:78:12: error: expected primary-expression before ‘int’
    int n, int c, int h, int w) {
    ^
    ./include/caffe/util/cudnn.hpp:78:19: error: expected primary-expression before ‘int’
    int n, int c, int h, int w) {
    ^
    ./include/caffe/util/cudnn.hpp:78:26: error: expected primary-expression before ‘int’
    int n, int c, int h, int w) {
    ^
    ./include/caffe/util/cudnn.hpp:102:5: error: ‘cudnnTensorDescriptor_t’ has not been declared
    cudnnTensorDescriptor_t bottom, cudnnFilterDescriptor_t filter,
    ^
    ./include/caffe/util/cudnn.hpp: In function ‘void caffe::cudnn::setConvolutionDesc(cudnnConvolutionStruct**, int, cudnnFilterDescriptor_t, int, int, int, int)’:
    ./include/caffe/util/cudnn.hpp:105:70: error: there are no arguments to ‘cudnnSetConvolution2dDescriptor’ that depend on a template parameter, so a declaration of ‘cudnnSetConvolution2dDescriptor’ must be available [-fpermissive]
    pad_h, pad_w, stride_h, stride_w, 1, 1, CUDNN_CROSS_CORRELATION));
    ^
    ./include/caffe/util/cudnn.hpp:12:28: note: in definition of macro ‘CUDNN_CHECK’
    cudnnStatus_t status = condition; \
    ^
    ./include/caffe/util/cudnn.hpp:105:70: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated)
    pad_h, pad_w, stride_h, stride_w, 1, 1, CUDNN_CROSS_CORRELATION));
    ^
    ./include/caffe/util/cudnn.hpp:12:28: note: in definition of macro ‘CUDNN_CHECK’
    cudnnStatus_t status = condition; \
    ^
    ./include/caffe/util/cudnn.hpp: In function ‘void caffe::cudnn::createPoolingDesc(cudnnPoolingStruct**, caffe::PoolingParameter_PoolMethod, cudnnPoolingMode_t*, int, int, int, int, int, int)’:
    ./include/caffe/util/cudnn.hpp:117:13: error: ‘CUDNN_POOLING_AVERAGE_COUNT_INCLUDE_PADDING’ was not declared in this scope
    *mode = CUDNN_POOLING_AVERAGE_COUNT_INCLUDE_PADDING;
    ^
    ./include/caffe/util/cudnn.hpp:124:41: error: there are no arguments to ‘cudnnSetPooling2dDescriptor’ that depend on a template parameter, so a declaration of ‘cudnnSetPooling2dDescriptor’ must be available [-fpermissive]
    pad_h, pad_w, stride_h, stride_w));
    ^
    ./include/caffe/util/cudnn.hpp:12:28: note: in definition of macro ‘CUDNN_CHECK’
    cudnnStatus_t status = condition; \
    ^
    In file included from ./include/caffe/loss_layers.hpp:11:0,
    from ./include/caffe/common_layers.hpp:12,
    from ./include/caffe/vision_layers.hpp:10,
    from src/caffe/layers/dropout_layer.cpp:9:
    ./include/caffe/neuron_layers.hpp: At global scope:
    ./include/caffe/neuron_layers.hpp:501:3: error: ‘cudnnTensorDescriptor_t’ does not name a type
    cudnnTensorDescriptor_t bottom_desc_;
    ^
    ./include/caffe/neuron_layers.hpp:502:3: error: ‘cudnnTensorDescriptor_t’ does not name a type
    cudnnTensorDescriptor_t top_desc_;
    ^
    ./include/caffe/neuron_layers.hpp:584:3: error: ‘cudnnTensorDescriptor_t’ does not name a type
    cudnnTensorDescriptor_t bottom_desc_;
    ^
    ./include/caffe/neuron_layers.hpp:585:3: error: ‘cudnnTensorDescriptor_t’ does not name a type
    cudnnTensorDescriptor_t top_desc_;
    ^
    ./include/caffe/neuron_layers.hpp:669:3: error: ‘cudnnTensorDescriptor_t’ does not name a type
    cudnnTensorDescriptor_t bottom_desc_;
    ^
    ./include/caffe/neuron_layers.hpp:670:3: error: ‘cudnnTensorDescriptor_t’ does not name a type
    cudnnTensorDescriptor_t top_desc_;
    ^
    In file included from ./include/caffe/vision_layers.hpp:10:0,
    from src/caffe/layers/dropout_layer.cpp:9:
    ./include/caffe/common_layers.hpp:536:3: error: ‘cudnnTensorDescriptor_t’ does not name a type
    cudnnTensorDescriptor_t bottom_desc_;
    ^
    ./include/caffe/common_layers.hpp:537:3: error: ‘cudnnTensorDescriptor_t’ does not name a type
    cudnnTensorDescriptor_t top_desc_;
    ^
    In file included from src/caffe/layers/dropout_layer.cpp:9:0:
    ./include/caffe/vision_layers.hpp:249:10: error: ‘cudnnTensorDescriptor_t’ was not declared in this scope
    vector bottom_descs_, top_descs_;
    ^
    ./include/caffe/vision_layers.hpp:249:33: error: template argument 1 is invalid
    vector bottom_descs_, top_descs_;
    ^
    ./include/caffe/vision_layers.hpp:249:33: error: template argument 2 is invalid
    ./include/caffe/vision_layers.hpp:250:3: error: ‘cudnnTensorDescriptor_t’ does not name a type
    cudnnTensorDescriptor_t bias_desc_;
    ^
    ./include/caffe/vision_layers.hpp:450:3: error: ‘cudnnTensorDescriptor_t’ does not name a type
    cudnnTensorDescriptor_t bottom_desc_, top_desc_;
    ^
    In file included from ./include/caffe/util/device_alternate.hpp:40:0,
    from ./include/caffe/common.hpp:19,
    from ./include/caffe/blob.hpp:8,
    from ./include/caffe/filler.hpp:10,
    from src/caffe/layers/deconv_layer.cpp:3:
    ./include/caffe/util/cudnn.hpp:64:32: error: variable or field ‘createTensor4dDesc’ declared void
    inline void createTensor4dDesc(cudnnTensorDescriptor_t* desc) {
    ^
    ./include/caffe/util/cudnn.hpp:64:32: error: ‘cudnnTensorDescriptor_t’ was not declared in this scope
    ./include/caffe/util/cudnn.hpp:64:57: error: ‘desc’ was not declared in this scope
    inline void createTensor4dDesc(cudnnTensorDescriptor_t* desc) {
    ^
    ./include/caffe/util/cudnn.hpp:69:29: error: variable or field ‘setTensor4dDesc’ declared void
    inline void setTensor4dDesc(cudnnTensorDescriptor_t* desc,
    ^
    ./include/caffe/util/cudnn.hpp:69:29: error: ‘cudnnTensorDescriptor_t’ was not declared in this scope
    ./include/caffe/util/cudnn.hpp:69:54: error: ‘desc’ was not declared in this scope
    inline void setTensor4dDesc(cudnnTensorDescriptor_t* desc,
    ^
    ./include/caffe/util/cudnn.hpp:70:5: error: expected primary-expression before ‘int’
    int n, int c, int h, int w,
    ^
    ./include/caffe/util/cudnn.hpp:70:12: error: expected primary-expression before ‘int’
    int n, int c, int h, int w,
    ^
    ./include/caffe/util/cudnn.hpp:70:19: error: expected primary-expression before ‘int’
    int n, int c, int h, int w,
    ^
    ./include/caffe/util/cudnn.hpp:70:26: error: expected primary-expression before ‘int’
    int n, int c, int h, int w,
    ^
    ./include/caffe/util/cudnn.hpp:71:5: error: expected primary-expression before ‘int’
    int stride_n, int stride_c, int stride_h, int stride_w) {
    ^
    ./include/caffe/util/cudnn.hpp:71:19: error: expected primary-expression before ‘int’
    int stride_n, int stride_c, int stride_h, int stride_w) {
    ^
    ./include/caffe/util/cudnn.hpp:71:33: error: expected primary-expression before ‘int’
    int stride_n, int stride_c, int stride_h, int stride_w) {
    ^
    ./include/caffe/util/cudnn.hpp:71:47: error: expected primary-expression before ‘int’
    int stride_n, int stride_c, int stride_h, int stride_w) {
    ^
    ./include/caffe/util/cudnn.hpp:77:29: error: variable or field ‘setTensor4dDesc’ declared void
    inline void setTensor4dDesc(cudnnTensorDescriptor_t* desc,
    ^
    ./include/caffe/util/cudnn.hpp:77:29: error: ‘cudnnTensorDescriptor_t’ was not declared in this scope
    ./include/caffe/util/cudnn.hpp:77:54: error: ‘desc’ was not declared in this scope
    inline void setTensor4dDesc(cudnnTensorDescriptor_t* desc,
    ^
    ./include/caffe/util/cudnn.hpp:78:5: error: expected primary-expression before ‘int’
    int n, int c, int h, int w) {
    ^
    ./include/caffe/util/cudnn.hpp:78:12: error: expected primary-expression before ‘int’
    int n, int c, int h, int w) {
    ^
    ./include/caffe/util/cudnn.hpp:78:19: error: expected primary-expression before ‘int’
    int n, int c, int h, int w) {
    ^
    ./include/caffe/util/cudnn.hpp:78:26: error: expected primary-expression before ‘int’
    int n, int c, int h, int w) {
    ^
    ./include/caffe/util/cudnn.hpp:102:5: error: ‘cudnnTensorDescriptor_t’ has not been declared
    cudnnTensorDescriptor_t bottom, cudnnFilterDescriptor_t filter,
    ^
    ./include/caffe/util/cudnn.hpp: In function ‘void caffe::cudnn::setConvolutionDesc(cudnnConvolutionStruct**, int, cudnnFilterDescriptor_t, int, int, int, int)’:
    ./include/caffe/util/cudnn.hpp:105:70: error: there are no arguments to ‘cudnnSetConvolution2dDescriptor’ that depend on a template parameter, so a declaration of ‘cudnnSetConvolution2dDescriptor’ must be available [-fpermissive]
    pad_h, pad_w, stride_h, stride_w, 1, 1, CUDNN_CROSS_CORRELATION));
    ^
    ./include/caffe/util/cudnn.hpp:12:28: note: in definition of macro ‘CUDNN_CHECK’
    cudnnStatus_t status = condition; \
    ^
    ./include/caffe/util/cudnn.hpp:105:70: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated)
    pad_h, pad_w, stride_h, stride_w, 1, 1, CUDNN_CROSS_CORRELATION));
    ^
    ./include/caffe/util/cudnn.hpp:12:28: note: in definition of macro ‘CUDNN_CHECK’
    cudnnStatus_t status = condition; \
    ^
    ./include/caffe/util/cudnn.hpp: In function ‘void caffe::cudnn::createPoolingDesc(cudnnPoolingStruct**, caffe::PoolingParameter_PoolMethod, cudnnPoolingMode_t*, int, int, int, int, int, int)’:
    ./include/caffe/util/cudnn.hpp:117:13: error: ‘CUDNN_POOLING_AVERAGE_COUNT_INCLUDE_PADDING’ was not declared in this scope
    *mode = CUDNN_POOLING_AVERAGE_COUNT_INCLUDE_PADDING;
    ^
    ./include/caffe/util/cudnn.hpp:124:41: error: there are no arguments to ‘cudnnSetPooling2dDescriptor’ that depend on a template parameter, so a declaration of ‘cudnnSetPooling2dDescriptor’ must be available [-fpermissive]
    pad_h, pad_w, stride_h, stride_w));
    ^
    ./include/caffe/util/cudnn.hpp:12:28: note: in definition of macro ‘CUDNN_CHECK’
    cudnnStatus_t status = condition; \
    ^
    In file included from ./include/caffe/loss_layers.hpp:11:0,
    from ./include/caffe/common_layers.hpp:12,
    from ./include/caffe/vision_layers.hpp:10,
    from src/caffe/layers/deconv_layer.cpp:7:
    ./include/caffe/neuron_layers.hpp: At global scope:
    ./include/caffe/neuron_layers.hpp:501:3: error: ‘cudnnTensorDescriptor_t’ does not name a type
    cudnnTensorDescriptor_t bottom_desc_;
    ^
    ./include/caffe/neuron_layers.hpp:502:3: error: ‘cudnnTensorDescriptor_t’ does not name a type
    cudnnTensorDescriptor_t top_desc_;
    ^
    ./include/caffe/neuron_layers.hpp:584:3: error: ‘cudnnTensorDescriptor_t’ does not name a type
    cudnnTensorDescriptor_t bottom_desc_;
    ^
    ./include/caffe/neuron_layers.hpp:585:3: error: ‘cudnnTensorDescriptor_t’ does not name a type
    cudnnTensorDescriptor_t top_desc_;
    ^
    ./include/caffe/neuron_layers.hpp:669:3: error: ‘cudnnTensorDescriptor_t’ does not name a type
    cudnnTensorDescriptor_t bottom_desc_;
    ^
    ./include/caffe/neuron_layers.hpp:670:3: error: ‘cudnnTensorDescriptor_t’ does not name a type
    cudnnTensorDescriptor_t top_desc_;
    ^
    In file included from ./include/caffe/vision_layers.hpp:10:0,
    from src/caffe/layers/deconv_layer.cpp:7:
    ./include/caffe/common_layers.hpp:536:3: error: ‘cudnnTensorDescriptor_t’ does not name a type
    cudnnTensorDescriptor_t bottom_desc_;
    ^
    ./include/caffe/common_layers.hpp:537:3: error: ‘cudnnTensorDescriptor_t’ does not name a type
    cudnnTensorDescriptor_t top_desc_;
    ^
    In file included from src/caffe/layers/deconv_layer.cpp:7:0:
    ./include/caffe/vision_layers.hpp:249:10: error: ‘cudnnTensorDescriptor_t’ was not declared in this scope
    vector bottom_descs_, top_descs_;
    ^
    ./include/caffe/vision_layers.hpp:249:33: error: template argument 1 is invalid
    vector bottom_descs_, top_descs_;
    ^
    ./include/caffe/vision_layers.hpp:249:33: error: template argument 2 is invalid
    ./include/caffe/vision_layers.hpp:250:3: error: ‘cudnnTensorDescriptor_t’ does not name a type
    cudnnTensorDescriptor_t bias_desc_;
    ^
    ./include/caffe/vision_layers.hpp:450:3: error: ‘cudnnTensorDescriptor_t’ does not name a type
    cudnnTensorDescriptor_t bottom_desc_, top_desc_;
    ^
    make: *** [.build_release/src/caffe/layers/dropout_layer.o] Error 1
    make: *** [.build_release/src/caffe/layers/deconv_layer.o] Error 1

    Any advice? Thanks!

  4. Hello all. When I get to the “make -j 4 all” step, I get:
    CXX .build_release/src/caffe/proto/caffe.pb.cc
    CXX src/caffe/common.cpp
    CXX src/caffe/layers/deconv_layer.cpp
    CXX src/caffe/layers/dropout_layer.cpp

    Then I get a bunch of compile errors, about 300 or so lines of them, that have a lot to do with the ./include/caffe/ directory. Any advice? Thanks!

        • My first guess is that there is an issue with the working directory. In your post, you tried to compile from: ~/cudnn-6.5-linux-armv7-R1/caffe

          In the video, Caffe was compiled from ~/caffe. In the instructions in this article, there appears to be a missing ‘cd ~/’ before downloading caffe. In the video, Caffe was already installed as demonstrated previously in another video. Try getting rid of Caffe from the cudnn directory, download Caffe into ~/ and and build compile. Let me know if that works, so I can change the article. Try to get Caffe working before adding cuDNN support if you’re still having issues.

          • Okay, I downloaded it into the home directory and built, but it still didn’t do anything until I commented back the # USE_CDNN := 1 line and changed LMDB_MAP_SIZE from 1099511627776 to 536870912, then it built just fine. However, when I uncommented the USE_CDNN := 1 line, it gives me far fewer compiler errors, but still some errors. Any advice on how to make it compile with CUDNN support? Thanks!

  5. Okay, I cloned it into the home directory and built, but I was still getting the errors. However, when I commented the #USE_CDNN := 1 line and changed the LMDB_MAP_SIZE, it compiled successfully. But once I uncommentted the USE_CDNN := 1 line and built, I get far fewer compile errors, but still errors. Any advice on how to get it to work with CUDNN support? Thanks!

      • It had different errors, but right now I am getting “make: Nothing to be done for ‘all’.” when I “make – j 4 all”.

        • $ make clean
          and then
          $ make -j 4 all
          Should force everything to recompile.

          If it doesn’t, you’ll have to find the object directory for Caffe and clean it out, and delete the executable.

          USE_CUDNN:=1

          Was the only thing that I did in the video to make it compile correctly. The cudnn install could be bad, you might want to delete the cudnn and reinstall it. After that, I’m running out of ideas.
          Sorry you’re having these issues.

  6. I will try that. In my Makefile.config, instead of #USE_CDNN := 1, I had USE_CUDNN :=1. Maybe that’s something. I’ll try it out. You’ve been very helpful, thank you!

  7. Hey- I’m noticing a lot of comments here referencing errors of the form “cudnn* not declared in this scope. I also had this issue and was confounded by it for a while. I only have the error when using cuddn R2- seems that the R2 update isn’t in the branch of caffe mentioned in this post.
    Still really unclear as to what was causing the problem specifically (error makes it look like caffe is just failing to import cudnn, despite environment variables) but it is definitely solved by using cudnn R1.

  8. Hi all
    I completely installed caffe and cudann (follow above video and after fixed some issues…). Now, i want to test with my data, how to do that?. I must prepare which data form? how to import my data to caffe. Thanks a lot.

  9. Hey all,
    I am getting the following error on make -j 4 runtest when compiling with cuDNN, while it works perfectly without compiling with cuDNN.

    F0206 03:23:00.169116 12663 cudnn_softmax_layer.cpp:15] Check failed: status == CUDNN_STATUS_SUCCESS (1 vs. 0) CUDNN_STATUS_NOT_INITIALIZED
    *** Check failure stack trace: ***
    @ 0x43255060 (unknown)
    @ 0x43254f5c (unknown)
    @ 0x43254b78 (unknown)
    @ 0x43256f98 (unknown)
    @ 0x43c2b80c caffe::CuDNNSoftmaxLayer::LayerSetUp()
    @ 0x43c5c9ee caffe::SoftmaxWithLossLayer::LayerSetUp()
    @ 0x43beedfc caffe::Net::Init()
    @ 0x43befec0 caffe::Net::Net()
    @ 0x43bdccc2 caffe::Solver::InitTrainNet()
    @ 0x43bdd792 caffe::Solver::Init()
    @ 0x43bdd978 caffe::Solver::Solver()
    @ 0x2c5634 caffe::SolverTest::InitSolverFromProtoString()
    @ 0x2bf9f0 caffe::SolverTest_TestInitTrainTestNets_Test::TestBody()
    @ 0x3b2d28 testing::internal::HandleExceptionsInMethodIfSupported()
    @ 0x3ad2ba testing::Test::Run()
    @ 0x3ad34a testing::TestInfo::Run()
    @ 0x3ad422 testing::TestCase::Run()
    @ 0x3aec7a testing::internal::UnitTestImpl::RunAllTests()
    @ 0x3aee6c testing::UnitTest::Run()
    @ 0xd7a8e main
    @ 0x4417d632 (unknown)
    make: *** [runtest] Aborted

    Please HElP!!
    Thanks a lot

  10. I noticed in your time runs that the forward and total pass times are actually worse when using cuDNN as opposed to CUDA only. I have observed the same results using cuDNN v2 on the TK1. Do you have any thoughts on why there appears to be no acceleration with cuDNN on the TK1 ?

    • I think there was a disconnect between the changes between cuDNN V1 and V2 between the Caffe development team and the cuDNN team. People were probably working towards adding features to cuDNN and lost a little bit of performance, while at the same time Caffe wasn’t working towards V2 integration until well after it was out. With that said, it could be just a Tegra K1 thing, it may have performed much better on an actual GPU card.

      The the Jetson TX1 using v4 of cuDNN, there is much better performance using cuDNN than CUDA alone.

  11. Hi,

    I have successfully installed Caffe on my Jetson TK1 (L4T 21.4, CUDA 6.5, cuDNNv2). The tests for caffee are working fine (make -j 4 all; make -j 4 test; make -j 4 runtest), and I was able to use caffe to classify images from the ImageNet database follwing this example:

    http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/00-classification.ipynb

    Then, I tried to use the deep residual network model ResNet-50 (from here: https://github.com/KaimingHe/deep-residual-networks) to classify images with the Jetson. So I adapted the previous python script and loaded the ResNet-50 model instead. Everything seemed to work fine until I ran the network for the classification in line 68

    output = net.forward()

    At this point ipython shuts downw with the message error “killed”. This is the last part of the output

    I0414 13:15:48.842650 6304 net.cpp:219] conv1_relu does not need backward computation.
    I0414 13:15:48.842701 6304 net.cpp:219] scale_conv1 does not need backward computation.
    I0414 13:15:48.842742 6304 net.cpp:219] bn_conv1 does not need backward computation.
    I0414 13:15:48.842803 6304 net.cpp:219] conv1 does not need backward computation.
    I0414 13:15:48.842855 6304 net.cpp:219] input does not need backward computation.
    I0414 13:15:48.842905 6304 net.cpp:261] This network produces output prob I0414 13:15:48.843305 6304 net.cpp:274] Network initialization done. I0414 13:15:50.019062 6304 upgrade_proto.cpp:66] Attempting to upgrade input file specified using deprecated input fields: ../models/ResNet/ResNet-50-model.caffemodel
    I0414 13:15:50.019403 6304 upgrade_proto.cpp:69] Successfully upgraded file specified using deprecated input fields.
    W0414 13:15:50.019623 6304 upgrade_proto.cpp:71] Note that future Caffe releases will only support input layers and not input fields.

    mean-subtracted values: [(‘B’, 104.0069879317889), (‘G’, 116.66876761696767), (‘R’, 122.6789143406786)]

    In [7]: output = net.forward()
    Killed

    ubuntu@tegra-ubuntu:~/caffe/examples$

    I am not sure how to debug this issue (I googled it and some people suggest that this might be a problem with the compiler).

    Any ideas what could be causing this?

    The full python script can be downloaded from here: https://github.com/Lisandro79/JetsonCaffe.git

    Thanks a lot

Leave a Reply

Your email address will not be published.


*