This is a little opinion piece on running Robot Operating System (ROS) with OpenCV versus OpenCV4Tegra on the NVIDIA Jetson TK1. First let’s talk about some of the advantages that OpenCV4Tegra has versus regular OpenCV.
As you can read from Jetson/Installing OpenCV, OpenCV4Tegra is a CPU and GPU accelerated version of the standard OpenCV library. OpenCV stands for “Open Computer Vision”, the de-facto standard Computer Vision library containing more than 2500 computer vision & image processing & machine learning algorithms. You can read more about the specific performance benefits of using OpenCV4Tegra versus OpenCV at Jetson/Computer Vision Performance.
Note: As of this writing, this is for the current 2.4.10 release of OpenCV vs. 2.4.10 release of OpenCV4Tegra.
There are three versions of OpenCV that you can run on the Jetson:
- “Regular” OpenCV
- OpenCV with GPU enhancements
- OpenCV4Tegra with both CPU and GPU enhancements
“Regular” OpenCV is OpenCV that is compiled from the OpenCV repository, with no hardware acceleration. This is typically not used on the Jetson, as GPU enhancements are available for OpenCV. OpenCV with GPU enhancements is designed for CUDA GPGPU acceleration. This is part of the standard OpenCV package. OpenCV4Tegra is a free, closed source library available from NVIDIA which includes ARM NEON SIMD optimizations, multi-core CPU optimizations and some GLSL GPU optimizations.
So why wouldn’t you always use OpenCV4Tegra? The answer lies in the actual OpenCV library itself; there are two proprietary patented algorithms, SIFT and SURF, which exist in opencv-nonfree. Because these are patented, NVIDIA does not include them in their distribution of OpenCV4Tegra. Therefore if your code does not use SIFT or SURF, then you can use OpenCV4Tegra and get the best performance.
So why use SIFT and/or SURF? The quick answer is that when people are doing feature detection, SIFT/SURF are two of the most popular algorithms in use today. One application is simultaneous Localization And Mapping (SLAM) used mostly in the robotics/drone world. One of the most popular packages of which is Semi-Direct Monocular Visual Odometry (SVO), available on Github at rpg_svo.
Another application which uses SIFT/SURF is deep learning, such as the package Caffe Deep Learning Framework.
The first alternative is that if you do not have any need of SIFT/SURF in your application, you can use OpenCV4Tegra and enjoy the best performance. There is a rub, but also a possible workaround.
When OpenCV4Tegra is installed, the packaging is different from what most packages, such as Robot Operating System (ROS), expect. Aptitude, the Debian package manager, reports that when OpenCV4Tegra is installed, libopencv-* is not installed. Aptitude notices a conflict with ‘upstream opencv’ (in this case regular old opencv) and requests that OpenCV4Tegra be uninstalled before OpenCV is installed. Jetson forum member davywb figured out a work around, and posted it to the Jetson forum. While the workaround is for ROS, it will probably work for most other packages too.
If you need SIFT/SURF you can:
You can read more at the Jetson/Installing OpenCV page.
If you need SIFT/SURF, then you should just build OpenCV from source, otherwise use OpenCV4Tegra.
Note 1: I would only use OpenCV4Tegra with the added SIFT/SURF code from the OpenCV library as a horrible last resort to wring out last bit of possible performance from the little beast. The major issue with the combination is that it’s a maintenance nightmare; you have to track all the different versions of the OpenCV library and OpenCV4Tegra and merge them. As you know, maintenance is the real bill that you have to pay when writing code.
The mistake that most people make is premature optimization. Instead of building something that works and then fine tuning it, programmers tend to spend too much time working on optimizing code which may not be a bottle neck at any time in the near future. Build something that works, figure out where the time is being spent, and then optimize. The closer to the metal that you are (GPU and CPU optimizations), the more time consuming (and error prone) the project becomes. If you get something to work, you can most likely speed it up. If you can’t get to the point where it works at all, you are working on optimizing a failure path. It’s the old 80/20 rule, put the 80% in the bank first and see where you really get the most benefit from the last 20% and work on that.
Note 2: OpenCV 3.0 handles SIFT/SURF in a separate repository, opencv_contrib repo. This may make it easier in the future to combine OpenCV4Tegra with SIFT/SURF, but because OpenCV4Tegra is still at release 2.4.10, this remains to be seen.