Which way of accessing pixels in OpenCV is the fastest?

If you want to do custom high performance image processing code, this is an important question. I’m working on a real-time video processing project and we need fast pixel access. In the old days before OpenCV 2.x your only choice was direct pointer access in an IPL Image. The only problem with that is that its easy to make a mistake and read or god forbid write outside of your image data segment. If you’re not careful whammo!!! … SEGFAULT.

OpenCV is now fully Object Oriented C++ and has a wonderful data container (cv::Mat) for working with image data. cv::Mat offers serveral methods to access data that are much safer than working directly with pointers. But how much impact does this have on performance?

Below are the results of a performance benchmarch comparing variouse methods: (full code found below)

——– Results ——— (collected for a 640×480 24 bit 3-channel image over 100 full passes)
OpenCV iteration method 1a – at with Vec3b : 1000566 (usecs) : 602 % of min
OpenCV iteration method 1b – at with direct access : 809094 (usecs) : 487 % of min
OpenCV iteration method 2a – raw pointer with Vec3b and pointer helpers : 826061 (usecs) : 497 % of min
OpenCV iteration method 2b – raw pointer with Vec3b and pointer helpers + continuous : 824629 (usecs) : 496 % of min
OpenCV iteration method 3 – raw pointer access with raw step : 166014 (usecs) : 100 % of min
OpenCV iteration method 4 – raw pointer with pointer helpers : 196367 (usecs) : 118 % of min
OpenCV iteration method 5 – const matrix iterator with Vec3b : 2370288 (usecs) : 1427 % of min

As you might expect raw pointer access is still the big winner. However you can get a little more safety without giving up much performance by using the cv::Mat::ptr(int row) function, which I call a pointer helper in the results. This function gives you a pointer to the start of the data row, which is smewhat safer. In our test we see that using the pointer helper only adds 18% overhead, while STL-like iteration is a full 14 times or 1400% slower!

Another advantage of using cv::Mat::ptr(int row) is that you don’t have to ensure the segment is continous. Remember cv::Mat doesn’t have to have a continuous memory segment. When using raw pointers and the step size you have to look out for that. However, since cv::Mat::ptr(int row) gives you a pointer to the begining of the row you won’t be affected, even if it’s not continuous. (segments are always broken on row boundaries, when not continous)

In Summary, do it this way:

cv::Size imgSize(640, 480);
cv::Mat image = cv::Mat(imgSize, CV_8UC3, cv::Scalar(1,1,1) );
for (int row = 0; row < image.rows; ++row)
{
    const uchar *ptr = image.ptr(row);
    for (int col = 0; col < image.cols; col++)
    {
        const uchar * uc_pixel = ptr;
        int a = uc_pixel[0];
        int b = uc_pixel[1];
        int c = uc_pixel[2];           
        sum += a + b + c;
        ptr += 3;
    }
}

BOOST_AUTO_TEST_CASE ( cvMatPixelIterationMethods )
{
    cv::Size imgSize(640, 480);
    cv::Mat image = cv::Mat(imgSize, CV_8UC3, cv::Scalar(1,1,1) );
    const int trials = 100;
    
    std::vector<std::string> names;
    std::vector<timestamp_t> usecs;
    timestamp_t min = UINT_MAX;
    
// OpenCV iteration method 1a - at with vec3b
    int64_t sum = 0;
    
    
    util::StopWatch stopWatch;
    stopWatch.start();
    for(int t = 0; t < trials ; ++t)
    {
        for (int row = 0; row < image.rows; ++row)
        {
            for (int col = 0; col < image.cols; ++col)
            {
                cv::Vec3b pixel = image.at<cv::Vec3b>(row, col);
                int a = pixel[0];
                int b = pixel[1];
                int c = pixel[2];           
                sum += a + b + c;
            }
        }
    }
    stopWatch.stop();
    BOOST_CHECK(sum == 3*imgSize.area()*trials);
    BOOST_TEST_MESSAGE("OpenCV iteration method 1a - at with Vec3b : " << stopWatch.elapsedTime_usec() << " usecs" );
    names.push_back("OpenCV iteration method 1a - at with Vec3b : ");
    usecs.push_back(stopWatch.elapsedTime_usec());
    if(stopWatch.elapsedTime_usec() < min) min = stopWatch.elapsedTime_usec();
    
// OpenCV iteration method 1b - at with direct access
    sum = 0;
    stopWatch.reset();
    stopWatch.start();
    for(int t = 0; t < trials ; ++t)
    {
        for (int row = 0; row < image.rows; ++row)
        {
            for (int col = 0; col < image.cols; ++col)
            {
                int a = image.at<cv::Vec3b>(row, col)[0];
                int b = image.at<cv::Vec3b>(row, col)[1];
                int c = image.at<cv::Vec3b>(row, col)[2];           
                sum += a + b + c;
            }
        }
    }
    stopWatch.stop();
    BOOST_CHECK(sum == 3*imgSize.area()*trials);
    BOOST_TEST_MESSAGE("OpenCV iteration method 1b - at with direct access : " << stopWatch.elapsedTime_usec() << " usecs" );
    names.push_back("OpenCV iteration method 1b - at with direct access : ");
    usecs.push_back(stopWatch.elapsedTime_usec());
    if(stopWatch.elapsedTime_usec() < min) min = stopWatch.elapsedTime_usec();

// OpenCV iteration method 2a - raw pointer with Vec3b and pointer helpers
    sum = 0;
    stopWatch.reset();
    stopWatch.start();
    for(int t = 0; t < trials ; ++t)
    {
        for (int row = 0; row < image.rows; ++row)
        {
            cv::Vec3b *ptr = image.ptr<cv::Vec3b>(row);
            for (int col = 0; col < image.cols; ++col)
            {
                cv::Vec3b pixel = ptr[col];
                int a = pixel[0];
                int b = pixel[1];
                int c = pixel[2];           
                sum += a + b + c;
            }
        }
    }
    stopWatch.stop();
    BOOST_CHECK(sum == 3*imgSize.area()*trials);
    BOOST_TEST_MESSAGE("OpenCV iteration method 2a - raw pointer with Vec3b and pointer helpers : " << stopWatch.elapsedTime_usec() << " usecs" );
    names.push_back("OpenCV iteration method 2a - raw pointer with Vec3b and pointer helpers : ");
    usecs.push_back(stopWatch.elapsedTime_usec());
    if(stopWatch.elapsedTime_usec() < min) min = stopWatch.elapsedTime_usec();
    
// OpenCV iteration method 2b - raw pointer with Vec3b and pointer helpers + continuous 
    sum = 0;
    stopWatch.reset();
    stopWatch.start();
    int cols = 0, rows = 0;
    if (image.isContinuous())
    {
        cols = image.rows * image.cols; // Loop over all pixels as 1D array.
        rows = 1;
    }
    
    for(int t = 0; t < trials ; ++t)
    {
        for (int row = 0; row < rows; ++row)
        {
            cv::Vec3b *ptr = image.ptr<cv::Vec3b>(row);
            for (int col = 0; col < cols; ++col)
            {
                cv::Vec3b pixel = ptr[col];
                int a = pixel[0];
                int b = pixel[1];
                int c = pixel[2];           
                sum += a + b + c;
            }
        }
    }
    stopWatch.stop();
    BOOST_CHECK(sum == 3*imgSize.area()*trials);
    BOOST_TEST_MESSAGE("OpenCV iteration method 2b - raw pointer with Vec3b and pointer helpers  + continuous : " << stopWatch.elapsedTime_usec() << " usecs" );
    names.push_back("OpenCV iteration method 2b - raw pointer with Vec3b and pointer helpers  + continuous : ");
    usecs.push_back(stopWatch.elapsedTime_usec());
    if(stopWatch.elapsedTime_usec() < min) min = stopWatch.elapsedTime_usec();
    
// OpenCV iteration method 3 - raw pointer access with raw step
    sum = 0;
    stopWatch.reset();
    stopWatch.start();
    uchar* uc_pixel = image.data;
    for(int t = 0; t < trials ; ++t)
    {
        for (int row = 0; row < image.rows; ++row)
        {
            uc_pixel = image.data + row*image.step;
            for (int col = 0; col < image.cols; ++col)
            {
                int a = uc_pixel[0];
                int b = uc_pixel[1];
                int c = uc_pixel[2];           
                sum += a + b + c;
                uc_pixel += 3;
            }
        }
    }
    stopWatch.stop();
    BOOST_CHECK(sum == 3*imgSize.area()*trials);
    BOOST_TEST_MESSAGE("OpenCV iteration method 3  - raw pointer access with raw step : " << stopWatch.elapsedTime_usec() << " usecs" );
    names.push_back("OpenCV iteration method 3  - raw pointer access with raw step : ");
    usecs.push_back(stopWatch.elapsedTime_usec());
    if(stopWatch.elapsedTime_usec() < min) min = stopWatch.elapsedTime_usec();
    
// OpenCV iteration method 4 - raw pointer with pointer helpers
    sum = 0;
    stopWatch.reset();
    stopWatch.start();
    for(int t = 0; t < trials ; ++t)
    {
        for (int row = 0; row < image.rows; ++row)
        {
            const uchar *ptr = image.ptr(row);
            for (int col = 0; col < image.cols; ++col)
            {
                const uchar * uc_pixel = ptr;
                int a = uc_pixel[0];
                int b = uc_pixel[1];
                int c = uc_pixel[2];           
                sum += a + b + c;
                ptr += 3;
            }
        }
    }
    stopWatch.stop();
    BOOST_CHECK(sum == 3*imgSize.area()*trials);
    BOOST_TEST_MESSAGE("OpenCV iteration method 4  - raw pointer with pointer helpers : " << stopWatch.elapsedTime_usec() << " usecs" );
    names.push_back("OpenCV iteration method 4  - raw pointer with pointer helpers : ");
    usecs.push_back(stopWatch.elapsedTime_usec());
    if(stopWatch.elapsedTime_usec() < min) min = stopWatch.elapsedTime_usec();
   
    
// OpenCV iteration method 5 - const matrix iterator with Vec3b
    sum = 0;
    stopWatch.reset();
    stopWatch.start();
    for(int t = 0; t < trials ; ++t)
    {
        cv::MatConstIterator_<cv::Vec3b> it = image.begin<cv::Vec3b>(), it_end = image.end<cv::Vec3b>();
        for(; it != it_end; ++it)
        {

            int a = (*it)[0];
            int b = (*it)[1];
            int c = (*it)[2];           
            sum += a + b + c;
        }
    }
    stopWatch.stop();
    BOOST_CHECK(sum == 3*imgSize.area()*trials);
    BOOST_TEST_MESSAGE("OpenCV iteration method 5 - const matrix iterator with Vec3b : " << stopWatch.elapsedTime_usec() << " usecs" );
    names.push_back("OpenCV iteration method 5 - const matrix iterator  with Vec3b : ");
    usecs.push_back(stopWatch.elapsedTime_usec());
    if(stopWatch.elapsedTime_usec() < min) min = stopWatch.elapsedTime_usec();    

    
    std::cout << "\n -------- Results --------- \n";
    for(size_t i = 0; i< names.size(); ++i)
    {
        if(true) std::cout << names[i] << " " << usecs[i] << " (usecs) : " << 100*usecs[i]/min << " % of min \n"; 
    }
    
}