Tag Archives: Optimization

Optimizing C++ code (Visual C++ Team blog)

The Visual C++ Team blog has started a series on Optimizing C++ code.

  1. Introduction
  2. Overview
  3. Constant Folding
  4. Dead Code Elimination

Leave a comment

Filed under C++, Programming

How to profile performance by hand

I recently needed to profile some C++ code that was taking longer than expected to run. The code was running on a machine without a profiler, so I wrote a handy Timer class that dumps nested timings of each method. You can initialize it to write either to a file or to std::cout if the machine has a console.

I originally thought of this as a ‘Poor Man’s Profiler’, but having used it there are real benefits to taking the trouble to instrument your own code – you can use the file dumps to swiftly compare performance between code changes; you can print out performance statistics and take along to meetings.

#include <iostream>
#include <chrono>
class TimerOutput
{
  public:
    TimerOutput( const std::string file_path = "" ) :
      file_path_( file_path )
    {
      if ( !file_path_.empty() )
      {
        file_.open( file_path_ );
      }
    }

    std::ostream& Stream()
    {
      if (file_path_.empty() )
        return std::cout;
      else
        return file_;
    }

private:
    std::string file_path_;
    std::ofstream file_;
};

class Timer
{
public:
    Timer( const std::string& description ) :
      description_(description),
      start_(std::chrono::system_clock::now())
    {
        applyIndent();
        timer_output_->Stream() << "Start " << description_.c_str() << "\n";
        ++indent;
    }

    ~Timer()
    {
        --indent;
        const std::chrono::time_point<std::chrono::system_clock> finish = std::chrono::system_clock::now();
        auto milliseconds = std::chrono::duration_cast<std::chrono::milliseconds>(finish - start_).count();

        applyIndent();
        timer_output_->Stream() << "Finished " << description_.c_str()
            << ", took " << milliseconds << " (ms)" << "\n";
}

private:
    void applyIndent() const
    {
        for ( size_t i = 0; i < indent; ++i )
        {
            timer_output_->Stream() << "--";
        }
    }

    static size_t indent;
    static TimerOutput* timer_output_;
    std::string description_;
    const std::chrono::time_point<std::chrono::system_clock> start_;
};

#define INITIALIZE_PROFILING_TO_CONSOLE() \
  TimerOutput timer_output; \
  size_t Timer::indent = 0; \
  TimerOutput* Timer::timer_output_ = &timer_output;

#define INITIALIZE_PROFILING_TO_FILE( path ) \
  TimerOutput timer_output( path ); \
  size_t Timer::indent = 0; \
  TimerOutput* Timer::timer_output_ = &timer_output;

#define TIME( description, f ) \
  { \
    Timer profiler( description ); \
    f; \
  }

The obvious limitation of my solution is that it uses a class static to achieve the levels of nesting, so it won’t work on multi-threaded code – but it was great for my purposes. Here’s some sample code that shows it in action:

#include "stdafx.h"

#include <thread>
#include <fstream>

#include "..\MusingStudio\Profiler.h"

//INITIALIZE_PROFILING_TO_CONSOLE()
INITIALIZE_PROFILING_TO_FILE( "c:/temp/timings.txt" )

void method2()
{
  TIME( "method2",
    std::chrono::milliseconds short_wait( 5 );
    std::this_thread::sleep_for( short_wait );
  )
}

void method1()
{
 TIME( "method1",
  TIME( "loop",
    for ( int i = 0; i &lt; 5; ++i )
    {
        method2();
    }
  )

  TIME( "expensive algorithm",
    std::chrono::milliseconds wait( 100 );
    std::this_thread::sleep_for( wait );
  )
 )
}

void method3()
{
  TIME( "method3",
    std::chrono::milliseconds long_wait( 500 );
    std::this_thread::sleep_for( long_wait );
  )
}

int main(int argc, char* argv[])
{
  TIME( "main",
    method1();
    method3();
  )

  return 0;
}

Here’s the output from the sample code:
Profiler

Leave a comment

Filed under C++, C++ Code, Programming

Alexandrescu Video – Three Optimization Tips for C++

Andrei Alexandrescu gave this talk on optimization tips for C++. He claims you should aim to:

  • Measure performance against a baseline
  • Choose algorithms that use the least heavyweight operations
  • Reduce array writes

The slides are also available.

Measuring gives you a leg up on experts who don’t need to measure

He also presents a handy ordering of different integral/floating point operations in terms of cost. The techniques (which he has internalised to give a gut feel of whether some improvement will produce a speed-up) are illustrated with a classic interview question (convert an integer to ascii) that is heavily used at Facebook.

The underlying message of this presentation is that the cost of one engineer spending time producing some pretty esoteric code with massive performance benefits is more than outweighed by the savings in terms of data centre costs (due to improved speed, you can scale back spend on servers).

Leave a comment

Filed under C++, Video