Tag Archives: Boost

How to achieve pattern matching in C++ using boost::regex

Suppose you are parsing text input and need to handle the string representation of a field that could have several formats. In that case, using regular expressions is appealing because you can try a number of patterns sequentially (taking the most likely first) and exit when you get a match.

When I tried to do this in C++, I found it hard to find an easy example to follow. Here’s the way I’ve been doing it, based on a simple example to parse a length that could be in any of several units of measure:

    #include <boost/algorithm/string_regex.hpp>

    void parse( const std::string& candidate )
    {
        boost::smatch value;

        if ( boost::regex_search( candidate, value, boost::regex( "(.*)ft(.*)in" )) )
        {
            auto feet = atol( value[1].str().c_str() );
            auto inches = atof( value[2].str().c_str() );
            std::cout << "Matched " << feet << " feet and " << inches << " inches\n";
        }

        if ( boost::regex_search( candidate, value, boost::regex( "(.*)m(.*)cm" )) )
        {
            auto metres = atol( value[1].str().c_str() );
            auto centimetres = atof( value[2].str().c_str() );
            std::cout << "Matched " << metres << " metres and " << centimetres << " centimetres\n";
        }

        if ( boost::regex_search( candidate, value, boost::regex( "([0-9]+)mm" )) )
        {
            auto millimetres = atol( value[1].str().c_str() );
            std::cout << "Matched " << millimetres << " millimetres\n";
        }

        throw std::runtime_error( (boost::format( "Failed to match candidate '%1%' with ft/in, m/cm or mm" ) % candidate).str() );
    }

This only uses a fraction of what can be done with regex – the point is to show how to use boost::regex. One gotcha is that, as per the code above, the string matches that are written into value are indexed from 1 – for some reason, the zero’th index accesses the whole candidate expression. Another gotcha is that boost::regex is one of the few boost libraries that isn’t just implemented using templates in a header file, so you also have to add the appropriate .lib into the linker inputs.

1 Comment

Filed under C++, C++ Code, Programming

How to transform between a double date-time and std::string in C++

One of the attractions of writing software using the .NET framework is the wealth of support for doing simple things like translating between different data formats. These tasks are typically much harder to achieve in C++ due to the lack of an equivalent framework. One such task that I came across the other day is that date-times are often represented by a double in Windows, where the integer part represents the date since some epoch and the fractional part is the time as a fraction of 24 hours. Even with access to the Boost library, I still had to do some work to produce a simple transformation in C++.

#include <boost/format.hpp>
#include <boost/date_time/gregorian/gregorian.hpp>
#include <boost/date_time/posix_time/posix_time.hpp>

typedef double DateTime;

namespace
{
  boost::gregorian::date parse_date( DateTime date_time )
  {
      boost::gregorian::date dt = boost::date_time::parse_date<boost::gregorian::date>( "1899-12-30", boost::date_time::ymd_order_iso );
      dt += boost::gregorian::date_duration( static_cast<long>( floor(date_time) ) );
  }

  boost::posix_time::time_duration parse_time( DateTime date_time )
  {
    double fractionalDay = date_time - floor(date_time);
    long milliseconds = static_cast<long>( floor( fractionalDay * 24.0 * 60.0 * 60.0 * 1000.0 + 0.5) );
    return boost::posix_time::milliseconds( milliseconds );
  }
}

std::string to_date_string( DateTime date_time )
{
  boost::gregorian::date dt = parse_date( date_time );
  return (boost::format( "%4-%02d-%02d" ) % dt.year() % dt.month().as_number() % dt.day().as_number()).str();
}

DateTime from_date_string( const std::string& value )
{
  boost::gregorian::date epoch = boost::date_time::parse_date<boost::gregorian::date>( "1899-12-30", boost::date_time::ymd_order_iso);
  boost::gregorian::date dt = boost::date_time::parse_date<boost::gregorian::date>( value, boost::date_time::ymd_order_iso);

  boost::gregorian::date_duration diff = dt - epoch;
  return diff.days();
}

std::string to_date_time_string( DateTime date_time )
{
  boost::gregorian::date date_part = parse_date( date_time );
  boost::posix_time::time_duration time_part = parse_time( date_time );

  long long fractional_seconds = time_part.fractional_seconds();
  boost::date_time::time_resolutions resolution = time_part.resolution();
  if ( resolution == boost::date_time::micro )
  {
    fractional_seconds /= 1000;
  } 
  else
  {
    if (resolution != boost::date_time::milli)
      throw std::logic_error( "Unexpected time resolution" );
  }

  return (boost::format( "%d-%02d-%02d %02d:%02d:%02d.%03d" )
    % date_part.Year() % date_part.month().as_number() % date_part.day().as_number()
    % time_part.hours() % time_part.minutes() % time_part.seconds() % fractional_seconds ).str();
}

DateTime from_date_time_string( const std::string& value )
{
  DateTime date = from_date_string( value );
 
  boost::posix_time::ptime t = boost::posix_time::time_from_string( value );
  double milliseconds = static_cast<double>(t.time_of_day().total_milliseconds());

  return date + (milliseconds / 24.0 / 60.0 / 60.0 / 1000.0);
}

Please comment if you know a more straight-forward way to achieve this transformation, especially using Boost. Syntactically, the code could be simplified using C++11 auto, but I’ve spelt out the types explicitly throughout because I found it helpful to see which parts of the boost library are being used.

1 Comment

Filed under C++, C++ Code, Programming

Flat Containers in Boost

Jon Kalb posted an article on flat containers in boost.

Alex Stepanov, the STL’s creator, has been quoted as saying, “Use vectors whenever you can. If you cannot use vectors, redesign your solution so that you can use vectors.”

The Boost Container library has a family of flat_* containers that have associative container interfaces and semantics, but are implemented as sorted vectors

Leave a comment

Filed under C++, Programming