How to achieve pattern matching in C++ using boost::regex

Suppose you are parsing text input and need to handle the string representation of a field that could have several formats. In that case, using regular expressions is appealing because you can try a number of patterns sequentially (taking the most likely first) and exit when you get a match.

When I tried to do this in C++, I found it hard to find an easy example to follow. Here’s the way I’ve been doing it, based on a simple example to parse a length that could be in any of several units of measure:

    #include <boost/algorithm/string_regex.hpp>

    void parse( const std::string& candidate )
    {
        boost::smatch value;

        if ( boost::regex_search( candidate, value, boost::regex( "(.*)ft(.*)in" )) )
        {
            auto feet = atol( value[1].str().c_str() );
            auto inches = atof( value[2].str().c_str() );
            std::cout << "Matched " << feet << " feet and " << inches << " inches\n";
        }

        if ( boost::regex_search( candidate, value, boost::regex( "(.*)m(.*)cm" )) )
        {
            auto metres = atol( value[1].str().c_str() );
            auto centimetres = atof( value[2].str().c_str() );
            std::cout << "Matched " << metres << " metres and " << centimetres << " centimetres\n";
        }

        if ( boost::regex_search( candidate, value, boost::regex( "([0-9]+)mm" )) )
        {
            auto millimetres = atol( value[1].str().c_str() );
            std::cout << "Matched " << millimetres << " millimetres\n";
        }

        throw std::runtime_error( (boost::format( "Failed to match candidate '%1%' with ft/in, m/cm or mm" ) % candidate).str() );
    }

This only uses a fraction of what can be done with regex – the point is to show how to use boost::regex. One gotcha is that, as per the code above, the string matches that are written into value are indexed from 1 – for some reason, the zero’th index accesses the whole candidate expression. Another gotcha is that boost::regex is one of the few boost libraries that isn’t just implemented using templates in a header file, so you also have to add the appropriate .lib into the linker inputs.

1 Comment

Filed under C++, C++ Code, Programming

One response to “How to achieve pattern matching in C++ using boost::regex

  1. Pingback: Regex – concrete use cases | musingstudio

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.