Suppose you are parsing text input and need to handle the string representation of a field that could have several formats. In that case, using regular expressions is appealing because you can try a number of patterns sequentially (taking the most likely first) and exit when you get a match.
When I tried to do this in C++, I found it hard to find an easy example to follow. Here’s the way I’ve been doing it, based on a simple example to parse a length that could be in any of several units of measure:
#include <boost/algorithm/string_regex.hpp> void parse( const std::string& candidate ) { boost::smatch value; if ( boost::regex_search( candidate, value, boost::regex( "(.*)ft(.*)in" )) ) { auto feet = atol( value[1].str().c_str() ); auto inches = atof( value[2].str().c_str() ); std::cout << "Matched " << feet << " feet and " << inches << " inches\n"; } if ( boost::regex_search( candidate, value, boost::regex( "(.*)m(.*)cm" )) ) { auto metres = atol( value[1].str().c_str() ); auto centimetres = atof( value[2].str().c_str() ); std::cout << "Matched " << metres << " metres and " << centimetres << " centimetres\n"; } if ( boost::regex_search( candidate, value, boost::regex( "([0-9]+)mm" )) ) { auto millimetres = atol( value[1].str().c_str() ); std::cout << "Matched " << millimetres << " millimetres\n"; } throw std::runtime_error( (boost::format( "Failed to match candidate '%1%' with ft/in, m/cm or mm" ) % candidate).str() ); }
This only uses a fraction of what can be done with regex – the point is to show how to use boost::regex. One gotcha is that, as per the code above, the string matches that are written into value are indexed from 1 – for some reason, the zero’th index accesses the whole candidate expression. Another gotcha is that boost::regex is one of the few boost libraries that isn’t just implemented using templates in a header file, so you also have to add the appropriate .lib into the linker inputs.
Pingback: Regex – concrete use cases | musingstudio