Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Regex Split problem, probably a no brainer.

Last post 05-05-2009, 2:11 PM by evildrome. 0 replies.
Sort Posts: Previous Next
  •  05-05-2009, 2:11 PM 52669

    Regex Split problem, probably a no brainer.

    Hi All,

    I've got a VB program with a VC++ dll containing Boost 1_34_1 regular expressions. I am a programmer but not VC++ or Regex.

    The program worked fine until the target HTML format (sPage) was changed and I needed to get the regexs updated. I got them updated OK. I checked them all with RegexBuddy and individually, they all work. I substituted the old regexs for the new but now after the regex_split command the output-iterator (oMessageInfo) contains only the output of the first regular expression.

    I looked at the Boost docs website ( http://www.boost.org/doc/libs/1_31_0...gex_split.html ) regarding the regex_split command

    "Effects: Each version of the algorithm takes an output-iterator for output, and a string for input. If the expression contains no marked sub-expressions, then the algorithm writes one string onto the output-iterator for each section of input that does not match the expression. If the expression does contain marked sub-expressions, then each time a match is found, one string for each marked sub-expression will be written to the output-iterator."

    Unfortunately I have no idea what a 'marked sub-expression' is.

    Hopefully someone here can spot the schoolboy mistake.

    const std::string sMessages1 = 
    
    "<td class=\"msgnumh smalltype\">\#([0-9]*)<\/td>"
    "(?:[\s\S]*?)From:(?:<\/em>)?(?:<\/span>)?(?:[\s]*)?(?:&quot;)?([^&]*)(?:&quot;)?(?:[\s]*)?&lt;([^&]*)&gt;"
    "(?:[\s\S]*?)Date:(?:<\/em>)?(?:<\/span>)?[\s]*([^<]*)(?:<br>)?"
    
    
    boost::regex   oRegExMessages1(sMessages1, boost::regbase::normal | boost::regbase::icase);  
    
    boost::regex_split(std::back_inserter(oMessageInfo), sPage, oRegExMessages1);
    


    Cheers,

    Wilson.
View as RSS news feed in XML