Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

help with VB.NET regex

Last post 02-08-2010, 1:40 PM by marraco. 6 replies.
Sort Posts: Previous Next
  •  02-08-2010, 11:34 AM 59408

    help with VB.NET regex

    I have a function who gets a regular expression as parameter.

     

    The function use the regular expression to check if the entire content of the file is matched (is a format checking). Then the function returns the matches.

     

    An example of the text file is this key:value pair list:

    1:
    2:
    3:some value.
    4:other value
    5:
    6:
    7:
    8:
    9:
    10:another

    The regular expression is this:

     (?<Line>^(?<KeyNumber>\d+):(?<Value>[ ,\S]*(?:(?=\W$)|\z)))+

     The modifiers (wich I cannot change), are:

    RegexOptions.Compiled Or _
    RegexOptions.CultureInvariant Or _
    RegexOptions.ExplicitCapture Or _
    RegexOptions.IgnoreCase Or _
    RegexOptions.Multiline Or _
    RegexOptions.Singleline

    The group "Line" should return each line (10 in total), and it works in Expresso:

    Expresso test 

    But for a reason I cannot understand, it don't work in VB.NET. I'm sure that I used the six modifiers posted above (multiline, culture invariant, etc)

     

    This code only gets a single "line", instead of 10:

    Imports System.Text.RegularExpressions

    Public Class Form1

        Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
            Dim RegexString As String = "(?<Line>^(?<KeyNumber>\d+):(?<Value>[ ,\S]*(?:(?=\W$)|\z)))+"
            Dim ExampleText As String = String.Join(vbCrLf, New String() {"1:", _
                                                                          "2:", _
                                                                          "3:Some Value ", _
                                                                          "4:Other value", _
                                                                          "5:", _
                                                                          "6:", _
                                                                          "7:", _
                                                                          "8:", _
                                                                          "9:", _
                                                                          "10:Another"})
            Dim MyParser As New Regex(RegexString, _
                             RegexOptions.Compiled Or _
                             RegexOptions.CultureInvariant Or _
                             RegexOptions.ExplicitCapture Or _
                             RegexOptions.IgnoreCase Or _
                             RegexOptions.Multiline Or _
                             RegexOptions.Singleline)

            Dim Clases As Match = MyParser.Match(ExampleText)
            MsgBox("Total number of ""Line"" captures: " & Clases.Groups("Line").Captures.Count)
        End Sub
    End Class

     

    It makes nonsense. I can't understand why it don't works. Can you help me?

  •  02-08-2010, 11:52 AM 59411 in reply to 59408

    Re: help with VB.NET regex

    I think your logic is wrong.  In Expresso you are looking at the number of matches of a global match. In your code you are only looking for a single match (first match) and have the message box reporting the number of captures a single match.

    Michael

    "In theory, theory and practice are the same. In practice, they are not."
    Albert Einstein
  •  02-08-2010, 12:04 PM 59413 in reply to 59411

    Re: help with VB.NET regex

    mash:
    I think your logic is wrong.  In Expresso you are looking at the number of matches of a global match. In your code you are only looking for a single match (first match) and have the message box reporting the number of captures a single match.

     

    I'm not sure if I understand your message.

    I see I posted this regex in Expresso:

    (?<Line>^(?<KeyNumber>\d+):(?<Value>.*(?:(?=\W$)|\z)))+

    and this other in VB.NET:

    (?<Line>^(?<KeyNumber>\d+):(?<Value>[ ,\S]*(?:(?=\W$)|\z)))+

    I did it by mistake, but the problem is the same with both regex

     

    Or maybe you are telling that this line:

    Clases.Groups("Line").Captures.Count
    will only return the first capture, even if all lines are matched? 

     

  •  02-08-2010, 1:05 PM 59416 in reply to 59413

    Re: help with VB.NET regex

     

    MyParser.Match(ExampleText) 

    will return only the first match, u need to use MatchCollection object and iterate over the collection of matches. MSDN has some VB examples. Search for MatchCollection .

  •  02-08-2010, 1:17 PM 59418 in reply to 59413

    Re: help with VB.NET regex

    marraco:

    mash:
    I think your logic is wrong.  In Expresso you are looking at the number of matches of a global match. In your code you are only looking for a single match (first match) and have the message box reporting the number of captures a single match.

     

    I'm not sure if I understand your message.

    I see I posted this regex in Expresso:

    (?<Line>^(?<KeyNumber>\d+):(?<Value>.*(?:(?=\W$)|\z)))+

    and this other in VB.NET:

    (?<Line>^(?<KeyNumber>\d+):(?<Value>[ ,\S]*(?:(?=\W$)|\z)))+

    I did it by mistake, but the problem is the same with both regex

     

    Or maybe you are telling that this line:

    Clases.Groups("Line").Captures.Count
    will only return the first capture, even if all lines are matched? 

     

    My message is this. In your orginal post you say

    marraco:

    The group "Line" should return each line (10 in total), and it works in Expresso:

    And as you screen shot show you are getting 10 matches.

    Then you say

    marraco:

    This code only gets a single "line", instead of 10:

     Now if you look at the code you posted you are calling the "Match" method of the Regex object.

    marraco:

     Dim Clases As Match = MyParser.Match(ExampleText)

     

    If you read the documentation for this method. From MSDN http://msdn.microsoft.com/en-us/library/twcw2f1c.aspx

    .NET Framework Class Library
    Regex..::.Match Method (String)

    Updated: October 2008

    Searches the specified input string for the first occurrence of the regular expression specified in the Regex constructor.

     

     

    You'll see that this method returns only the first match.

    Which is what I meant when I said your logic is bad. You've tested with a program doing global matches but coded for a single match, thinking you'll get the same results..

    You'll want to use the "Matches" method for global matches. Please review the documentation as that method returns a different object.


    Michael

    "In theory, theory and practice are the same. In practice, they are not."
    Albert Einstein
  •  02-08-2010, 1:33 PM 59419 in reply to 59416

    Re: help with VB.NET regex

    Sergei Z:

    MyParser.Match(ExampleText) 

    will return only the first match, u need to use MatchCollection object and iterate over the collection of matches. MSDN has some VB examples. Search for MatchCollection .

    Oh. I see. The weird thing is that it worked with this other file format:

    1 -999 -14
    2 -14 -10
    3 -10 -6
    4 -6 -2
    5 -2 2
    6 2 6
    7 6 10
    8 10 14
    9 14 999

    -9999

    and this expression, wich is similar:

    (?:(?<Line>^(?<NºClass>\d+)\s+(?<Start>-?\d*(?:[\,\.]?\d+)?)\s+(?<End>-?\d*(?:[\,\.]?\d+)?)\s*\n?$.)*(?<EndOfFile>^-9999\n?.*$))

     

    Nonetheless,  I swithced to MatchCollection, and it works, so thanks Sergei!

    Have a bit of Latin rock as award!

    http://www.youtube.com/watch?v=82CwZl5721M

  •  02-08-2010, 1:40 PM 59422 in reply to 59418

    Re: help with VB.NET regex

    mash:
    ....

    I think your logic is wrong.  In Expresso you are looking at the number of matches of a global match. In your code you are only looking for a single match (first match) and have the message box reporting the number of captures a single match.

     ...

    You'll see that this method returns only the first match.

    Which is what I meant when I said your logic is bad. You've tested with a program doing global matches but coded for a single match, thinking you'll get the same results..

    You'll want to use the "Matches" method for global matches. Please review the documentation as that method returns a different object.

     

    That clarifies it. I think that with the other file type I only got a single match, but the .Count property/function are counting the nested captured groups, so I misunderstood that count as the <Line> .Count

     

    Thanks for your help. 

    PD: I give this for you:

     http://www.youtube.com/watch?v=OIoTCmCdCsA :)

View as RSS news feed in XML