Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Match the first 4 lines

Last post 03-08-2010, 5:54 PM by Aussie Susan. 1 replies.
Sort Posts: Previous Next
  •  03-05-2010, 7:58 AM 60402

    Match the first 4 lines

    Hi

    requires a regex command that only matches the first 4 lines in a message and ignores the rest-  iam using this code .*?:77E:.*:21:(\s\n).*?:32B.* but the problem is that there is more then 1 TAG 32B and 21 in the message so i keep on matching the others and not the 4 lines i am Interested in

    :20:20100304BM2UNX31
    :12:598
    :77E:
    :21:
    :32B:JJJ0

    {1:F01NEDSZAJJXXXX6128088798}{2:O5981433100304UNEXZAJJAXXX28211270541003041433N}{3:{108:ZZZZZZZZ55067400}}{4:
    :20:20100304BM2UNX31
    :12:598
    :77E:
    :21:
    :32B:JJJ0,
    :52A:XXXXXXJJ
    :58A:XXXXXXJJ
    :33B:JJN196015980
    :23:100304JJJJJJJJJ2001
    :21:100304JJJJJJJJJ2008
    :32B:JJN495599436
    :52A:JJJJJJJJJJ
    :58A:JJJJJJJJJJ
    :33B:JJJ605845740
    :23:100304JJJJJJJJJ2005
    :21:100304JJJJJJJJJ2009
    :32B:ZZZ4418318228
    :52A:JJJJJJJJJJ
    :58A:JJJJJJJJJJ
    :33B:ZZZ15600296642
    :23:100304JJJJJJJJ2011
    :34E:YYY11488240698
    -}

  •  03-08-2010, 5:54 PM 60496 in reply to 60402

    Re: Match the first 4 lines

    Can you please read the posting guideline sin the sticky note at the beginning of this forum, particularly with respect to the regex variant you are using and the output you are expecting.

    I am not at all clear from your description, nor from your example pattern, what you are trying to achieve.

    If you just want the first 4 lines of the text you are presenting, and assuming that each line ends with a 'newline' character, and that the example text is EXACTLY how the input string looks, then

    ^(.*\n){4}

    with no options set will locate the first 4 lines without any regard for what is on those lines. Given your sample text, the output of this pattern is:

    :20:20100304BM2UNX31
    :12:598
    :77E:
    :21:

    If you want to start with the ":77E:" line and get the next 4 lines then:

    ^:77E:(.*\n){4}

    with the 'multiline' option set, will locate the 2 instances in your sample.

    If it must start with the ":77E:" line and end with the one that begins ":32B:" on the 4th line after the beginning line, then:

    ^:77E:(.*\n){2}:32B:.*

    (again with the 'multiline' option set) will find the 2 instances where this occurs - BUT NOTE that your sample text has both of these lines only 2 lines apart (hence the '{2}' in the pattern); if they really are to be some other fixed number of lines apart then you will need to alter the quantifier value.

    I've tested these patterns with a .NET based regex tester but the patterns are fairly general and should work in many regex variants but this is not guaranteed.

    By the way, you seem to be under a rather common misconception that the pattern needs to account for all characters in the string because you have a ".*" or variant at the beginning and the end of your pattern. The way the regex engine works is that it will start at the first character in the string and try to match your pattern. If it cannot, then it skips that character and starts the matching process at the second string character. It will carry on like this until it either gets to the end of the string (when it declares a failure) or it makes a match. Also, it will only match as many characters as it needs to to satisfy the pattern. Therefore you the pattern need only specify those characters you actually want. If you look at my patterns they do exactly that. (Note that I have ended the last one with '.*' because you want all of the characters in the line that starts ":32B:" through to the end of the line - therefore your rules say you want to match these characters).

    Susan

View as RSS news feed in XML