Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Parse string pattern from text files

Last post 07-14-2008, 9:01 PM by Aussie Susan. 2 replies.
Sort Posts: Previous Next
  •  07-14-2008, 11:30 AM 44115

    Parse string pattern from text files

    Hi,


    I am new to the regex concept I used it in several occasions and it was extremely helpful for my purpose. Now I need to extract data from very large log file 30 000 lines and up and than take this information line by line and store it to a data set.

     

    I have the part where I read the file line by line, but I am not sure how to construct my regex search patter in order to get the info. I am using c# and here is my input file. Thanks in advance.

     

    Input file:

     

    07.07.2008 00:00:00.168 | guardserv.exe/PROD | 2924 | INFO | 00000000 | Reading License Files in D:\tt\datfiles\.
    07.07.2008 00:00:00.183 | guardserv.exe/PROD | 2924 | INFO | 00000000 | Using License Files in D:\tt\datfiles\ with Update Count: 486 

    07.07.2008 00:37:21.692 | guardserv.exe/PROD | 2924 | INFO | 00000000 | License will expire. Exch:XXXX App:TT_XTRD_PRO-SUBSCR Expiry:15.07.2008
    07.07.2008 00:37:21.692 | guardserv.exe/PROD | 2924 | INFO | 00000000 | License will expire. Exch:XXXX App:TT_XTRD_PRO-SUBSCR Expiry:15.07.2008 

    07.07.2008 00:37:28.443 | guardian.exe/PROD | 3436 | INFO | 00000000 | Universal login for user N1MSPRINGER successful
    07.07.2008 00:37:28.489 | priceproxy.exe/PROD | 2400 | INFO | 00000000 | LoginUser for user TTORDMJMJYN1MSPRINGER addr xxx.xxx.x.xxx accepted.
    07.07.2008 00:37:28.489 | guardian.exe/PROD | 3436 | INFO | 00000000 | Universal login request for user N1MSPRINGER started
    07.07.2008 00:37:28.583 | guardian.exe/PROD | 3436 | INFO | 00000000 | Universal login for user N1MSPRINGER successful 

     

    I only need the line which is bolded. Technically my app reads the file line by line in a loop and when I find the line I am looking for (bolded) I need to parse the following information:

     1. 07.07.2008 00:37:28.489 2. guardserv.exe/PROD 3. TTORDMJMJYN1MSPRINGER 4. xxx.xxx.x.xxx

     

    I would really appreciate if someone can get me started with the Regex search pattern.

     

    Here is how I loop through the lines in c#:

    foreach (Match m in rC.Matches(val))
                    {

                    } 

     

    Regex rc;

     

     

     

  •  07-14-2008, 12:56 PM 44127 in reply to 44115

    Re: Parse string pattern from text files

    If I could also have them in separate match groups would be ideal.

     

    Like in the following:

     

       string name = @"(?<Name>[\w-]*)\s*";
                string app = @"App:\s*(?<App>[\w-]*)\s*";
                string trader = @"Trader:\s*(?<Trader>([\w-]*\s)+)";
                string ul = @"UL:\s*(?<UL>[\w-]*)\s*";
                string workstation = @"Workstation:\s*(?<Workstation>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s*";
                string server = @"Server:\s*(?<Server>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})";
                Regex rC = new Regex(name + app + trader + ul + workstation + server);

  •  07-14-2008, 9:01 PM 44137 in reply to 44115

    Re: Parse string pattern from text files

    If I've understood correctly, then the pattern:

    ^(?<DateTime>[\d.]+\s[\d.:]+)
    \s+\|\s+
    (?<App>[\w./]+)
    \s+\|\s+
    (?<Trader>\d+)
    \s+\|\s+
    (?<UL>\w+)
    \s+\|\s+
    (?<Zeros>\d+)
    \s+\|\s+
    loginuser\sfor\suser\s
    (?<Username>\S+)
    \s+addr\s+
    (?<IP>\S+)
    \saccepted\.$

    will find both the specific line and the component parts. You will probably want the 'ignore case' option and also put the whole pattern into a single line. I've left it as multiple lines so that you can see how it hangs together. I'm not sure I completely understand your first reply in that I can't match some of the fields in the line with the patterns you have suggested, but hopefully you should be able to take this an turn it into what you want.

    In its present form, it can actually be used to parse the whole file and it will select just those lines that fit the required pattern. If you do this, then you will need the 'multiline' option set and, depending on the way the file is constructed in terms of the line endings, the last line may need "\r?" before the "$" to account for the carriage-return character.

    Susan 

View as RSS news feed in XML