Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Grabbing Data From Medical Form

Last post 05-07-2007, 9:23 AM by eSquire. 3 replies.
Sort Posts: Previous Next
  •  05-06-2007, 12:23 PM 29499

    Grabbing Data From Medical Form

    I am using a FileMaker Pro with a php plugin (http://www.scodigo.com/)

    Sample Data (fixed length):
      Acq On    : 18 Feb 2007  22:39                       Operator: FJN

      Sample    : Jones 1925 10                            Inst    : GC_MS


         1) 5aAndrostan-3a,17a-diol     11.24  241   186712   2500.00         0.02

        27) Stigmasterol                26.97  394     9158   2500.00 ng      0.02

        30) Cholesteryl Butyrate        30.59  368    23219   2500.00         0.02

    Goal:
    I want to get the data following the labels:  "Acq On:", "Operator;", etc.
    Acq On: = 18 Feb 2007  22:39
    Operator: = FJN

    I would also like to grab the data following "1)","27)",30)", etc. 

    27) = Stigmasterol                26.97  394     9158   2500.00 ng      0.0

    I also need to parse the data following "27)".

    Possibly use a pattern count on any digit preceding a ")"?

    Is there a way to get x number of characters following the pattern match?
    Is there a way to get x number of characters following the pattern match to the end of line?


    Thanks,

    -Jeff

  •  05-06-2007, 7:19 PM 29506 in reply to 29499

    Re: Grabbing Data From Medical Form

    Okay. I'll try to help you and get the data without any surrounding whitespace.

    Here's your example once again in a monospace font:

      Acq On    : 18 Feb 2007  22:39                       Operator: FJN
    Sample : Jones 1925 10 Inst : GC_MS

    1) 5aAndrostan-3a,17a-diol 11.24 241 186712 2500.00 0.02
    27) Stigmasterol 26.97 394 9158 2500.00 ng 0.02
    30) Cholesteryl Butyrate 30.59 368 23219 2500.00 0.02

    First of all, a pattern to get the text from the labels:

    Acq On    : (\d{1,2}\s\w+\s\d{4}\s+\d+:\d+)\s+Operator: (\S*)\s+Sample    : (.*?)(?=  )\s+Inst    : (\S*)

    Then a pattern for  the xx) ... lines.

    ^ *(\d+)\) (.*?)(?=  | \d) +(\d+\.\d+) +(\d+) +(\d+) +(\d+\.\d+) +([a-z]*) +(\d+\.\d+)

    using the multiline (m) modifier

    I'm assuming you are using spaces not tabs in the text. I didn't know that else might show in the ng (nanogram?) column so I just allowed a string of lowercase latin characters. If there's no weight column the result in the respective match will simply be NULL.

    Oh, and my best wishes to Mr or Mrs Jones: get well soon (I guess?)!

  •  05-06-2007, 11:18 PM 29509 in reply to 29506

    Re: Grabbing Data From Medical Form

    Thanks again for you quick response. I apologize for not explaining myself better. The data I extract will be placed in a database so I need to match each pattern separately. If I could see an example that grabs x amount of characters after the pattern or to the end of line I think I can get this done.

    Thanks againm,
    -Jeff
  •  05-07-2007, 9:23 AM 29514 in reply to 29509

    Re: Grabbing Data From Medical Form

    Are you saying that you (or Scodigo for that matter) can't use the match array as return value?

    If you're using preg_match($pattern, $subject, $matches) for the header, $matches[1] should contain the acquiration date, $matches[2] the operator, $matches[3] the sample name and $matches[4] the "Inst" (who/whatever that is).

    If you use preg_match_all($pattern, $subject, $matches) with the second pattern you should get a two dimensional array. $matches[0] will contain the data from the first line, $matches[1] the data from the second line and so on. To get the data columns you would use something like $matches[0][1] (leading number in front of bracket of first line), $matches[0][2] (pharmaceutical ingredient of first line) and so on.

    If you really just want to match until you got a number of characters or the end of line use these patterns with activated multiline modifier (m) and without(!) singleline modifier.

    Match a number of characters (in this example 18). This will catch trailing whitespace as well!

    Acq On    : .{18}

    Match to the end of line (without newline or carriage return characters or trailing whitespace):

    Operator: (.*?)(?=\s*$)
    Filed under: ,
View as RSS news feed in XML