Got more questions? Find advice on: ASP | SQL | XML | Windows
Welcome to RegexAdvice Sign in | Join | Help

Need an expression to find all text between numeric identifiers

  •  05-13-2008, 11:05 AM

    Need an expression to find all text between numeric identifiers

    Hello, I am using C# in Visual Studio 2008.  I have a text file of the entire Bible that I am wanting to either put into a database or write to an XML document.  The format of the text is as follows with comments in C# single line comment style to the right of the actual text (example: Genesis  //book name)

    GENESIS                                   //Am going to capture these as literal strings as part of a loop

    1:1 In the beginning...             //Chapter and verses are formatted chapter number:verse number with
                                                      // text following.
                                                     // I can capture the chapter/verse with the following expression
                                                    // [0-9]{1,3}:[0-9]{1,3}  no problem getting that

    1:1 In the beginning...        //My problem is in capturing the text BETWEEN the two chapter/verse
    1:2 And blah blah....            //identifiers even if there are multiple lines between verses.
    1:3 More blah, blah....        //I just want to capture the text for example between 1:2 and 1:3

    Sample Text:

    Genesis

    The Creation of the World
       1:1 In the beginning God created the heavens and the earth.
       1:2 Now the earth was without shape and empty, and darkness was over the surface of the watery deep, but the Spirit of God was moving over the surface of the water. 1:3 God said, "Let there be light." And there was light!

    Exodus

    1:1 These are the names of the sons of Israel who entered Egypt – each man with his household entered with Jacob: 1:2 Reuben, Simeon, Levi, and Judah, 1:3 Issachar, Zebulun, and Benjamin,

    Desired Result:

    Genesis                                //Will capture book name first
    1:1                                        //Will capture chapter/verse identifier second
    Text between verses        //Will capture text between verses third (multiline)

    All of this will be divided into fields in a database or XML document.  I just need help on constructing an expression that will allow me to capture all of the text between two chapter/verse identifiers.  I know that I will have to use nested loops or switch statements to get the data where it belongs.

    Randy H. Johnson
View Complete Thread