Dear all, I got a task to learn and use regular expression to split a long string into sentences and save them.
The sentence can be broken by stop mark".", question mark "?" and exclamation mark "!" (In this moment, I needn't to care about enter mark )
I am using vb.net to write a expression "([^\.\?\!]*)[\.\?\!]", and did some test.
The input string for testing is "I love icecream. The icecream is £5.99? I found it in www.yahoo.com! "
The result is I got 6 sentences given below:
sentence 1,---------I love icecream.
sentence 2,---------The icecream is £5.
sentence 3,---------99?
sentence 4,---------I found it in www.
sentence 5,---------yahoo.
sentence 6,---------com!
The problem is it read period mark even in the middle of the sentence. Then I changed my expression to "([^\.\?\!]*)[\.\?\!]\s", I try to split the sentence with stop mark and space together.
The result is I got 3 sentences( which is correct) given below:
sentence 1,---------I love icecream.
sentence 2,---------99?
sentence 3,---------com!
Some of them are not a sentence.
Is there anyone can help me?
Thanks