So Eric Wise posted a small little snippet IsEmailAddress that stars a long winded regular expression to determine if an input string is in fact an Email Address. So IsEmailAddress(“try.sending.this.email@____________________________________________.zzzzz”) would return True for the suspect email.
I don’t want this to come across as flaming Eric but I do feel it’s important to make a few points (and they’re not directly pointed at Eric, though the first one certainly is relevant, the rest are just more general rants):
1st, if you really want to use the email address for the purpose of sending email there are far better ways to do this. I’ve blogged about this before but it’s worth repeating: Email Validation for every new email address and parsing your bounce logs is a sure fire way to satisfy the requirement that the email address works and is at least accessible to the registering user (whether they own it or not is a discussion for some other day and some other blog). If you’re not into validation, which is probably the best method, you can try to poll the SMTP server for the domain to see if the account exists, which might tell you it’s a valid email but not necessarily one that will get to the registering user.
2nd, Regular Expressions are a tool, just like anything else they are only appropriate for certain situations. This isn’t to say you can’t go out of your way to use regex to solve problems at considerable expense to yourself, just for the sake of using regex. Religious zealots abound, look at the junior programmer in the next cube that just learned about the Facade pattern and is trying to use it on every project he ever encounters, right or wrong. Knowing the tools aren’t enough, knowing when it’s appropriate to use which tool is far more important.
3rd, if you’re really interested in what a valid email address is go read RFC 822 browse down to page 7 and start looking through the lexical symbols. If you think it’s an easy thing to do in a regex just check out the number of varying results from regex lib (and no, I’m not endorsing regex lib, I think it’s a fairly bad idea to insert a bit of code that you have no idea what it’s saying and just hoping that it works, always.) and you’ll see pages of tries. If you want to see what the real regex looks like go here: http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html and it still doesn’t tell you if the email exists, only that it meets the standard.