Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Michael Ash's Regex Blog

Regex Musings

Making your regex code ready.

There are times when regular expression you’ve written or someone written for you needs a little tweaking before you add it to your code and the tweaking is required because the syntax of the language conflicts with your regex.   For example when part of your regex pattern contains a double quote and the language you are using uses double quotes as string delimiters.  If you just cut and paste the pattern in your code the pattern’s quotation will terminate your string prematurely.  Now the code way to fix it is to escape the quotation in pattern. This solution requires altering the regex and how the character is escapes depends on the language being used.   The regex itself allows you to escape character with the \ character.  The language being used may or may not recognize that as escape character for its syntax.   And it may be confusing later when you look at the regex and can’t remember why you escaped a character that the pattern itself doesn’t need it, But there is another way. Hex values

                                                                                                                             

Most regex implementations support a hex syntax \x##,  where # is a hex digit.

So if you use \x22 instead of double quote and \x27 for single quotes the regexes become more cookie cutter ready.

 

Another useful hex value is \x20 which is a space.   This is especially useful in .Net where there is an option on a regex to ignorewhitespace in the pattern.  Turning this option on allows end of line comments in the regex but with the exception of inside a character class, ignores typed in spaces within the patterns, which would be problematic if a space was part of the pattern to match.  So you could break a working regex if you later decide to add this option. This happened on the Regexlib when the option was first turned on.  A lot of patterns that were written before the switch was flipped suddenly stopped working.

 

Speaking of .Net when it comes to name groups you can’t use the hex notation to define the group name using the single quote syntax .  However you can avoid any issue with single quotes by using the alternate syntax.

Sponsor
Published Wednesday, May 18, 2005 11:44 AM by mash

Comments

No Comments
Anonymous comments are disabled