There are times when regular expression you’ve written or
someone written for you needs a little tweaking before you add it to your code
and the tweaking is required because the syntax of the language conflicts with
your regex. For example when part of
your regex pattern contains a double quote and the language you are using uses
double quotes as string delimiters. If
you just cut and paste the pattern in your code the pattern’s quotation will
terminate your string prematurely. Now
the code way to fix it is to escape the quotation in pattern. This solution
requires altering the regex and how the character is escapes depends on the
language being used. The regex itself
allows you to escape character with the \ character. The language being used may or may not
recognize that as escape character for its syntax. And it may be confusing later when you look
at the regex and can’t remember why you escaped a character that the pattern
itself doesn’t need it, But there is another way. Hex values
Most regex implementations support a hex syntax \x##, where # is a hex digit.
So if you use \x22 instead of double quote and \x27 for
single quotes the regexes become more cookie cutter ready.
Another useful hex value is \x20 which is a space. This is especially useful in .Net where
there is an option on a regex to ignorewhitespace in the pattern. Turning this option on allows end of line comments
in the regex but with the exception of inside a character class, ignores typed
in spaces within the patterns, which would be problematic if a space was part
of the pattern to match. So you could
break a working regex if you later decide to add this option. This happened on
the Regexlib when the option was first turned on. A lot of patterns that were written before
the switch was flipped suddenly stopped working.
Speaking of .Net when it comes to name groups you can’t use
the hex notation to define the group name using the single quote syntax . However you can avoid any issue with single
quotes by using the alternate syntax.