Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

Match accented Latin graphem with PHP regex

Last post 03-12-2010, 2:50 AM by alexhux. 4 replies.
Sort Posts: Previous Next
  •  03-11-2010, 3:52 AM 60684

    Match accented Latin graphem with PHP regex

    Hi! I'm developing a short series of regex which will help me in validating the input of the user in a webform. I'm trying to accomplish that task using PHP.

    What I need to check, for example, is that the string given by the user contain only lowercase and uppercase letters (A-Za-z), blank spaces and accented letters like àèéìòù. I really can't get how to make this, I guess I must use the \x{0000} syntax in a range but I can't get it working.

    Every help will be gold for me. Thanks a lot!

  •  03-11-2010, 10:53 AM 60775 in reply to 60684

    Re: Match accented Latin graphem with PHP regex

    The following pattern will match the characters you've listed

    ^[A-Za-z\x20àèéìòù]*$

    http://regexadvice.com/blogs/mash/archive/2008/01/31/A-touch-of-Character-Class.aspx discusses how the character class works.


    Michael

    "In theory, theory and practice are the same. In practice, they are not."
    Albert Einstein
  •  03-11-2010, 11:46 AM 60781 in reply to 60775

    Re: Match accented Latin graphem with PHP regex

    I've found another way to do this, because sometime writing directly the charater to match doesn't work.

    I've fixed that problem with this regex:

     ^[A-Za-z\x{00C0}-\x{00FF} ]{2,30}$ which match all upper/lowercase letters + all latins extended accented letters and a blank space (not less then 2 characters and not more then 30)

  •  03-11-2010, 1:04 PM 60783 in reply to 60781

    Re: Match accented Latin graphem with PHP regex

    alexhux:

    I've found another way to do this, because sometime writing directly the charater to match doesn't work.

    I've fixed that problem with this regex:

     ^[A-Za-z\x{00C0}-\x{00FF} ]{2,30}$ which match all upper/lowercase letters + all latins extended accented letters and a blank space (not less then 2 characters and not more then 30)

    Actually directly writing the character should always work as for as the regex is concerned. However you do have to escape metacharacters and make adjustment for the host language's syntax rules.

    Ranges however are easier to maintain.

    Just FYI, when I replied I provided a match just for the characters you listed so I don't know if it matters for the solution that you came up with since it involves more characters but the range you are using includes a couple of mathematical symbols. I'm under the impression that you only really want letter characters. If so you'll want to exclude characters U+00D7 and U+00F7


    Michael

    "In theory, theory and practice are the same. In practice, they are not."
    Albert Einstein
  •  03-12-2010, 2:50 AM 60806 in reply to 60783

    Re: Match accented Latin graphem with PHP regex

    You're right... in the range that I'm using there are more then accented letters... maybe I've to use a more specific range.

    Really I only need accented letters used in the Italian alphabet, so only some more letters then the one I've specified, but having also the possibility to match other accented letters should be useful.

View as RSS news feed in XML