Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

RegEx to match all words except some

Last post 12-12-2007, 7:28 PM by ddrudik. 3 replies.
Sort Posts: Previous Next
  •  12-12-2007, 12:10 PM 37600

    RegEx to match all words except some

    Hi,

      I need a regex that matches a string containing words (only the letters of the english alphabet, upper and lower case, and digits) and exclude words like MS, IB, DS etc...

    Programming language is Java. I tried with \\w+(?:MS|IB|DS). I would appreciate for your help.

     

    regards,

    RB
     

    Filed under:
  •  12-12-2007, 12:17 PM 37603 in reply to 37600

    Re: RegEx to match all words except some

    To allow "words" you would need to allow spaces, do you want to allow spaces in this?

    If yes:

    (?!.*\b(?:MS|IB|DS)\b.*)^[a-zA-Z0-9 ]+$

    If no:

    (?!.*(?:MS|IB|DS).*)^[a-zA-Z0-9]+$

    \w includes the underscore character in case you didn't want to allow that.


  •  12-12-2007, 6:12 PM 37619 in reply to 37603

    Re: RegEx to match all words except some

    ddridik,

    I think your pattern will match the word but only if it is the only one on the line (the ^ and $ ensure that).

    The way I read the OP's question is that, given a text string such as

    I dont like the way that MS uses the IB structure

    to locate each individual word EXCEPT the MS and IB. If so, something like

    (^|\s)(?!MS|IB|DS)[a-zA-Z0-9]+$?

    will result in the required matches.

    Susan
     

  •  12-12-2007, 7:28 PM 37620 in reply to 37619

    Re: RegEx to match all words except some

    If that's the case then possibly a modifcation of the pattern to:

    \b(?!(?:[MD]S|IB)\b)[a-zA-Z0-9]+\b

    I am unable to get a match with:

    (^|\s)(?!MS|IB|DS)[a-zA-Z0-9]+$?

    Although I see where where you were going with that.

    Since it will ignore words with underscores in them, if the asker wants to include those words too:

    \b(?!(?:[MD]S|IB)\b)\w+\b

    Since words with apostrophes will also be excluded unless you modify the pattern to allow for them:

    \b(?!(?:[MD]S|IB)\b)[\w']+\b

    Test String:

    This is a test of a pattern's efficacy excluding DS, MS, and IB but not IBB or BIB.

    Results:

    Array
    (
        [0] => Array
            (
                [0] => This
                [1] => is
                [2] => a
                [3] => test
                [4] => of
                [5] => a
               [ 6] => pattern's
                [7] => efficacy
               [ 8] => excluding
                [9] => and
                [10] => but
                [11] => not
                [12] => IBB
                [13] => or
                [14] => BIB
            )
    
    )
    

     


View as RSS news feed in XML