Got more questions? Find advice on: ASP | SQL | XML | Windows
in Search
Welcome to RegexAdvice Sign in | Join | Help

How to create a regex expression that Places a space in 30+ words excluding <a href://"URL">

Last post 07-15-2008, 6:54 AM by ddrudik. 14 replies.
Sort Posts: Previous Next
  •  07-04-2008, 5:41 AM 43751

    How to create a regex expression that Places a space in 30+ words excluding <a href://"URL">

    Hi guys

    Im new here and was just wondering if you guys could give me a little hand on creating a regex expression


    Ok what i want to do is make a regex expression so that it checks if any words in a PHP comment are over 30 Characters as so it can then place in a space in the word so it donst muck up my layout!

    this Works for above:

     preg_replace('/([^\s]{30})/', '$1 ', $comment);   -  Places space in 30+ words

     

     

    Then after it finds a 30+ word it then checks if it is a URL so it dosnt mess with the links (<a href://"dont-mess-with-url">mess-with-name-of-link-if-you-want</a>)

    This is what i have so far but dosnt seem to work?

    preg_replace('/([^\s]{30})([^<a href:\/\/"][^\.\_a-z\?A-Z0-9\-]*[^">])(?=[^\s])/', '$1 ', $comment);     -   Places space in 30+ words excluding <a href://"URL">

     

    Kind regards

    Fastmode
     

    Filed under:
  •  07-04-2008, 10:10 AM 43759 in reply to 43751

    Re: How to create a regex expression that Places a space in 30+ words excluding <a href://"URL">

    <?php
    $string='Then after it finds a 30+ word it then checks if it is a URL so it dosnt mess with the links <a href="http://dont-mess-with-url-dont-mess-with-url-dont-mess-with-url-">mess-with-name-of-link-if-you-want-mess-with-name-of-link-if-you-want</a>';
    function replfunc($match){
      if (preg_match('/<.*?>/s',$match[0])) {
        return $match[0];
      } else {
        return preg_replace('/\S{10}/','\0 ',$match[0]);
      }
    }
    $string=preg_replace_callback('/<.*?>|[^\s<>]{30,}/s','replfunc',$string);
    echo $string;
    ?>

  •  07-04-2008, 10:10 AM 43760 in reply to 43751

    Re: How to create a regex expression that Places a space in 30+ words excluding <a href://"URL">

    <?php
    $string='Then after it finds a 30+ word it then checks if it is a URL so it dosnt mess with the links <a href="http://dont-mess-with-url-dont-mess-with-url-dont-mess-with-url-">mess-with-name-of-link-if-you-want-mess-with-name-of-link-if-you-want</a>';
    function replfunc($match){
      if (preg_match('/<.*?>/s',$match[0])) {
        return $match[0];
      } else {
        return preg_replace('/\S{10}/','\0 ',$match[0]);
      }
    }
    $string=preg_replace_callback('/<.*?>|[^\s<>]{30,}/s','replfunc',$string);
    echo $string;
    ?>

    If you want to emulate forcing word wrap you might consider replacing the space in the replacement with <wbr> tag instead.


  •  07-04-2008, 7:21 PM 43769 in reply to 43760

    Re: How to create a regex expression that Places a space in 30+ words excluding <a href://"URL">

     Hey, there and Thanks heaps for your post :)

     

    Do you mind if you culd explain what this means???

    function replfunc($match){
      if (preg_match('/<.*?>/s',$match[0])) {
        return $match[0];
      } else {
        return preg_replace('/\S{10}/','\0 ',$match[0]);
      }
    }
    $string=preg_replace_callback('/<.*?>|[^\s<>]{30,}/s','replfunc',$string);

  •  07-04-2008, 8:12 PM 43770 in reply to 43769

    Re: How to create a regex expression that Places a space in 30+ words excluding <a href://"URL">

    preg_replace_callback PHP function:

    http://www.php.net/manual/en/function.preg-replace-callback.php

    The regular expression:

    (?s-imx:<.*?>|[^\s<>]{30,})

    matches as follows:
     
    NODE                     EXPLANATION
    ----------------------------------------------------------------------
    (?s-imx:                 group, but do not capture (with . matching
                             \n) (case-sensitive) (with ^ and $ matching
                             normally) (matching whitespace and #
                             normally):
    ----------------------------------------------------------------------
      <                        '<'
    ----------------------------------------------------------------------
      .*?                      any character (0 or more times (matching
                               the least amount possible))
    ----------------------------------------------------------------------
      >                        '>'
    ----------------------------------------------------------------------
    |                        OR
    ----------------------------------------------------------------------
      [^\s<>]{30,}             any character except: whitespace (\n, \r,
                               \t, \f, and " "), '<', '>' (at least 30
                               times (matching the most amount possible))
    ----------------------------------------------------------------------
    )                        end of grouping
    ----------------------------------------------------------------------
     

    After the initial matches are created they are passed to the function one by one, the function tests if the match is a <> block to be ignored, and if not it will be matched in 10 character long segments, between each segment a space will be inserted ("\0 " means the match plus a space character).  Instead of a space character you might consider a replacement such as "\0<wbr>" to keep from breaking the text yet support word wrapping at 10 character intervals.

    Like most solutions, there are other options for accomplishing the same task, here's a similar option that eliminates the extra preg_match operation by using matching capture groups in the initial preg_replace_callback function:

    <?php
    $string='Then after it finds a 30+ word it then checks if it is a URL so it dosnt mess with the links <a href="http://dont-mess-with-url-dont-mess-with-url-dont-mess-with-url-">mess-with-name-of-link-if-you-want-mess-with-name-of-link-if-you-want</a>';
    function replfunc($match){
      if ($match[1]) {
        return $match[0];
      } else {
        return preg_replace('/\S{10}/','\0<wbr>',$match[0]);
      }
    }
    $string=preg_replace_callback('/(<.*?>)|[^\s<>]{30,}/s','replfunc',$string);
    echo $string;
    ?>


  •  07-04-2008, 10:22 PM 43772 in reply to 43770

    Re: How to create a regex expression that Places a space in 30+ words excluding <a href://"URL">

    Thanks!! little bit of learning to do XD

     

    Hey and one more thing how can i have it so that IF it matches href=" DONT proceed?

    Cause i know that [^abc] wont proced when a, b or c are in the text, but how can i check that IF  href=" is in the text as a WHOLE do not procced??

     

    Kind regards

    Fastmode 

  •  07-04-2008, 11:00 PM 43773 in reply to 43772

    Re: How to create a regex expression that Places a space in 30+ words excluding <a href://"URL">

    HMTL tags <> are already ignored by the code provided, that would include <a href...> tags, please be more specific on your last question.


  •  07-04-2008, 11:15 PM 43774 in reply to 43773

    Re: How to create a regex expression that Places a space in 30+ words excluding <a href://"URL">

    I meant (New question) how can i check, lets say "potatoes" is in a 30 letter word and IF so, leave it alone but if potatoes isn't in the word add in a space???

    How can i achieve this? as [^something] only checks singular characters not a combination of letters (words) eg. potatoes 

  •  07-04-2008, 11:25 PM 43775 in reply to 43774

    Re: How to create a regex expression that Places a space in 30+ words excluding <a href://"URL">

    In the same question context, assuming you want to ignore <> tags as well as [^\s<>]{30,} strings that contain potatoes:

    A non-regex (stristr PHP function) method:

    <?php
    $string='Then after it finds a 30+ word it then checks if it is a URL so it dosnt mess with the links <a href="http://dont-mess-with-url-dont-mess-with-url-dont-mess-with-url-">something-potatoes-mess-with-name-of-link-if-you-want-mess-with-name-of-link-if-you-want</a>';
    function replfunc($match){
      if ($match[1] || (stristr(strtolower($match[0]),'potatoes') == TRUE)) {
        return $match[0];
      } else {
        return preg_replace('/\S{10}/','\0<wbr>',$match[0]);
      }
    }
    $string=preg_replace_callback('/(<.*?>)|[^\s<>]{30,}/s','replfunc',$string);
    echo $string;
    ?>

     


  •  07-04-2008, 11:32 PM 43776 in reply to 43775

    Re: How to create a regex expression that Places a space in 30+ words excluding <a href://"URL">

    Ok so isn't there a way of doing it in one nice line like this

    /([^\s<>]{25})(and-dosn't=potatoes)/

  •  07-04-2008, 11:44 PM 43778 in reply to 43776

    Re: How to create a regex expression that Places a space in 30+ words excluding <a href://"URL">

    Once you try to use something like (note the below pattern is quite inefficient):

    (<.*?>)|(?:(?!.*potatoes.*)[^\s<>]){30,}

    Then you will get matches such as:

    $matches Array:
    (
        [0] => Array
            (
                [0] => <a href="http://dont-mess-with-url-dont-mess-with-url-dont-mess-with-url-">
                [1] => otatoes-mess-with-name-of-link-if-you-want-mess-with-name-of-link-if-you-want
                [2] => </a>
            )

        [1] => Array
            (
                [0] => <a href="http://dont-mess-with-url-dont-mess-with-url-dont-mess-with-url-">
                [1] =>
                [2] => </a>
            )

    )

    To test these patterns with your text:

    http://www.myregextester.com/?r=230


  •  07-05-2008, 2:41 AM 43779 in reply to 43778

    Re: How to create a regex expression that Places a space in 30+ words excluding <a href://"URL">

     OhOk Thanks it works!! but now im trying a different and simpler method and was just wondering if you could give me a little hand on this as well :)

    How come this dosnt exclude <a href://bla.com"> when i use this

    /([^\s<>]{30})(?=[^\s])/ 

     

    thankyou 

  •  07-05-2008, 8:56 AM 43783 in reply to 43779

    Re: How to create a regex expression that Places a space in 30+ words excluding <a href://"URL">

    Here's a link to that regex tester again with your new pattern (note the explain checkbox that will explain your pattern to you in the results window of the tester:

    http://www.myregextester.com/?r=231

    [^\S<>]{30} doesn't exclude tags, it was just used in conjunction with a pattern to match tags <.*?> to match text outside of <.*?>

    (?=) is a "match but exclude from capture" construct which might not help you with this one.

    In the past an expression such as:

    (?<!<[^<>]*)pattern(?![^<>]*>)

    was recommended to find "pattern" not within <> tags, however since PHP does not support a variable-length (?<!) construct, it will not work for your task.

    I suspect that a preg_replace_callback function method is the simplest option available to you.


  •  07-15-2008, 3:56 AM 44144 in reply to 43783

    Re: How to create a regex expression that Places a space in 30+ words excluding <a href://"URL">

    Lol sorry for being such a noob before hahaha.

    I finally get it now after a little bit of studying and your help :) so im using this and it works perfectly!

    <?php
    $string='Then after it finds a 30+ word it then checks if it is a URL so it dosnt mess with the links <a href="http://dont-mess-with-url-dont-mess-with-url-dont-mess-with-url-">mess-with-name-of-link-if-you-want-mess-with-name-of-link-if-you-want</a>';
    function replfunc($match){
      if (preg_match('/<.*?>/s',$match[0])) {
        return $match[0];
      } else {
        return preg_replace('/\S{10}/','\0 ',$match[0]);
      }
    }
    $string=preg_replace_callback('/<.*?>|[^\s<>]{30,}/s','replfunc',$string);
    echo $string;
    ?>

     

    Thanks a million and take care Big Smile 

  •  07-15-2008, 6:54 AM 44150 in reply to 44144

    Re: How to create a regex expression that Places a space in 30+ words excluding <a href://"URL">

    To avoid the preg_match function within the replfunc function you can use a capture group ( ) in the preg_replace_callback function:

    <?php
    $string='Then after it finds a 30+ word it then checks if it is a URL so it dosnt mess with the links <a href="http://dont-mess-with-url-dont-mess-with-url-dont-mess-with-url-">mess-with-name-of-link-if-you-want-mess-with-name-of-link-if-you-want</a>';
    function replfunc($match){
      if ($match[1]) {
        return $match[1];
      } else {
        return preg_replace('/\S{10}/','\0 ',$match[0]);
      }
    }
    $string=preg_replace_callback('/(<.*?>)|[^\s<>]{30,}/s','replfunc',$string);
    echo $string;
    ?>


View as RSS news feed in XML