Hi all,
I want to do something that I don't know if it possible. I have an HTML document from which I obtain all texts that are inside tags, for example, all text inside a span, p, a, etc.
My regex is: (>)[\r\n\t]*([\w\s&;]+)[\r\n\t]*(<) and it woks great. Now, I don`t some tags and they are different from the others by an attribute. An example would be the following, the attribute that discriminate is "class":
<body>
<span>Hello, this text is valid.</span>
<div class="NoValid">This text is not valid.</div>
<div>This text is valid.</div>
</body>
Which regex is the one to get all text excepting the first div???
Thanks in advance.