OK. Stick with me here. It is going to take me a few to explain this.
I am writing a small application in VB.NET. The ultimate task is to make a web browser, within a web browser. Basically, You open IE, FireFox, Safari, or whatever browser you like and go to the web application I am building. At the top of the web page, I have a tool bar, including a text box for the URL. The user will type in the web page they want to go to, hit Enter, and the web form will post to itself. Then, on the postback to the server, my code looks similiar to the code below.
Dim _WebClient As New System.Net.WebClient
Dim _UTF8 As New System.Text.UTF8Encoding
Dim _URL As String = Me.txtAddress.Text
Dim _Content As String
_WebClient.Headers.Add("User-Agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.0.3705)")
_Content = _UTF8.GetString(_WebClient.DownloadData(_URL))
This works great to get the HTML of the page the user wants to see. The next step is to replace all of the href and src values of images and links. What I need is a little direction on the regular expressions to achieve this. Basically, I want to have all the src tags and href tags point to one single form within my application. It will be something like http://intranet/web/?http://www.google.com/search?hl=en&q=news
Now, you might be asking, "Why is he doing this?". The answer is simple. My client has not allowed any of his employees to have web access, ever. (Unbelievable, I know.) He has finally accepted that fact that sometimes an employee needs access to the web to research something. Still, he does not want to open it up, completely. He wants me to build an extension to the existing intranet application that the employees use all day. This will be just as I described above. I have the application working, except I still need to get the href tag values to point to the application I am building instead of the actual web page I fetched. Same thing with the src tag values for the images. Now, I could do this with a bunch of string parsing, but I know that regular expressions are the best way of accomplishing my goal.
Sorry if I provided way more information than I needed. Sorry if I "simplified" my question. I just wanted to make sure everyone knew what I was trying to achieve.
Please help as soon as possible. I am past my due date. Thanks in advance.
- Always looking for work