VBScript Regular Expressions are, as in other languages, a very powerful tool, allowing you to find and manipulate patterns within strings easily and quickly. The syntax for them can often be a headache, but once you are familiar with them, you will find them invaluable.
One potential application for regular expressions in VBScript is to process text entered into a website via a form. Normal text replacement will allow you to filter out swear words and highlight specific phrases, but regular expressions allow you to go further. The below example demonstrates how to use them to make any valid email addresses or URLs into clickable links programmatically.
The function itself is easy to call, like so:
strTextToProcess = create_links(strTextToProcess)
The create_links function makes use of another function, also included below, called "ereg_replace". This is a simple function to make regular expression text replacement easier, and you can find out more about it in my article about VBScript Regular Expressions.
function create_links(strText)
strText = " " & strText
strText = ereg_replace(strText, "(^|[\n ])([\w]+?://[^ ,""\s<]*)", "$1<a href=""$2"">$2</a>")
strText = ereg_replace(strText, "(^|[\n ])((www|ftp)\.[^ ,""\s<]*)", "$1<a href=""http://$2"">$2</a>")
strText = ereg_replace(strText, "(^|[\n ])([a-z0-9&\-_.]+?)@([\w\-]+\.([\w\-\.]+\.)*[\w]+)", "$1<a href=""mailto:$2@$3"">$2@$3</a>")
strText = right(strText, len(strText)-1)
create_links = strText
end function
function ereg_replace(strOriginalString, strPattern, strReplacement)
' Function replaces pattern with replacement
dim objRegExp : set objRegExp = new RegExp
objRegExp.Pattern = strPattern
objRegExp.IgnoreCase = True
objRegExp.Global = True
ereg_replace = objRegExp.replace(strOriginalString, strReplacement)
set objRegExp = nothing
end function
Finally, here is a demonstration of the above code in action:
strTextToProcess = "This simple pair of functions, from http://www.addedbytes.com, will take any text and convert valid URLs and email addresses into clickable links. Problems, feedback and suggestions should be sent to dave@addedbytes.com or posted in the comments section, which you can reach through the link below."
strTextToProcess = create_links(strTextToProcess)
response.write strTextToProcess
The three lines above will output:
This simple pair of functions, from http://www.addedbytes.com, will take any text and convert valid URLs and email addresses into clickable links. Problems, feedback and suggestions should be sent to dave@addedbytes.com or posted in the comments section, which you can reach through the link below.
23 Comments
hi jack daniels.. i tried your code to make URLs Email Address Clickable Automatically .. it working.. and nice code.. but i have one more problem to discuss. The problem is how to open the hyperlinks address as a blank document or as new page.. can we discuss about this?
#1, matroxjr, Malaysia, 12 July 2004. Reply to this.
Good question. On the third line of the create_links() function is a small part that says "$1<a href=""$2"">$2</a>". Just replace that with "$1<a target="_blank" href=""$2"">$2</a>".
#2, Dave Child, United Kingdom, 12 July 2004. Reply to this.
Hi Dave!
Great script! But there's a little bug that I don't think is intended...
If I post an url on a new row, like this:
http://nuss.se/
The url doesn't get "urlified"! The regexp seems to require a space before the url.
#3, Nisse, Sweden, 16 July 2004. Reply to this.
Curious. Thanks, Nisse, for pointing that out. I've updated the code slightly so it should work for any whitespace, including new lines (it should have done before, strangely).
#4, Dave Child, United Kingdom, 16 July 2004. Reply to this.
Thanks for a great script. However I got a problem. If the URL is at the end of a sentence, the dot is included in the link. Which leads to a faulty link.
#5, Joop, Sweden, 11 October 2004. Reply to this.
Yes - unfortunately, that is tricky to fix, as periods are legitimate characters in URLs. I'm working on a better version of this function though, and fixing that bug is one of the aims of the new version.
#6, Dave Child, United Kingdom, 12 October 2004. Reply to this.
#4. Dave Child on July 16, 2004:
Curious. Thanks, Nisse, for pointing that out. I've updated the code slightly so it should work for any whitespace, including new lines (it should have done before, strangely).
I still seem to have this problem.. What should I change to make it work?
#7, Bas, Netherlands, 18 February 2005. Reply to this.
Try replacing "\n" with "\r\n". That will likely do the trick if the script isn't working.
#8, Dave Child, United Kingdom, 19 February 2005. Reply to this.
Thanks for the quick reaction.. Sorry for my slow reaction..
But it doesn't work.. Something else I could try to make the links work when they are the first (and only thing) on a line?
#9, Bas, Netherlands, 3 March 2005. Reply to this.
The \r and \n stuff should match new lines and whitespace in VBScript. Let me do a little testing though and I'll see if I can work out a definitive solution.
#10, Dave Child, United Kingdom, 4 March 2005. Reply to this.
great thanks :)
I'll check to see if you found something.. :) Thanks for the effort!
#11, Bas, Netherlands, 5 March 2005. Reply to this.
Hi Dave.
Your script just saved my life! GREAT JOB!
About the "." bug, I've inserted at the end of the first function, the following script.
replace(strText,".""",""".")
It's not perfect, but at leat lets the href works.
Tks
Bruno
#12, Bruno, Brazil, 26 August 2005. Reply to this.
Here's a more complete version of Bruno's solution:
' Remove unwanted punctuation at the end of the links (both the HREF and the text inside the anchor tag).
DIM UNWANTED_PUNCTUATION
UNWANTED_PUNCTUATION = "\?<\[\].,!"
strText = ereg_replace(strText, "[" & UNWANTED_PUNCTUATION & "]+""", """")
strText = ereg_replace(strText, "([" & UNWANTED_PUNCTUATION & "]+)</a>", "</a>$1")
This has worked on all of the text I have tested so far.
#13, Brian Hanifin, United States, 22 February 2006. Reply to this.
The script works great!, thank you. One issue I've found is that URLS after char returns don't get parsed.
For example:
--- start example ---
www.yahoo.com is a great search portal.
www.google.com is great search engine.
--- end example ---
www.yahoo.com gets parsed, but www.google.com does not.
Any suggestions?
#14, Mike, United States, 3 April 2006. Reply to this.
Anyone know what languages this solution will work for?
#15, Em, Ireland, 12 July 2006. Reply to this.
complete code newbie :( I think this is exactly what i need but how do I make it work on a text list?
#16, RADCOM, United Kingdom, 4 June 2007. Reply to this.
I realize that this is really old at this point, but it helped me greatly. The way that I solved the problem of the period at the end of a link was to add this line of code near the bottom of the create_links function.
strText = right(strText, len(strText)-1)
***New Line
strText = Replace(strText,"." & chr(34) & ">",chr(34) & ">")
***End New Line
create_links = strText
This looks through the textfield and replaces all instances of
.">
with just
">
effectively correcting and removing the trailing period from all of your links.
#17, Stephen Collins, United States, 12 August 2008. Reply to this.
Very helpful! Thanks so much. It works great for me, except for when the URL is preceded by a parenthesis (a closing parenthesis is fine). E.g.:
www.dell.com --> Works.
Dell (www.dell.com) --> Doesn't work.
I've tried adding both parentheses symbols to the regex code to no avail. Any thoughts?
#18, Eric, United States, 15 August 2008. Reply to this.
Thanks very much for this.
One thing to check, if you are having the "doesn't work when url is on a new line" problem (as I was). I replaced \n with \r\n as suggested above and still had the problem, then realised that before sending to the script, I was doing :
text = replace(text,vbcrlf,"<br>")
placing this code AFTER create_links(textsource) did the trick.
Thanks again
#19, Neil, United Kingdom, 16 August 2008. Reply to this.
Great script! The problem with links not being converted when they sit on a single line or are at the begining or end of a line is that the text being converted could be html and therefore, start and end tags need to be ignored <>
Try adding <> to [\n<> ] I've tried this and works.
#20, Neil Horton, United Kingdom, 9 December 2008. Reply to this.
Thanks - this is great. For me it did not work for email addresses of the form mailto:test@test.com
Any thoughts?
#21, Charles Heaps, United States, 6 March 2009. Reply to this.
It works great for me, though I did a bit of enhancements to the original code. I was using the old-fashioned way to achieve this, until I found this article. Thanks Dave!
#22, Mexabet, Australia, 7 March 2009. Reply to this.
Hi Dave -- Thank you for posting your script, perfect.
#23, Miles, United Kingdom, 14 January 2013. Reply to this.