Regular expressions in VBScript are two words that can bring many to their knees, weeping, but they are not as scary as some would have you believe. With their roots in Perl, regular expressions in VBScript use similar syntax, and the chances are that you may already be familiar with the concepts here if you have played with regular expression matching before.
Below, you will find three sections. The first section, Reference, is a simple reference listing the most-used of the various symbols and characters used in regular expressions. The second section, Functions, has two functions in it that may make life easier for you. The third section, Examples, is where the fun begins - examples of regular expressions in action.
Reference
Character Sets and Grouping
- . - Any single character (except new line character, "\n")
- [] - Encloses any set of characters
- ^ - Matches any characters not within following set
- [A-Z] - Any upper case letter between A and Z
- [a-z] - Any lower case letter between a and z
- [0-9] - Any digit from 0 to 9
- () - Group section. Also can then be back-referenced with $1 to $n, where n is the number of groups
- | - Or. (ab)|(bc) will match "ab" or "bc"
Repetition
- + - One or more
- * - Zero or more
- ? - Zero or one
- {5} - Five
- {1,3} - One to three
- {2,} - Two or more
Positioning
- ^ - Start of string
- $ - End of string
- \b - End of word
- \n - New line
- \r - Carriage return
Miscellaneous
- \ - Escape character
- \t - Tab
- \s - White space
- \w - Matches word (equivalent of [A-Za-z0-9_])
Please note that the escape character mentioned above is not usable in normal VBScript. Regular expression syntax is based upon Perl regular expression syntax. To escape a character in VBScript, you usually double it. For example, the following will print out 'This is a "quoted" piece of text'.
response.write("This is a ""quoted"" piece of text.")
Functions
The first of the functions below, ereg (named after the PHP function to keep me from going quite quite mad), is the one you will probably use most. Simply put, if you feed in a string, pattern, and choose whether or not you would like to ignore the case of letters in either, the function will return TRUE if the string contains the pattern, or FALSE if not.
function ereg(strOriginalString, strPattern, varIgnoreCase)
' Function matches pattern, returns true or false
' varIgnoreCase must be TRUE (match is case insensitive) or FALSE (match is case sensitive)
dim objRegExp : set objRegExp = new RegExp
with objRegExp
.Pattern = strPattern
.IgnoreCase = varIgnoreCase
.Global = True
end with
ereg = objRegExp.test(strOriginalString)
set objRegExp = nothing
end function
Next up we have ereg_replace. Like it's shorter cousin, you need to feed it a string, a pattern and choose your case sensitivity. This time, you must also add a replacement. This function will replace all instances of the pattern with the replacement in the string (if you change ".Global = True" to ".Global = False" then the function will only replace the first instance of the pattern with the replacement).
function ereg_replace(strOriginalString, strPattern, strReplacement, varIgnoreCase)
' Function replaces pattern with replacement
' varIgnoreCase must be TRUE (match is case insensitive) or FALSE (match is case sensitive)
dim objRegExp : set objRegExp = new RegExp
with objRegExp
.Pattern = strPattern
.IgnoreCase = varIgnoreCase
.Global = True
end with
ereg_replace = objRegExp.replace(strOriginalString, strReplacement)
set objRegExp = nothing
end function
Examples
Example 1: Checking hexadecimal string
A hexadecimal number can be made up of any digit, and any letter, upper or lower case, between a and f, inclusive. So to check if a string is actually hexadecimal, the following will do quite nicely (strOriginalString is the original string to be tested):
<%
if ereg(strOriginalString, "[^a-f0-9\s]", True) = True then
response.write "String is not hexadecimal."
else
response.write "String is hexadecimal."
end if
%>
The pattern, "[^a-f0-9\s]" matches anything that is not in the set of characters specified (so if there is anything in the string that is not in that set, the function will return True). The characters specified are all letters between a and f inclusive, and we've specified a case insensitive match, so upper case letters will be treated the same way. We are also allowing whitespace (new lines, spaces, carriage returns and tabs), which is what the "\s" represents in regular expressions.
Example string that returns False (and is therefore hexadecimal):
AAcc99
Example 2: Masking the last section of an IP address
An IP address is made up of four sets of numbers seperated by periods. It's common practice, if you are going to display visitor (or any) IP address on your site, to mask the last (fourth) set of numbers. Here's a way to use ereg_replace to do just this:
<%
strOriginalString = ereg_replace(strOriginalString, "([^0-9])([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\.[0-9]{1,3}([^0-9])", "$1$2.$3.$4.***$5", True)
%>
This is a little more tricky, as you'd hopefully expect from a second example. It looks harder than it is though, so one step at a time. There are actually only a few entities in the pattern - they are just repeated. The most important is this: "([0-9]{1,3})". It matches a section of an IP adress, and is enclosed in brackets so that this section can be used in the replacement of the pattern as well (otherwise we would not be able to keep the first three parts of the IP address to display). You can see these sections in use, referenced with "$2", "$3" and "$4" in the replacement. The pattern within the brackets simply says "between one and three digits between 0 and 9".
The second repeated section is "\.". We use a backslash before the period to indicate that this period (the character following the backslash) is to be treated as a normal period. We call this an escaped character, and this is a fairly common practice. The period, unescaped (without the backslash), is used as a symbol representing "any character except the new line character".
Example input text:
My IP address is 123.456.78.9 but 4444.1.1.1 is just a bunch of random numbers, and so is 12.34.56, and 1.1.1.1 is another valid IP.
Example output text:
My IP address is 123.456.78.*** but 4444.1.1.1 is just a bunch of random numbers, and so is 12.34.56, and 1.1.1.*** is another valid IP.
Example 3: Making the second word of every sentence in a string bold, as long as the word before only contains upper case letters and the second word does not contain an even digit
Getting more interesting now, this example is not in the least bit useful in practice, but should prove to be a useful demonstration of the power of regular expressions. It sounds tough - but with regular expressions, it's a walk in the park.
<%
strOriginalString = ereg_replace(". " & strOriginalString, "(\.|!|\?)\s([A-Z]+)\s([^02468\s]+)\s", "$1 $2 <strong>$3</strong> ", False)
strOriginalString = mid(strOriginalString, 2)
%>
We start by adding an artificial period and space to the beginning of the string, just to make sure we catch the first sentence, and add a line to strip our extra characters out afterwards. We only want those sentences split with punctuation and a space, or we'll end up with bold decimals and it will be very messy indeed. So, we check for puncuation, followed by a space, followed by a word made entirely of capitals, followed by another space, followed by a second word that doesn't contain even numbers, or whitespace, followed by a space. If we find that, we replace it with the same items we picked up in brackets, only with a <strong></strong> tag pair around the second word.
Example input text:
THE quick brown fox jumped over the lazy dog? Many red balloons blew up! EVEN num2ber sentence. ODD num3ber sentence.
Example output text:
THE quick brown fox jumped over the lazy dog? Many red balloons blew up! EVEN num2ber sentence. ODD num3ber sentence.
34 Comments
Great article!
#1, Owen Michael, Norway, 12 December 2003. Reply to this.
it is good
very good exampels
#2, kiran kumar kalvagadda, United States, 7 February 2004. Reply to this.
Thanks for the ereg, and ereg_replace functions. I was looking for something simple and sweet and all 4guysfromrolla.com could offer me was babbling explanations and nothing that was too the point. Congrats.
#3, chris, United States, 28 July 2004. Reply to this.
it is good
#4, kiran kalvagadda, United States, 13 August 2004. Reply to this.
ereg_replace exactly what I was looking for - gracias
#5, kelly, United States, 23 June 2005. Reply to this.
Sweet. Great examples!!
#6, Shamiul Azom, United States, 6 August 2005. Reply to this.
Genius !
exactly what i was looking for
Many thanks
#7, Flo, France, 11 August 2005. Reply to this.
Thanks for the sample code....
#8, I dont know RegEx, United States, 8 December 2005. Reply to this.
Many a time your site has helped me out.
Great regexp examples.
Thank you.
#9, Shane, United States, 4 January 2006. Reply to this.
is there a way to replace a sTrInG with <bold>sTrInG</bold>, so keeping the original case-instances unmodified?
#10, admin, United Kingdom, 14 April 2006. Reply to this.
"is there a way to replace a sTrInG with <bold>sTrInG</bold>, so keeping the original case-instances unmodified"
Try storing each word in a variable rather than doing and search and replace. When you find the string, replace it with the
oldString = sTrInG
' do a search for oldString after changing it to upper case if needed, once you find it, replace it with the newString
newString = "<bold>" & oldString & "</bold>"
#11, zafar, Canada, 10 September 2006. Reply to this.
Very good explanation !!
I looked around for articles about regular expressions in vbscript and couldnt find anything as good as this.
Thanks !
#12, Michel, Netherlands, 28 January 2007. Reply to this.
I see the ereg command being used here. This is a PHP programming language command from what I can find on the web.
I need to do what ereg does but in vbscript. Do you have any vbscript code??
#13, davwsmith@eprod.com, United States, 7 June 2007. Reply to this.
Read the article, "davwsmith". I've named the VBScript functions after the PHP ones, as they are intended to do the same thing.
#14, Dave Child, United Kingdom, 7 June 2007. Reply to this.
thanks a lot...this page information is very helpful for me. i'm searching for this solution for past 3 days. thanks
#15, Jevean, India, 7 November 2007. Reply to this.
Great code!
I had tried several code samples (even on Microsoft's own website), and these are definitely working: easy to use, easy to understand, thanks a lot!
#16, Goulven Champenois, France, 5 December 2007. Reply to this.
Can anyone tell me regular expression for time values For e.g lets say reg exp for 03:00 PM ?
#17, Ansi, United States, 20 December 2007. Reply to this.
I need to create rule start with alphabet followed by alphabet or digit or underscore with minimum 5 characters length
i used ^[a-z][\w]{5,} but result TRUE on "test test"
any help?
#18, baliwebdesigner, Australia, 18 February 2008. Reply to this.
You rock! Thanks for this.
#19, Noobie Munky, United States, 21 February 2008. Reply to this.
Thanks for this. I just used a variation of your code to create a BBCode parser.
I can now let people load custom content without worrying about them breaking my code!
#20, Jennifer Harrowell, United Kingdom, 26 February 2008. Reply to this.
Probably the best example of regular expressions I've ever seen. Which is exactly why I keep feeling the need to refer back to it almost every day. Fantastic stuff!
#21, James Law, United Kingdom, 27 March 2008. Reply to this.
Read the article, "davwsmith". I've named the VBScript functions after the PHP ones, as they are intended to do the same thing.
I'm sorry, where's this article?
#22, Tommy, United States, 19 June 2008. Reply to this.
Tommy: It's above the comments. Search on this page for the phrase "named after the PHP function" and you'll find it.
#23, Dave Child, United Kingdom, 20 June 2008. Reply to this.
Why if the IP address is at the end of the string, this doesn't work ??
E.g. "My Ip Address is: 127.0.0.1" this does NOT work, while "My Ip Address is: 127.0.0.1 and I like good" does WORK...
#24, SPeedyNT, Italy, 1 October 2008. Reply to this.
A great article!
I especially like the reference, it's the easiest to understand reference I've seen so far :)
#25, Pepper, Canada, 3 February 2009. Reply to this.
Simply Superb
#26, Prakash V, India, 7 September 2009. Reply to this.
So useful! thanks for share it!
#27, waveland, Spain, 24 September 2009. Reply to this.
I love the enhancement to vb script regex replace.
However, I sure could use a little help.
I just can't seem to figure out how to do what I want.
I need to search HTML for href tags and replace all ampersands with "%26". I could do a simple vbscript replace but the problem is that I don't want to do the replace on the src tags, only on the href tags in the HTML.
Example Start String:
href="http://video.google.com/videosearch?q=test&um=1&ie=UTF-8&sa=N&hl=en&ta=wv"
<BR>
<img src="http://image.google.com/image?q=test&um=1&ie=UTF-8&sa=N&hl=en&ta=wv"
Example Result String:
href="http://video.google.com/videosearch?q=test%26um=1%26ie=UTF-8%26sa=N%26hl=en%26ta=wv"
<BR>
<img src="http://image.google.com/image?q=test&um=1&ie=UTF-8&sa=N&hl=en&ta=wv"
Any ideas?
Thanks in advance.
Replies: #29 and #30.
#28, Tracey Smith, USA, 20 January 2010. Reply to this.
#28 Sorry Scott, I wrote “%26” I meant “amp;” – but the issues is still the same. Replace all Xs with Y but only when X is in an href, in the string, not when it is in an img src.
#29, Tracey Smith, United States, 20 January 2010. Reply to this.
#28 Anyone have any ideas?
#30, Tracey Smith, United States, 20 January 2010. Reply to this.
Is there a way to not replace the matched pattern with something else, but to keep only what is matched and discard the rest?
#31, DennisA, Finland, 14 January 2011. Reply to this.
Thanks for these regular expressions snippets. They helped me with a regex problem I was facing, thanks so much.
Can't believe it... I am still coding in classic ASP for a client. :)
#32, Jasmine, 4 May 2012. Reply to this.
Hey, man, thanks for the clear writeup and understandable examples. Although I've been doing perl regular expressions for years, VBScript RegExp had me weeping. But you've helped me see the light!
#33, Rick Schwein, USA, 13 June 2012. Reply to this.
Worked easily in ASP Classic, which is basically VBS. I use it now for user input as my validation code.
#34, Jim Russell, USA, 26 March 2013. Reply to this.