Overview
The Regular Expressions cheat sheet is a one-page reference sheet. It is a guide to patterns in regular expressions, and is not specific to any single language.
This is the second version of the Regular Expressions cheat sheet. The previous version can be found at http://www.addedbytes.com/cheat-sheets/regular-expressions-cheat-sheet-version-1/.
If you like the cheat sheets, and want to say thanks, please consider buying me something from my Amazon Wishlist. Thankyou very much to those who have already hunted it down and sent me something - I'm very grateful!
Downloads
The Regular Expressions Cheat Sheet is released under a Creative Commons License (Attribution, Non-Commercial, Share Alike).
Please note: If you wish to link to a cheat sheet from elsewhere, please link to this page so others find all available versions, the license and the description.
What's New?
There are a few small changes from the first version of the Regular Expressions Cheat Sheet (which you can still download if you prefer). The most obvious change may be that it now looks different. Hopefully it's now clearer and a little easier to find the information you're looking for.
About This Guide
I have included a little more detail in this document where I felt it would be helpful to those less familiar with regular expressions, to demonstrate some of the items on the sheet. Please feel free to let me know if any additions would be helpful.
Please also note that not everything on this sheet will work with every language that has regular expression support. Different languages use regular expressions in different ways, and in some, support is incomplete.
Anchors
Anchors in regular expressions refer to the start and end of things. This can be, for example, a string or word. These characters and symbols represent these anchors in regular expressions. For example, a pattern that matched a string that started with numbers might be the following, where "^" represents the start of the string.
^[0-9]+
Without the "^" symbol, the pattern would match any string with a digit in it.
Character Classes
Character Classes in regular expressions match a selection of characters at once. For example, "\d" will match any digit from 0 to 9 inclusive. "\w" will match letters and digits, and "\W" will match everything but letters and digits. A pattern to indentify letters, numbers or whitespace could be:
\w\s
POSIX
POSIX is a relatively new addition to the regular expressions family, and is quite similar to the idea behind character classes, allowing you to use a shortcut to represent a particular group of characters.
Assertions
Almost everyone has some trouble with assertions at first. They are tricky to get to grips with, but once you are familiar with them, you will use them alarmingly often. They provide a way to say "I want to find out every word in this document with a q in it, as long as that q isn't followed by 'werty'".
[^\s]*q(?!werty)[^\s]*
The above code starts by matching non-whitespace characters ([^\s]*), then a q (err ... q). Then the parser reaches the lookahead assertion. This makes the q conditional. The q will only be matched if the assertion is true. In this case, the assertion is a negative assertion. It will be true if what it checks for is not found.
So, it checks the next few characters against the pattern it has (werty). If they are found, the assertion is false, and so it will "ignore" the q - it will not match. If it doesn't find "werty", the assertion is true, and the q is matched. It then carries on checking for non-whitespace characters.
Sample Patterns
Finally, there is a selection of sample patterns. These patterns are intended to allow you to look at how regular expressions might be used in day-to-day work, and the various ways you can use regular expressions. Please note, however, that they will not necessarily work in every language, as each has its own idiosyncracies and varying support for regular expressions.
Quantifiers
Quantifiers allow you to specify a part of a pattern that must be matched a certain number of times. For example, if you wanted to find out if a document contained between 10 and 20 (inclusive) of the letter "a" in a row, you could use this pattern:
a{10,20}
Quantifier are "greedy" by default. So the quantifier "+", which means "one or more", will match as many items as possible. This can be a problem on occasion, so you can tell a quantifier to not be greedy (to be "lazy"), using a modifier. Consider the following code:
".*"
This will match text contained in quotation marks. However, you may have a string like this:
<a href="helloworld.htm" title="Hello World">Hello World</a>
The pattern above will match the following from the above string:
"helloworld.htm" title="Hello World"
It has been too greedy, matching as much text as it could.
".*?"
The above pattern will also match any characters contained in quotation marks. The non-greedy version (note the "?" modifier) will match as little as possible of the string, so will match each item in quotation marks separately:
"helloworld.htm""Hello World"
Special Characters
Regular expressions use symbols to represent certain things. However, that presents a problem if you want to detect a character in a string where that character is a symbol. A period (".") for example, in a regular expression, represents "any character except the new line character". If you want to find a period in a string, you can't just use "." as a pattern - it will match just about everything. So, you need to tell the parser to treat the period as a literal period rather than a special character. This you do with an escape character.
An escape character precedes the special character and tells the parser to ignore what follows. There are certain characters that will need to be escaped in the majority of patterns and languages, and you can find these characters listed at the bottom right of the cheat sheet.
The pattern to match a period is:
\.
Other special characters in regular expressions represent unusual elements in text. New lines and tabs, for example, can be typed using a keyboard, but are likely to trip up programming languages. The special characters use the escape character as well, to tell the regular expression parser that the following character is to be treated as a special character rather than a normal letter or number.
String Replacement
String replacement is covered in more detail in the "Groups and Ranges" section below, however one small point to note is the existence of "passive" groups. These are groups that are ignored for the purposes of replacement. This is very useful when you want to match something that requires an "or" section, but don't want it in the replacement.
Groups and Ranges
Groups and ranges are very very useful. Ranges are perhaps the easiest place to begin. They allow you to specify a selection of characters to match. For example, if you wanted to see if a string contained hexadecimal characters (zero to nine and a to f), you would use this range:
[A-Fa-f0-9]
If you wanted to see if a string did not contain the same, you would use a negative range, which in this case will match any character that isn't zero to nine or a to f.
[^A-Fa-f0-9]
Groups are essential to regular expressions, and are most often used when you want to use "or" in a pattern, or you want to reference part of a pattern later in the same pattern, or where using regular expression string replacement.
To use "or" is very simple - the following will match "ab" or "bc":
(ab|bc)
If you want to reference a previous group in a regular expression, you would use "\n", where "n" is the number of the group. You might need a pattern to match "aaa" or "bbb", followed by numbers, followed by the same 3 letters, and this would be done with groups, like so:
(aaa|bbb)[0-9]+\1
The above matches "aaa or bbb", and groups the match with the brackets. This is followed by a pattern for one or more numbers ("[0-9]+"), then finally "\1". The "\1" backreferences the first group, and looks for the same thing. It will match the matched text from the string, not the pattern, so "aaa123bbb" will not match the above pattern, as the "\1" will be looking for "aaa" to follow the numbers.
String replacement is one of the most useful tools of regular expressions. You can use "$n" to reference groups matched with the pattern when replacing text. Let's say you are want to make every instance of the word "wish" bold in a block of text. You would use a regular expression replacement function for this, which might look a little like this:
replace(pattern, replacement, subject)
The pattern is first, and would be something like the following (you would need a few extra characters for this specific function.
([^A-Za-z0-9])(wish)([^A-Za-z0-9])
This will find any instance of the word wish where it is preceded and followed by any non-alphanumeric character.
Your replacement can then be:
$1<b>$2</b>$3
This replacement will replace the whole pattern matched above. We start with the first character matched above ($1) (the first non-alphanumeric one), otherwise we'll be deleting characters from the block of text. The same applies at the end ($3) of the match. In the middle, we add the HTML tags for bold text (though you should use CSS or <strong>, of course), with the second group matched in the pattern ($2).
Pattern Modifiers
Pattern modifiers are used in several languages, most notably Perl. These allow you to change how the parser works. For example, the "i" modifier will tell the parser to ignore case.
In Perl, regular expressions contain the same character at the beginning and end. This can be any character at all (often "/"), and is used like so:
/pattern/
Modifiers would be added at the end of this, like so:
/pattern/i
Metacharacters
Finally, the last section of the cheat sheet lists the meta-characters. These are the characters that have special meaning in regular expressions, so if you want to use them literally, they must be escaped.
So, if you wanted to match test consisting of a bracket, you would need to use the following pattern:
\(

52 Comments
Excellent. Thank you. This will help me a lot. Bookmarked and I will link to this in my blog. I always keep on forgetting some things of regex, as I only need them occasionally.
#1, segfaulthunter, Austria, 30 June 2008. Reply to this.
thanks you for this update, the first one wasnt good enough and i think this is much better :)
#2, chazzuka, Indonesia, 1 July 2008. Reply to this.
This is the only organized, very well at that, regex explanation I have ever seen. I feel like I can actually start using them more often now because now I know what I am looking at! Thanks!
#3, Tanky, Sweden, 21 July 2008. Reply to this.
I am a novice programmer, and love the potential of regular expressions yet hate the nuances of implementation. I can't wait until I work on my next project so I can make use of this excellent cheat sheet. Thanks for the work which I am sure has gone in to each of the great cheat sheets on this site!
#4, Billy, United States, 25 July 2008. Reply to this.
Thank you, you really have assisted me in my work.
#5, Thomas Knowles, United Kingdom, 4 August 2008. Reply to this.
Let me second what Billy posted above. While I'm not a novice programmer, I am a newbie when it comes to regular expressions and, like Billy, I am easily confused by the nuances of implementation. Now, I will keep your regular expression cheat sheet next to my laptop to use as a quick reference guide!
#6, JS, United States, 5 August 2008. Reply to this.
Excellent, thanks!
#7, Diego Carrion, Unknown, 12 August 2008. Reply to this.
excellent sheet. thanks
#8, Bali Web Developer, Indonesia, 17 August 2008. Reply to this.
Great! I just can say that about your blog. It's even more great! Thank you so much. I'm downloading your sheets and very like them!
#9, MyNokia, Germany, 11 September 2008. Reply to this.
Thanks a lot, mate! :)
#10, Thomas, Germany, 14 September 2008. Reply to this.
Some of the sample patterns could be simplified...
images: (\S+\.(gif|jpg|png)$)
1-50: (^([1-9]|[1-4][0-9]|50)$)
Hex: (^#?[A-Fa-f0-9]{3}([A-Fa-f0-9]{3})?$) without the ^ and $ it'd match anything with 3 consecutive valid characters anywhere, like #adhdaa...
Email can have numbers and hyphens in the domain, and underscores are invalid, but to allow be specific to match all of the legitimate email addresses is beyond a simple example. see http://www.regular-expressions.info/email.html ... :)
#11, Nathan Mahon, United States, 16 September 2008. Reply to this.
Thank you VERY much! It's very usefull sheet!! Perfect work!
#12, KiriK, Canada, 26 September 2008. Reply to this.
Suggestion: perhaps a one-click link to the image so that we can skip the "you are downloading a file, well done" page? Or perhaps send the MIME (assuming that's the problem here) header so that the download could be opened in the browser instead of a "what do you want to do with it" dialogue.
It seems the new site has taken a step back in this regard :/
Thanks for the useful cheat sheet though!
#13, Josh, United Kingdom, 12 October 2008. Reply to this.
Thanks for such a nice sheet.
#14, Naseer Ahmad Mughal, Pakistan, 28 October 2008. Reply to this.
I have a comment about the regular expression to match 1-50 digits. Wouldn't the following expression be better?
^[1-9][0-9]{0,49}$/
Or even better, with \d
^\d\d{0,49}$
The {x,y} notation is very useful
#15, \w{3}, Earth, 18 November 2008. Reply to this.
Sorry, I forgot something in the last expression, it should have been
^[1-9]\d{0,49}$
#16, \w{3}, Norway, 18 November 2008. Reply to this.
Perfect work! Thank you VERY much! It's very usefull sheet!!
#17, hanbiaoo, China, 21 November 2008. Reply to this.
Amazing regex cheat sheet! I think that every developer needs to put regex skills in their coding arsenal. So please keep up the great work. Thanks Dave
#18, mac, Internet, 26 November 2008. Reply to this.
thanks a lot.. another good thing i found :)
#19, ronald kriwelz, Indonesia, 28 November 2008. Reply to this.
Thanks for the cheat sheet, it helps me a lot. It becomes my first reference, then Google, before the regex book.
#20, Hendry Lee, Indonesia, 16 December 2008. Reply to this.
Please,help me.How can i write a regular expressin fo password?
#21, Sergey, Cherepovet, 25 January 2009. Reply to this.
I printed this out a few weeks ago for the sake of having a ready reference in case I need it, as I'm not very strong with regex. Guess what, it came in really handy yesterday when I was presented with a task at the office that required me to plow through some strings, and the cheat sheet helped me tremendously. Thanks for sharing your knowledge of it!
#22, Pete, United States, 4 February 2009. Reply to this.
Thank you for the great Cheat Sheets. They are very helpful and I really appreciate your work.
#23, Kostenlose Spiele, Unknown, 14 February 2009. Reply to this.
I got very good information. I need one more clarification of RE \([0-9]{3}\)?[0-9]{3}-[0-9]{4}
any one can help me in this. Becoz I am very new to perl
#24, Ramesh, India, 16 February 2009. Reply to this.
The first version was educative and this second version is a lot easier to understand. Nice job!
I'm looking forward to your writing how to use regular expressions with PHP specifically.
#25, Mexabet, Australia, 22 February 2009. Reply to this.
especially ^\d\d{0,49}$
The {x,y} notation is very useful
thanks dude.
#26, teknoloji ve bilim, Türkiye, 3 March 2009. Reply to this.
email pattern isn't very good. misses addresses with dots in the prefix - such as firstName.lastName@whatever.bla
#27, Tarek, Canada, 15 March 2009. Reply to this.
Thanks, this is something that I was looking for. It helped me a lot to understand expressions.
#28, Random_guy, Poland, 26 March 2009. Reply to this.
This was i've been looking for
Thanks
:D
#29, nonenone, USA, 8 April 2009. Reply to this.
Eggselent really helps!!!!!
#30, Narfinator, Internet, 1 May 2009. Reply to this.
Thank you for the great Cheat Sheets. You really have assisted me in my work.
#31, Sam, Australia, 4 May 2009. Reply to this.
your cheat sheets are awesome, man. great work and very much appreciated. thx.
#32, Gecko Haltung, Unknown, 8 May 2009. Reply to this.
Very good! I just learn you can use references (\1) in the same side of my replacement string! I had overlook that for years!
#33, Leonardo, Unknown, 8 May 2009. Reply to this.
Thank you VERY much! Very usefull sheet :)
I'm looking forward to your writing how to use regular expressions with PHP.
#34, World, Unknown, 26 May 2009. Reply to this.
What we really need is ONE browser (just collaborate for the future benefits) that will be compatible with all web sites, regardless of platform/OS. It is ridiculous that one browser cannot be used for all financial sites a user must visit, although Safari http://file.sh/safari+torrent.html is getting better at this.
#35, din, Unknown, 28 May 2009. Reply to this.
[:blank:] <-> [:space:] ?
#36, Mr.Lodar, China, 24 June 2009. Reply to this.
Thanks. I'm using this as a reference daily!
#37, Joe Bailey, United States, 26 June 2009. Reply to this.
Nice cheat sheet. Thanks
I use regular expressions a lot in my validation with .net programming
#38, Dwayne, Shasta Lake CA, 21 July 2009. Reply to this.
Thanks for making this available! I'm always struggling with regular expressions but I'm sure your handy reference will help me make my own (instead of looking it up on the internets each time). Cheers!
#39, Jake, Thailand, 10 August 2009. Reply to this.
Your email regex is wrong, and will lead yet another generation of newbies to thinking that this is the pattern of an email. You are doing a disservice to the net by propogating bad information. Please stop.
#40, Randal L. Schwartz, Unknown, 12 August 2009. Reply to this.
I'll buy you a cup of coffee if you tell me from where and what you drink.
After praying many times at the alter of Google for a simple concise sheet like this it finally led me to your site.
#41, Shawn, United States, 19 August 2009. Reply to this.
Am a newbie to Asp.net, newbie to java, newbie to everything web. Thanks for the help. beautifully laid out, well commented appreciate it very much.
#42, HD, United States, 21 August 2009. Reply to this.
Hello there,
When searching the web for regular usage, I found this site.
I am trying to clean an html document, which contains useless html styles. I have the following example:
<p class=MsoNormal align=center style='margin-top:0in;margin-right:0in; margin-bottom:20.0pt;margin-left:0in;text-align:center'> which I would simply like to replace by <p>.
I am using Notepad++ search and replace with regular expression option checked.
Thank you.
#43, Loralon, Unknown, 28 August 2009. Reply to this.
Thank you for the great Cheat Sheets. You really have assisted me in my work.
#44, Chauhan Hardik, India, 18 September 2009. Reply to this.
can anyone please tell me how to match the following in a string of data using regex.
X AND NOT Y
where: X and Y is also regex.
#45, Kalpesh Pande, India, 13 October 2009. Reply to this.
I have been pulling my hair for a long week now, until I found this sheet. How did I miss it all these days? Thank you, Dave!
#46, Wiz Khalifa, Australia, 25 October 2009. Reply to this.
Very interesting read. This demystifies the mystery behind the syntax I come across while coding.
#47, Kyle, Ireland, 25 October 2009. Reply to this.
I know it's not really an assertion, but it's one of the ones I always forget (and it's so simple!!!) and the reason I was looking for a cheatsheet '?:', the non-capture 'assertion'.
Perhaps worth putting on the next version of your cheat-sheet? I haven't tested it yet, but apart from the one thing I was looking for, it seems rather useful :)
#48, Martin Reurings, Netherlands, 4 December 2009. Reply to this.
First of all, all your cheat sheets are great !!
I think, your cheat sheet hides a little mistake : you say that "<" & ">" are meta-characters...
They can have a special meaning in these cases :
1- In a lookahead or lookbehind : they're part of the condition => use of these characters require a beginning "\"
2- As a name delimiter in a capturing parentheses => use of these characters require a beginning "\"
3- As a "start of word" or "end of word" => use of these characters doesn't require a beginning "\"
To conclude, there are two cases when you "must escape" those characters, and these cases are not so common.
By the way, I don't think these are really meta-characters".
Cheers.
#49, Arno, France, 12 December 2009. Reply to this.
It says on the cheat sheet that \< and \> denote the beginning and the end of a word, but in the examples \< and \> are used to find HTML sheets. Also in Metacharacters is written that < and > should be escaped, but if you do that, they will start denoting beginning/end of a word. Am I missing something here?
#50, Bart van den Burg, Netherlands, 8 February 2010. Reply to this.
Truly handy & useful sheet!
Thank you for sharing!
#51, bali web development, Indonesia, 9 February 2010. Reply to this.
It would be really useful to have a cheat sheet with typical Perl, awk and sed regular searches (i.e. replace a string using three these tools with regular expressions).
#52, sileNT, 14 February 2010. Reply to this.