Skip Navigation

Email Address Validation

PLEASE NOTE: This function is now considered out of date. An updated version incorporating many of the comments below has been released under an open source license as a Google Code project: php-email-address-validation. There is more about this change in the post Email Address Validation Updated.

Many email address validators will actually throw up errors when faced with a valid, but unusual, email address. Many, for example, assume that an email address with a domain name extension of more than three letters is invalid. However, new TLDs such as ".info", ".name" and ".aero" are perfectly valid but longer than three characters. Many email address validators fail to take into account that you do not necessarily need a domain name in an email address - an IP address is fine.

The first step to creating a PHP script for validating email addresses is to work out exactly what is and is not valid. RFC 2822, that specifies what is and is not allowed in an email address, states that the form of an email address must be of the form "local-part @ domain".

The "local-part" of an email address must be between 1 and 64 characters in length and may be made up in any one of three ways. It can be made up of a selection of characters (and only these characters) from the following selection (though the period can not be the first of these):

  • A to Z
  • 0 to 9
  • !
  • #
  • $
  • %
  • &
  • '
  • *
  • +
  • -
  • /
  • =
  • ?
  • ^
  • _
  • `
  • {
  • |
  • }
  • ~
  • .

Or, it can be made up of a quoted string containing any characters except "\". Older email addresses may be made up differently, and may contain a combination of the above. The following are all valid as the first part of an email address:

  • dave
  • +1~1+
  • {_dave_}
  • "[[ dave ]]"
  • dave."dave" (Note that this is considered an obsolete form of address - new addresses created should not be of this form, but it is still considered valid.)

The following, though similar, are all invalid:

  • -- dave -- (spaces are invalid unless enclosed in quotation marks)
  • [dave] (square brackets are invalid, unless contained within quotation marks)
  • .dave (the local part of a domain name cannot start with a period)

The "domain" portion of the email address can also be made up in different ways. The most common form is a domain name, which is made up of a number of "labels", each separated by a period and between 1 and 63 characters in length. Labels may contain letters, digits and hyphens, however must not begin or end with a hyphen (officially, a label must begin with a letter, not a digit, however many domain names have been registered beginning with digits so for the purposes of validation we will assume that digits are allowed at the start of domain names). A domain name, technically, need be only one label. However in practice domain names are made up of at least two labels, so for the purposes of validation we will check for two. A domain name may not be over 255 characters in total. A domain portion of an email address may also be an IP address, which can in turn be enclosed in square brackets.

In order to check that email addresses conform to these guidelines, we'll need to use regular expressions. First, we need to match the three possible forms of the local part of an email address, using the two patterns below (we'll add in escape characters later, when we put the function together):

  1. ^[A-Za-z0-9!#$%&'*+-/=?^_`{|}~][A-Za-z0-9!#$%&'*+-/=?^_`{|}~\.]{0,63}$
  1. ^"[^(\|")]{0,62}"$

We can use the two patterns we've defined here to check for obsolete local parts of email addresses too, saving ourselves from needing a third pattern.

Next, we need to check the domain portion of the email address. It can either be an IP address or a domain name, so we can use the two patterns here to validate it:

  1. ^\[?[0-9\.]+\]?$
  1. ^[A-Za-z0-9][A-Za-z0-9-]*[A-Za-z0-9](.[A-Za-z0-9][A-Za-z0-9-]*[A-Za-z0-9])+$

The above pattern will match any valid domain name, but will also match an IP address, so we only need the above to check the "domain" portion of the email.

Putting it all together gives us the following function. Call it like any normal function, and you will get back a value of "true" if the string entered is a valid email address, or "false" if the input was an invalid email address.

  1. function check_email_address($email) {
  2. // First, we check that there's one @ symbol, and that the lengths are right
  3. if (!ereg("^[^@]{1,64}@[^@]{1,255}$", $email)) {
  4. // Email invalid because wrong number of characters in one section, or wrong number of @ symbols.
  5. return false;
  6. }
  7. // Split it into sections to make life easier
  8. $email_array = explode("@", $email);
  9. $local_array = explode(".", $email_array[0]);
  10. for ($i = 0; $i < sizeof($local_array); $i++) {
  11. if (!ereg("^(([A-Za-z0-9!#$%&'*+/=?^_`{|}~-][A-Za-z0-9!#$%&'*+/=?^_`{|}~\.-]{0,63})|(\"[^(\\|\")]{0,62}\"))$", $local_array[$i])) {
  12. return false;
  13. }
  14. }
  15. if (!ereg("^\[?[0-9\.]+\]?$", $email_array[1])) { // Check if domain is IP. If not, it should be valid domain name
  16. $domain_array = explode(".", $email_array[1]);
  17. if (sizeof($domain_array) < 2) {
  18. return false; // Not enough parts to domain
  19. }
  20. for ($i = 0; $i < sizeof($domain_array); $i++) {
  21. if (!ereg("^(([A-Za-z0-9][A-Za-z0-9-]{0,61}[A-Za-z0-9])|([A-Za-z0-9]+))$", $domain_array[$i])) {
  22. return false;
  23. }
  24. }
  25. }
  26. return true;
  27. }

Using the function above is relatively simple, as you can see:

  1. if (check_email_address($email)) {
  2. echo $email . ' is a valid email address.';
  3. } else {
  4. echo $email . ' is not a valid email address.';
  5. }

You can now validate email addresses entered into your site against the specifications that define email addresses (more or less - domain names that start with a number are supposed to be invalid, but do exist).

Finally, please do remember that because an email looks valid does not mean it is in use. Using a script for validating email addresses is a good start to email address validation, but though it can tell you an email address is technically valid it cannot tell you if it is in use. You might benefit from checking in more depth, for example seeing if a domain name is registered. Even better, fire off an email to the address given by a user and get them to click a link to confirm it is real - the only way to be 100% sure.

180 comments

Hope i'm not making myself look really stupid here, but i think there's might be a mistake in the following loop:

for ($i = 0; $i &lt; sizeof($local_array); $i++) {
if (!ereg("^(([A-Za-z0-9!#$%&amp;'*+/=?^_`{|}~-][A-Za-z0-9!#$%&amp;'*+/=?^_`{|}~\.-]{0,63})|(\"[^(\\|\")]{0,62}\"))$", $local_array[0])) {
return false;
}
}

Shouldn't it be checking the regular expression against $local_array[$i] and not $local_array[0]?

Anyway, this is a great routine that saved me a lot of time. Thanks!
Thanks for the heads-up! I've changed it.
Stephane
France #3: August 1, 2004
I tried your email validation function.

It looks like it is missing a closing curly bracket at the end of the source code.

So I added one.

But the function still would not work.

It would stop the execution of my script, not giving any message.

Otherwise, your web site looks really good!!

Cheers
Stephane

mittiprovence@yahoo.se
Doe
Sweden #4: August 8, 2004
It does not miss a curly bracket. However I had to add a return true; statement at the end of the function otherwise the address would always fail.
Thanks for noticing the curly bracket, Stephane. It was missing, Doe, but as you say, the return true; was missing as well. Looks like I cut something out when pasting or editting! All fixed now.
Hong
United States #6: November 19, 2004
In the text, you mentioned that the length of a domain is 1 - 63. Should be up to 255 (from RFC 2821).

However, the code is correct on that.

Thanks for your work. Now I do not have to spend a lot of time in re-inventing the wheel.
Hong
United States #7: November 19, 2004
I forgot to mention that valid characters should also include lower cases "a-z". Of course we all know that. But somebody reading the text may think that "a-z" is not allowed.
Hong
United States #8: November 22, 2004
I think we should also check the last part of a domain to make sure that it has 2 - 4 letters.
Is the following piece right?

/* The last part of domain should have 2 - 4 characters. */
$i--;
if ( ! ereg ("[A-Za-z]{2,4}", $domain_array[$i]) )
return false;
}

Also this piece will check if a domain has a MX record:
// Check if mail exchange exists.
if ( ! getmxrr ($email_array[1], $mxrecords) )
return false;
Hi Hong.

Glad you like the function.

The last part of the domain can be various lengths - .museum, for example, is over 4 characters. I think it's important to use the specifications to define validity, rather than current domain options, as I want to avoid updating this type of thing if possible.

At the end of the day, the only accurate validation is a test email, but this kind of function can usually help pick up obvious typos. Restricting things too far is very bad indeed - this will just flag impossible email addresses.
Hong
United States #10: November 23, 2004
My bad.

After reading the RFCs, I know there's no 2-4 limit. But after I read another person's code, I thought there might be such thing defined elsewhere.

Thanks for clearing things up.
Hi!
Thanks for a good e-mail validator!

But I believe it needs some updating. Some time ago using the scandinavian letters æ Ã&#8224;, ø Ã&#732;, and Ã¥ Ã&#8230; in e-mail and urls became legal.

It would be really nice if you could add this in your code. I have not found any e-mail validator that validates adresses which includes these letters.
I'm wonder, if domain like this:

[59765

or this:

876487]

is valid. The code validate this domain, but I'm not sure, if it is correct.
i've translate (in my way) it in italian.
Great job, great way to check a mail address.
and this one is a good suggestion.
http://www.ilovejackdaniels.com/php/email-address-validation/comments/#comment8
I've got it!!
You mention "You might benefit from also checking in more depth, for example seeing if a domain name is registered." This is exactly what I am looking for - a way to check on-the-fly (I mean by the form processing script) whether an entered domain is valid.

Is there an easy way to do this in PHP? The only thing I have found is a PEAR class, but I don't have control of PHP on the machine I need the script on. And I don't really want to install PEAR just for this one thing.
Hi Wayne. Any whois script should be able to tell you if a domain is registered, and that's probably the easiest way to achieve what you want. Check out http://php.resourceindex.com - loads of scripts and functions there, and there are a few for whois.
I appreciate the email address validation function. However, I need a function that will validate an email address that may include the name. For example:

John Smith <john.smith@gcfl.net>
"Smith, John" <john.smith@gcfl.net>
john.smith@gcfl.net (John Smith)

It would be great if it extracted the name and provided it to the called as well.
chris
United States #17: April 21, 2005
I've noticed that it doesn't seem to catch a wrong character in the actual users name.
j*ohn@mac.com is valid for example.
Hi Chris.

I'm not sure I understand what you're saying. "j*ohn@mac.com" is a valid email address, and the validation function identifies that correctly.
 United Kingdom #19: May 13, 2005
Hi, i've just started leearning PHP and mySQL and just wanted to say thanks for some great code!
 Czech Republic #20: May 18, 2005
Hi,

the first part of the function is not completely right:

instead of ereg:
"[^@]{1,64}@[^@]{1,255}"

should be:
"^[^@]{1,64}@[^@]{1,255}$"

otherwise it would accept e.g.:
"name@domain.sth,name2@domain2.sth"

Regards,
Petr
Nicely spotted, Petr. Thankyou!
tepp yogi
France #22: May 18, 2005
hi,

it seems what nik (#12) is pointing at is that the ip address part of the validation is too loose. actually there are 2 problems :

1. the ip validation is too loose for a correct ip address

2. the optional brackets condition looks if an optional left bracket is there *or* if an optional right bracket is there, istead of making sure both conditions ( *and * ) are valid.

here is a proposed ip address validation string :

$domainIp = "(?:[0-9]{1,2}|[0-1][0-9]{2}|2[0-4][0-9]|25[0-5])(?:\\.(?:[0-9]{1,2}|[0-1][0-9]{2}|2[0-4][0-9]|25[0-5])){3}";

here is the solution for accepting optional brackets :
$domainIpWithOptionalBrackets = "(?:" . $domainIp . "|\\[" . $domainIp . "\\])";

cheers,

tepp.
tepp yogi
France #23: May 18, 2005
another thing that has been bothering me :

here is a quote from rfc 822, section 6.1 :
domain = sub-domain *("." sub-domain)
sub-domain = domain-ref / domain-literal
domain-ref = atom

does this mean that a valid email address could be of the form <tepp.yogi@[192.168.0.10].domain.com> ? worse, without the brackets ?

finally, i've got a couple of other questions for you :

1. it *seems* that the higher part (i.e. the .com part for domain.com) of a domain cannot contain any digits, cf. rfc 1123 section 2.1 :
[...] a valid host name can never have the dotted-decimal form #.#.#.#, since at least the highest-level component label will be alphabetic.
do you know if my assumption is correct ?

2. it also *seems* that when using an ip address, the brackets are not optional, cf. rfc 1123, section 2.1 :
[...] a dotted-decimal number must be enclosed within "[ ]" brackets for SMTP mail (see Section 5.2.17)
but then again, when you go to section 5.2.17 as suggested, it says nothing about brackets. any clues ?
Hi Tepp.

Clues, no. Confusion, yes. The RFCs, can be a bit of a mess!

2822 has replaced 822. You could argue that an email validator should check against all previous RFCs - so that old email addresses are valid - however, 2822 does take this into account and I think it makes sense to stick to just one document for these purposes!

Similar applies for 1123. 2822 is based in part upon RFC 1123.

As for the domain part of the mail domain being loose - you and Nik are correct. It will validate domains of the type given when it should not. I recommend using the above regex from Tepp for the domain portion of the regex!
retro
Germany #25: June 13, 2005
Nice function
but the email abc.de.@gmx.at
dont work.
can you include zhis. thx
Kim
Norway #26: June 27, 2005
Small problem, in some ways the validator is to strickt when validating local addresses ie addresses not having a domain part that will be delivered to a local user or alias, or adresses where the domain part contains only one part like user@mail,
Anonymous
United States #27: July 14, 2005
One other item... you can change the "length check" of your domain portion to be from {4,255}, as a valid domain will have at least 4 characters after the "@" sign (like a.us).
codeFiend
United States #28: August 12, 2005
this is a small thing, but I'd say it's easier to use foreach() instead of for() loops in this case.
(ie:
<pre>
foreach ($local_array as $local_part)
{
if (!ereg("^(([A-Za-z0-9!#$%&'*+/=?^_`{|}~-][A-Za-z0-9!#$%&'*+/=?^_`{|}~\.-]{0,63})|(\"[^(\\|\")]{0,62}\"))$", $local_part))
{
return false;
}
}
</pre>
and
<pre>
foreach ($domain_array as $domain_part)
{
if (!ereg("^(([A-Za-z0-9][A-Za-z0-9-]{0,61}[A-Za-z0-9])|([A-Za-z0-9]+))$", $domain_part))
{
return false;
}
}
</pre>
The function is great, thanks a lot. I've extended it a bit to also recognize valid mailaddresses with trailing names (e.g. "John Smith <smith@company.com>").

function ValidateMail($emailAddress_str) {
$theMailAddress_str = $emailAddress_str;
$openBracket_num = strpos($emailAddress_str, '<');
$closeBracket_num = strpos($emailAddress_str, '>');

// check, if mailaddress has an illegal combination of brackets
if ( (($openBracket_num !== false) and ( $closeBracket_num === false )) or
(($openBracket_num === false) and ( $closeBracket_num !== false)) ) {
return false;
}

// check, if mailaddress has a name (e.g. 'John Smith <john@smith.com>')
// if so, get the emailaddress within the brackets for further checks
if (( $openBracket_num !== false ) and ( $closeBracket_num !== false )) {
$theMailAddress_str = substr( $emailAddress_str, ++$openBracket_num, $closeBracket_num - $openBracket_num );
}

// we now check that there's exactly one @ symbol, and that the lengths are right
if (!ereg("[^@]{1,64}@[^@]{1,255}", $theMailAddress_str)) {
return false;
}

// Split it into sections to make life easier
$email_array = explode("@", $theMailAddress_str);
$local_array = explode(".", $email_array[0]);
foreach ($local_array as $entry) {
if (!ereg("^(([A-Za-z0-9!#$%&'*+/=?^_`{|}~-][A-Za-z0-9!#$%&'*+/=?^_`{|}~\.-]{0,63})|(\"[^(\\|\")]{0,62}\"))$", $entry)) {
return false;
}
}

if (!ereg("^\[?[0-9\.]+\]?$", $email_array[1])) { // Check if domain is IP. If not, it should be valid domain name
$domain_array = explode(".", $email_array[1]);
if (sizeof($domain_array) < 2) {
return false; // Not enough parts to domain
}
foreach ($domain_array as $entry) {
if (!ereg("^(([A-Za-z0-9][A-Za-z0-9-]{0,61}[A-Za-z0-9])|([A-Za-z0-9]+))$", $entry)) {
return false;
}
}
}
return true;
}
Karl
United States #30: September 12, 2005
Wouldn't the initial ereg call in your finalized function incorrectly reject an email address containing an at symbol in a quoted string for the local-part?

e.g. "my_test@your_site"@address.org

I believe RFC 2822 would allow this as a valid address given the definition of quoted-string.
This function rocks and you have saved me a lot of time working out all this validation "stuff" for myself.

THANKS
cor
Netherlands #32: October 26, 2005
This function seems to accept user@domain.com@domain.com

You could easily do a count() of email_array to see if you have more than 1 '@'
 United Kingdom #33: December 9, 2005
Thanks for your code and all the comments! My solution, albeit potentially flawed, is a 2-stage one that is fast while providing some assurance that the domain is in use:-

1) Execute "preg_match( '/^[^@]{1,64}@[^@]{1,255}$/', $arg )" PURELY to calculate the right number of chars in the *whole* string;

2) After a preview of the email to let the user see any mistakes (and once they have pressed the "send this email" button), execute "checkdnsrr( $domain )" and use "mail()" if this returns true.

At the moment, this script doesn't work if the domain is instead an IP address. I intend to cater for this at some point in the future.

Thanks again.
Thanks for a great script!

The only problem I ran across was commented upon by another visitor and acknowledged by the author, but evidently never fixed in the example script on the article page.

After playing with this script, as part of a larger solution, for the last couple of weeks, I can can confirm that:


function check_email_address($email) {
// First, we check that there's one @ symbol, and that the lengths are right
if (!ereg("[^@]{1,64}@[^@]{1,255}", $email)) {
// Email invalid because wrong number of characters in one section, or wrong number of @ symbols.
return false;
}
}


Should be changed to read:


function check_email_address($email) {
// First, we check that there's one @ symbol, and that the lengths are right
if (!ereg("^[^@]{1,64}@[^@]{1,255}$", $email)) {
// Email invalid because wrong number of characters in one section, or wrong number of @ symbols.
return false;
}
}


To me, this is of paramount importance, given the nature of email injection attacks.

Anyway, I really enjoy this site! Thanks again!
Is there a reason why you use ereg() instead of preg_match()? http://www.php.net/ereg says:
"Note: preg_match(), which uses a Perl-compatible regular expression syntax, is often a faster alternative to ereg()."
I do not know if this would be a circumstance where ereg() would be faster (the docs do say that it is *often* a faster alternative, implying that sometimes it may not be.

As an addition to #28: codeFiend who said that foreach() makes more sense than for(), I would add that, any functions in the top part of a for(), slow it down. I would take a look at http://www.php.lt/benchmark/phpbench.php
From what I can tell one should use:
$da_size = sizeof ( $domain_array );
for ($i = 0; $i < $da_size; $i++) {...}

Anyway, this is a great little piece of code, thanks!
Sebastien BLAISOT
France #36: February 2, 2006
Thanks for this great piece of code.

to complete it, i just added at the end (just before the "return true" statement, a DNS MX validation of the domain :

// DNS MX check of the specified domainname
if( !checkdnsrr($email_array[1], "MX") )
{
return false;
}
Sebastien BLAISOT
France #37: February 3, 2006
Oops, to be correct, code on comment #36 must be read :

// DNS check of MX of the specified domainname
if( !checkdnsrr($email_array[1], "MX") )
{
if( !checkdnsrr($email_array[1], "A")) {
return false;
}
}

if there is no MX, check if there is an A record (mail of the form user@machine.domain.tld is correct even if machine doesn't have an MX record)
Hi there,

I am just wondering If the code on posting is ready to cut and paste and use for email validation and also for domain check.
It is very consolidated code, but it takes to evaluate all complex expression code there.

Thanks.
sunita
United States #39: February 8, 2006
I tested this email is still valid.
a*b@nodomainavailable.com

which should not be!!!
According to the specs, a*b@nodomainavailable.com is a valid email, I believe. The local part of the email (a*b) can be made up of a selection of characters including, but not limited to, letters and *.
Dario
Italy #41: March 10, 2006
Thanks for the great script!
Is it possible to get the last version of the function....

Whit added domain validation and corrections..

Thanks in advance
ps: good job on the script
.. Adn for the DNS check... i think it is missing 1 return false...

should be :

if(!checkdnsrr($email_array[1], "MX")){
if(!checkdnsrr($email_array[1], "A")){
return false;
}
return false;
}
Sebastien Blaisot
France #44: April 12, 2006
I think you shouldn't add this return false.

If there is no MX record, but there is an A record, we have an IP addresse to forward the mail to, so the dns config is correct.

think of user@mymachine.mydomain.tld

this is correct, even if there's no MX record for mymachine.mydomain.tld but only an A record. The machine will get the mail on it's own IP address (and thus should be configured to receive such mails).

When you put the other return false, you just say :
"Regardless of the presence of an A record, the email domain is not validated if there is no MX record", which is not what we want.
(and thus, the A record check becomes useless)
Tr909
Netherlands #45: April 14, 2006
Sadly a@a.a validates as an
emailaddress in the function.
As far as i can tell there
no 1 character domains
and no 1 character TLD's (top level domains).
Scott
United States #46: April 15, 2006
First of all, a big thanks to Dave for the high-quality content.

Re: IP addresses, am I right in thinking that they're always in the format of four numbers between 0 and 255, separated by periods? If that's the case, then user@256.256.256.256 (with or without brackets) shouldn't validate.

Also, would it be useful to account for port numbers?
(e.g. user@xx.xx.xx.xx:543)

Not sure how important these points are, but I thought I'd mention them.

(P.S. Live comment preview makes me paranoid. I feel like someone's looking over my shoulder as I compose my thoughts.)
Callum
United Kingdom #47: May 9, 2006
I've been looking for ages for someone to go through PHP email validation in a readable but comprehensive way. I am more grateful to you than I can express in words. I'm not joking.
Rihard , pt
Portugal #48: May 17, 2006
can any1 post a full script with all sayd before?

ty ty ty
Liam Cody
Australia #49: May 25, 2006
I copied and pasted the code, and sadly, got an error:
Parse error: parse error, unexpected T_STRING in testing.php on line 3
my php file contains only the function as per the page.
Any clues on what's going wrong?
 United Kingdom #50: May 30, 2006
Works for me - thanks very much, a Google search came up with your site. Saved me a lot of time and will definitely have a look around the rest of the code.
Best wishes
THANKS VERY VERY MUCH
mikethecow
United Kingdom #52: June 14, 2006
Hi Dave

Excellent regex script - thanks. I spent DAYS trying to do this...!
hey, thanks man! I was going to do one of these myself and I wasn't looking forward to wresteling with regular expressions. (we have kind of an ugly history)
I just wanted to say thank you to the author and all of the people here have pointed out flaws and made suggestions for improvement. At the end of the day, there is now a really solid function for checking something that nearly everyone how uses PHP has looked at and though "what is the easiest and quickest way to do this without having to learn everything about Regex."

that was a rather long winded thank you :)

Thanks again.
I just wanted to say thank you to the author and all of the people here have pointed out flaws and made suggestions for improvement. At the end of the day, there is now a really solid function for checking something that nearly everyone how uses PHP has looked at and though "what is the easiest and quickest way to do this without having to learn everything about Regex."

that was a rather long winded thank you :)

Thanks again.
Bart
Belgium #56: August 9, 2006
Why do you explode the local part, using the decimal point as the separator? In the regex on the next line, you also include the point, which is superfluous in my opinion.
Naldz
Philippines #57: August 15, 2006
nice script...tnx. This is really a great help.
Thanks Dave, works great!
<i>(#1 tabbing between your forum input ins FF1.5 selects some weird hidden items.) Hey, this live comment preview thing looks really familiar... Er uh, back to my comment.</i>

I just wanted to say thanks for this write up. It is the most comprehensive and clear write up about email addresses and validation I've read to date.

Thanks!
>(#1 tabbing between your forum input ins FF1.5 selects some weird hidden items.)

Thanks for pointing that out. It appears tabbing in Firefox also selects label elements as well as inputs. I've added tabindex to the comment form to enable proper tabbing between elements.

>Hey, this live comment preview thing looks really familiar

Really? What do you mean?

Glad you like the article.
> >Hey, this live comment preview thing looks really familiar
> Really? What do you mean?

- http://dev.wp-plugins.org/wiki/LiveCommentPreview

I suppose it is a fairly common thing to do these days, however :D
I searched Google for validating email address with PHP and this was the first that came up in the list, thanks!
Just went over all the comments and would say most of it is incorporated. The important stuff anyways, except #22.

Nice work.
Great job man, I actually just added this email address check in my own blog. Works perfect. Damn those comment spams!
read this
Ukraine #65: October 16, 2006
for ($i = 0; $i < sizeof($domain_array); $i++) if (!ereg("^(([A-Za-z0-9][A-Za-z0-9-]{0,61}[A-Za-z0-9])|([A-Za-z0-9]{2,}))$", $domain_array[$i])) return false;
jam
United States #66: November 4, 2006
Perfect, thanks!
Haroon ur Rashid Malik
Pakistan #67: November 23, 2006
That's great. I got understannd how to validate email and a form.
Thank you.
Harro
Belgium #68: November 27, 2006
Finally.. a proper e-mail check..

Most checks are too strict to allow official e-mail addresses.
Skin
Portugal #69: December 4, 2006
I loved it.
Thanks.
I'm yet to find any validation that allows RFC2822 (recursive) comments, but there again, they are nothing short of mad.
Hi, thanks for the in depth article.

I have a question though - where you test the local part of the domain name, you show the example on two lines (two tests) - in your code you then put it altogether.

I'm writing my application in vb.net, and I'm wondering if there is a difference for the escape characters between vb.net and php? For example - in this line...

"^(([A-Za-z0-9!#$%&'*+/=?^_`{|}~-][A-Za-z0-9!#$%&'*+/=?^_`{|}~\.-]{0,63})|(\"[^(\\|\")]{0,62}\"))$"

I get a warning on the first ^ from the right hand side of the string, what I was wondering was whether perhaps the \" before it was being used as an escape character because you wanted to test for the " or not? If so, I think I have to remove those and double up the quotes instead when they are in a string.

Sorry if this sounds a bit gargled, with such a long string and my exceptionally poor knowledge of regular expressions I cant explain this very well...
Erm...for what its worth - I'm not from Spain - I'm from the UK - not sure why it says Spain (please dont reply in Spanish eh!)
Heh, sorry Rob. I'm using a free IP to Country database, and it's not 100% accurate. It's pretty good, for the most part, but it does have issues. I'm working on improving it so it shows you where it think you are from in the comment entry form. Then you can delete it, correct it ... whatever.

In your specific case, the database has your IP as being part of a block belonging to a Spanish ISP.
To answer your question - yes, VB doubles up characters to excape them usually, though only in the case of string containers, not with regular expression symbols. So yes, you need to use "" in your pattern.
Hi Dave,

Thanks for such a prompt response! I've had your page open now for about 2 hours and have been bringing your example into my .net email class - just about finished.

Thanks for the info about the country - its wasn't a major moan or anything - just thought I'd mention it - I work for the NHS so it wouldn't surprise me in the slightest if we have bought in connectivity from some other country! :)

Doubling up the quotes and remove the preceeding \ seems to work, so that was cool.

All I need to do now is add some checking for specific tld's as we only allow certain registrations on our software here...

Can you add "strings" to the match with regular expressions as well as the character ranges?

For example, if I wanted to check the domain part for nhs.uk or nhs.net - I can do this easily enough in vb so its no biggy, but whilst this email class is now regex central I thought I'd try and do it all the same ;)

Thanks again for the help and great article.
Dave, one more thing - I'm sure its something I'm doing wrong - but I cant get the test to validate when an IP address is entered...

For example: rob.meade@123.123.123.123 or indeed rob.meade@[123.123.123.123] both of which I believe are valid - yet each time I test it I get a false rather than a true...

I'm using this:

"^\[?[0-9\.]+\]?$"

and am testing on only the 'domain' part, thus 123.123.123.123 or [123.123.123.123]

I'm using the reference info from http://www.regular-expressions.info/reference.html to learn the special characters and they all seem correct to me...

Also - I don't see anything to restrict the IP to just 4 blocks of numbers, ie, we assume in the above that if its numbers and dots only then it must be an IP address, presumably I'd need to add some further code for testing for 4 blocks of numbers, and that they had a range between 0-255 or something?
/holds head in shame...

I had an If <regex test> = False rather than True - my apologies - feel free to delete #77/#76 to protect me from further embarrassment :)
With regards to the IP address validation - I am now trying this:

^\[?[0-9\.]+\]?$

as per your example in order to see if its "like" an IP address or a domain name...

Then, I found this on another site - which I believe tries to match the 4 blocks of numbers, with dots and the correct range for the numbers...

\[?\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b\]?

The problem I'm having is that whilst it will detected 256.123.123.123 as being invalid, it doesn't seem to detect 123.123.123.123.123 (ie, an extra block) - which I thought was a bit odd...

Any thoughts anyone?
 United Kingdom #79: January 2, 2007
NOTE: The PHP directive magic_quotes_gpc is ON by default, and it essentially runs addslashes() on all GET, POST, and COOKIE data.

This means that if a user in a form inputs

"dave"@domain.com

what you get from a $_POST variable is

\"dave\"@domain.com

The check_email_address() function will find this INVALID because the "\" is not allowed, however what the user actually entered was valid.

A solution to this is to check if magic_quotes_gpc is on and if so use the php stripslashes() function BEFORE checking if the email is valid, like this (where 'user_input' is the name of the <input> tag in your form):

if ( get_magic_quotes_gpc() ) {
$email = stripslashes( $_POST['user_input'] );
} else {
$email = $_POST['user_input'];
}

if ( check_email_address($email) ) { // email valid...

PS It would NOT be a good idea to add the following code to the check_email_address() function at the very beginning:

if ( get_magic_quotes_gpc() ) {
$email = stripslashes( $email );
}

this would limit the use of the function to emails passed directly from a $_POST variable. Other variables passed to this function may not have been through addslashes() or may already have had stripslashes() done, so the function could remove a "\" from an invalid address and then call it valid. The "loosely coupled, tightly bound" idea.
I'm being recommended this :

function checkmail($email){

return(preg_match("/^[-+_.[:alnum:]]+@((([[:alnum:]]|[[:alnum:]][[:alnum:]-]*[[:alnum:]])\.)+(ad|ae|aero|af|ag|ai|al|am|an|ao|aq|ar|arpa|as|at|au|aw|az|ba|bb|bd|be|bf|bg|bh|bi|biz|bj|bm|bn|bo|br|bs|bt|bv|bw|by|bz|ca|cc|cd|cf|cg|ch|ci|ck|cl|cm|cn|co|com|coop|cr|cs|cu|cv|cx|cy|cz|de|dj|dk|dm|do|dz|ec|edu|ee|eg|eh|er|es|et|eu|fi|fj|fk|fm|fo|fr|ga|gb|gd|ge|gf|gh|gi|gl|gm|gn|gov|gp|gq|gr|gs|gt|gu|gw|gy|hk|hm|hn|hr|ht|hu|id|ie|il|in|info|int|io|iq|ir|is|it|jm|jo|jp|ke|kg|kh|ki|km|kn|kp|kr|kw|ky|kz|la|lb|lc|li|lk|lr|ls|lt|lu|lv|ly|ma|mc|md|mg|mh|mil|mk|ml|mm|mn|mo|mp|mq|mr|ms|mt|mu|museum|mv|mw|mx|my|mz|na|name|nc|ne|net|nf|ng|ni|nl|no|np|nr|nt|nu|nz|om|org|pa|pe|pf|pg|ph|pk|pl|pm|pn|pr|pro|ps|pt|pw|py|qa|re|ro|ru|rw|sa|sb|sc|sd|se|sg|sh|si|sj|sk|sl|sm|sn|so|sr|st|su|sv|sy|sz|tc|td|tf|tg|th|tj|tk|tm|tn|to|tp|tr|tt|tv|tw|tz|ua|ug|uk|um|us|uy|uz|va|vc|ve|vg|vi|vn|vu|wf|ws|ye|yt|yu|za|zm|zw)$|(([0-9][0-9]?|[0-1][0-9][0-9]|[2][0-4][0-9]|[2][5][0-5])\.){3}([0-9][0-9]?|[0-1][0-9][0-9]|[2][0-4][0-9]|[2][5][0-5]))$/i",$email));
}

or

a different regex (after I commented about TLDs) :

/^[-+_.[:alnum:]]+@(?:(?:(?:[[:alnum:]]|[[:alnum:]][[:alnum:]-]*[[:alnum:]])\.)+(?:[a-z]+)$|(([0-9][0-9]?|[0-1][0-9][0-9]|[2][0-4][0-9]|[2][5][0-5])\.){3}([0-9][0-9]?|[0-1][0-9][0-9]|[2][0-4][0-9]|[2][5][0-5]))$/i

I'm not 100% convinced.
Chris, I don't blame you. That looks like it might work, but would still need updating whenever the TLDs or CCTLDs are changed.
Invalid account passed: teste@222
Here's a function that'll use preg_match and if that's passed it'll actually check if the domain name really exists.

Hope it helps...

function check_email_mx($email)
{
if( (preg_match('/(@.*@)|(\.\.)|(@\.)|(\.@)|(^\.)/', $email)) ||
(preg_match('/^.+\@(\[?)[a-zA-Z0-9\-\.]+\.([a-zA-Z]{2,3}|[0-9]{1,3})(\]?)$/',$email)) )
{
$host = explode('@', $email);
if(checkdnsrr($host[1].'.', 'MX') ) return true;
if(checkdnsrr($host[1].'.', 'A') ) return true;
if(checkdnsrr($host[1].'.', 'CNAME') ) return true;
}
return false;
}
Nice site by the way, I can really use your cheat-sheets
useful entry... thank you very much!!!!
markc1223@yahoo.com
Anonymous
United States #86: March 9, 2007
Great time saver. Thanks!!
Saved me a lot of time too :)
Anonymous
United States #88: March 15, 2007
the input "2@." will pass validation
The input "2@." will pass the first part of validation and fail on the domain validation, from my reading of it.
dns
United Kingdom #90: March 28, 2007
hi there
nice script, works great. i just have one thing to add -- please correct me if i'm wrong: a domain has to be at least 2 chars long, like mail@ab.com, and at the moment the validation works if i put mail@a.com

also, the alias should be at least two chars long, and so far it also validates a@ab.com.

i can't fix that, but i hope it helps improving ;)
I just wanted to say thanks for the php email validation script - it's working great so far!
 United States #92: March 30, 2007
thanks, it works great. no error unlike others

and really easy. I like this site alot. top quality stuff.
 United States #93: March 30, 2007
the two work great together this fuction is from a comment from here

<?php


function check_email_mx($email)
{
if( (preg_match('/(@.*@)|(\.\.)|(@\.)|(\.@)|(^\.)/', $email)) ||
(preg_match('/^.+\@(\[?)[a-zA-Z0-9\-\.]+\.([a-zA-Z]{2,3}|[0-9]{1,3})(\]?)$/',$email)) )
{
$host = explode('@', $email);
if(checkdnsrr($host[1].'.', 'MX') ) return true;
if(checkdnsrr($host[1].'.', 'A') ) return true;
if(checkdnsrr($host[1].'.', 'CNAME') ) return true;
}
return false;
}

?>


add that function below the function from this site.
and then check this way


<?php
if (!check_email_address($email)) {
echo "<br/>-Email address appears to be invalid";

}
else
{
if (!check_email_mx($email)) {
echo "<br/>-Email address appears to be invalid";

}
}
?>


if it is valid not need to tell right? we are just checking if it is invalid.
Marcin Wiazowski
Poland #94: April 12, 2007
Hello,


I've found some errors:

1) "ereg" functions use POSIX regular expressions. "." (dot) has no special meaning inside [] in POSIX regular expressions, so you should use "." instead of "\." inside [] if you use "ereg" functions (but - as I remember - not, if you use "preg" functions). Correct this in your code and in examples in your article. If you fix errors listed below and use improvements listed below, there is no need to do anything - all dots inside [] will be removed.

2a) user@[123.123.123.123 and user@123.123.123.123] are not valid.
2b) domain names with IP addresses should be verified, too - this fixes an error referred in comment #88 (refer also to comments #22, #78).
Common 2a and 2b solution: You should replace
if (!ereg("^\[?[0-9\.]+\]?$", $email_array[1])) {
with
if (preg_match('/^(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])){3}$/', $email_array[1]) || preg_match('/^\[(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])){3}\]$/', $email_array[1])) { return true; } else {

3) quoted string in local part CAN contain "\" - refer to RFC. Quoted string is treated as without quotes, so there should be one or more chars inside quotes. You should replace (\"[^(\\|\")]{0,62}\") with (\"[^\"]{1,62}\") or better with (\"[^\"]+\") - maximum length is checked earlier.

4a) "ereg" functions are not binary-safe ("preg"s are binary safe). You should check for 0x00 invalid characters at the beginning of a function code - using binary-safe string search function like "strpos", directly by using "preg", or by replacing first occurence of "ereg" with "preg".
4b) characters 0x7F-0xFF are not allowed - refer to RFC. It would be nice to eliminate also control characters - 0x00-0x1F.
Common 4a and 4b solution: Put at the beginning: if (preg_match('/[\x00-\x1F\x7F-\xFF]/', $email)) { return false; }



Some improvements:

1) "." (dots) are eliminated in explode(".", $email_array[0]), so there is no need to to include "." inside regular expression in the code below explode - you can replace ([A-Za-z0-9!#$%&'*+/=?^_`{|}~-][A-Za-z0-9!#$%&'*+/=?^_`{|}~\.-]{0,63}) with ([A-Za-z0-9!#$%&'*+/=?^_`{|}~-]{1,63}) or better with ([A-Za-z0-9!#$%&'*+/=?^_`{|}~-]+) - maximum length is checked earlier (referring to comment #56).

2) "+" is needless in domain name validation pattern - you can replace (!ereg("^(([A-Za-z0-9][A-Za-z0-9-]{0,61}[A-Za-z0-9])|([A-Za-z0-9]+))$", $domain_array[$i])) with (!ereg("^(([A-Za-z0-9][A-Za-z0-9-]{0,61}[A-Za-z0-9])|([A-Za-z0-9]))$", $domain_array[$i]))

3) "foreach" is faster than "for".

4) patterns enclosed with single quotes are processed faster than patterns enclosed with double quotes.

5) "preg"s are faster than "ereg"s (and are binary-safe, as mentioned above).



Below it is a function with all above fixes and improvements. You can put this function in your article if only you want :)

function check_email_address($email)
{
// Check for invalid characters
if (preg_match('/[\x00-\x1F\x7F-\xFF]/', $email))
return false;

// Check that there's one @ symbol, and that the lengths are right
if (!preg_match('/^[^@]{1,64}@[^@]{1,255}$/', $email))
return false;

// Split it into sections to make life easier
$email_array = explode('@', $email);

// Check local part
$local_array = explode('.', $email_array[0]);
foreach ($local_array as $local_part)
if (!preg_match('/^(([A-Za-z0-9!#$%&\'*+\/=?^_`{|}~-]+)|("[^"]+"))$/', $local_part))
return false;

// Check domain part
if (preg_match('/^(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])){3}$/', $email_array[1]) || preg_match('/^\[(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])(\.(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])){3}\]$/', $email_array[1]))
return true; // If an IP address
else
{ // If not an IP address
$domain_array = explode('.', $email_array[1]);
if (sizeof($domain_array) < 2)
return false; // Not enough parts to be a valid domain

foreach ($domain_array as $domain_part)
if (!preg_match('/^(([A-Za-z0-9][A-Za-z0-9-]{0,61}[A-Za-z0-9])|([A-Za-z0-9]))$/', $domain_part))
return false;

return true;
}
}



Things to do:

1) "my_test@your_site"@address.org is a VALID email address (referring to comment #30), so "@" inside double quotes are allowed. Spaces inside double quotes are also allowed.



Notices:

1) referring to comment #25: "." (dot) it is not allowed in a local part at the beginning or at the end of a local part. Dots also must be separated - two or more dots being neighbours are not allowed - refer to RFC. This is properly checked by function above.

2) IP addresses as domain names can also be in following formats (but mail server may not recognize them!):
dword – http://3515261219/ (representations of the dot-less "dword" IP address can also be achieved by adding multiples of 4294967296)
octal – http://0321.0206.0241.0043/
hexadecimal – http://0xD1.0x86.0xA1.0x23/ or http://0xD186A123/
mixed - http://0321.0x86.161.0043


Regards :)
Nice! I've been looking for the email validation in PHP for a long time. I have only found it in JavaScript before now :D
sha
United States #96: April 18, 2007
Can some one look at my code?
Here is what I have so far and it works...
The question is how do I add the validators you all are mentioning. (the email validation) I am new to php. So some of this looks strange. Will I have to completely re do my code to add the validation?
see below:

<?php
$to = "email omitted";
$from = $_REQUEST['Email'] ;
$name = $_REQUEST['Name'] ;
$title = $_REQUEST['Title'];
$headers = "From: $from";
$subject = "Web Contact Data";
$send = "mailto:email omitted";

$fields = array();
$fields{"Name"} = "Name";
$fields{"Title"} = "Title";
$fields{"Email"} = "Email";
$fields{"Phone"} = "Phone";
$fields{"Message"} = "Message";

$body = "We have received the following information:\n\n"; foreach($fields as $a => $b){ $body .= sprintf("%20s: %s\n",$b,$_REQUEST[$a]); }

$headers2 = "omitted";
$subject2 = "Thank you for contacting us";
$autoreply = "Thank you for contacting us.
Do not reply, we will get back to you as soon as possible, usualy within 48 hours. If you have any more questions, please consult our website";

if($from == '') {print "You have not entered an email, please go back and try again";}
else {
if($name == '') {print "You have not entered a name, please go back and try again";}
else {
$send ="mailto:email omitted";
$send = mail($to, $subject, $body, $headers);
$send2 = mail($from, $subject2, $autoreply, $headers2);
if($send)
{header( "Location: site omitted/test/thankyou.html" );}
else
{print "We encountered an error sending your mail, please notify omitted-email";}
}
}
?>

I appreciate your help. Thanks in advance.
Anonymous
France #97: April 24, 2007
I looked through the RFC 2822, and I couldn't find anything against 1 character 2nd level domain name. In fact I found some existing domains, some are included in: http://en.wikipedia.org/wiki/Single-letter_second-level_domains
other TLD may also allow such names e.g. "d.cz".
Cacou
United States #98: May 4, 2007
I could be wrong, but isn't
x@xx.com.
valid? (note the . at the end)
At least xx.com. is a valid domain name.
Your code claims the above is not a correct email address.
@Cacou:

The period is a separator. It separates "labels", each of which must be between 1 and 63 characters in length. Therefore, according to the specs, "xx.com." is not a valid domain. It will probably resolved just fine, because to all intents and purposes it is just "xx.com".

@Anonymous (#97):

You are right - labels (parts of domains) can be 1 letter in length, officially.

@Marcin:

Thankyou for taking the time to comment and to fix some of the outstanding issues with the function. I have added a note and link to your comment to the article so others can find this version.

@Everyone else:

Thankyou :). It's very rewarding to see that after almost three years since this was initially posted, it's still useful to people and still generating conversation. It's a tribute to all of you that this thread is almost at 100 comments, with so many people contributing thoughts, suggestions and improvements over such a long period.
Kristoffer
Denmark #100: May 5, 2007
I do not really have anything interesting to add to this, I just want to thank you for your work. This piece has saved me a lot of time and for that I am grateful.

Thank you :-)
Thanks for the great code, as the poster above me said, this saved me time. I actually just needed the domain validtion part so i went on and used it as a standalone code !

good job !
venus, philippines
Philippines #102: May 21, 2007