Author Archive

How to validate an email address ?January 28th, 2010

emailatHaving worked on various web projects, I often encounter a very well known problem : finding an effective regular expression (regexp) to check the validity of user submitted email addresses.

In his blog, Fighting for a lost cause, Ian Dunn has compiled various regular expressions which try to address this problem. The editor’s idea is great: using a set of valid/invalid emails and a simple unit test, he can provide a good comparison of some of the most used regexps.

His philosophy is simple : “It’s better to accept a few invalid addresses than reject any valid ones, so I’m looking for 0 false-positives and as few false-negatives as possible.”
But I’ve noticed 2 problems :

  1. His “best” regexp doesn’t work in JavaScript (JS doesn’t support advanced features like negative lookbehind …)
  2. The method used to validate IP addresses is not correct (doesn’t take care of 0-255 range)

So i’ve decided to improve another existing regex, created by Warren Gaebel and already enhanced by Guillaume Arluison, by adding another test criteria : also check the “real” validity of the IP address.

Here is my solution :
/^[-a-z0-9~!$%^&*_=+}{\'?]+(\.[-a-z0-9~!$%^&*_=+}{\'?]+)*@([a-z0-9]([-a-z0-9_]?[a-z0-9])*(\.[-a-z0-9_]+)*\.(aero|arpa|biz|com|coop|edu|gov|info|int|mil|museum|name|net|org|pro|travel|mobi|[a-z]{2})|([1]?\d{1,2}|2[0-4]{1}\d{1}|25[0-5]{1})(\.([1]?\d{1,2}|2[0-4]{1}\d{1}|25[0-5]{1})){3})(:[0-9]{1,5})?$/i

This one works very well (found 18/18 valid mails + deep IP address check, and found 19/20 invalid mails – there is a problem checking global length)

There’s just a small problem, each time a new TLD > 2 chars will be added, you’ll need to append it to the list in the regex, if you want a more generic solution, you can use this variant (note that this version will not check if the TLD really exists) :

/^[-a-z0-9~!$%^&*_=+}{\'?]+(\.[-a-z0-9~!$%^&*_=+}{\'?]+)*@([a-z0-9]([-a-z0-9_]?[a-z0-9])*(\.[-a-z0-9_]+)*\.([a-z]{2,6})|([1]?\d{1,2}|2[0-4]{1}\d{1}|25[0-5]{1})(\.([1]?\d{1,2}|2[0-4]{1}\d{1}|25[0-5]{1})){3})(:[0-9]{1,5})?$/i

Those 2 solutions should be usable in all languages providing PCRE (Perl Compatible Regular Expressions), on server & client side (such as Javascript, PHP, Perl, Python, Ruby etc…)

Prelude SIM : Security Information Management systemJanuary 18th, 2010

We live in an over-networked world where security becomes more and more important to protect us from information thefts, servers downtimes and other attacks.

Prelude LogoVarious solutions exist. I have recently given an  internal presentation to present  Prelude SIM (Security Information Management) System, a project I have contributed to. It’s an  OpenSource solution which allows you to monitor in real-time your infrastructure by correlating events from deployed sensors such as Snort (IDS), Samhain (FileSystem Integrity Checker) or Prelude-LML (Log analyzer) and  helps you react quickly to a potential attack.

Here are my slides : Prelude SIM Talk