11.11.2005

Form data validation in PHP and JavaScript

I'm mentoring a few folks at work in web app development, specifically PHP. Today, the subject of form data validation came up when I pointed out to them that their app had none. No matter what data I entered for name, email address, etc., that data went right into the database (or at least the insert was tried). It was hack city.

There are 3 different types of data validation checks one can do:
  • Syntactic validation
  • Semantic validation
  • Domain or model validation

Syntactic validation is when you need to make sure that the syntax of incoming data is correct. For instance, if it's a "first name" field, you wouldn't expect numbers, ampersands or other strange characters in there. You'll have to strip out the garbage before you can put that name into the database, so you'd better make the user fix their data before you have to. Another common one I find is when trying to get telephone numbers, social security numbers, etc. You don't want any chars in there.

Syntactic validation also takes care of checking minimum and maximum lengths for fields. If a new user must have a username that's 5 or more characters, you would check that thru syntactic validation. Same thing for maxlengths. Syntactic validation is what's going on too with checking for required fields (ie, is the username field blank).

Another common use of syntactic validation on the web is for checking valid email addresses and urls. You need to make sure those email addresses folks are giving you are at least of the proper form so you can email them good marketing stuff later. You could also check to make sure that email address actually exists (this would fall under domain validation though).

Syntactic validation is the simplest form of validation. The beauty part of it is that you can do most syntactic checks in both JavaScript on the client-side and then again in PHP on the server side. This aspect of "double-checking" stumped my associates too until I mocked up a quick form on my own machine that posted junk values to their test app, causing all kinds of mysql errors. I've seen programmers that don't do validation on both the client-side and server-side, but those programmers are no longer working steadily.

Another form of data validation is Semantic validation. In semantic validation checks, you are checking to make sure that one piece of information makes sense in regards to other incoming information.

For example, when registering a new user, most web apps will place the password box twice to make sure the user enters their desired password correctly. Checking these two passwords to make sure they are the same would be an example of semantic validation. Another example of semantic validation is checking to make sure that a required field based off another field is actually filled in. So, if the user specifies the country as US, then they also have to specify a state in the US... otherwise, they don't have to specify a state.

Semantic validation rules can get very complex, very quickly. While it's easy to build up a library of reusable code of syntax validation rules, semantic validation rules often require some custom work on every form. Again, semantic validation rules are most effective when used on both the client-side and server-side.

The third kind of form data validation is Domain or model validation. Domain validation requires checking the incoming info against another source of acceptable values. This could be existing database records, a config file, or simply PHP code. This kind of validation is checking the "domain" or "model" to make sure this incoming info makes sense.

A great example of domain validation is when registering as a new user on a site. After passing all the syntactic and semantic checks for registering a new username and password on the site, the app has to check to make sure the username you are requesting isn't already taken. So, it queries the users table in the db to see if your requested username is there. If it is, then the app needs to alert you to what went wrong and give you another chance to suggest a username. So, our web app is checking your requested username against the domain of existing usernames in our system.

Other examples of domain validation include checking to make sure an article_id in a CMS exists before we edit/delete it. Domain validation is also used in many permission schemes in web apps (ie, you can perform this action if your user level is > 5 or something).

Domain validation will almost always need to be done on the server-side as that's where the domain or model usually resides. So, you can't do any domain validation in JavaScript (well, you could, but it would be the most insecure app in the world).

Now the trick with form data validation is that most of the checks you're doing on the data will be done over and over again in every web app created. So, you're best bet is to create some type of class to take care of all this for you. (more on this in future article).

No comments:

Post a Comment