Monday, July 24. 2006
Writing secure PHP
PHP makes it very easy to add functionality to static websites, especially for the new developer who may not have any prior experience or formal instruction in programming. Sadly this often means that developers are not aware of potential security vulnerabilities in their applications. This article is intended as a short overview of some of the common pitfalls encountered and how to avoid them. In later articles I’ll be expanding on some of the areas to provide more detail.
Unvalidated user input
Of the four potential security problems I discuss in this article not validating user input is the most dangerous. Not fully validating input allows a malicious visitor to cause very serious problems for your site and even well intentioned visitors could easily create problems without knowing it.
If you are to properly protect your site from a malicious attack you have to assume that every visitor may be malicious. It is important to assume that no piece of user submitted data can be trusted. This includes data that you have already checked on the clients’ computer, with JavaScript for example. Client-side validation should be there solely for your visitors convenience, a malicious visitor can easily bypass it.
The best way of validating data will depend on what you intend to do with it and on what form the data takes. If it is relatively short and/or should follow a defined pattern, an email address, postcode or telephone number for example then a regular expression will probably be the best way of validating the input. If it is a comment or some other form of longer content, a forum post or blog comment for example, then a regular expression isn’t going to work and validating it will probably be a matter of preventing the input doing anything malicious rather than actually identifying it as malicious content. User submitted data, which cannot be directly validated, can be used in two main ways to damage your site.
The first potential problem occurs whenever user submitted data is to be stored in a database and is called SQL injection. Luckily PHP provides functions to easily prevent SQL injection, namely addslashes which escapes the characters which are needed for SQL injection to work and stripslashes which removes the backslashes ready for displaying content back to your visitors. One thing to note however is that if magic_quotes_gpc is on, which is the default, then using addslashes to prevent SQL injection may not be necessary as all data submitted through GET, POST and COOKIES will automatically be escaped. It’s best to check.
The second problem is cross site scripting flaws. These occur when you display user submitted content on your site. Without validation a malicious visitor could submit the following content, which would then be displayed directly on your site:
Here is a malicious piece of content.
Any visitor to you site would automatically be redirected to http://www.badguys-r-us.com/ and if that site was a clone of your own site then your visitors may not even realise and happily log in as normal, giving away their password and other sensitive data. Again, PHP has functions which can prevent this type of attack; htmlspecialchars replaces the characters with special significance in HTML with their HTML entities. If you need your visitors to be able to submit HTML then you will need to strip out
Of the four potential security problems I discuss in this article not validating user input is the most dangerous. Not fully validating input allows a malicious visitor to cause very serious problems for your site and even well intentioned visitors could easily create problems without knowing it.
If you are to properly protect your site from a malicious attack you have to assume that every visitor may be malicious. It is important to assume that no piece of user submitted data can be trusted. This includes data that you have already checked on the clients’ computer, with JavaScript for example. Client-side validation should be there solely for your visitors convenience, a malicious visitor can easily bypass it.
The best way of validating data will depend on what you intend to do with it and on what form the data takes. If it is relatively short and/or should follow a defined pattern, an email address, postcode or telephone number for example then a regular expression will probably be the best way of validating the input. If it is a comment or some other form of longer content, a forum post or blog comment for example, then a regular expression isn’t going to work and validating it will probably be a matter of preventing the input doing anything malicious rather than actually identifying it as malicious content. User submitted data, which cannot be directly validated, can be used in two main ways to damage your site.
The first potential problem occurs whenever user submitted data is to be stored in a database and is called SQL injection. Luckily PHP provides functions to easily prevent SQL injection, namely addslashes which escapes the characters which are needed for SQL injection to work and stripslashes which removes the backslashes ready for displaying content back to your visitors. One thing to note however is that if magic_quotes_gpc is on, which is the default, then using addslashes to prevent SQL injection may not be necessary as all data submitted through GET, POST and COOKIES will automatically be escaped. It’s best to check.
The second problem is cross site scripting flaws. These occur when you display user submitted content on your site. Without validation a malicious visitor could submit the following content, which would then be displayed directly on your site:
Here is a malicious piece of content.
<SCRIPT>document.location=’http://www.badguys-r-us.com/’;</SCRIPT>
Any visitor to you site would automatically be redirected to http://www.badguys-r-us.com/ and if that site was a clone of your own site then your visitors may not even realise and happily log in as normal, giving away their password and other sensitive data. Again, PHP has functions which can prevent this type of attack; htmlspecialchars replaces the characters with special significance in HTML with their HTML entities. If you need your visitors to be able to submit HTML then you will need to strip out