PHP security: input validation and XSS

for lots of reasons, security included, it’s very important to validate all user input and protect our application from intentional or accidental wrong inputs.

First of all, here are some advices
  • don’t use register_globals ! A malicious user may change value of script variables simply modifying the query string
  • don’t use $_REQUEST: possibile loss of data. Use $_GET and $_POST

Number validation
use cast operator to validate numbers. If the malicious input is “‘); drop table users“, the casted (int) value will be zero.

When you use casting, remember the maximum size of int to prevent overflow. you may use (float) instead of (int).
Remember: the decimal separator is the point, not the period !
A numeric value with period (,) is not numeric, and a float casting will be delete the decimal part.
String validation
First of all, the input may contain stressed letters (French and Italian languages do, for example à,è,ì,ò and ù).
Use
set_local(LC_CTYPE,”french”)
The stressed characters will be converted into the corresponding non-stressed characters (à->a, è->e, etc…).
To validate string inputs, use regular expressions !
ereg($patter, $string) // or eregi: same arguments but case insensitive
example:
if (ereg(“^[0-9]{5}$”, $_POST[‘postcode’])!==false) { /*postcode valid*/ }
“^[0-9]{5}$” means: string that contain 5 characters, each must be a number
don’t forget “^” and “$” , or “a12345b” will be considered valid.
Use regular expression to validate various type of input: e-mails, URLs. For further details, php.net examples and google !
File uploads
as second argument of
move_uploaded_file ( string $filename , string $destination )
use basename($_FILE[“fieldName”][“name”]) to increase security
$filename must be $_FILE[“fieldName”][“tmp_name”]
Don’t trust $_FILE[“fieldName”][“type”]
If you expect a image, use getimagesize(). If isn’t an image, the return value will be FALSE
If the size of the $destination file is different form the size of the temp file ($filename), delete it !!
Do not use magic quotes !
Magic quotes doesn’t escape correctly all the special characters !
it’s better to use specific db functions, such as
mysql_real_escape_string()
Pay attention to the current PHP configuration, you may do a stripslashes to $_GET data if the magic quotes are enabled. Use the get_magic_quotes_gpc()
Do not allow the user to modify serialized data !
A malicious user may be change and array with 100 elements a:100:{} to an array with millions of elements (it requires lots of memory).
If you need to pass serialized data, you should use a two-way cipher algorithm with key.
XSS
Cross site scripting allows users to insert malicious code in frontend via form submit. When the result page with the user data will be displayed (examples: weblog comments or site guestbook), the code will be executed or shown.
Examples of malicious inputs:
  • [? passthru(“rm -rf *”); ?]
  • [iframe src=”http://mysite.com/banner.php” … /]
Some tips:
  • Convert characters to html entities. the “less and greater than” will be converted in HTML entities preventing not-expected HTML formatting or PHP code execution
    pho function: htmlspecialchars() //converts “ampersand”, “double and single quote”, “less and greater than” into HTML entities
  • Pay attention to quotes or double quotes if you compose tag attributes with user’s data:
    print “[a href='{$_GET[‘link’]}’]text[/a]”;
    if $_GET[‘link’] is: #’ onclick=’alert(‘malicous js code’)’ title=’
    the result of the link will be execute a user-defined javascript
  • If the HTML tags are not allowed, you can use strip_tags() and remove all the tags from the user data. Note: strip_tags() does not convert special chars !
  • To only remove attributes to user submitted tags, use preg_replace()
  • If the user data is a URL, remember to check that it will not start with “javascript:” to prevent javascript code, or (better), parse with eregi()
  • to obtain a IP-based accesscontrol, you should also consider the proxy. If the user uses it, the real IP is contained in the variabile HTTP_X_FORWARDED_FOR, and validate it by using ip2long(), then long2ip() and check if the IP remain the same
  • HTTP_REFERER is not reliable, some browsers don’t send it