Why WordPress’ wp_kses() is better than PHP’s strip_tags() for security

In this video, I discuss why WordPress’ wp_kses() function is more powerful, flexible, smarter and more secure than PHP’s strip_tags() function.

Video Transcript

Hi, my name is Mike McLin and in this video I am going to explain why WordPress’s wp_kses() function (if that’s how you pronounce it) is better than PHP’s strip_tags() function when it comes to sanitizing data for security. The purpose of both of these functions is to filter data by only allowing certain HTML tags, while removing all other tags. We usually will run one of these functions (or a similar function) whenever we are accepting user input, for example comments on a blog, to prevent any type of unexpected or malicious behavior. OK, Let’s get started.

Setting up

First off I’ll create a snippet of HTML, and assign it to the $string variable. This HTML consists of 2 paragraphs, the first one also containing an anchor link. If we echo out the string, we can see that it renders properly.

PHP string_tag() function

Now, let’s place the $string variable inside of PHP’s strip_tags() function and echo the result. We can see that the text is still there however, all of the HTML tags have been removed. The PHP strip_tags() function accepts 2 parameters. The first is the string that is to be filtered, and the second optional parameter is a string of all of the allowed tags. Let’s go ahead and allow paragraph tag.

We can now see that the 2 paragraphs have been restored. Let’s also allow the anchor tag. Now, we can see that the anchor link has been restored.

WordPress’ wp_kses() function

Now let’s go through the same steps using WordPress’ wp_kses() function. Just like strip_tags(), we’ll pass the string as the first parameter. However, unlike strip_tags, the second parameter is not optional. It is required, and it needs to be an array value, not a string. So, let’s just pass an empty array to strip out all tags.

Now, let’s go ahead and allow paragraphs and anchors. Let’s break the array out so it is easier to read. You’ll notice that the $allowed_tags variable is an array of tags, and each tag is a nested array. We store the allowed HTML attributes for each tag in the nested arrays. I’ll just leave them empty for now and see what we get.

Now it looks like everything rendered properly. We can see both of our paragraphs and our link. However, when I inspect the link, you can see that the href attribute is missing. That is because we didn’t add it to our anchor tag’s attribute array in the wp_kses() function. You can see that the strip_tags() renders out the whole anchor tag as expected. While this might make wp_kses() a bit more cumbersome to initially setup, it also adds another level of control when filtering your data.

Let’s go ahead and allow the href attribute for anchors, along with the title attribute. You’ll notice that these have a value type of an array too. Basically everything in the wp_kses() $allowed_tags variable equals an array. Now, we can see that the anchor is outputting the href attribute as expected.

When comparing these 2 functions, many people stop here and miss out on 2 other important advantages that wp_kses() has over strip_tags(). The first is it is just a smarter function. Let’s go ahead and add this little arrow, pointing to the link and see what happens.

wp_kses is smarter than strip_tags

You can see that this completely broke the expected result from PHP’s strip_tags() function. The less-than symbol appeared to be an opening character for a potential HTML tag, a tag that wasn’t passed into the second parameter of the strip_tags() function, and therefore wasn’t rendered. The wp_kses() function rendered the HTML properly.

wp_kses is more secure than strip_tags

The other advantage wp_kses() has is probably it’s biggest advantage of all. It helps prevent JavaScript attacks. Many times, the purpose of these functions is to help prevent such attacks, and when somebody does something like this… The JavaScript is denied successfully. Instead of executing the JavaScript alert, it has just output the value as harmless plain-text.

Now, let’s see what happens when we try something a little more clever. We’ll sneak our JavaScript code into our anchor href attribute and see what happens. When I click on the strip_tags() link, the JavaScript is executed. When I click on the wp_kses() link, the JavaScript is not executed, and we are sent to whatever URL the link has sent us to.

If I go back and inspect the anchors, you can see that the javascript is still present in the strip_tags() anchor href, while it has been removed from the wp_kses() anchor.

About the $allowed_protocols parameter in wp_kses

This occurs because wp_kses() actually has a 3rd parameter that can be passed to it, which is an array of allowed protocols. Nearly all popular protocols are accepted by default like http, https, mailto and so on, but JavaScript isn’t. You’ll probably never want to overwrite these default protocols, but if you do, you can always read more about them on the wp_kses() page in the WordPress codex.

So, that’s the end of this tutorial. My name is Mike McLin, and I hope you have seen why wp_kses() is more powerful, flexible, smarter and more secure than PHP’s strip_tags() function.

View Transcript

Related Links

WordPress Codex: wp_kses
PHP.net Manual: strip_tags