Practice Safe Idempotent Methods

This article was first published on in PHP Advent.

All web developers should be familiar with the GET and POST methods. These are the primary methods used in everyday development on the Web. Even if you know nothing about HTTP, you’ve at least seen form examples using either get or post as the value of the method attribute. All too often, though, I find that those who build web applications know far too little about the protocol that powers the Web: HTTP. I think all web developers should have at least a rudimentary understanding of the technology that earns their bacon.

Checking Out What’s Under the Hood

When you plug a URL into your browser’s address bar, what it really does is makes an HTTP request—many times, a series of them. So, let’s forget the browser for just a moment. Open up a command prompt and launch a telnet session to connect to phpadvent.org. The command looks something like this:

$ telnet phpadvent.org 80
Trying 66.225.209.21...
Connected to phpadvent.org.
Escape character is '^]'.

This should leave your cursor sitting there, awaiting input. So, type the following:

GET / HTTP/1.1
Host: phpadvent.org

When you reach the end of that last line, return twice. You should see a response that comes back looking like this:

HTTP/1.1 302 Found
Date: Thu, 18 Dec 2008 04:23:29 GMT
Server: Apache/2.2.6 (Unix)
X-Powered-By: PHP/5.2.5
Location: http://phpadvent.org/2008
Content-Length: 0
Connection: close
Content-Type: text/html

Without going into too much detail about this response, I want to point out the status code and the Location header. The status code in this case is 302 Found, which means that the resource requested—in this example, it’s /—exists temporarily at another location, specified by the Location header. The browser knows what to do with this, so it makes a second request for http://phpadvent.org/2008. Through telnet, we can make the raw request like this:

GET /2008 HTTP/1.1
Host: phpadvent.org

What we get back is a 200 OK response with a full HTML body.

This is standard redirection. I make a GET request, and the server tells me to request from a different location. The server itself does not perform the redirection. That is up to the client (browser).

Safety First

I won’t spend too much time talking about what GET and POST mean. You already know these methods. The GET method is used for retrieval, while the POST method is used to indicate a resource on the server that should take care of processing some data that we’re sending to it.

What’s important to note here is that the HTTP specification clearly states that GET “SHOULD NOT have the significance of taking an action other than retrieval.” (Steps up on the soap box.) Web developers violate this every time we create a link on a page that a user clicks on to rate something, increase a counter, purchase a book, etc. The fact of the matter is this: if it uses a GET request to take any action other than retrieval, then it’s wrong.

“But why is it wrong?” you ask.

It’s not wrong because someone sat in their ivory tower and mandated that it is so. The HTTP designers designed GET as a “safe” method, allowing browsers to represent POST in a special way to make the user aware that they are requesting a potentially unsafe action. This does not mean that GET cannot have side effects on the server, but it does mean that the user may not be aware that the request for those side effects was made and cannot be held responsible for it.

If web developers forced the use of POST for these kinds of actions rather than using GET requests, then the browser could at least notify the user that some action is about to be made, and they could confirm or cancel the request.

Idempotence: Not a Sexual Dysfunction

This leads to the concept of idempotence. Pronounce it at your own risk.

HTTP methods that are said to be idempotent are so termed because “(aside from error or expiration issues) the side effects of N > 0 identical requests is the same as for a single request.” In layman’s terms, what this means is: when I make a request ten times, the side effects of that request are exactly the same on the tenth time as they are on the first time. GET is considered idempotent, as are HEAD, PUT, and DELETE, which I won’t be discussing in this post. POST is not idempotent.

Since GET is considered “safe” and for retrieval only, then it is by nature idempotent. Every time I request a resource with GET, it will always retrieve that resource with no side effects. We break this property of GET when we attempt to make GET do more than simple retrieval.

Consider the URL http://example.org/count. Making a request to this URL should increment a counter and return the new value. The first time I request this URL, the value returned might be “1,” but the tenth time I make this request, the value would be “10.” The property of idempotence dictates that the value returned on the tenth request should be the same as that returned on the first request, provided the request is identical. Therefore, if I use a GET request to retrieve the value stored at http://example.org/count, it should always return the same value, provided no one makes a POST request in the meantime to increment the counter and change its state, but that’s what POST is for.

Practice Safe Web

Following the HTTP specification precisely and respecting the “safe” and idempotent nature of GET, while using POST to manipulate data, will make your web applications safer, but it’s not a security measure to protect your sites against attacks, so please don’t misunderstand. What your application gains in safety is the browser’s ability to notify the user that potentially unsafe actions are about to occur, while limiting an attacker’s ability to manipulate data through GET requests.

Furthermore, you’ll feel better about yourself because you’re doing the Right Thing™ by following the standard.