You are here

PHP sessions and serialisation

Submitted by Peter on Fri, 2010-09-03 01:27

PHP sessions and serialisation are related areas and are usually easy. Today, while working on arcaic code from last century, I found a really horrible combination of limitations of PHP 5.2.x bouncing the project all over the place. I present the problems in the hope it will help someone somewhere.

I used to document PHP related problems at php.net, the official site for PHP. I found many of my helpful suggestions were deleted by the thought police. Now I place the same information on a site I control and the big search engines add the information into the search results almost as quickly as a direct addition to php.net.

Sessions

Sessions are a way to store data temporarily, commonly from page to page when you browse a Web site. PHP provides lots of ways to create and work with sessions. You can store the data using files, a database, or any method accessible by code. You can configure all the important features when you set up your Web site and you can override most of the settings from code to produce an extremely versatile system.

When data from your application is stored in sessions, PHP offers two techniques for serialising data into simple text strings for easy storage anywhere.

Serialisation

PHP offers three serialisation techniques, serialize(), session_encode(), and wddx_serialize(), with two available for use in sessions.

session_encode() is the PHP default for serialising into and out of sessions but does not offer the flexibility to read the stored data outside of a session.

serialize() does everything almost exactly the same as session_encode() and lets you read session data outside of sessions but the two types of serialisation do not work together. You cannot replace session_encode() with serialize().

WDDX is a standard method using XML and is the most flexible. WDDX can be a little bit slower because the resultant string is longer. WDDX solved one of the problems I struck today.

Reading session data without a session

There are situations where you want to read session data without starting a session. Sessions require HTTP headers. You need different HTTP headers to redirect the visitor to a new page. Redirection fails after you set the headers that support sessions. You sometimes need to read the session data first then decide the direction you will take then you start the session.

session_encode() has a matching session_decode() but you cannot use session_decode() to decode the session data because session_decode() decodes direct into memory instead of into a variable. You cannot control the output of session_decode().

serialize() has a matching unserialize(). unserialize() looks perfect for unserialising session data. When you use session_encode() to encode the session data, unserialize() does not decode the data. The examples at php.net show unserialize() decoding session data and the example used to work on a site I built a long time ago. Today with PHP 5.2.x, unserialize() does not work on the data encoded in a session.

I replaced session_encode() with serialize() in one site a long time ago so I could use unserialize() to decode the data. The result is inefficient. You let PHP waste a lot of processing power encoding then you throw the encoded string away and create your own. You waste the same amount of processing bring the data back. PHP should either let you specify serialize() to replace session_encode() or make the two compatible.

WDDX is a good reliable replacement for session_encode() and PHP makes the changeover easy. WDDX is supplied ready made in PHP for Windows. cPanel and similar Web site management programs build WDDX for you automatically when you tick the appropriate box on the requirements page. Some primitive operating systems make you compile WDDX before you can use WDDX.

WDDX solved all the access and compatibility problems in the site where I wanted to read the session data outside of the session. You tell PHP to use WDDX instead of the default with just one quick change to the PHP session settings. A 6000 byte session row increased to a bit over 8000 bytes when I switched to WDDX. There is no noticeable speed difference for this small amount of data. The flexibility of WDDX is worth more than a few CPU cycles.

wddx_deserialize()

WDDX lets you decode the session data into an array of your choosing. The magic decoder function is named wddx_deserialize().

wddx_unserialize()

wddx_unserialize() is recommended in the PHP documentation but wddx_unserialize() is not in PHP 5.2.x and some other versions of PHP. wddx_deserialize() is in all the versions of PHP I use.

Files or a database?

Files are fast and tend to remain equally fast as the load builds up, unless you are using an older version of Solaris where the system suddenly slows down worse than any Monday morning peak hour traffic.

Databases can be fast but all the different uses of your database will soon drag down the performance of your database. Performance tends to become worse faster as your site ages because you build up bigger database tables and indexes. You need to occasionally tune your databases to allow for the growth.

Databases are more accessibly and flexible. You use files for simple session processing then you want some nice little extra feature and switch to storing your sessions in a database.

On shared hosting, sessions can be stored in disk directories you cannot view or search. If the sessions are stored in a database, you can always use your database browse facilities to view and search the session records.

MySQL

PHP offers three ways to access MySQL databases, the older mysql module for old versions of MySQL, the slightly newer mysqli module for modern versions of MySQL, and the PDO module for future access to MySQL. PDO works and is the interface used by the next version of Drupal, currently in alpha testing. PDO is not yet on all hosting servers.

mysqli is almost everywhere and is similar to the older interface, making a conversion to mysqli easy. The documentation for mysqli is out of alignment in a big way. mysqli can be used in an object oriented way and a procedural way. The documentation often describes an object oriented way that does not yet work. you have to search really hard to find anything about the older procedural approach that is currently the only reliable way to use mysqli.

If you have the latest version of PHP 5 with the latest version of MySQL, upgrade your database code from the old mysql procedural code to the newer mysqli procedural code. When everything is working and you have spare time, convert to the mysqli object oriented code to gain objectivity. The conversion is relatively easy. There is an almost one to one relationship between the older functions and the newer methods. The main problem is documentation that does not match up the way PHP works.

You can eventually convert to PDO. The conversion from the object oriented version of mysqli should be easy because they are similar. Last time I tried to use PDO, the documentation was missing big chunks.

Spinning around

Today's project used PHP 5.2.x. The sessions kept failing in a variety of ways. This is not my code. It was written by one of those horrible creates you read about in Lord of the Rings. The code used files for storage and is ancient rubbish I would not show to students, not even as an example of bad code, because it might infect their brain. It worked for years when untouched by human hands but fails when you make the slightest change. One tiny change in one place creates massive problems all over the code and you have to make hundreds of changes to get the site working again.

The session files are in a directory that is not accessible, making diagnosing the problems extremely difficult. The first change was to put the sessions into the database. The existing database code then showed it could not handle long text. Write off three hours fixing the database access code to handle long strings. In the end I switched to using binds instead of long SQL query strings.

Then I found there is already a session table but that session table is used only for some sessions and it does not work reliably. I am faced with many more hours work to convert the unreliable partial sessions across to the standard sessions.

There are two other forms of sessions in use. I know about both of them and will leave them in place because they are just to hard to change.

The existing MySQL code did not work the correct way for the mysqli procedural approach and I started switching to the mysqli object oriented approach but the documentation for the object oriented approach did not match the way mysqli actually works. I gave up and fixed a number of errors in the existing mysqli code. I can now use my mysqli based session code to maintain the sessions and use the regular database code to read the session data any time I need something outside of sessions.

Getting the deserialisation working was painful until I switched to WDDX. I wasted three hours on the default PHP session decoding before switching to WDDX. WDDX required only one extra step and that is preloading classes required for any objects stored in sessions. The conversion to WDDX was over in 20 minutes.

I do not normally store objects in sessions. The existing code had a lot of objects in the session and most were not needed. I spent an hour removing all the ones that were not used and a few minutes adding the preload for the few objects still in the session. After the clean up, the sessions are so small I can easily find anything in them.

Content management system

If you use a modern free open source content management system, I recommend Drupal, you do not have to battle with all the problems I mention in this page because hundreds of developers have already built and tested the code for you.

Trying to maintain the worst of the 1990s amateur code shows you how pathetic a lot of the code was. The code was made worse by the fact that most of it was scribbled over by a different contractor every month. Think of chickens pecking at the code.

All those years of Who cares, I will be working somewhere else next week. None of the beneficial open review you find in large open source projects. A good content management system should replace 90 percent or more of your code with thoroughly tested and community maintained code.

Conclusion

There are often several ways to do things in PHP and one way will work better than the others. In the areas of database access, sessions, and serialisation, there is a time lag of a year or more between the documentation and the code. You need to systematically test the alternatives to find the details missing from the documentation. One day you should dump the old code completely and move to a system where there are hundreds of other developers helping you.