The Twelve Factor App

Submitted by peter on Mon, 01/15/2018 - 21:00

There is a set of application development guidelines worth reading at https://12factor.net/. They describe their guidelines with the following major point.

The twelve-factor app is a methodology for building software-as-a-service apps

Methodologies help you move toward a target. You use a methodology when the methodology provides the best route to your target. Some of the twelve factors are common to other methodologies and guidelines. Choose one or all factors. Compare them to your current approach. You might find something to improve your current or future projects.

Here are my observations based on experience across many projects.

Twelve Factors

First the twelve factors.

Codebase

One codebase tracked in revision control, many deploys

This is an excellent starting point. You should use version control when there is more than one developer or more than one release or more than one customer. You can then walk backwards from issues to the changes that caused the issues. You can help selected customers test a change before moving code from alpha status to beta status.

One version in version control requires really good comments in the change control system. the best approach is to have your issues tracked in an issue tracker and link issue items to code changes including code change reversions. Use Git or Github for version control, Mantis or JIRA for your issues, and provide a two way link.

Your one code base then has a history of development and decisions about change. The instant you create a different code base, you lose the connection to reasons for decisions in the other code base. At some time in the future, you will have to merge code and will run up against incompatibility.

When do you need a different code base? You might need a new code base for an upgrade from Perl to PHP, something I did a few times in the 1990s. The few uses for new code bases are really extreme and are usually the result of many years without a code refresh. Put the old code base in version control and connect every item of new code to the equivalent functionality in the old code.

Whatever you do, put the existing code in Git and connect the code to the development documentation, your project management, and issue tracking. Type in the reason for the change before starting the change. If you use automated testing, you should be able to write a test for the change before writing the change.

Dependencies

Explicitly declare and isolate dependencies

Dependencies are the bits of code you need but are not in your application. You can configure your computers to have everything. You cannot guarantee other people will have the same system level configuration. Consider one example. When you use Linux, you expect certain programs to always exist. I worked on two projects using Linux where common code was missing or had a different name/path.

When you use PHP, the PHP developers put their functions in front of system level code. You use their function. The PHP developers then choose between using system level functions or using the same code in PHP to remove system level dependencies.

Another example would be applications written in Java. The Java brag sheets tell us that Java is everywhere and Java based application can just use the local single copy of the Java run time, the JRT. Most Java applications work with only one version of the JRT so the applications include their own copy of the JRT to avoid all the incompatibilities.

The Twelve Factor approach declares all dependencies and has them automatically included through a loader/manifest system appropriate to the programming language used for the application. Another approach is to check all the dependencies during the installation configuration step.

Both options occur too late, they both wait until the first use. You really need something you can send ahead to check the target system before the customer commits to your app.

Config

Store config in the environment"

This is part of the "separate date and code" theme from the 1970s. Your configuration settings are data, not code. Anything defined as a constant is data. Put the data in a configuration file or a database.

The Config page in 12factors.net mentions environmental variables. Environmental variables are external to your application and should be treated as hostile with very careful checking before use.

A configuration file has the advantage that it can be version controlled and distributed as a starting point for new installations. Think about an app wrapped in a Web site deployment using Apache. You might provide a file named apache2.default.conf. During installation, they copy apache2.default.conf as apache2.conf and apply their changes to apache2.conf. They always have the default as a reference. You could do something like that with all your config files.

Configurations in databases have advantages for updating and reporting settings. You need a file with the initial default settings. You might provide a settings export for use when reporting issues.

Something you will need to think about when scaling upwards is the deployment of settings across server sets when you have your app/Web site replicated across multiple data centres on several continents. When does your central change override a local setting?

Think about something like taxes and shipping changes. They vary from country to country. You distribute a change at the country level. Then you remember Canada where the rules for applying taxes vary from province to province. Did you allow for their special rules?

You can see similar problems when a Web browser is updated. Something from Apple, Google, or Mozilla tramples over your local settings. You spend days trying to find stupid internal settings. They should have a way to protect local settings or to show a before/after list and a way to revert their mistakes.

Backing services

Treat backing services as attached resources

The Twelve Factor page on backing services describes the ideal situation where you can treat every service as an equal resource and not differentiate between similar resources. In real life, similar resources can have different security and different interfaces.

At best you might get the same interface when two resources are exactly the same release. A common approach is to write a thin service to provide the connection to external services. You can then alter the interface as needed. An example might be a bulk mail service. You choose between two suppliers and decide to keep open the option to swap across to the other supplier. While you are using service A, you can write a thin interface service to bring in service B as if it were service A.

Build, release, run

Strictly separate build and run stages

The Twelve Factor Build, release, run page is describing development using a compiled language with code locked into binary files. What they are trying to achieve is also relevant to PHP and other interpreted languages. Lock up the code early and keep it locked from testing through release.

In the PHP approach, you would edit your part of the code, run your local testes then commit the change to Git. Git should then submit an automated test of the whole application with your change. From that point on, the code moving through release should be the same as the code locked in Git.

There are times when an engineer might change distributed code to introduce diagnostics. The engineer should be able to instantly revert to the distributed code, instead of having to reverse out an edit.

Processes

Execute the app as one or more stateless processes

The 12factor.net Processes page describes a Drupal Web site. Store state data in a database and connecting data, the session, in Memcached. Drupal compiles some data for reuse and stores the result in a temporary file.

The only thing blocking Drupal from the best of a Twelve Factor approach is running everything in one master process. While Drupal uses lazy loading to minimise resource usage, where lots of resources are needed, they are all loaded together. The online shop example mentioned in this page becomes a monster in Drupal. You really need to look at using something like Phalcon split off subsets of processing as independent processes.

Port binding

Export services via port binding

The Twelve Factor approach says you have to extend your application to directly handle requests from a network port. This results in security problems that need special handling. In essence, you have to split internal traffic from external traffic and make sure your application sees only internal traffic. It is an overly complex approach for the projects I work on.

The 12factor.net Port binding page talks about using your application as a backing service for another application. I suggest you split your application in that case. The public part should be separate from the service provided to other applications. In effect, you build a server component and a client component.

The server component can be used internally through any type of network facing process. A service written in PHP could use the PHP built in service or Apache or Nginx depending on activity volume.

The client component would be locked up in Nginx to use the Nginx security and proxy server caching.

The 12factor.net site mentions code you can place in front of your code to provide network service port processing. They do not describe what you need in front of that for a public facing service. They do not mention the security problems. Nginx and Apache provide far more than what you get from a simple port interface library.

Concurrency

Scale out via the process model

Concurrency works best when designed in. Your application may have to split into parts to work concurrently. Look up Unix process model and similar approaches. Read the Concurrency page at 12factors.net. Design for concurrent processes.

Think about an online shop. The shopping application has a user login, a shopping cart, a product database, a stock database, and a purchase history. Do you need all that code and all those databases loaded all the time?

You might split off the stock code as a system running the stock database. It does not need to know about past purchases or anything the product photographs. The stock system just needs the stock code, the quantity available, and the number reserved in shopping carts. The other processes can ask questions and submit changes.

The payment history is needed only when the user looks at previous invoices. The user system could activate for login then disappear during shopping, with the basic user name and id in the visitor's session. On average, there would be less resource usage. In the online shop example, only the catalogue process would be active during the extensive product browsing phase of shopping.

Disposability

Maximize robustness with fast startup and graceful shutdown

"Disposability" seems like a strange word here. A disposable function is, to me, something that can be cancelled without an adverse effect, with no need for a graceful shutdown. Outside of the strange terminology, fast startup and graceful shutdown are good.

There are situations where a fast startup is achieved by deferring processing until needed. This is a "just in time" approach and can defer the processing until the first transaction, resulting in a slow transaction. A better approach is to defer the initial work during startup and then, if there are no transactions in the queue, start a dummy transaction to load the most common code and data.

Dev/prod parity

Keep development, staging, and production as similar as possible

Keep development, staging, and production similar is a number one priority in development and is a major reason for replacing the old fashion waterfall development process with something agile. While the formal Agile methodology might not fit your requirements, elements of the agile approach work everywhere.

In the waterfall approach, someone draws up a long list of changes, they are all developed, they are all pushed through staging together, and they all hit the customer at the same time. The long development time results in a mixed set of fixes added into the system at various points to keep the customer's systems working. When the big batch of changes arrives at the next stage, any local changes are lost and have to be redeveloped. This is a common complain about Microsoft products, with each release reinstating errors because the new release was branched out before the fixes were developed.

In agile, the fixes go through the full development process because the update process is so fast, the fixes will arrive fast, typically in a weekly update cycle.

Logs

Treat logs as event streams

This is good advice for big systems where an application is really an aggregation of components running on the same system. You send the logs to an aggregation service then split out related messages. As an example, a payment error might have messages from a shop system, a stock system, a payment gateway, an invoicing system, and a database system.

When you have all related messages, you can tell what happened when the payment failed. Were the purchased items returned to available stock? Was the invoice cancelled, reversed out, or left as unpaid?

Separate to your logging, you want to make sure messages end up in related locations. The user should know exactly what happened. "Transaction failed" is not enough. If the item they placed in the shopping cart was sold to someone else, they should know that you ran out of stock.

One error that was common in Australia was the "transaction rejected" message in payment systems. When the network failed, some payment systems issued the same message as used for warning about possibly stolen cards and other bad behaviour. You would get an evil look from the shop assistant. A few seconds later, the customer at the next cash register would get the same rejection and the shop manager would say something about "network failure, it happens all the time".

Admin processes

Run admin/management tasks as one-off processes

This part of the Twelve Factor methodology refers to a limited set of batch administration tasks you might choose to run from a 1950s style command line. In real life, there are many administration tasks that are a better fit to a formal interactive administration interface.

Think about a product update from a wholesale supplier. There might be tens of thousands of changes. You run it as a batch change. If all the images are sourced separately, the update file could be relatively small and handled by a process in a Web page. When the product changes have release dates in the future, there is no rush to push the changes through immediately. A shop receiving daily adjustments from many suppliers can queue them in a page and watch them run as part of the daily administration.

A mixed approach

We looked at an online shop as an example of something you might redevelop. Assume you are starting with PHP. Phalcon is the fastest PHP framework and works well for Web services type applications. In a private local network, you could put Phalcon in Apache or Nginx then run a tiny service dedicated to something like a stock database. The result could be a stock service used by anything else including a big Content Management System like Drupal.

Drupal would shrink and become, on average, a little bit faster. Stock adjustments would slow down a tiny bit due to the call to the separate service. Given that the stock adjustments are only needed on a few pages, like checkout, the small gains on many pages would outweigh the small losses on a few pages.

Conclusion

The Twelve Factor approach is all good. Some parts are limited to specific types of applications and other parts apply to every type of development. Read all the pages at 12factor.net. Bookmark the pages you want to discuss with your development team.