Database abstraction is the process of separating database storage from your use of the data in your programming code. What, when where, why do you abstract? Here are the main reasons/methods/answers.
What is abstraction? A form of translation combined with hiding. Think about flying on commercial airlines as a passenger. Do you know the cabin is partially pressurised to keep you alive at high altitudes? Most people do not know. It is one of the things you do not have to think about because someone else performs the work to make you life easier and safer.
Abstraction gives you the opportunity to separate out specialised work to special workers, or suppliers, and keep your application programmers focused on the data required for the application. You can hire a database specialist to work on the database and present an abstract view of the database to the application programmers. You can hire a web services expert to create specific Web services code that presents a simplified service to the application developers. Any connection or special processing can be simplified when presented to the application using the result.
There are two main reasons for abstraction. The first reason is to make life easier for the application programmers. The second reason is to maintain compatibility between databases.
Your application might focus on money. your database might not store money in an appropriate format. Your database administrator might offer to create a special money format using
stored procedures or some other trick available in the database. Your application programmers work with money. The database internally converts the money to a different format for storage and performs the reverse conversion when you read the data.
A database level abstraction, commonly called a view, can remove data from your view of a table or combine multiple tables as id they are one. Think of a customer account where they identify the city, state, or country where they live. You have a separate table supplying tax rates by city, state, or country. A view could return customer accounts with their tax rates added to their address.
A common use of views is to remove the year from your birth date. Human resources can see your full date of birth so they can calculate your retirement date. Everybody else can see the day and month of your birthday so they can celebrate your birthday but not your age, becaue your age is confidential.
Perhaps you are converting from database brand A to database brand B. As an example, PostgreSQL has special field formats for geometric measurements, the type you might use in computer aided design or mapping. If you use them then convert to MySQL or most other databases, you have to emulate the geometric measurements. Database abstraction lets you perform the emulation in one place outside of your application.
Consider a company with many applications using MySQL and one application using PostgreSQL. The company decides to reduce maintenance overheads by converting the lone PostgreSQL application to MySQL. They only need a single one off conversion without database abstraction.
Now consider te same company with several applications using PostgreSQL and management arguing about the choice of MySQL or PostgreSQL. You start converting applications from PostgreSQL to MySQL then management change their minds and you have to convert all the MySQL applications to PostgreSQL. A better solution would be to convert the applications to use an abstraction layer compatible with both MySQL and PostgreSQL. You can then switch from one database to the other and back any time management change their minds.
Think about the applications you create today and the databases of tomorrow. MySQL does not have a true binary bit field, a boolean field, and emulates boolean using a small integer. There are some simple databases with no boolean fields. Copying data between database types is dangerous because the data conversions may introduce errors. You can change your approach and introduce your own abstraction to store binary fields as integer for compatibility with al databases.
You can perform abstraction in your own code or add a layer of abstraction software or use abstraction in your programming language or in SQL or in your database. You choose and you can choose more than one place for abstraction.
You can add abstraction in your code with a simple function, perhaps on called
integer_to_boolean. You get maximum flexibility and two main disadvantages. First, you might forget to use the function. Second, you have to tell your code the type of database used to store the data.
You can add abstraction in your code with a class, perhaps on called
boolean_emulator. You get maximum flexibility plus the class could return the data in several formats including printable strings. You still have the same two disadvantages, you might forget to use the class and you have to tell your class the type of database used to store the data.
Using PHP as a programming language example, PHP performs some data conversions automatically as a form of code simplification or abstraction. PHP also has a database abstraction called PDO. You can replace PHP MySQL code with PHP PDO code and instantly most of your database accesses can work on MySQL plus PostgreSQL and several other databases. PDO only abstracts the database access, not the database column conversions, leaving you with some additional work.
There are database abstraction layers with ADOdb as an example of a well known layer with a long history. Some content management systems chose ADOdb as their abstraction layer. ADOdb used to be a leader in the field and now drags behind because ADOdb tries to maintain backward compatibility too far.
Databases offer some abstraction through their access layer. PostgreSQL and MySQL both let you access them direct for speed and accuracy. They both let you also choose ODBC as an independent database access method. ODBC returns some data abstracted because ODBC is not designed to handle every data type from every type of database.
Views are an abstraction offered by some databases and the flexibility varies from database to database. in some databases, a view is created by code you add to the database. Views created by simply hiding some data are a convenience when you first use them and are a roadblock when your database administrator leaves because your application programmers cannot find out how the data is maintained. The main advantage of a view is the ability to severely restrict the data visible to an application programmer.
Stored procedures are code stored in a database then run for every database update and access. Look at the cost of stored procedures from the perspective of the manager. Your Web site is written in PHP. Your stored procedures may have to be written in Perl or some other language, adding to your maintenance cost. Your applications are written by application programmers but the stored procedures are written by someone who is not normally a programmer and may not be subject to the same testing or control. When your database administrator leaves, you may have to hire a very experienced replacement to ensure the stored procedures are maintained.
The best form of abstraction is a software layer you can replace when your requirements change. The layer should be written in the same language as your applications, assuming all your applications are written in the same language.
The layer should allow some choice of database. PostgreSQL and MySQL both cost less than Oracle. A company using Oracle would, at a minimum, look for an abstraction layer compatible with Both oracle and PostgreSQL to make PostgreSQL a future possibility.
There is no perfect layer, only layers offering more independence than you have without a layer. If your staff know how to use a layer that provides most of what you want, that layer might be a better choice than an unknown layer offering more features but more conversion costs.
Look for a layer known to new staff. ADOdb is less than ideal in some areas but it is widely used and is the standard database abstraction layer for some content management systems. You can find more people with ADOdb experience than many of the alternatives.
Look at frameworks. Some application development frameworks are impossible to use unless you use everything from the framework. Other frameworks are designed to let you use only what you want to use. Zend framework is an example of a PHP framework where they encourage you to use only the parts you need. Look at the database abstraction within the frameworks. Choose the best parts or the whole framework.
Content management systems
Most new web sites are built on content management systems, Drupal is the best example of a modern content management system. Drupal recently grew from version 6 to version 7. Version 6 has a database abstraction layer that works with MySQL and had some compatibility with PostgreSQL but not enough. Drupal 7 has better compatibility with PostgreSQL plus compatibility with several other databases including Oracle and Microsoft SQL server.