Thursday, May 11, 2017

Migration-base vs State-based database development approach

There are two main approaches for the development of SQL databases:

  • Transformational (migrations)
  • State-based (states)

Transformational or migration-based approach

The idea of the transformational approach is based on database migration from one state to another. After first production release, we will assume that database has its the first state. This state is the starting point for further migrations.
Why do I think that the starting point for migrations should be the first production release? The answer is simple. While the application has not gone into the production stage, it does not make much sense to try to save entered test data. Keeping data complicates development and testing process on earlier stages of project delivery. In this case, it is better to re-create the database each time and insert reference and test data into it. 
The diagram below describes the migration-based approach:


In this method, developers concentrate on writing migrations (M1, M2 .. Mn) from one database state to another (S1, S2 .. Sn). There is a good example of this approach. It is EntityFramework's migration mechanism. For more information, you can navigate and check MSDN article https://msdn.microsoft.com/en-us/library/jj591621(v=vs.113).aspx.

State-based approach

This approach is based on the following paradigm: developers have to maintain the most current snapshot of the database structure. What does it mean? The most convenient way to explain this is to analyze the following example. Let's assume that we have the table below:

CREATE TABLE People (
    PersonID  INT NOT NULL PRIMARY KEY CLUSTERED,
    LastName  VARCHAR(255) NOT NULL,
    FirstName VARCHAR(255) NOT NULL,
    Address   VARCHAR(255) NULL,
    City      VARCHAR(255) NULL
);

At some moment product owner decided to add one more property to the People table. In migration-based world in this particular scenario developer has to create migration script with ALTER TABLE Statement, but in state-base world we need to follow it's the main paradigm, so we need to update table creation script with a new column. Please see below.

CREATE TABLE People (
    PersonID  INT NOT NULL PRIMARY KEY CLUSTERED,
    LastName  VARCHAR(255) NOT NULL,
    FirstName VARCHAR(255) NOT NULL,
    Address   VARCHAR(255) NULL,
    City      VARCHAR(255) NULL,
    Country   VARCHAR(255) NULL
);

So developer needs to keep database structure up-to-date and the tools will do all the work. Cool, right? One of such tools was introduced by Microsoft when they shipped Visual Studio database project.
Let's see how these tools work. On the diagram below, you can find database development schema used by Visual Studio database project.


Microsoft Database project and tools inside of it will be trying to sync the database with current database script. For more info about Microsoft Database Project use https://msdn.microsoft.com/en-us/library/xee70aty(v=vs.100).aspx link.

What approach to choose?

It is difficult to say which approach is better or worse for a particular project. Next, I will give some advantages and disadvantages of each approach, which in my opinion play a key role in the choice of the mechanism for developing a database on the project. And you can decide what is the best for you.

Advantages of migration-based approach

  • Understandable workflow of data migration
  • Full control on generated SQL-scripts

Disadvantages of migration-based approach

  • Need write the tool for deployment script generation or pay for existing one
  • Developer's onboarding process require more time for understanding how database upgrade works
  • The complex mechanism for keeping the database in the actual state. Developer has to revert database to valid state to perform migration because order of migrations is important

Advantages of state-based

  • Gives developers access to the current state of the database
  • It is easier to audit and view history of changes in source control
  • Force developers to keep actual state of the database in source control
  • Database script compilation, developers can identify issues with their code during compilation
  • Can detect if somebody did manual changes in the production database and sync it with actual schema in source control
  • Build-in mechanism for database refactoring

Disadvantages of state-based

  • Developers do not have a lot of control over deployment script generation
  • Poor support for user data migration
  • Additional review of generated script required, developers have to check generated script before deployment