DbDeploy (and its .NET counterparts) work OK when everyone is working on or checking into a single master branch. Here is an example of the normal use case, so that we can compare to the more complicated use cases that follow.
DeveloperA needs to make a change to the database and they see the delta scripts are up to 77, so they make delta script 78 and continue testing the code that they wrote to work with those database changes. Meanwhile DeveloperB also sees that the scripts are up to 77, and so they make a script 78 and start testing with their code changes. Let’s say that DeveloperA finishes first and checks in. DeveloperB goes to check-in, sees they are out of date, pulls the latest, runs the unit tests again, and blamo! – a failure (two script 78s). At this point are faced with an annoyance that can be worked around. They need to rollback their script 78, run the other script 78, rename their script to 79, then re-run the unit tests and check in.
Let’s take the same scenario and use date based numbering or timestamping. The last delta script checked in on master is 140901.1047. Notice I have switched to using decimals as the script numbers, with the number being yyMMdd.hhmm. Developer A wants to make script change 140907.1532 and DeveloperB wants to make 140908.0854. When DeveloperB goes to check in, pulls the latest, and runs the unit tests. At this point the tool could roll back 140908.0854 and apply 140907.1532, then re-apply 140908.0854. Or if you are “feeling lucky” then the tool could just “fill in the hole” and apply 140907.1532, leaving the other script alone. Determination of whether or not to rollback could be made by whether there are undo scripts for all of the scripts that need to be rolled back. If there are, use the rollback, if not, then just apply the missing script.
The problem gets much more complicated when there are multiple developers working on multiple feature branches. This is more like the Git branching model. In this scenario let’s say there are two teams, TeamA and TeamB. Each of the teams develops a set of scripts to support their particular feature.
Let’s say TeamA develops:
140903.1242
140909.1117
140909.1734
And TeamB develops
140904.1512
140905.0933
140911.1802
Assuming TeamA checks into master first, when TeamB gets latest they merge in TeamA’s scripts. *This time*, after TeamB checks in, TeamA will *also* need to merge in TeamB’s scripts. However, both teams should end up with a database that has the correct scripts inside it. The possibility exists that one teams scripts will force another teams scripts to change. Let’s say TeamA’s script on 140903.1242 renames TableX.ColumnN to be ColumnA, and TeamB’s script on 140904.1512 uses ColumnN (in a new View that they have created for example). When TeamB gets latest and tries to run the unit tests, blamo! – error in script. If we were “filling in the holes” it would actually be 140903.1242 that caused the error, and if we rolled back it would be 140904.1512 that would cause the error. The point is one or more scripts that have already been applied may need to change to support the incompatibility.
Timestamping doesn’t solve everything, but it comes pretty close. One use case that isn’t supported by timestamping is solved by hashing. Let’s take the case of a single developer or team working on their own machine, trying to figure out the right way to make a script change. They may try the first version of a script that uses column with an integer type, but they realize that it must also allow NULLs. They *could* create scripts to support every little change that they make, but that feels a little cumbersome, verbose, and confusing to humans trying to follow the chain of scripts. It would be nice if the tool helped in this scenario. So in the case of a single developer they change the script that has already been run (in fact may not even be the last script that was run) to allow NULLs. Then the tool, sees that the script changed, backs up prior to the script, and runs the script again, to create the column with the correct type.
One last point. Now that we have scripts being applied and reapplied due to the two scenarios mentioned above, there is another change that we need to make when authoring the scripts so that we don’t lose data unnecessarily. Everyone knows to use RENAME instead of DROP and ADD. However let’s take the simple case of adding a new column. The script might look like this:
alter table Sales.Customers add Gender char(1);
Of course we would also want an undo operation:
alter table Sales.Customers add Gender char(1); --//@UNDO alter table Sales.Customers drop Gender;
After the script has been applied, if we have been using the database and adding gender information, but for some reason we need to roll back and forward again we lose all our gender data. All we need to do is place the information into a temporary table prior to dropping. We need to save both the PrimaryKey and the dropped column. So a solution for SQL Server might look like this:
alter table Sales.Customers add Gender char(1); if object_id('tempdb..#sales_customer_gender') is not null begin update Sales.Customers set Sales.Customers.Gender = tmp.Gender from Sales.Customers c inner join #Sales_Customer_Gender tmp on c.CustomerId = tmp.CustomerId drop table #Sales_Customer_Gender end --//@UNDO select * into #Sales_Customers_Gender from ( select CustomerId, Gender from Sales.Customers ) as genders; alter table Sales.Customers drop Gender;
It is a little more work, but it maintains the data, which is the primary goal of tools like DbDeploy anyway.
RSS feed for comments on this post.
Sorry, the comment form is closed at this time.