Showing posts with label SQL_DB. Show all posts
Showing posts with label SQL_DB. Show all posts
I can’t release my database project due an error “data loss could occur”
error code

Updating database (Failed)
*** Could not deploy package.
Warning SQL72015: The column [dbo].[Table1].[ColumnToRemove] is being dropped, data loss could occur.

This happens due the fact that my source table changed/removed a column that’s still available in the target table.

When deploying a database project, you can add an publish file with this deployment. In this file (it’s a xml file) you can set/adjust certain settings
First, we setup visual studio and secondly, we adjust je build pipeline

1) Visual Studio
Here we are going to create the publish file
  • Right click on the solution and click “Publish”
  • Then a new window pop-up, go to “Advanced”
  • Uncheck - 'Block incremental deployment if data loss might occur’ checkbox.
Publish settings - general

  • Go to the second tab and check - 'DROP objects in target but not in project' and click “ok”
Publish settings - drop

  • Then select “Create profile” and a new a ‘***.publish.xml’ will be added to the solution

Add publish file

  • The next step is, is to add the publish file to source control “add ignored file to source control”
Add to source control

2) DevOps 
First, we edit the build pipeline
  • Go to Azure DevOps -> Pipelines -> Pipelines
  • Then edit your pipeline (please see the blog of Joost for creating a pipeline)
  • Go to the task “Copy files” and add to ‘contents’ “**\*.publish.xml” 
Build pipeline

Last step, is to add the xml file to the release pipeline
  • Go to pipelines -> release
  • Then edit your pipeline and add the publish file
Release pipeline

In this post you learned how to add a Publish Profile file. In this file you can change the publish settings. In this example we did want to make it possible to truncate tables when a column has been removed.
But it can also be used to disable the deployment of security-related objects to our database, like:
  • ExcludeUsers
  • ExcludeLogins
  • ExcludeDatabaseRoles

I want to up- and downscale my Azure SQL Database within my ADF pipeline. You recently showed how do this with Powershell code in an Automation Runbook, but that is too much coding for me. You also showed how to do this with some simple TSQL code, but that requires a lot of database permissions for the ETL user. Is there a codeless solution in Azure Data Factory which doesn't require permissions within my Azure SQL Database?
Change the Tier of your SQL Azure DB

Yes: Rest API! And don't worry there is no coding required. For this example we will use the Web activity to call the Rest API of SQL Server. This doesn't require any coding or permissions within the database itself. However you need ofcourse some permissions to change the database Pricing Tier. For this we will be using managed identities for Azure resources: we will give the Azure Data Factory permissions to the Azure SQL Server.

For this example we assume you already have an ADF with a pipeline for your ETL which we will extend with an upscale and a downscale.
Stage multiple tables in a foreach loop

1) Assign permissions to ADF
The permissions will be granted to the Azure SQL Server and not to a specific database. To change the database Pricing Tier with ADF, we need a role that can only change the database settings, but nothing security related: Contributer, SQL DB Contributer or SQL Server Contributer.
  • Go to the Azure SQL Server of the database that you want to up- or downscale with ADF
  • In the left menu click on Access control (IAM)
  • Click on Add, Add role assignment
  • In the 'Role' drop down select 'SQL DB Contributer'
  • In the 'Assign access to' drop down select Data Factory
  • Search for your Data Factory, select it and click on Save
Note: if you forget this step you will get an error while executing the pipeline in ADF
Not enough permissions

"errorCode": "2108",
"message": "{\"error\":{\"code\":\"AuthorizationFailed\",\"message\":\"The client 'xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' with object id 'xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' does not have authorization to perform action 'Microsoft.Sql/servers/databases/write' over scope '/subscriptions/xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/Joost_van_Rossum/providers/Microsoft.Sql/servers/bitools2/databases/Stage' or the scope is invalid. If access was recently granted, please refresh your credentials.\"}}",
"failureType": "UserError",
"target": "Upscale Stage DB",
"details": []

2) Get URL Rest API
Now it is almost time to edit your ADF pipeline. The first step will be adding a Web activity to call the Rest API, but before we can do that we need to determine the URL of this API which you can find here.{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Sql/servers/{serverName}/databases/{databaseName}?api-version=2017-10-01-preview

Within this URL you need to replace all parts that start and end with a curly bracket: {subscriptionId}, {resourceGroupName}, {serverName} and {databaseName} (including the brackets themselves). Don't use a URL ( for the database server name, but use only the name:

3) JSON message for Rest API
The Rest API expects a JSON message with the pricing tier for which you can find the details here under SKU. For some reason the location is required in these messages. Here are two example which you need to adjust for your requirements:
{"sku":{"name":"S1","tier":"Standard"}, "location": "West Europe"}
{"sku":{"name":"Basic","tier":"Basic"}, "location": "West Europe"}

3) Upscale DB
Go to your existing ADF pipeline to add an upscale (and downscale) activity
  • Add a Web activity in front of the existing Lookup activity
  • Give it a descriptive name: Upscale Stage DB
  • Go to the Settings tab of the Web activity
  • For URL enter the URL you created in step 2
  • For Method choose PUT
  • Add a new Header called 'Content-Type' with the value 'application/json'
  • Copy the JSON message that you create in step 3 to the Body
  • Collapse Advanced at the bottum
  • Choose MSI (old name for Managed Instance) as authentication
  • Enter '' in the Resource field. This is the URL for Managed Identity
Add Web activity to call Rest API

4) Downscale DB
Repeat the previous step, but now add the downscale activity at the end
  • Add a second Web activity, but now after your ETL activity (foreach in this example)
  • Give it a descriptive name: Downscale Stage DB
  • Go to the Settings tab of the Web activity
  • For URL enter the URL you created in step 2
  • For Method choose PUT
  • Add a new Header called 'Content-Type' with the value 'application/json'
  • Copy the JSON message that you create in step 3 to the Body
  • Collapse Advanced at the bottum
  • Choose MSI (old name for Managed Instance) as authentication
  • Enter '' in the Resource field
Adding an upscale and downscale via Rest API

In this post you learned how to give a resource (ADF) access to an other resource (SQL Server) via Managed Identities. Then we showed you how to call a Rest API in an ADF pipline Web activity for which we didn't have to write any code at all. From all the different options we showed you to Up- and Downscale an Azure SQL Database this is probably the easiest and safest method especially when you want to incorporate in your ETL process. An other positive thing is that the Rest API call is synchronous which means it doesn't start the ETL before the database has been upscaled. So no extra wait activities required.
You recently showed how to up- or downscale my Azure SQL Database with Powershell code in an Automation Runbook, but is there a way to do this without Powershell?
Change the Tier of your SQL Azure DB

Changing the database Pricing Tier can also be done with some TSQL script. First we will show you how to do this in Sql Server Mangement Studio (SSMS) and then how you could also do this in Azure Data Factory (ADF).

To change the Pricing Tier in SSMS you first need to connect SSMS to your Azure SQL Database and make sure to select the right database (not master) in your query window.

A1) Retrieve pricing tier
First a query to retrieve the Pricing Tier (Edition, Server Objective and Max Size). This can be done with a hardcode name or by using DB_NAME() within the DATABASEPROPERTYEX() command. However since you cannot retrieve the data from database X when connected to database Y (it will return NULL) the second options is the easiest.
-- Hardcoded Database Name
SELECT DATABASEPROPERTYEX('bitools2', 'Edition') as Edition
, DATABASEPROPERTYEX('bitools2', 'ServiceObjective') as ServiceObjective
, DATABASEPROPERTYEX('bitools2', 'MaxSizeInBytes') as MaxSizeInBytes

-- Get current Database Name
, DATABASEPROPERTYEX(DB_NAME(), 'ServiceObjective') as ServiceObjective
, DATABASEPROPERTYEX(DB_NAME(), 'MaxSizeInBytes') as MaxSizeInBytes

Retrieve Database Pricing Tier

A2) Change pricing tier
The next step is to change the Pricing Tier with an ALTER DATABASE command in SSMS. If you go to the portal right after executing the ALTER statement you will see a message Updating database pricing tier. When executing the command in SSMS it immediately shows that it is completed, however it takes a few moments to actually finish the command.
-- Change Princing Tier to Basic (only when size < 2GB)

-- Change Pricing Tier to Standard S1 with size set to 250GB

-- Change Pricing Tier to Standard S1 (withoud mentioning edition and size)

-- Change Pricing Tier to lowest Standard (S0)
ALTER DATABASE [bitools2] MODIFY (EDITION ='Standard');

Pricing Tier changes after execution the alter script

Note: According to the documentation you could also use CURRENT (instead of the database name) to alter the current database. However that failed on my laptop (SSMS 18.4). Please write your suggestions in the comments.
The source database '' does not exist.

B) Azure Data Factory
You could also execute this script in ADF to upscale your Azure SQL Database before the ETL (or ELT) starts and then downscale it afterwards. For this example we will add these ADF activities in our existing stage pipeline from a previous blog post to upscale the sink database (our stage database).
Stage multiple tables in a foreach loop

B1) Upscale DB
First we will add a Stored Procedure activity which executes the ALTER script from above to upscale the stage database.
  • Add the stored procedure activity in front of the existing Lookup activity
  • Give it a descriptive name: Upscale Stage DB
  • Make sure to use the same Linked service connection as your sink (Stage database)
  • Use 'sp_execute' for the Stored Procedure name
  • Add a new string parameter named 'stmt' and add the ALTER query as value (change DB name)
-- Change Pricing Tier to Standard S1 with size set to 250GB

Upscale DB in ADF

B2) Wait for upscale to be finished
Now you want to wait until the database has been upscaled before you start the ETL because during the upscale some queries will be cancelled. For this we will execute a query on the MASTER database (because queries will be cancelled on the Stage DB). This query will check whether the view sys.dm_operation_status still has an active ALTER query on our Stage DB.
  • Add a Stored Procedure Activity between the Upscale and the Lookup
  • Give it a descriptive name: Wait for upscale
  • Use a connection to the master database as Linked Service
  • Use 'sp_execute' for the Stored Procedure name
  • Add a new string parameter named 'stmt' and add the query below as value (change DB name)
  • Connect all activities in the pipeline: Upscale => Wait => Lookup
-- Wait until pricing tier has changed (retry ever 30 second)
FROM sys.dm_operation_status
WHERE operation = 'ALTER DATABASE' -- Only look for ALTER operations
AND state = 1 -- which are still in progress
AND major_resource_id = 'Stage' -- on our database
WAITFOR DELAY '00:00:30';
Bijschrift toevoegen

B3) Downscale DB
After the foreach loop is done we can add an other Stored Procedure activity to downscale the Stage database.
  • Add the stored procedure activity after the existing ForEach activity
  • Give it a descriptive name: Downscale Stage DB
  • Make sure to use the same Linked service connection as your sink (Stage database)
  • Use 'sp_execute' for the Stored Procedure name
  • Add a new string parameter named 'stmt' and add the ALTER query as value (change DB name)
  • Connect the ForEach activity to the Downscale Acticity
-- Change Princing Tier to Basic (only when size < 2GB)
Upscale, Wait, ETL and Downscale

Note: A big downside of using this method in Azure Data Factory is that you need a lot of permissions on your database. Which in most cases you don't want do give to your ETL account. In a next blog post we will show you an other alternative: Rest API.


In this post you learned how to change the database Pricing Tier with some simple TSQL script instead of a Powershell. Probably most suitable during development to temporarily upscale your DB to make the queries faster. But it could also be very useful to integrate the scaling in your ETL (or ELT) proces to minimize the time for a more expensive/faster database. In a next blog post we will introduce you to the Rest API which is very suitable for ADF.

To save some money on my Azure bill, I want to downscale the vCore or the number of DTUs for my Azure SQL Databases when we don't need a high performance and then upscale them again when for example the ETL starts. How do you arrange that in Azure?
Change the Tier of your SQL Azure DB

A few years ago we showed you how to do this with some PowerShell code in an Azure Automation Runbook with the AzureRM modules. However these old modules will be be out of support by the end of 2020. So now it is time to change those scripts.

The script works for both Database Transaction Units (DTU) and Virtuals Cores (vCore). DTU uses a Service Tier (Basic, Standard and Premium) and then the actual Performance Level (B, S0 to S12, P1 to P15) which indicates the number of used DTUs. More DTUs means more performance and of course more costs.

The newer vCore model is now recommended by Microsoft. It is a bit more flexible and transparent: it actually shows you the processor, memory and IOPS. But the lowest vCore price is much more expensive than the lowest DTU price. So for development and testing purposes you probably still want to use DTU. The vCore model also uses a Service Tier (GeneralPurpose, BusinessCritical) and then number of vCores combined with the processor (e.x. GP_Gen4_1 or GP_Gen5_2).

Use the Microsoft documentation to compare both models, but a little comparison. S3 with 100 DTUs is about the same as General Purpose with 1 vCore. P1 with 125 DTUs is about the same as Premium with 1 vCore.

1) Create Automation Account
First we need to create an Automation Account. If you already have one with the Run As Account enabled then you can skip this step.
  • Go to the Azure portal and create a new resource
  • Search for automation
  • Select Automation Account
  • Choose a useful name for the Automation Account
  • Select your Subscription, Resource Group and the Region
  • For this example we will use the Azure Run As account. So make sure to enable it and then click on the Create button.
Create Azure Automation Account

2) Add Module Az.Sql
Before we start writing some code we need to add a PowerShell module called Az.Sql. This module contains methods we need in our code to upscale or downscale the Azure SQL database, but first we need to add Az.Accounts because Az.Sql depends on it.

If you forget this step you will get error messages while running your code that state that some of your commands are not recognized:
Get-AzSqlDatabase : The term 'Get-AzSqlDatabase' is not recognized as the name of a cmdlet, function, script 
file, or operable program. Check the spelling of the name, or if a path was included, verify that the path is correct
and try again.
  • Go to the newly created Azure Automation Account
  • Go to Modules in the left menu
  • Click on the Browse Gallery button
  • Search for Az.Accounts
  • Click on Az.Accounts in the result and import this module
  • Also search for Az.Sql (but wait until Az.Accounts is actually imported)
  • Click on Az.Sql in the result and import this module
Adding a new module to your Automation Account

Note: if you are using an existing Automation Account then you probably already added Az.Accounts.

3) Create Runbook
Now we are ready to create a runbook in the Azure Automation Account and start writing some PowerShell code.
  • Go back to the overview page of your newly created Azure Automation Account
  • Click on Runbooks in the left menu
  • Click on the + Create a runbook button to create a new Runbook
  • Enter a descriptive name for the Runbook like UpDownScaleDb
  • Select PowerShell as Runbook type
  • Optionally add a description and click on the Create button
Create a Runbook

4) Edit Runbook code
Next edit the new Runbook if it wasn't already opened by the previous step. Copy the code below and paste it in the editor. Then study the code and its comments to understand the code that can upscale and downscale your Azure SQL DB. It exists of five parts:
  1. Parameters
  2. Log in to Azure
  3. Get current pricing tier
  4. Upscale or downscale
  5. Logging
To up or downscale the script needs five parameters. The first three parameters 'ResourceGroupName', ServerName'' and 'DatabaseName' are to indicate for which database the DTU or vCore needs to be changed. The fourth and fifth parameters 'Edition' and 'PricingTier' are used to up or downscale your DB. There are a couple of parameter validations which you could extend to make your script even more monkey proof.
Note: if you want to call this script via Azure Data Factory (ADF), then you need to change the parameter part. You can find all the details to do that in our blog posts about Runbook parameters and ADF and using the Webhook activity in ADF. If this is your first time creating a runbook then first try the standard script and then adjust it to your needs.

Log in to Azure
This is a standard piece of code that you will see in all of our examples. Please read our blog post about the Azure Run as Account for more detailed information.

Get current pricing tier
This piece of code tests whether it can find the database and gets its current pricing tier. It stores the current pricing tier and uses it later on for an extra check when upscaling or downscaling the DB.

Upscale or downscale
This is the actual code for changing the database pricing tier. There is an extra check to compare the current pricing tier with the new pricing tier. It now throws an error when they are equal. You could change that to write an warning instead of an error.
Note: you could also send emails to notify you of any errors

The last piece of code is for logging purposes. It shows you that it successfully changed pricing tier of the DB (or DWH) and how long it took to accomplish that.

# PowerShell code
# Parameters





# Keep track of time

# Log in to Azure with AZ (standard code)
Write-Verbose -Message 'Connecting to Azure'

# Name of the Azure Run As connection
$ConnectionName = 'AzureRunAsConnection'
# Get the connection properties
$ServicePrincipalConnection = Get-AutomationConnection -Name $ConnectionName

'Log in to Azure...'
$null = Connect-AzAccount `
-ServicePrincipal `
-TenantId $ServicePrincipalConnection.TenantId `
-ApplicationId $ServicePrincipalConnection.ApplicationId `
-CertificateThumbprint $ServicePrincipalConnection.CertificateThumbprint
if (!$ServicePrincipalConnection)
# You forgot to turn on 'Create Azure Run As account'
$ErrorMessage = "Connection $ConnectionName not found."
throw $ErrorMessage
# Something else went wrong
Write-Error -Message $_.Exception.Message
throw $_.Exception

# Getting the database for testing and logging purposes
$MyAzureSqlDatabase = Get-AzSqlDatabase -ResourceGroupName $ResourceGroupName -ServerName $ServerName -DatabaseName $DatabaseName
if (!$MyAzureSqlDatabase)
Write-Error "$($ServerName)\$($DatabaseName) not found in $($ResourceGroupName)"
Write-Output "Current pricing tier of $($ServerName)\$($DatabaseName): $($MyAzureSqlDatabase.Edition) - $($MyAzureSqlDatabase.CurrentServiceObjectiveName)"

# Set Pricing Tier Database
# Check for incompatible actions
if ($MyAzureSqlDatabase.Edition -eq $Edition -And $MyAzureSqlDatabase.CurrentServiceObjectiveName -eq $PricingTier)
Write-Error "Cannot change pricing tier of $($ServerName)\$($DatabaseName) because the new pricing tier is equal to current pricing tier"
Write-Output "Changing pricing tier to $($Edition) - $($PricingTier)"
$null = Set-AzSqlDatabase -DatabaseName $DatabaseName -ServerName $ServerName -ResourceGroupName $ResourceGroupName -Edition $Edition -RequestedServiceObjectiveName $PricingTier

# Show when finished
$Duration = NEW-TIMESPAN –Start $StartDate –End (GET-DATE)
Write-Output "Done in $([int]$Duration.TotalMinutes) minute(s) and $([int]$Duration.Seconds) second(s)"

Note: once your database grows to a certain size you can not downscale it to certain levels because there is a max database size for certain pricing tiers. For example if your DB is larger than 250 GB then you can note downscale it below S3. There are size checks that you could add to avoid errors.

5) Testing
Testing the functionality of your code can be done in the runbook editor. Click on the Test pane button above your script. After that you need to fill in the five parameters and hit the Start button to execute the script.
Testing a Powershell script

6) Scheduling Runbook
To schedule your runbook in Azure Automation you first need to publish it via the Runbook editor. After it has been published you can add a schedule to this runbook.
  • Edit the script in the runbook editor
  • Click on publish (the editor will close and you will be redirected to the overview page)
  • In the overview page click on Link to schedule
  • In the Schedule menu you can select an existing schedule or create a new one
  • In the Parameter menu you can provide the value for the five parameters
Add schedule to a runbook

Note: If you have multiple Azure SQL Database that you all want to up or downscale on the same time then you have a slight problem because you cannot reuse a schedule for the same runbook multiple times with different parameters (please upvote or add a comment). Workarounds:
  1. create multiple identical schedules (ugly but works)
  2. do everything in one big script (less flexible but works)
Log of runbook executions

Details of single execution

In this blog post you learned how to schedule an up- or scale of your Azure SQL DB (or DWH) to save money in case you don't need a high performace 24 hours a day. Scheduling is done in Azure Automation, but with some minor changes you can also do that via an ADF pipeline.

The Powershell scripting gives you the most flexibility for example to add additional tests and notifications, but for those who don't like scripting: in a next blog post we will show you how to accomplish this with TSQL code only. You can either use that for an on-the-fly up- or downscale or use it in a Store Procedure Activity in Azure Data Factory.
Before we can deploy our database project through the DTAP environment we need to build it to check whether the merged code of all developers still works. The result of the build is a dacpac file which can be used for the deployment. But what is a dacpac file and how do we create it in Azure DevOps?
Build Pipeline

First, Dacpac stands for Data Application Component Package. It is a single file containing a database model that is equal to the database project in Visual Studio. It contains all the information to create a database. You can rename it from .dacpac to .zip to see the content of the file.

1) Create empty build pipeline
There are several pipeline templates that you can use, but for this example we will start with an empty build pipeline.
  • Go to Pipelines (Builds) in the left menu
  • Click on New to create a New Build Pipeline
  • Use the classic editor link on the bottom
  • Select Azure Repos Git as the source
  • Then select the Team project, Repository and the default branch
  • Next choose an Empty job
Create empty Build Pipeline

Now give the new Build pipeline a useful name. Next is choosing the Agent pool. For this example we will use the default Microsoft hosted pool: Azure Pipelines. The agent for this pipeline will be 'vs2017-win2016'.

2) Add trigger
To 'Enable continuous integration' for the database project we will need to add a trigger. For this example we will use a Branch filter on 'Master' and a Path filter on the path of this database project in GIT. This trigger will automatically start the build of the database project when someone changes code in the master branch for this particularly project (if done right via a pull request from a feature branch to the master branch).
  • Go to the Triggers tab
  • Enable continuous integration by checking the checkbox
  • First set the branch filter to: Include and master (or the branch you want to use)
  • Click on + Add below Path filters
  • Set the path filter to: include and the path of your project in Git: DB/HST/HST_BB/*
Add trigger

3) Builds tasks: Build solution
Now we need to add tasks to the Agent job generated in step 1. The first task is MSBuild which will build the Visual Studio database project. Once succeeded, this will generate a dacpac file which will be used by the next tasks
  • Click on Agent job 1 and optionally change its name
  • Click on the + icon to add a task the Agent job
  • Search for msbuild in the seachbox
  • Select MSBuild (Build with MSBuild) and click Add
  • The only property you need to set is the filepath of the sqlproj file in GIT: DB/HST/HST_BB/HST_BB.sqlproj
Add MSBuild task

4) Builds tasks: Copy dacpac and publish files
With this task we will copy the result of the build task (a dacpac file) to the Artifact Staging Directory. We need to specify which file we want to copy and to which folder. For the target folder we will use a predefined variable pointing to the right folder. For the files we will use a wildcard filter. Note that we not only publish the dacpac file, but also the Publish profile xml file.

  • Click on Agent job 1 (or the new name you provided in the previous step)
  • Click on the + icon to add a second task the Agent job
  • Search for 'copy files' in the seachbox
  • Select Copy files (Copy files from .... folder paths) and click Add
  • In the contents field add two rows:
    • **\*.dacpac
    • **\*.publish.xml
  • As target folder add: $(Build.ArtifactStagingDirectory)
Copy dacpac to Artifact Staging folder

5) Builds tasks: Publish build artifact
The last step of the build pipeline is to publish all files in the Artifact Staging Directory and give them an Artifact Name. This Artifact Name will be used in the Release pipeline which we will explain in a subsequent post.
  • Click on Agent job 1 (or the new name)
  • Click on the + icon to add a second task the Agent job
  • Search for 'Publish build artifact' in the searchbox
  • Select Publish build artifact (Publish build .... file share) and click Add
  • Give the artifact a new name (instead of drop)
Publish the artifacts

Note: in this case the artifact is a dacpac file, but it could also be ispac for SSIS or a assembly for C#.

6) Test the pipeline
Now it's time to test your build pipeline by committing some changes to your master branch via Visual Studio (or to create a pull request to pull changes to your master branch). This will trigger an automatic build. If you don't want to make any changes at this moment, you could just click on (Save &) Queue in devops to do a build manually.
Dry run your build pipeline

In this post you saw how to build your database project automatically when someone changes something in the master branch (like a pull request). You don't have to check the result manually if you setup notifications in DevOps. Then you will receive an email when something succeeds of fails.

Bisides master branches you could also build other branches like feature or personal branches by using different filters. Or you could schedule your build to build each night. After setting up this basic example explorer and try out all the other options from the build pipeline. In a followup post we will show you how to deploy this built artifact through your DTAP environment using Continuous Deployment.

I'm using database projects combined with Azure DevOps CI CD and I want to deploy data through my DTAP enviroment for some reference tables. Another reason could be that you want to do some data preparations before deploying all database changes. How do I do that in Visual Studio Database projects?
Adding Pre- and Post-Deployment scripts

This is where Pre-Deployment and Post-Deployment scripts could help you out. It's all in the name, but a Pre-Deployment script will be execute before deploying the database project and a Post-Deployment script will be executed after that deployment. These are all just regular .sql files containing T-SQL code.

For example a Pre-Deployment script can be used to secure data in a temporary table before a table change. A Post-Deployment script can be used to copy the data back to the appropriate tables and then clean up the temporary table afterwards.

Deploying data through the DTAP environments is nothing more then creating a delete statement and some subsequent insert queries. If the reference table already exits you could use a Pre-Deployment script or else you should use a Post-Deployment script.

1) Creating a deployment script
To keep our database project clean we will first create a folder called 'Scripts' to store the Pre-Deployment and Post-Deployment scripts. If you have a lot of scripts you could also create two separate folders.

To add scripts to your project, right click new folder in your the project and choose: Add, New Item..., User Scripts and then Pre-Deployment Script or Post-Deployment Script. Then give the file a suitable name.
Adding Pre- or Post-Deployment scripts

Note: see the file property of the deployment file that is called Build Action. This is where you determine the purpose and execution moment of a regular '.sql' file. However you can have only one Pre-Deployment and one Post-Deployment file in a project. There is a workaround below if you need multiple files.

2) Adding code
Now you can add your T-SQL code, but make sure it is repeatable (in case of a crash or when deploying multiple times). So if you want create a temporary table, make sure it doesn't already exits by adding a drop if-exists construction before the create statement.

Below a script that pushes data through the DTAP environment for a reference table.
Post-Deployment Script Template
This file contains SQL statements that will be appended to the build script.
Use SQLCMD syntax to include a file in the post-deployment script.
Example: :r .\myfile.sql
Use SQLCMD syntax to reference a variable in the post-deployment script.
Example: :setvar TableName MyTable
SELECT * FROM [$(TableName)]

DELETE FROM [ref].[myTable]
INSERT [ref].[myTable] ([Column1], [Column2], [Column3]) VALUES ('bla 1', 'bla 2', 'bla 3')
INSERT [ref].[myTable] ([Column1], [Column2], [Column3]) VALUES ('bla a', 'bla b', 'bla c')

If you have a big deployment script and you want to divide the code over multiple files then you have to create one 'master' script to call the other 'child' scripts because you can only have one Pre-Deployment and one Post-Deployment script in a database project. For this example we added multiple child scripts in the same folder as the parent script and called those by adding ":r .\" in front of the file names.
Post-Deployment Script Template
This file contains SQL statements that will be appended to the build script.
Use SQLCMD syntax to include a file in the post-deployment script.
Example: :r .\myfile.sql
Use SQLCMD syntax to reference a variable in the post-deployment script.
Example: :setvar TableName MyTable
SELECT * FROM [$(TableName)]

There can be only one post-deployment script per database! (One script to rule them all...)

:r .\PostDeploymentFile1.sql
:r .\PostDeploymentFile2.sql
:r .\PostDeploymentFile3.sql

In this post you learned how to add some scripts before and after the general deployment of the database project. A good way to either add some reference data or to do some cleanup before you deploy your changes. Next take look at the publishing profile file for you SQL DB project.

I want my Data Warehouse databases in Azure DevOps just like my ETL projects and use Contunious Integration (CI) and Continuous Deliver (CD) for multiple environments. How do you do that?
SQL Database in DevOps

If you read a couple of forums and blogs about this subject then you will probably notice that 'all roads lead to Rome'. In this blog series we will explain our approach, but feel free to deviate.

Just for the record. With Continuous Integration (CI) we mean merging the code of all developers and building the project to check whether everything still works. Continuous Deliver will orchestrate the  deployment of the package created by CI through out the DTAP environment in an automated manner to reduce costs, but speed up the release proces.

For this example we work with four environments each with its own Azure SQL database: Development, Test, Acceptance and Production. The Development database is a shared database where all developers do there coding.

The database changes since the last compare will be pushed to a Visual Studio Database project. Which on its turn will be committed to Git. After that the project will be build in DevOps and then deployed through all environments.
DTAP environments

Note: This approach with a single shared development database only works for smaller teams or when the database work can be separated to avoid conflicts. The alternative is to give each developer their own local development database. This will take a little more syncing and merging effort though.

1) Visual Studio Database project
First, you either need Visual Studio 2017 or 2019. You have three options:
  1. SSDT standalone installer. This is Visual Studio 2017 with only the BI and DB project templates.
  2. Visual Studio 2017 (Community Edition is free). During installation locate Data storage and processing and select SQL Server Data Tools. Finding a download without an MSDN license is a bit tricky since 2019 is the current version
  3. Visual Studio 2019 (Community Edition is free). During installation locate Data storage and processing and select SQL Server Data Tools.

Data Storage and processing - SQL Server Data Tools

After installing you will find the SQL Server Database Project under SQL Server when you create a new project.
Create new Database Project

2) Import or Schema compare
Now that we have an empty database project we need to add the database items from our development database to this new project. You can either do an import of the database or do a Schema Compare between the database and the project.

When importing a new database you only have couple of options like not importing referenced logins or the folder structure. I would suggest not to import any logins because the are probably different on all environments. As folder structure I recommend Schema\Object Type because I think this is the clearest structure. Note that you can do an import only once.
Import Database

The second option is to do a schema compare which gives you much more options for tuning the synchronization. The best feature is that you can repeat this any time you like to update the Visual Studio Project.

Right click the project and choose Schema Compare... Then make sure the source database is on the left side and the Visual Studio project on the right side. Click on the options/settings icon to tune the comparison. In this case we will ignore users and role memberships.

Now press the compare button and review all changes especially when you work in a team. Then you probably only want your changes. Uncheck changes that are not yours or not ready to go to the next environment. By clicking on a change you can see the actual differences. Last step is to press the Update button and move all selected changes to the Visual Studio project. Before committing any changes to Git you should built your project to check if everything works.
Sql Schema Compare

Note: When working with multiple developers in a single development database you will probably encounter a couple of database changes which where not made by you. This is the point where you have to be selective on which changes you want to push to the database project.

Tip: When you have a lot of 'garbage' in your development database you could do a clean up by switching the source and target in the Sql Schema Compare and then push the update button. This will throw everything away that is not in your database project. It is probably a good idea to first create a backup.

3) Add Schema Compare to project
A great feature of the Sql Schema Compare file is that you can save it to your PC and then add it to your database project. The next time you only have to open it and click on the compare button.
Add Compare file to db project

4) Dacpac to create or update
If you built the database project in Visual Studio or DevOps it will generate a .dacpac file which can be used to create or update the next database in our DTAP environment. By default it can only update tables when they get bigger (extra columns or larger datatypes). This is to prevent loss of data. In a next post we will explain how to overcome this with Publish profiles or Pre-Deployment script.

In this introduction post you learned how to create a database project in Visual Studio and how to synchronize a database and a database project. In the next posts we will create a Build pipeline in DevOps to validate your committed database changes and after that we will create a Release pipeline to release your committed database changes through the DTAP environment.

A downside of database projects in Visual Studio is that it can become very slow if you have like thousands and thousands of database objects. For those very large databases (not in size, but in structure) you could also take a look at commercial database tools like those of redgate or SentryOne. However the majority of Data Warehouse databases will probably easily fit in Visual Studio Database projects.

As we slowly move from on-premises data warehouses with Microsoft SQL Server to cloud data warehouses in Microsoft Azure, we need to know more about the various options in Azure. You probably already used an Azure SQL Database, but Microsoft also introduced Azure SQL Data Warehouse. What are the differences between these two databases?
Azure SQL DB vs Azure SQL DW

Back in 2013, Microsoft introduced Azure SQL Database which has its origin in the on-premises Microsoft SQL Server. In 2015 (however public availability was in July 2016) Microsoft added SQL Data Warehouse to the Azure cloud portfolio which has its origin in the on-premises Microsoft Analytics Platform System (APS). This was a Parallel Data Warehouse (PDW) combined with Massively Parallel Processing (MPP) technology and included standard hardware. It is the 'big brother' of SQL Server, but with a slightly different purpose.

In this post we will briefly describe the differences between these two Microsoft Azure Services, but first Microsofts own definitions:
  • Azure SQL Database is a relational database-as-a service using the Microsoft SQL Server Engine (more);
  • Azure SQL Data Warehouse is a massively parallel processing (MPP) cloud-based, scale-out, relational database capable of processing massive volumes of data (more);

1) Purpose: OLAP vs OLTP
Although both Azure SQL DB and Azure SQL DW are cloud based systems for hosting data, their purpose is different. The biggest difference is that SQL DB is specifically for Online Transaction Processing (OLTP). This means operational data with a lot of short transactions like INSERT, UPDATE and DELETE by multiple people and/or processes. The data is most often highly normalized stored in many tables.

On the other hand SQL DW is specifically for Online Analytical Processing (OLAP) for data warehouses. This means consolidation data with a lower volume, but more complex queries. The data is most often stored de-normalized with fewer tables using a star or snowflake schema.

2) Achitecture
In order to make the differences more clear a quick preview of the architecture of Azure SQL Data Warehouse, where you see a whole collection of Azure SQL Databases and separated storage. The maximum number of compute notes at the moment is 60.
    Architecture SQL DW: Decouples (Blob) storage from compute (SQL DB)

    3) Storage size
    The current size limit of an Azure SQL Database is 4TB, but it has been getting bigger over the past few years and will probably end up around 10TB in the near future. On the other hand we have the Azure SQL Data Warehouse which has no storage limit at all (only the limit of your wallet), because the storage is separated from the compute.

    3) Pricing
    The pricing is also quite different. Where Azure SQL DB starts with €4,20 a month, Azure SQL DW starts around €900,- a month excluding the cost of storage which is included in SQL DB. The storage costs for Azure SQL DW are around €125,- per TB per month. And the maximum costs of a single SQL DB is around €13500,- where SQL DW ends around a massive €57000,- (excl. storage). But when you take a look at the architecture above, it should be no surprise that SQL DW is more expensive than SQL DB, because it consists of multiple SQL DBs.

    However, SQL DW has one big trick up its sleeve that SQL DB hasn't: you can pause it completely and then you only pay for storage. If you start your SQL DW with your ETL job and pause it right after you processed your Azure Analysis Services then you only need it a small percentage of the month.

    Note: prices are from July 2017

    4) DTU vs DWU
    SQL DB has 15 different pricing tiers which specify the number of Database Transaction Units  (DTU) and the storage size/type:
    - Basic
    - Standard (S0, S1, S2, S3)
    - Premium (P1, P2, P3, P4, P6, P11, P15)
    - Premium RS (PRS1, PRS2, PRS4, PRS6)
    Basic has only 5 DTUs and the highest number of DTUs is, at the time of writing, 4000.
    The term DTU is a bit vague. It is a mysterious combination of RAM, CPU and read-write rates, but basically if you want to double the performance of your current database you just need to double the number of DTU's for your database.

    SQL DW has 12 different pricing tiers and uses Data Warehouse Units (DWU) to specify the performance level.
    - DWU100, 200, 300, 400, 500, 600, 1000, 1200, 1500, 2000, 3000, 6000
    The term DWU is a little less vague, because if you divide that number by 100 you have the number of compute nodes available for that pricing tier. On the other hand the exact combination of CPU, memory and IOPS per compute note is unknown.

    Because both services have a different purpose it is a bit strange to compare the hardware, but according to this MSDN blog post 1 DWU is approximately 7,5 DTU.

    But there is also some similarity: for both services you can use the same script to change the pricing tier on the fly to either give the performance a real boost when needed or the save money in the quiet hours.

    5) Concurrent Connection
    Although SQL DW is a collection of SQL Databases the maximum number of concurrent connections is much lower than with SQL DB. SQL DW has a maximum of 1024 active connections where SQL DB can handle 6400 concurrent logins and 30000 concurrent sessions. This means that in the exceptional case where you have over a thousand active users for your dashboard you probably should consider SQL DB to host the data instead of SQL DW.
    For more details see the SQL DB Recource Limitations and SQL DW Recource Limitations.

    6) Concurrent Queries
    Besides the maximum connections, the number of concurrent queries is also much lower. SQL DW can execute a maximum of 32 queries at one moment where SQL DB can have 6400 concurrent workers (requests). This is where you see the differences between OLTP and OLAP.
    For more details see the SQL DB Recource Limitations and SQL DW Recource Limitations.

    7) PolyBase
    Azure SQL Data Warehouse supports PolyBase. This technology allows you to access data outside the database with regular Transact SQL. It can for example use a file in an Azure Blob Storage container as a (external) table. Other options are importing and exporting data from Hadoop or Azure Data Lake Store. Although SQL Server 2016 also supports PolyBase, Azure SQL Database does not (yet?) support it.

    8) Query language differences
    Although SQL DW uses SQL DB in the background there are a few minor differences when quering or creating tables:
    - SQL DW cannot use cross databases queries. So all your data should be in the same database.
    - SQL DW can use IDENTITY, but only for INT or BIGINT. Moreover the IDENTITY column cannot be used as part of the distribution key.
    - Also see this SQL DW list of unsupported table features.

    9) Replication
    SQL DB supports active geo-replication. This enables you to configure up to four readable secondary databases in the same or different location. SQL DW does not support active geo-replication, only Azure Storage replication. However this is not a live, readable, synchronized copy of your database! It's more like a backup.

    10) In Memory OLTP tables
    SQL DB supports in-memory OLTP. SQL DW is OLAP and does not support it.

    11) Always encrypted
    SQL DB supports Always Encrypted to protect sensitive data. SQL DW does not support it.

    Although Azure SQL DB looks much cheaper on a monthly basis, this doesn't mean you should always choose SQL DB by default. One big advantage is that you can pause SQL DW. For example a stage database is only used during the ETL process. Why should you always have this database up an running? Or why should your datamart stay online 24*7 if your end users only use Analysis Services to browse the data.

    On the other hand the original purpose of both services is different (OLTP vs OLAP), but this doesn't mean you should always use Azure SQL DW for your data warehouses. Depending on several factors like data size, complexity, required up-time and budget, Azure SQL DB could also host your data warehouse. You could even mix them in your project. For example Stage and Historical/Persistent stage in Azure SQL DW and your Datamart in Azure SQL DB.

    Please leave a comment if you know more significant differences that are worth mentioning.