Showing posts with label JOOST. Show all posts
Showing posts with label JOOST. Show all posts
Case
I want to start and stop my Azure Analysis Services from within Azure Data Factory, but I don't want write code or use other Azure services like Azure Automation or Azure Logic Apps to do this. Is there an Azure Data Factory-only solution where we only use the standard pipeline activities from ADF?
Save some money on your Azure Bill by pausing AAS




















Solution
Yes you can use the Web Activity to call the Rest API of Azure Analysis Services (AAS), but that requires you to give ADF permissions in AAS via its Managed Service Identity (MSI). If you already used our Process Model example, then this is slightly different (and easier).


1) Add ADF as contributer to AAS
Different than for processing one of the AAS models we don't need SSMS to add ADF as an Server Administrator. Instead we will use Access control (IAM) on the Azure portal to make our ADF a contributor for the AAS that we want to pause or resume.
  • Go to your AAS the Azure portal
  • In the left menu click on Access control (IAM)
  • Click on + Add and choose Add role assignment
  • In the new Add role assignment pane select Contributor as Role
  • In the Assign access to dropdown select Data Factory
  • Select the right Subscription
  • Now Select your Data Factory and click on the Save button
Add ADF as Contributor to AAS















2) Add Web Activity
In your ADF pipeline you need to add a Web Activity to call the Rest API of Analysis Services. First step is to determine the Rest API URL. Replace in the string below, the <xxx> values with the subscription id, resource group and servername of your Analysis Services. The Rest API method we will be using is 'Suspend' but you can replace that word by 'Resume' to startup the AAS:
https://management.azure.com/subscriptions/<xxx>/resourceGroups/<xxx>/providers/Microsoft.AnalysisServices/servers/<xxx>/Suspend?api-version=2017-08-01

Example:
https://management.azure.com/subscriptions/a74a173e-4d8a-48d9-9ab7-a0b85abb98fb/resourceGroups/bitools/providers/Microsoft.AnalysisServices/servers/bitools2/Suspend?api-version=2017-08-01

Second step is to create a JSON message for the Rest API. Well the Rest API doesn't use it, but it is required in the Web activity when you use POST as method. So you just need to create a dummy json message:
{
"Dummy": "Dummy"
}
  • Add the Web activity to your pipeline
  • Give it a descriptive name like Pause AAS (or Resume AAS)
  • Go to the Settings tab
  • Use the Rest API URL from above in the URL property
  • Choose POST as Method
  • Add the dummy JSON message from above in the Body property
  • Under advanced choose MSI as Authentication method
  • Add 'https://management.azure.com/ in the Resource property (different than process example)
Web Activity calling the AAS Rest API



















Then Debug the Pipeline to check the suspend/resume action














3) Retrieve info
By changing the method type from POST to GET (body property will disappear) and removing the method (suspend or pause) from the URL, you can retrieve information about the AAS. Like status and pricing tier. You could for example use that to first check the status before changing it.
Retrieve service info via GET


Summary
In this post you learned how pause and resume your Analysis Services to save some money on your Azure bill. The big advantage of this method is that you don't need other Azure services which makes maintenance a little easier. In a next post we will also show you how to change the pricing tier

Case
There are several methods to process Azure Analysis Services models like with Logic Apps, Azure Automation Runbooks and even SSIS, but is there an Azure Data Factory-only solution where we only use the pipeline activities from ADF?
Process Azure Analysis Services




















Solution
Yes you can use the Web Activity to call the Rest API of Azure Analysis Services (AAS), but that requires you to give ADF permissions in AAS via its Managed Service Identity (MSI).

1) Create ADF service principal
In the next step we need a user which we can add as a Server Administrator of AAS. Since we will not find the managed identity of ADF when we search for a user account, we will have to create one. This 'user' is called a service principal.
  • Go to ADF in the Azure portal (not the Author & Monitor environment)
  • In the left menu click on Properties which you can find under General
  • Copy the 'Managed Identity Application ID' and the 'Managed Identity Tenant' properties to a notepad and construct the following string for the next step:
    app:<Application ID>@<Tentant> (and replace the <xxx> values with the properties)
    app:653ca9f9-855c-45df-bfff-3e7718159295@d903b4cb-ac8c-4e31-964c-e630a3a0c05e

Create app user from ADF for AAS















2) Add user as Server Administrator
Now we need to connect to your Azure Analysis Services via SQL Server Management Studio (SSMS) to add the user from the previous step as a Server Administrator. This cannot be done via the Azure portal.
  • Login to your AAS with SSMS
  • Right click your server and choose Properties
  • Go to the Security pane
  • Click on the Add... button
  • Add the service principal from the previous step via the Manual Entry textbox and click on the Add button
  • Click on Ok the close the property window

Add Server Administrator via Manual Entry














After this step the 'user' will appear on the portal as well, but you can not add it via the portal.
Analysis Services Admins
















3) Add Web Activity
In your ADF pipeline you need to add a Web Activity to call the Rest API of Analysis Services. First step is to determine the Rest API URL. Replace in the string below, the <xxx> values with the region, servername and modelname of your Analysis Services. The Rest API method we will be using is 'refreshes':
https://<region>.asazure.windows.net/servers/<servername>/models/<modelname>/refreshes

Example:
https://westeurope.asazure.windows.net/servers/bitoolsserver/models/bitools/refreshes

Second step is to create a JSON message for the Rest API to give the process order to AAS. To full process the entire model you can use this message:
{
"Type": "Full",
"CommitMode": "transactional",
"MaxParallelism": 2,
"RetryCount": 2,
"Objects": []
}
Or you can process particular tables within the model with a message like this:
{
"Type": "Full",
"CommitMode": "transactional",
"MaxParallelism": 2,
"RetryCount": 2,
"Objects": [
{
"table": "DimProduct",
"partition": "CurrentYear"
},
{
"table": "DimDepartment"
}
]
}
See the documentation for all the parameters that you can use.
  • Add the Web activity to your pipeline
  • Give it a descriptive name like Process Model
  • Go to the Settings tab
  • Use the Rest API URL from above in the URL property
  • Choose POST as Method
  • Add the JSON message from above in the Body property
  • Under advanced choose MSI as Authentication method
  • Add 'https://*.asazure.windows.net' in the Resource property (note this URL is different for suspending and resuming AAS)
Web Activity calling the AAS Rest API
























Then Debug the Pipeline to check the process result















4) Retrieve refreshes
By only changing the method type from POST to GET (body property will disappear) you can retrieve information about the processing status and use that information in the next pipeline activities.
Retrieve process status via GET


Summary
In this post you learned how process your Analysis Services models with only Azure Data Factory. No other services are needed which makes maintenance a little easier. In a next post we will also show you how to Pause or Resume your Analysis Services with Rest API. With a few extra steps you can also use this method to refresh a Power BI dataset, but we will show that in a future post.







Case
I want to create an Azure Function with C# code. How do I create and deploy one in Azure (and use it Azure Data Factory)?
Write C# in Visual Studio Code to create an Azure Function











Solution
In this blogpost we will create and deploy a very simple 'hello world' Azure Function with an HTTP trigger which you can extend to your own needs. After that we can use the Azure Data Factory pipeline with an Azure Function activity to execute it.

As an alternative you could also create an Azure Function with a Blob Storage trigger that executes when a new file arrives, but we rather want to use that same trigger type to start an Azure Data Factory pipeline that then starts this Function followed by other pipeline activities. This way we have one place that does the triggering/orchestration.

1) Create new Azure Function project
Please first follow the steps of our previous post on how to prepare Visual Studio Code for creating Azure Functions with C#. After that open Visual Studio code and perform the steps below to create your first hello world Function.
  • In Visual Studio code click on the Azure icon in the left menu.
  • In the newly opened pane click on the folder with the lightning icon on it to create a new project. (An additional function can later-on be added to the project with the Lightning-plus icon)
  • Select the folder of the new project (or use the Browse... option)
  • Next select C# as coding language
  • Select HTTP trigger as the template for this example
  • Enter the Function name. This is the name of the function within the project (that can contain multiple functions). Example: myCSharpFunction
  • Provide a namespace: Bitools.Function
  • For this test example use anonymous as Authorization level
  • The project has been created, but their could be an additional action in step 2
Create new Azure Function project

















2) Unresolved dependencies
This extra step seems to be a bug in the Azure Function extension for C# in Visual Studio code. After the project has been generated Visual Studio Code will show the following error in the lower right corner. This seems to refer to some missing references.
There are unresolved dependencies.
Please execute the restore command to continue.













If you don't get this error then Microsoft probably solved the bug. When you do get it, the only thing you have to do is clicking the Restore button. After that some extra files will be added in the obj folder of your project. (see previous post to compare extension versions)
Execute the restore command

Spot the differences

























3) Code in myCSharpFunction.cs
The file 'myCSharpFunction.cs' contains your C# code. The name could be different when you gave your function a different name. Below you see the standard / generated code with some extra comment lines for if you are new to C#. For this example we do not extend the code.
// This section lists the namespaces that this function will be using frequently,
// and saves the programmer from specifying a fully qualified name every time that
// a method that is contained within is used
using System;
using System.IO;
using System.Threading.Tasks;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Extensions.Http;
using Microsoft.AspNetCore.Http;
using Microsoft.Extensions.Logging;
using Newtonsoft.Json;

namespace Bitools.Function
{
public static class myCSharpFunction
{
// Main function and entry point of this Azure Function
[FunctionName("myCSharpFunction")]
public static async Task<IActionResult> Run(
[HttpTrigger(AuthorizationLevel.Anonymous, "get", "post", Route = null)] HttpRequest req,
ILogger log)
{
// Log information
log.LogInformation("C# HTTP trigger function processed a request.");

// Retrieve parameter 'name' from querystring
string name = req.Query["name"];

// Also try to retrieve the same parameter from the request body
string requestBody = await new StreamReader(req.Body).ReadToEndAsync();
dynamic data = JsonConvert.DeserializeObject(requestBody);
// If not found in querystring then use requestbody
name = name ?? data?.name;

// If name is still empty throw an error that a name parameter
// is expected else response with a greeting
string responseMessage = string.IsNullOrEmpty(name)
? "This HTTP triggered function executed successfully. Pass a name in the query string or in the request body for a personalized response."
: $"Hello, {name}. This HTTP triggered function executed successfully.";

return new OkObjectResult(responseMessage);
}
}
}

4) Debug locally
Now we are going to test the Azure Function locally on our Windows device. There are multiple ways to start debugging. Pressing F5 is probably the easiest. See animated gif for more options.
  • In the Run menu on the top of the screen you will find the Start Debugging option. 
  • The terminal pane on the bottom will show a lot of details. Wait a few seconds for it to finish and click (while also pressing CTRL) on the green URL.
  • A new browser window will open and it shows the error output that it cannot find the name parameter.
  • In the browser add a querystring after the URL: ?name=Joost (or your own name of course). Now it will respond with a greeting
  • Close the browser and then hit the disconnect icon on top to stop debugging
Debugging your function locally

















5) Create Azure Function in Azure Portal
Before you can deploy your newly created function you first need to create an Azure Function in the Azure portal.
  • Go to the Azure Portal and click on Create a resource
  • Search for function and select Function App
  • Click on the Create button
  • On the Basics tab you find the most important settings
  • Select your Subscription and Resource Group
  • Enter an unique Function App name
  • Select .NET Core as Runtime stack
  • Select 3.1 as Version 
  • Select the Region (probably the same as your Resource Group)
  • Optionally go to the Hosting tab for extra settings
  • Choose a new or existing Storage account
  • Change the Plan type (default: Serverless)
  • Optionally go to the Monitoring tab for extra settings
  • Disable or enable Application insights and change its name
  • Click the Review + create button
  • Review the settings and click on the Create button
Create new Azure Function (app) on Azure portal















Note 1: you cannot create an Azure Function with a Windows worker (.NET Core) if there is already a Linux worker (Python) in that same resource group and with the same region.

Note 2: you could also perform these steps within Visual Studio Code during deployment.

6) Deploy Azure Function to Azure Portal
Now that we have an (empty) Azure Functions app in the Azure portal we can deploy our newly created Azure Function to this resource.
  • In Visual Studio code click on the Azure icon in the left menu.
  • In the newly opened pane click on the blue arrow (deploy) icon
  • In the drop down select your Azure Functions App from the previous step
Deploy Azure Functions from Visual Studio Code

















7) Testing in portal
Now that we have deployed our project to Azure Functions we can test it in the Azure Portal. For this example we will use the post method.
  • Go to the Azure Portal and then open your Azure Functions App
  • In the left menu click on Functions
  • In the list of functions click on your function (only one in this example)
  • In the left menu click on Code + Test
  • Click on the test button (top center)
  • Change the HTTP method to post
  • Select one of the keys
  • Enter a JSON message in the body: {"name":"Joost"} (name=case-sensitive)
  • Click on the Run button and see the result
Testing in the Azure Portal

















8) Executing from Azure Data Factory
Now if you want to execute this new Azure Function in Azure Data Factory with the Azure Function Activity you can follow the steps in this previous post. However without code changes it will return an error stating that the response is invalid: 3603 - Response Content is not a valid JObject
3603 - Response Content is not a valid JObject














At the moment it is returning a so called JArray, but it is expecting a JObject (J = JSON). Any other return types than JObject will throw the error above. To overcome this we need a minor code change at the end by changing the return construction at line 37.
// This section lists the namespaces that this function will be using frequently,
// and saves the programmer from specifying a fully qualified name every time that
// a method that is contained within is used
using System;
using System.IO;
using System.Threading.Tasks;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Extensions.Http;
using Microsoft.AspNetCore.Http;
using Microsoft.Extensions.Logging;
using Newtonsoft.Json;

namespace Bitools.Function
{
public static class myCSharpFunction
{
// Main function and entry point of this Azure Function
[FunctionName("myCSharpFunction")]
public static async Task<IActionResult> Run(
[HttpTrigger(AuthorizationLevel.Anonymous, "get", "post", Route = null)] HttpRequest req,
ILogger log)
{
// Log information
log.LogInformation("C# HTTP trigger function processed a request.");

// Retrieve parameter 'name' from querystring
string name = req.Query["name"];

// Also try to retrieve the same parameter from the request body
string requestBody = await new StreamReader(req.Body).ReadToEndAsync();
dynamic data = JsonConvert.DeserializeObject(requestBody);
name = name ?? data?.name;

// If name is still empty throw an error that a name parameter
// is expected else response with a greeting
return name != null
? (ActionResult)new OkObjectResult(new {message = "Hello " + name})
: new BadRequestObjectResult("Pass a name in the query string or in the request body for a personalized response.");

/*
string responseMessage = string.IsNullOrEmpty(name)
? "This HTTP triggered function executed successfully. Pass a name in the query string or in the request body for a personalized response."
: $"Hello, {name}. This HTTP triggered function executed successfully.";

return new OkObjectResult(responseMessage);
*/
}
}
}

Below a couple of screenshots on how to configure and test this in Azure Data Factory. You might want to store the Function key in Azure Key Vault to avoid keys in your ETL/ELT code.
Set up the Azure Function Activity in Azure Data Factory

















After configuring the Azure Function activity you can hit the debug button and see the result. This output could then be used as input for successive pipeline activities
Successfully executing and getting the response


















Conclusion
First a big thank you to colleague Walter ter Maten for always helping me out with the C# stuff. In this blog post you learned how to create, test and deploy your first (very basic) Azure Function App with C# code. Then we also showed you how to execute this from Azure Data Factory. In a couple of follow up posts we will show you how to build some useful functions for DWH projects and show you some technical stuff like adding Azure Key Vault to the game. Also check out the Python version of this blogpost.


Case
I want to create an Azure Function with Python code. How do I create and deploy one in Azure?
Write Python in Visual Studio Code to create an Azure Function











Solution
In this blogpost we will create and deploy a very simple 'hello world' Azure Function with an HTTP trigger which you can extend to your own needs. After that we can use the Azure Data Factory pipeline with an Azure Function activity to execute it.

As an alternative you could also create an Azure Function with a Blob Storage trigger that executes when a new file arrives, but we rather want to use that same trigger type to start an Azure Data Factory pipeline that then starts this Function followed by other pipeline activities. This way we have one place that does the triggering/orchestration.

1) Create new Azure Function project
Please first follow the steps of our previous post on how to prepare Visual Studio Code for creating Azure Functions with Python. After that open Visual Studio code and perform the steps below to create your first hello world Function.
  • In Visual Studio code click on the Azure icon in the left menu.
  • In the newly opened pane click on the folder with the lightning icon on it to create a new project. (An additional function can later-on be added to the project with the Lightning-plus icon)
  • Select the folder of the new project (or use the Browse... option)
  • Next select Python as coding language
  • Select the Python interpreter. In our example it is py 3.8.3 (if it's not in the list you need to browse to python.exe on your device)
  • Select HTTP trigger as the template for this example
  • Enter the Function name. This is the name of the function within the project (that can contain multiple functions)
  • For this test example use anonymous as Authorization level
  • The project has been created, but continue with the next step below with some additional actions before you can start coding
Create new Azure Function project















2) Select interpreter and install Linter pylint
To finish the creation of the project we need to select the location of the interpreter and install a linter.
  • Click on the popup in the bottom left corner to select the interpreter. 
  • On the top in the middle you can now select the interpreter. Select the one in the .venv folder which is a subfolder of your project.
  • Next step is to install Linter pylint by clicking on the install button on the new popup in the bottom left corner.
  • Now wait a few seconds for the installation to finish
Select interpreter and install Linter pylint















3) Code in __init__.py
The file '__init__.py' contains your Python code. Below you see the standard / generated code with some extra comment lines for if you are new to Python. For this example we do not extend the code.
# Import module for logging purposes
import logging

# Import module for Azure Functions and give it an alias
import azure.functions as func

# Main function and entry point of this Azure Function
def main(req: func.HttpRequest) -> func.HttpResponse:
# Log information
logging.info('Python HTTP trigger function processed a request.')

# Retrieve parameter 'name' from querystring
name = req.params.get('name')
# If not found try to retrieve it from the request body
if not name:
try:
# Check if there is a request body
req_body = req.get_json()
except ValueError:
# On failure do nothing
pass
else:
# On success try to retrieve name from request body
name = req_body.get('name')

# If a name was found then response with 'Hello [name]'
if name:
return func.HttpResponse(f"Hello {name}!")
else:
# If a name was not found response with an error message
return func.HttpResponse(
"Please pass a name on the query string or in the request body",
status_code=400
)

4) Debug locally
Now we are going to test the Azure Function locally on our Windows device. There are multiple ways to start debugging. Pressing F5 is probably the easiest. See animated gif for more options.
  • In the Run menu on the top of the screen you will find the Start Debugging option. 
  • The terminal pane on the bottom will show a lot of details. Wait a few seconds for it to finish and click (while also pressing CTRL) on the green URL.
  • A new browser window will open and it shows the error output that it cannot find the name parameter.
  • In the browser add a querystring after the URL: ?name=Joost (or your own name of course). Now it will respond with a greeting
  • Close the browser and then hit the disconnect icon on top to stop debugging
Debugging your function locally














5) Create Azure Function in Azure Portal
Before you can deploy your newly created function you first need to create an Azure Function in the Azure portal.

  • Go to the Azure Portal and click on Create a resource
  • Search for function and select Function App
  • Click on the Create button
  • On the Basics tab you find the most important settings
  • Select your Subscription and Resource Group
  • Enter an unique Function App name
  • Select Python as Runtime stack
  • Select the Python Version (3.8 in our example)
  • Select the Region (probably the same as your Resource Group)
  • Optionally go to the Hosting tab for extra settings
  • Choose a new or existing Storage account
  • Change the Plan type (default: Serverless)
  • Optionally go to the Monitoring tab for extra settings
  • Disable or enable Application insights and change its name
  • Click the Review + create button
  • Review the settings and click on the Create button

Create new Azure Function (app) on Azure portal


















Note 1: you cannot create an Azure Function with a linux worker (python) if there is already a Windows worker (C#) in that same resource group and with the same region. You will then get an error: LinuxWorkersNotAllowedInResourceGroup - Linux workers are not available in resource group bitools. Use this link to learn more https://go.microsoft.com/fwlink/?linkid=831180. Also see the Azure Function documentation. Summary: don't mix C# and Python functions within the same resource group.

Note 2: you could also perform these steps within Visual Studio Code during deployment.

6) Deploy Azure Function to Azure Portal
Now that we have an (empty) Azure Functions app in the Azure portal we can deploy our newly created Azure Function to this resource.
  • In Visual Studio code click on the Azure icon in the left menu.
  • In the newly opened pane click on the blue arrow (deploy) icon
  • In the drop down select your Azure Functions App from the previous step
Deploy Azure Functions from Visual Studio Code














7) Testing in portal
Now that we have deployed our project to Azure Functions we can test it in the Azure Portal. For this example we will use the post method.
  • Go to the Azure Portal and then open you Azure Functions App
  • In the left menu click on Functions
  • In the list of functions click on your function (only one in this example)
  • In the left menu click on Code + Test
  • Click on the test button (top center)
  • Change the HTTP method to post
  • Select one of the keys
  • Enter a JSON message in the body: {"name":"Joost"} (name=case-sensitive)
  • Click on the Run button and see the result
Testing in the Azure Portal

















7) Executing from Azure Data Factory
Now if you want to execute this new Azure Function in Azure Data Factory with the Azure Function Activity you can follow the steps in this previous post. However without code changes it will return an error stating that the response is invalid: 3603 - Response Content is not a valid JObject
3603 - Response Content is not a valid JObject
















At the moment it is returning a so called JArray, but it is expecting a JObject (J = JSON). Any other return types than JObject will throw the error above. To overcome this we need two code changes. First we need to import the JSON module by adding: import json at the top of the code with the other imports. Then we need to adjust the return of the Hello world message. See changes after lines 4 and 30.
# Import module for logging purposes
import logging

# import json module to return a json message
import json

# Import module for Azure Functions and give it an alias
import azure.functions as func

# Main function and entry point of this Azure Function
def main(req: func.HttpRequest) -> func.HttpResponse:
# Log information
logging.info('Python HTTP trigger function processed a request.')

# Retrieve parameter 'name' from querystring
name = req.params.get('name')
# If not found try to retrieve it from the request body
if not name:
try:
# Check if there is a request body
req_body = req.get_json()
except ValueError:
# On failure do nothing
pass
else:
# On success try to retrieve name from request body
name = req_body.get('name')

# If a name was found then response with 'Hello [name]'
if name:
#return func.HttpResponse(f"Hello {name}!")
message = {'message': f"Hello {name}!"}
return json.dumps(message)
else:
# If a name was not found response with an error message
return func.HttpResponse(
"Please pass a name on the query string or in the request body",
status_code=400
)

Below a couple of screenshots on how to configure and test this in Azure Data Factory. You might want to store the Function key in Azure Key Vault to avoid keys in your ETL/ELT code.
Set up the Azure Function Activity in Azure Data Factory

















After configuring the Azure Function activity you can hit the debug button and see the result. This output could then be used as input for successive pipeline activities
Successfully executing and getting the response


















Conclusion
First a big thank you to colleague Jasper Diefenbach for helping me out with the Python stuff. In this blog post you learned how to create, test and deploy your first (very basic) Azure Function App with Python code. Then we also showed you how to execute this from Azure Data Factory. In a couple of follow up posts we will show you how to build some useful functions for DWH projects and show you some technical stuff like adding Azure Key Vault to the game. Also check out the C# version of this blogpost.


Case
I want to create Azure Functions on my Windows device, but which tools and extensions do I need to install?
Write Python or C# in Visual Studio code to create Azure Functions









Solution
In this blogpost we will show you which tools you need to install to create an Azure Function with either .NET or with Python code. The screenshots are of the current versions at the time of writing, but you might just want to take the latest stable version when downloading. In the upcoming Azure Functions posts we will create some basic Hello World functions to show the basics of creating and deploying your first Function. After that we will show some more useful functions for the Data Warehousing developers. For example to convert Excel or XML files to an easier readable format for Azure Data Factory or Synapse Polybase: CSV.

1) Download and install Visual Studio Code
For this blog post we will be using Visual Studio Code instead of the regular Visual Studio. Where this 'regular' Visual Studio is a so called Integrated Development Environment (IDE), the newer Visual Studio Code is more a lightweight source code editor. Ideal for some coding with PowerShell, C# or Python. Use the link below to download Visual Studio Code and then install it.
https://code.visualstudio.com/download
Installing Visual Studio Code




















2) Install extensions for Python
If you want to use Python for your Azure Functions you need to install Python for Windows and the Python extension for Visual Studio code. First download and install Python for windows 64bit. The default is a 32 bit version, but you can also find the 64 bit version slightly down the page (search for Windows x86-64 executable installer).
https://www.python.org/downloads/windows/
Install Python for Windows 64bit
















Then install the Python extension for Visual Studio code. When clicking on the install button on the website it will ask to open it with Visual Studio Code. Within Visual Studio Code you have to click on the install button again. After installation it will ask to point to the previously installed Python interpreter.
https://marketplace.visualstudio.com/items?itemName=ms-python.python
Install Python extension for Visual Studio code


















Note: You can also use the Extensions icon in the left menu of Visual Studio code to search for this specific extension.

There is one last installation required: Linter Pylint, but Visual Studio Code will ask for it when creating your first Azure Function with Python code: Linter pylint is not installed.
Install Linter pylint for Visual Studio Code







Installing Linter pylint within Visual Studio Code














3) Install extentions for C#
If you want to create Azure Functions with C# then you first need to install .Net Core SDK. Make sure to install the version to Build apps (Run Apps is not sufficient). The minimum version is .NET Framework 4.7.2 or .NET Core 2.2, but try the most recent version depending on your needs and the Runtime version of Azure Functions. If you forget this then you will recieve an error while trying to create a C# Function.
Receive an error when have not installed .NET Core SDK











Furthermore you should install the C# extension from Microsoft. This extension is not mandatory, but will be recommended when creating your first C# Azure Function.
Install C# extension for Visual Studio code














Note: You can also use the Extensions icon in the left menu of Visual Studio code to search for this specific extension.

4) Install extensions for Azure Functions
Next extension is Azure Functions for Visual Studio code. When clicking on the install button on the website it will ask to open it in Visual Studio Code. Within Visual Studio Code you have to click on the install button again.
https://marketplace.visualstudio.com/items?itemName=ms-azuretools.vscode-azurefunctions
Azure Functions for Visual Studio Code














Note: You can also use the Extensions icon in the left menu of Visual Studio code to search for this specific extension.

5) Install Azure Functions Core Tools
To make you able to debug the Azure Function code locally we need Azure Functions Core Tools, but to install that we first need to install Node Package Manager (NPM) which is included in nodejs (more detailed info here).
https://nodejs.org/en/download/
Install NodeJS with NPM




















Last step of this installation is to open a Command Promt (or PowerShell promt) in Administrator mode to install Azure Functions Core Tools. With the command npm -v you can check your npm version. Now use the following command for the installation (more detailed info here):
npm i -g azure-functions-core-tools@3 --unsafe-perm true
Install Azure Functions Core Tools via command prompt















Conclusion
In this introduction post you read which tools and extensions to install to create Azure Functions. Quite a lot installations, but manageable when following the steps above. We focused on the most popular languages (in the DWH scene) C# and Python, but there are way more languages to choose from like Java(script) or PowerShell. Each with its own extensions.

As mentioned before the next post about Azure Functions will be about deploying your first simple function with Python or C#. After that we will focus on the more functional Azure Functions solutions, but with a focus on the DWH scene. Also bringing Azure Key Vault to the game is a must for Azure Functions.

Case
I a previous PolyBase example, to read data from the storage account, we stored the secret in Synapse. Is it possible to use Managed Identity instead of storing secrets in Synapse? However I can not find the Managed Identity of  my SQL Server.
Azure Synapse Analytics with PolyBase reading Azure Storage Account














Solution
You can use a Managed Identity, but there are two requirements. First this only works with 'StorageV2 (general purpose v2)'. 'BlobStorage' or 'Storage (general purpose v1)' will not work! Secondly you need to register your SQL Server that hosts Synapse in your Active Directory. This will allow you to select your SQL Server within the Access control (IAM).

1) Create Storage Account
Create an Azure Storage Account and make sure the type is StorageV2 (general purpose v2). The storage account of this example is called 'bitoolsstorage' and it has a container called 'mycontainer'. You can choose your own names, but these names will be used in the example code.
Bijschrift toevoegen















2) Create Synapse
Create a Synapse Data Warehouse including a SQL Server to host it. Our SQL Server is called 'bitoolssynapseserver' and our Synapse SQL Pool (data warehouse) is called 'synapsedwh'. Again choose your own names and change those in the example code below.
Synapse SQL Pool (data warehouse)












3) Register SQL Server in AD
Next step is to register the SQL Server that hosts your Synapse DWH in the Active Directory. This will allow you to find your SQL Server in the next step as a Managed Identity. At the moment of writing this needs to be done via PowerShell and cannot be done via the portal.

We will be using Cloud Shell (PowerShell in the portal), but you can also use PowerShell (ISE) on your Windows device but then you have to execute two extra commands (login and select subscription).

  • Click on the Cloud Shell icon in the upper right corner (next to the searchbox). This will start PowerShell in the portal. If this is the first time using it you first need to connect it to an Azure Storage Account.
  • Then execute the Set-AzSqlServer command. The first parameter is the resource group where SQL Server is located. The second parameter is the name of SQL Server (without .database.windows.net) and the last parameter will assign the Managed Identity.
# PowerShell
Set-AzSqlServer -ResourceGroupName "Joost_van_Rossum" -ServerName "bitoolssynapseserver" -AssignIdentity
Register SQL Server as Managed Identity
















If you are using PowerShell on your Windows device instead of Cloud Shell then use this code
# PowerShell
# Login to Azure (popup will appear)
Connect-AzAccount

# Select your subscription
Select-AzSubscription -SubscriptionId "2c67b23a-4ba2-4273-bc82-274a743b43af"

# Assign Managed Identity
Set-AzSqlServer -ResourceGroupName "Joost_van_Rossum" -ServerName "bitoolssynapseserver" -AssignIdentity

4) Storage Blob Data Contributor
Now it's time to give your SQL Server access to the Azure Storage Account. The role we need for this according the documentation is 'Storage Blob Data Contributor', but I also tested it with 'Storage Blob Data Reader' and that works fine as well (since we are only reading data). Note: You need to be owner of the resource (group) to delegate access to others.
  • Go to your Storage Account from step 1
  • Click on Access control (IAM) in the left menu
  • Click on the + Add icon and choose Add role assignment
  • In the Role drop down select 'Storage Blob Data Contributor'
  • Leave the Assign access to drop down unchanged
  • In the Select box start typing the name of your SQL Server
  • Select your SQL Server and click on the Save button
Deligate Access to Managed Identity of SQL Server
















5) Master Key
We are finished in the Azure portal and now its time to start with the actual PolyBase code. Start SQL Server Managed Studio (SSMS), but make sure your Synapse is not paused.

First step is to create a master key to encrypt any secrets, but only if you do not already have one (although we will not use any secrets). You can check that in the table sys.symmetric_keys. If a row exists where the symmetric_key_id column is 101 (or the name column is '##MS_DatabaseMasterKey##') then you already have a master key. Otherwise we need to create one. For Synapse a masterkey password is optional. For this example we will not use the password.
--Master key
IF NOT EXISTS (SELECT * FROM sys.symmetric_keys WHERE symmetric_key_id = 101)
BEGIN
PRINT 'Creating Master Key'
CREATE MASTER KEY;
END
ELSE
BEGIN
PRINT 'Master Key already exists'
END


6) Credentials
Next step is to create a credential which will be used to access the Storage Account. For a Managed Identity you don't use secrets:
--Credential
CREATE DATABASE SCOPED CREDENTIAL bitools_msi
WITH
IDENTITY = 'Managed Service Identity'
;

Tip:
Give the credential a descriptive name so that you know where it is used for. You can find all credentials in the table sys.database_credentials:
--Find all credential
SELECT * FROM sys.database_credentials


7) External data source
With the credential from the previous step we will create an External data source that points to the Storage Account and container where your file is located. Execute the code below where:
  • TYPE = HADOOP (because PolyBase uses the Hadoop APIs to access the container)
  • LOCATION = the connection string to the container in your Storage Account starting with abfss.
  • CREDENTIAL = the name of the credentials created in the previous step.
--Create External Data Source
CREATE EXTERNAL DATA SOURCE bitoolsstorage_abfss
WITH (
TYPE = HADOOP,
LOCATION = 'abfss://mycontainer@bitoolsstorage.dfs.core.windows.net',
CREDENTIAL = bitools_msi
);

Tip:
Give the external source a descriptive name so that you know where it is used for. You can find all external data sources in the table sys.external_data_sources:
--Find all external sources
SELECT * FROM sys.external_data_sources

Notice that the filename or subfolder is not mentioned in the External Data Source. This is done in the External Table. This allows you to use multiple files from the same folder as External Tables.


8) External File format
Now we need to describe the format used in the source file. In our case we have a comma delimited file. You can also use this file format to supply the date format, compression type or encoding.
--Create External Data Source
CREATE EXTERNAL FILE FORMAT TextFile
WITH (
FORMAT_TYPE = DelimitedText,
FORMAT_OPTIONS (FIELD_TERMINATOR = ',')
);

Tip:
Give the format a descriptive name so that you know where it is used for. You can find all external file formats in the table sys.external_file_formats:
--Find all external file formats
SELECT * FROM sys.external_file_formats

9) External Table
The last step before we can start quering, is creating the external table. In this create table script you need to specify all columns, datatypes and the filename that you want to read. The filename starts with a forward slash. You also need the datasource from step 7 and the file format from step 8.
--Create External table
CREATE EXTERNAL TABLE dbo.sensordata (
[Date] nvarchar(50) NOT NULL,
[temp] INT NOT NULL,
[hmdt] INT NOT NULL,
[location] nvarchar(50) NOT NULL
)
WITH (
LOCATION='/bitools_sample_data_AveragePerDayPerBuilding.csv',
DATA_SOURCE=bitoolsstorage_abfss, -- from step 7
FILE_FORMAT=TextFile -- from step 8
);
Note:
PolyBase does not like columnname headers. It will handle it like a regular data row and throw an error when the datatype doesn't match. There is a little workaround for this with REJECT_TYPE and REJECT_VALUE. However this only works when the datatype of the header is different than the datatypes of the actual rows. Otherwise you have to filter the header row in a subsequent step.
--Create External table with header
CREATE EXTERNAL TABLE dbo.sensordata2 (
[Date] DateTime2(7) NOT NULL,
[temp] INT NOT NULL,
[hmdt] INT NOT NULL,
[location] nvarchar(50) NOT NULL
)
WITH (
LOCATION='/bitools_sample_data_AveragePerDayPerBuilding.csv',
DATA_SOURCE=bitoolsstorage_abfss,
FILE_FORMAT=TextFile,
REJECT_TYPE = VALUE, -- Reject rows with wrong datatypes
REJECT_VALUE = 1 -- Allow 1 failure (the header)
);
You can find all external tables in the table sys.external_tables.
--Find all external tables
SELECT * FROM sys.external_tables
However you can also find the External Table (/the External Data Source/the External File Format) in the Object Explorer of SSMS.
SSMS Object Explorer

























10) Query external table
Now you can query the external table like any other regular table. However the table is read-only so you can not delete, update or insert records. If you update the source file then the data in this external table also changes instantly because the file is used to get the data.
--Testing
SELECT count(*) FROM dbo.sensordata;
SELECT * FROM dbo.sensordata;
Quering an external table

























Conclusion
In this post you learned how to give the Managed Identity of SQL Server access to your Storage Account. This saves you some maintenance for the secrets. And you learned how to use PolyBase to read files from that Storage Account using the Managed Identity.