How to deploy Azure Data Factory pipeline and its dependencies programatically using PowerShell

Since ADF unfortunately doesn’t provide a built-in way for automated deployment, to do this you have to write a custom script. Then you can run it on build server such as VSTS or TeamCity.

If you’re in the interactive mode, you need to login and select the corresponding subscription. If you run the script on build server, the subscription should be selected for you automatically.

Login-AzureRmAccount
Select-AzureRmSubscription -SubscriptionName YourSubName

Here’s the parameters you can’t get programmatically so need to specify explicitly. Last parameter is a path to JSON files that represent the resources to be deployed: linked services, datasets, and pipelines.

param(
    [Parameter(Mandatory=$true)][string]$ResourceGroupName,
    [Parameter(Mandatory=$true)][string]$Location,
    [Parameter(Mandatory=$true)][string]$DataFactoryName,
    [Parameter(Mandatory=$true)][string]$InputFolderPath
)

First, create the factory itself. Flag -Force helps to continue on error if it already exists. Then load it into a variable:

New-AzureRmDataFactory -ResourceGroupName $ResourceGroupName -Name $DataFactoryName -Location $Location -Force -ErrorAction Stop
$dataFactory = Get-AzureRmDataFactory -ResourceGroupName $ResourceGroupName -Name $DataFactoryName -ErrorAction Stop

Next three loops read files based on the following convention:

  • Linked services start with LinkedService, e.g. LinkedService-SQL.json
  • Datasets start with Dataset, e.g. Dataset-Orders.json
  • Pipelines start with Pipeline, e.g. Pipeline-CopyOrders.json

First, create linked services. If any references a gateway then create it as well. Since the cmdlet doesn’t support flag -Force, we can use -ErrorAction Continue to continue on error if it already exists:

$files = Get-ChildItem $InputFolderPath -Recurse -Include *.json -Filter LinkedService* -ErrorAction Stop
foreach ($file in $files)  
{
    Write-Output "Creating linked services from $($file.FullName)"
    New-AzureRmDataFactoryLinkedService -DataFactory $dataFactory -File $file.FullName -Force -ErrorAction Stop

    $json = Get-Content $file.FullName -Raw -ErrorAction Stop
    $svc = $json | ConvertFrom-Json
    $gwName = $svc.properties.typeProperties.gatewayName
    if ($gwName)
    {
        Write-Output "Creating gateway $($gwName) from $($file.FullName)" 
        New-AzureRmDataFactoryGateway -DataFactory $dataFactory -Name $gwName -ErrorAction Continue
    }
}

Then create datasets:

$files = Get-ChildItem $InputFolderPath -Recurse -Include *.json -Filter Dataset* -ErrorAction Stop
foreach ($file in $files)  
{
    Write-Output "Creating dataset from $($file.FullName)"
    New-AzureRmDataFactoryDataset -DataFactory $dataFactory -File $file.FullName -Force -ErrorAction Stop 
}

And finally, pipelines:

$files = Get-ChildItem $InputFolderPath -Recurse -Include *.json -Filter Pipeline* -ErrorAction Stop
foreach ($file in $files)  
{
    Write-Output "Creating pipeline from $($file.FullName)"
    New-AzureRmDataFactoryPipeline -DataFactory $dataFactory -File $file.FullName -Force -ErrorAction Stop 
}

That’s it. By this time all pipelines should be deployed, verified and started. Happy data movement!

This entry was posted in Programming and tagged , . Bookmark the permalink.

16 Responses to How to deploy Azure Data Factory pipeline and its dependencies programatically using PowerShell

  1. Arunachalam S says:

    Hello,
    I need to do build continous integration using VSTS and use this powershell for deploying my linked services, datasets, pipelines etc. But in the powershell script what is the input folder path i need to specify as I ADF files are in VSTS..i tried giving my local system path and it was throwing an error.

    • abatishchev says:

      First you need to run msbuild on your ADF project (adfproj), this will merge environment configs and produce respective folder, e.g. bin\Debug\Dev. Then take the content of Dev and deploy it using the PowerShell script. Its path is one of the input parameters.

  2. Arunachalam S says:

    this inputfolderpath i can specify if I’m deploying manually..but since im doing this through VSTS using powershell and building continous deployment..im not sure how this will pick the ADF files..any pointers for this would help..

    • abatishchev says:

      There is no difference between building locally and using VSTS. It just executes commands in given order which you configure using a web UI. The path is on a build agent. All same. This script I’m using at work to build and deploy using same VSTS as you use.

      • Arunachalam S says:

        Thanks for your reply. this is the PowerShell script I’m using for deploying from my local PowerShell:

        foreach($file in Get-ChildItem "C:\Users\a.sivananthan\Documents\Visual Studio 2015\Projects\ADFAutomation2018\ADFAutomation2018\" -filter "LinkedService")
        {
          New-AzureRmDataFactoryLinkedService -ResourceGroupName "myresource1" -DataFactoryName "ADFAutomationjan2018" -Name $file.BaseName -File $file.FullName -Force | Format-List
        }
        foreach($file in Get-ChildItem "C:\Users\a.sivananthan\Documents\Visual Studio 2015\Projects\ADFAutomation2018\ADFAutomation2018\" -filter "DataSet")
        {
         New-AzureRmDataFactoryDataset -ResourceGroupName "myresource1" -DataFactoryName "ADFAutomationjan2018" -Name $file.BaseName -File $file.FullName -Force | Format-List
        }
        foreach($file in Get-ChildItem "C:\Users\a.sivananthan\Documents\Visual Studio 2015\Projects\ADFAutomation2018\ADFAutomation2018\" -filter "Pipeline")
        {
         New-AzureRmDataFactoryPipeline -ResourceGroupName "myresource1" -DataFactoryName "ADFAutomationjan2018" -Name $file.BaseName -File $file.FullName -Force | Format-List
        }
        

        in this if i need to do the same thing using VSTS – continous deployment what should be the path? how will i find the path in builagent?. Im struggling to find this out as im new to VSTS and azure

  3. Arunachalam S says:

    I’m not sure about building my project using MSBUILD..do you have some links which provides steps for this..

    • abatishchev says:

      Use the dedicated task, point it to either sln or adfproj, not sure which one. The only caveat is that you need to check-in (put under the source control) the ADF build target file which is installed by VS somewhere under %LocalAppData%.

  4. Arunachalam S says:

    Hello, this is my build log file:

    2018-02-22T06:58:18.1439156Z ##[section]Starting: Build solution ***.sln
    2018-02-22T06:58:18.1640432Z ==============================================================================
    2018-02-22T06:58:18.1640649Z Task : Visual Studio Build
    2018-02-22T06:58:18.1640831Z Description : Build with MSBuild and set the Visual Studio version property
    2018-02-22T06:58:18.1640961Z Version : 1.126.0
    2018-02-22T06:58:18.1641080Z Author : Microsoft Corporation
    2018-02-22T06:58:18.1641237Z Help : More Information
    2018-02-22T06:58:18.1641378Z ==============================================================================
    2018-02-22T06:58:21.2726403Z ##[command]”D:\a_tasks\VSBuild_71a9a2d3-a98a-4caa-96ab-affca411ecda\1.126.0\ps_modules\MSBuildHelpers\vswhere.exe” -version [15.0,16.0) -latest -format json
    2018-02-22T06:58:22.2161642Z ##[command]”C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\MSBuild\15.0\Bin\msbuild.exe” “D:\a\1\s\ADFCICD1402.sln” /nologo /nr:false /dl:CentralLogger,”D:\a_tasks\VSBuild_71a9a2d3-a98a-4caa-96ab-affca411ecda\1.126.0\ps_modules\MSBuildHelpers\Microsoft.TeamFoundation.DistributedTask.MSBuild.Logger.dll”;”RootDetailId=8c7618ee-ebf7-4b9d-8642-4b0a294a2bd9|SolutionDir=D:\a\1\s”*ForwardingLogger,”D:\a_tasks\VSBuild_71a9a2d3-a98a-4caa-96ab-affca411ecda\1.126.0\ps_modules\MSBuildHelpers\Microsoft.TeamFoundation.DistributedTask.MSBuild.Logger.dll” /p:VisualStudioVersion=”15.0″ /p:_MSDeployUserAgent=”VSTS_7d845707-12c0-4a8d-8e30-4b61d5b5ca09_build_16_95″
    2018-02-22T06:58:28.3918074Z Building the projects in this solution one at a time. To enable parallel build, please add the “/m” switch.
    2018-02-22T06:58:32.0398442Z Build started 2/22/2018 6:58:31 AM.
    2018-02-22T06:58:37.0086183Z Project “D:\a\1\s\ADFCICD1402.sln” on node 1 (default targets).
    2018-02-22T06:58:37.0095953Z ValidateSolutionConfiguration:
    2018-02-22T06:58:37.0096347Z Building solution configuration “Debug|Any CPU”.
    2018-02-22T06:58:37.0109589Z ValidateProjects:
    2018-02-22T06:58:37.0109996Z The project “ADFCICD1402” is not selected for building in solution configuration “Debug|Any CPU”.
    2018-02-22T06:58:45.4896659Z Project “D:\a\1\s\ADFCICD1402.sln” (1) is building “D:\a\1\s\ADFAutomation\ADFAutomation.pssproj” (2) on node 1 (default targets).
    2018-02-22T06:58:45.4897294Z CleanupEmptyRefsFolder:
    2018-02-22T06:58:45.4898459Z Directory “D:\a\1\s\ADFAutomation\bin\Debug\” doesn’t exist. Skipping.
    2018-02-22T06:58:45.5234134Z Done Building Project “D:\a\1\s\ADFAutomation\ADFAutomation.pssproj” (default targets).
    2018-02-22T06:58:45.5874974Z Done Building Project “D:\a\1\s\ADFCICD1402.sln” (default targets).

    do i need to refer “D:\a\1\s\ADFAutomation\” this path in my PowerShell script?

    Correct me if I’m wrong. Thanks for your help.

      • Arunachalam S says:

        Thanks. I tried this yesterday but with no luck. See if you can provide any help in this.

        foreach($file in Get-ChildItem "$(Build.ArtifactStagingDirectory)" -filter LinkedService*)
        {
            New-AzureRmDataFactoryLinkedService -ResourceGroupName "ADFAutomationResource" -DataFactoryName "ADFCICD190218" -Name $file.BaseName -File $file.FullName -Force | Format-List
        }
        

        ADF files i took MSBUILD and used copyfiles to copy into “Build.ArtifactStagingDirectory” and then published the artifacts.
        I have this PowerShell script as a another project in my solution and trying to use this path Build.ArtifactStagingDirectory – but it throws an error stating:

        2018-02-23T06:56:44.5596446Z ##[error]The term ‘Build.ArtifactStagingDirectory’ is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path was included, verify that the path is.

        Please let me know how to proceed in this case? Thanks.

      • abatishchev says:

        Don’t put VSTS variables into PowerShell script’s body as it’s not aware of them. Put them when you invoke the said script in VSTS task. That it, see the initial post: the script uses the path from a parameter, the parameter is populated from the task.

        Task type: Azure PowerShell
        Azure Connection Type: Azure Resource Manager
        Script Type: Script File Path
        Script Path: $(System.DefaultWorkingDirectory)\Daily\drop\AzureDataFactory\Deploy\Deploy-AzureDataFactory.ps1
        Script Arguments: -ResourceGroupName Default-Factory-WestUS -Location WestUS -DataFactoryName TestFactory1 -InputFolderPath     $(InputFolderPath)
        

        Note that the path to the script is adjusted.

  5. Arunachalam S says:

    Thanks for the guidance. Please let me know if you can help in this.

    I need to do Continuous Integration and Deployment for my Azure Data factory. For this in a visual studio solution I have two projects one for ADF json files (linked services, datasets etc) and another one PowerЫhell script for deploying this ADF into a Azure subscription.

    Steps followed –

    Took MSBUILD of ADF codes and used copy files task to copy into $(Build.ArtifactStagingDirectory). Used Publish artifacts task to publish in VSTS.

    Publish artifacts for PowerShell script as a separate build.

    In my release I have a Azure PowerShell script which will invoke these ADF files and deploy it in Azure subscription. I’m using “Build.ArtifactStagingDirectory” for referring my ADF files. But I’m getting the below error –
    The term ‘Build.ArtifactStagingDirectory’ is not recognized as the name of a cmdlet, function, script file, or operable program

    foreach($file in Get-ChildItem “$(Build.ArtifactStagingDirectory)” -filter “LinkedService”) { New-AzureRmDataFactoryLinkedService -ResourceGroupName “ADFAutomationResource” -DataFactoryName “ADFCICD190218” -Name $file.BaseName -File $file.FullName -Force | Format-List }

    Let me know how to proceed in this case as there are no sufficient links explaining this.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s