Azure Data Factory vs SSIS

As Azure Data Factory matures year after year, I think that it’s fair to compare Azure Data Factory and SQL Server Integration Services (SSIS). This blog post considers people, processes and technical characteristics when looking at Azure Data Factory vs SSIS. 

What is Azure Data Factory?  

To being, Azure Data Factory (ADF) helps you design your data movement solutions for on-premises or any cloud provider. It is a serverless offering that allows you to perform enterprise data movements and transformations. You can do this while governing Azure Data Factory using the Azure ecosystem.  

If you want to know more, check out this Azure Data Factory Introduction

What is SQL Server Integration Services? 

SQL Server Integration Services (SSIS) is a platform for building enterprise-level data integration and data transformations solutions. It comes as part of the SQL Server licensing offering. You can host it on-premises, in a VM that you manage (IaaS) or deployed as part of Azure Data Factory in a VM that Microsoft manages for you (PaaS).  

Azure Data Factory vs SSIS 

Both Azure Data Factory and ISS have similar descriptions. This is because the two tools were created with the same purpose. To understand the differences, let’s compare ADF and SSIS. 

The following table compares Azure Data Factory and SSIS, taking into account people, processes and technology characteristics. 

People / Processes / TechnologyAzure Data FactorySSIS (On-Premises or IaaS)
Training availableFree training available fom Microsoft and community resources growing every dayA lot of free and private training available
Likelihood you’ll enjoy working with it HighLow
Likelihood of finding developers in the marketHighHigh
Easy to learnYesYes
Easy to masterYes, but it involves making sure you are an expert in the pricing model No
Salary ranges as a developerIncreasingDecreasing due to decrease of offering
GovernanceFully integrated with Azure Purview and Azure logsIntegrated with SQL Server logs and SSISDB
Release cycleNew features every monthEvery 2 years (with each version of SQL Server)
Microsoft SupportIt depends on your Azure support level, but the support is extremely responsiveLegacy process, and it might require Software Assurance
Development toolAzure Portal or APIsVisual Studio
Integrated with Azure DevOps for CI/CDYesYes
Code version controlYesYes
Future proofYesNo
VolumeFrom megabytes to terabytesFrom megabytes to gigabytes
VelocityAny speed can be achieved by integrating with other services such as Azure DatabricksMainly batch processing
VarietyFrom CSV to Json, parquet or new file formatsIt does not integrate with new file formats like Json or parquet
Release date20152005
Platform fully managed by MicrosoftYesNo
Pricing modelPay as you go, it might be difficult to understand and optimize. It comes as part of an SQL Server license
Development environment pricingYou will be billed as part of the pay as you go offeringYes (SQL Server Developer Edition)
Number of connectorsMore than 85Around 7, it can be extended by getting third-party provider extensions (additional costs)
Primary design patternELT (pipelines) and ETL (data flows)ETL
Drag and drop development availableYesYes
Development optionsPipelines, mapping data flows, Power Query (Power Query) and you can execute SSIS workloads in ADFOnly SSIS
Parallelism by defaultYesNo
AlertsBuilt in with Azure (Email / SMS / Calls), it requires additional resources if you want to send detailed messagesBuilt in component to send email alerts. It allows you to customize the message
Definition languageJsonXML
Available to use Azure Synapse Analytics StudioYesNo
Integrated with Azure DatabricksYesNo
Integrated with Big Data and ML workloadsYesNo
Maturity stateStill maturing, more features and capabilities to be releasedAfter 16 years on the market, fully mature, cannot expect many more capabilities to be released
Data movement customizationThere are features like defining how many Data Integration Units to use, batch size or how many activities to run in parallel but it is challenging to troubleshoot.You have detailed control of the number of rows in each batch or the size in KB of each row which provides a great level of control on the data movement

Summary 

By looking at the comparison above, I suggest exploring Azure Data Factory as your first option for a data movement orchestration tool. 

If you are happy with your current workloads in SQL Server Integration Services, don’t forget that you can still execute them using Azure Data Factory. 

What’s Next?  

In upcoming blog posts, we’ll continue to explore some of the features within Azure Data Services.         

Please follow Tech Talk Corner on Twitter for blog updates, virtual presentations, and more!       

As always, please leave any comments or questions below.   

Check out these other blog posts

comment [ 0 ]
share
No tags 0

No Comments Yet.

Do you want to leave a comment?

Your email address will not be published. Required fields are marked *