Introduction
Dear Readers,
This post serves as an introduction to a six-part series focused on the Integration Runtime (IR) within Azure Data Factory (ADF). Over the course of these six articles, you’ll gain essential knowledge and practical skills to choose and implement the right Integration Runtime for your specific requirements. Each article will explore different aspects of IR, providing you with a comprehensive guide to handling various integration scenarios effectively.
What is Azure Data Factory (a.k.a. ADF)
Azure Data Factory is a cloud-based data integration service that enables the creation and scheduling of data-driven workflows. Its primary strength lies in its capability to interact with multiple types of data sources and orchestrate data flows between them.
What is Integration Runtime?
The Integration Runtime (IR) is the compute infrastructure that ADF uses to provide data integration capabilities across various network environments. There are three types of integration runtimes offered by Data Factory:
- Azure Integration Runtime: Managed by Microsoft and can only access data sources in public networks.
- Self-hosted Integration Runtime: Deployed and managed by you, allowing access to both public and private networks. You’re responsible for patching, scaling, and maintaining the infrastructure.
- Azure-SSIS Integration Runtime: A Microsoft-managed IR designed to run SQL Server Integration Services (SSIS) packages in ADF. This IR can access both public and private networks.
In this series, we will focus on the custom IR, namely Self-hosted IR and Azure-SSIS IR, which offer greater flexibility, such as the ability to interact with private networks and, in the case of SSIS IR, the execution of SSIS packages.
Context and Case Studies
To guide you through the process of understanding the provided solutions, we’ll explore the following fictive scenarios:
Case Study 1: LucyCheck Ltd.
A Medical analysis laboratory where files are stored locally on a PC connected to the Internet. The medical regulatory authority, which operates on Azure and uses Azure Data Factory for its data integration processes, needs to retrieve specific data from the laboratory’s local system. This data will then be integrated into the regulatory authority’s Data Warehouse hosted on Azure.
How should this integration be implemented?
Case Study 2: Helory Swags Ltd
A luxury children’s clothing retailer is in the process of migrating its infrastructure to Azure. According to company policy, all new development must now take place within Azure.
The management requires reporting that involves data from a Line of Business (LOB) system, which is still stored in an on-premises SQL Server and won’t be migrated until next year.
How can an integration solution be implemented to integrate the on-premises SQL Server in the Azure Data Warehouse?
Case Study 3: Moera LinkedTeen Ltd.
A communications company with a hybrid infrastructure and several ADF instances. To fulfill various requests, these ADF instances need to extract data from the same on-premises SQL Server instance.
What is the best way to implement this integration?
Case Study 4: Aelynn Money Transfer Ltd.
A financial services company is migrating its entire infrastructure to Azure, except for one Line of Business (LOB) system, which will remain on-premises. The company’s data warehouse is populated with data from this LOB using a complex SSIS package. Due to time constraints, this SSIS package cannot be easily redeveloped in Azure Data Factory (ADF).
How can we ensure the continued use of this complex SSIS package in Azure while maintaining its connection to the on-premises LOB data?
Upcoming Solutions in the Series
The upcoming articles in the series will cover the following topics:
- Part 1: Security prerequisites for ADF access to on-premises environment.
- Part 2: ADF access to on-premises file systems.
- Part 3: ADF access to on-premises SQL Servers.
- Part 4: Implementing a shared ADF Integration Runtime.
- Part 5: Point-to-site (P2S) VPN setup.
- Part 6: Using SSIS IR in ADF to connect with on-premises SQL Servers.
Conclusion
By the end of these series, you will have a comprehensive understanding of both Self-hosted and Azure-SSIS Integration Runtimes in Azure Data Factory. Armed with this knowledge, you will confidently choose and implement the most suitable Integration Runtime for your organization’s needs.
Stay tuned for the next instalment of this exciting journey!