APPLIES TO: Azure Data Factory Azure Synapse Analytics
Conditional pathsAzure Data Factory and Synapse Pipeline orchestration allows conditional logic and enables the user to take a different path based upon outcomes of a previous activity. Using different paths allow users to build robust pipelines and incorporates error handling in ETL/ELT logic. In total, we allow four conditional paths,
Name Explanation Upon Success (Default Pass) Execute this path if the current activity succeeded Upon Failure Execute this path if the current activity failed Upon Completion Execute this path after the current activity completed, regardless if it succeeded or not Upon Skip Execute this path if the activity itself didn't run
You can add multiple branches following an activity, with one exception: Upon Completion path can't coexist with either Upon Success or Upon Failure path. For each pipeline run, at most one path is activated, based on the execution outcome of the activity.
Error Handling Common error handling mechanism Try Catch blockIn this approach, customer defines the business logic, and only defines the Upon Failure path to catch any error from previous activity. This approach renders pipeline succeeds, if Upon Failure path succeeds.
Do If Else block
In this approach, customer defines the business logic, and defines both the Upon Failure and Upon Success paths. This approach renders pipeline fails, even if Upon Failure path succeeds.
Do If Skip Else block
In this approach, customer defines the business logic, and defines both the Upon Failure path, and Upon Success path, with a dummy Upon Skipped activity attached. This approach renders pipeline succeeds, if Upon Failure path succeeds.
Summary table Approach Defines When activity succeeds, overall pipeline shows When activity fails, overall pipeline shows Try-Catch Only Upon Failure path Success Success Do-If-Else Upon Failure path + Upon Success paths Success Failure Do-If-Skip-Else Upon Failure path + Upon Success path (with a Dummy Upon Skip at the end) Success Success How pipeline failures are determined
Different error handling mechanisms lead to different status for the pipeline: while some pipelines fail, others succeed. We determine pipeline success and failures as follows:
Assuming Upon Failure activity and Dummy Upon Failure activity succeed,
In Try-Catch approach,
In Do-If-Else approach,
In Do-If-Skip-Else approach,
As we develop more complicated and resilient pipelines, it's sometimes required to introduced conditional executions to our logic: execute a certain activity only if certain conditions are met. The use cases are plenty, for instance:
Here we explain some common logics and how to implement them in ADF.
Single activityHere are some common patterns following a single activity. We can use these patterns as building blocks to construct complicated work flows.
Error handlingThe pattern is the most common condition logic in ADF. An error handling activity is defined for the "Upon Failure" path, and will be invoked if the main activity fails. It should be incorporated as best practice for all mission critical steps that needs fall-back alternatives or logging.
Best effort steps
Certain steps, such as informational logging, are less critical, and their failures shouldn't block the whole pipeline. In such cases, we should adopt the best effort strategies: adding next steps to the "Upon Completion" path, to unblock the work flow.
And
First and most common scenarios are conditional "and": continue the pipeline if and only if the previous activities succeed. For instance, you could have multiple copy activities that need to succeed first before moving onto next stage of data processing. In ADF, the behavior can be achieved easily: declare multiple dependencies for the next step. Graphically, that means multiple lines pointing into the next activity. You can choose either "Upon Success" path to ensure the dependency have succeeded, or "Upon Completion" path to allow best effort execution.
Here, the follow-up wait activity will only execute when both web activities were successful.
And here, the follow-up wait activity executes when ActivitySucceeded passes and ActivityFailed completed. Note, with "Upon Success" path ActivitySucceeded has to succeed, whereas ActivityFailed on the "Upon Completion" path runs with best effort, that is, might fail.
Or
Second common scenarios are conditional "or": run an activity if any of the dependencies succeeds or fails. Here we need to use "Upon Completion" paths, If Condition activity and expression language.
Before we dive deep into code, we need to understand one more thing. After an activity ran and completed, you can reference its status with @activity('ActivityName').Status. It's either "Succeeded"_ or "Failed". We use this property to build conditional or logic.
In some cases, you might want to invoke a shared error handling or logging step, if any of the previous activities failed. You can build your pipeline like this:
@or(equals(activity('ActivityFailed').Status, 'Failed'), equals(activity('ActivitySucceeded').Status, 'Failed'))
@or(or(equals(activity('ActivityFailed').Status, 'Failed'), equals(activity('ActivitySucceeded1').Status, 'Failed')),equals(activity('ActivitySucceeded1').Status, 'Failed'))
Greenlight if any activity succeeded
When all your activities are best effort, you might want to proceed to next step if any of the previous activities succeeded. You can build your pipeline like this:
@or(equals(activity('ActivityFailed').Status, 'Succeeded'), equals(activity('ActivitySucceeded').Status, 'Succeeded'))
Complex scenarios All activities need to succeed to proceed
The pattern is a combination of two: conditional and + error handling. The pipeline proceeds to next steps if all proceeding activities succeed, or else it runs a shared error logging step. You can build the pipeline like this:
@and(equals(activity('ActivityFailed').Status, 'Succeeded'), equals(activity('ActivitySucceeded').Status, 'Succeeded'))
Common patterns Try-Catch-Proceed
The pattern is equivalent to try catch block in coding. An activity might fail in a pipeline. When it fails, customer needs to run an error handling job to deal with it. However, the single activity failure shouldn't block next activities in the pipeline. For instance, I attempt to run a copy job, moving files into storage. However it might fail half way through. And in that case, I want to delete the partially copied, unreliable files from the storage account (my error handling step). But I'm OK to proceed with other activities afterwards.
To set up the pattern:
Note
Each path (UponSuccess, UponFailure, and UponSkip) can point to any activity. Multiple paths can point to the same activity. For example, UponSuccess and UponSkip can both point to one activity while UponFailure points to a different one.
Error Handling job runs only when First Activity fails. Next Activity will run regardless if First Activity succeeds or not.
Generic error handlingCommonly, we have multiple activities running sequentially in the pipeline. If any fails, I need to run an error handling job to clear the state, and/or log the error. For instance, I have sequential copy activities in the pipeline. If any of these fails, I need to run a script job to log the pipeline failure.
To set up the pattern:
The last step, Generic Error Handling, will only run if any of the previous activities fails. It will not run if they all succeed.
You can add multiple activities for error handling.
Data Factory metrics and alerts
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4