Sunday, April 25, 2021

Fault Handling Framework (FHF)

Handle faults

Handle faults by two ways
1. Using Catch & CatchAll blocks

2. Using Fault Handling Framework 

Fault Handling Framework

FHF:

Fault handling allows a BPEL process service component to handle error messages or other exceptions returned by outside web services, and to generate error messages in response to business or runtime faults. There are two categories of BPEL faults:

1.           Business faults – Business faults are application-specific faults that are generated when there is a problem with the information being processed (for example, when a member is not found in the database).

2.           Runtime faults – Runtime faults are the result of problems within the running of the BPEL process service component or web service (for example, data cannot be copied properly because the variable name is incorrect). These faults are not user-defined, and are thrown by the system. They are generated for a variety of reasons, including the following:

•            The process tries to use a value incorrectly.

•            A logic error occurs (such as an endless loop).

•            A Simple Object Access Protocol (SOAP) fault occurs in a SOAP call.

•            An exception is thrown by the server.

 

Handling Faults with the Fault Management Framework:

Oracle SOA Suite provides a generic fault management framework for handling faults in BPEL processes.

If a fault occurs during runtime in an invoke activity in a process, the framework catches the fault and performs a user-specified action defined in a fault policy file associated with the composite or component.

•            The fault management framework catches all faults (business and runtime) for an invoke activity.

•            A fault policy file defines fault conditions and their corresponding fault recovery actions. Each fault condition specifies a particular fault or group of faults, which it attempts to handle, and the corresponding action for it. A set of actions is identified by an ID in the fault policy file.

•            A set of conditions invokes an action (known as a fault policy).

•            Email or JMS notify users of errors associated with a condition.

•            A fault policy bindings file associates the policies defined in the fault policy file with the following:

1.           SOA composite applications

2.           BPEL process and Oracle Mediator service components

3.           Reference binding components for BPEL processes and Oracle Mediator service components

->  We use this framework only to handle invocation faults; we can’t use this framework to handle all types of fault.

-> This can be used for BPEL as well as Mediator component.

-> We should use this only for two BPEL templates (Async and OneWay).


Fault Handling Framework is based upon two files.
Fault-policy.xml
Fault-binding.xml

Fault-policy.xml: 
-> This file is used to define one or more fault policies with conditions and actions.
-> In this file we define what type of error we want to handle and what action we need to perform when that fault occur.

Policies for fault handling consist of two main elements:
fault condition
activates the policy block, we specify policy and actions based on error codes, error messages etc.

action(s) 
should be performed when the condition is satisfied
An action for an fault may be to retry it for a certain number of time at a specified interval, or to mark it in recovery for human intervention,
 use a custom Java code or simply to throw the fault back.
If the fault is rethrown then if we have specified any explicit ‘catch’ block in our BPEL process that will be executed.

Fault-binding.xml: 
This file is used to associates a policy with a composite, component, or reference.

Note:
-> Whenever the BPEL process throws an error, The EHF will check whether the error exists in Fault-Binding.xml files.
-> If so, the action in the Fault policy.xml file will be taken.
-> If the action is not found, the fault will be thrown and it will be handled in the catch block.
-> Fault management framework (Fault-bindings.xml and Fault-policy.xml) is kept inside an SOA MDS and refer path to Composite.
-> Fault-handler like catch and catchAll are inside a BPEL to catch all faults,  but fault policies will only be executed when an invoke activity fails.

Oracle SOA Suite Fault Handling Best Practices
1 Create fault (catch block) for each partner link. For each partner link, have a catch block for all possible errors. The idea is not to let errors go to the catchAll block.
2 CatchAll should be kept for all errors that cannot be thought of during design time.
3 Classify errors into various types – runtime, remote, binding, validation, Business errors, etc.
4 Notification should be set up in production, so that, errors are sent to concerned teams by E-Mail. Console need not be visited for finding out the status of execution.

5 Use Catch Block for non-partner link error.

6 Every retry defined in fault policy causes a commit of the transaction. Dehydration will be reached and threads released.
7 Automated recovery can be created by creating a fault table, persisting the queue, and having an agent to re-submit the job (For example writing a Timer agent to invoke the Java code we wrote to recover instances). Can be achieved through scripts. Use only PUBLISHED API of ESB or QUEUE (AQ etc.) for re-submission. Another example would be to use WLST to change the RecoveryConfig MBean to configure the recovery window to retry all faulted instances.
8 Handle Rollback fault by providing ‘No Action’ in fault policy.
9 Remember – Receive, OnMessage, On Alarm, Wait, CheckPoint (Activity for forcing Java thread to store its current state to Dehydration store) will cause storing of current state to dehydration store and threads will be released.
10 Always use MDS to store fault policies and bindings to increase their reuse for multiple composites and projects.

Actions we have in the fault-policy.xml file
1. Retry
2. Human Intervention
3. Terminate
4. Java call
5. replay
6. rethrow

fault-policies.xml 


<faultPolicies xmlns="http://schemas.oracle.com/bpel/faultpolicy">
<faultPolicy version="2.0.1" id="CompositeFaultPolicy" xmlns:env="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://schemas.oracle.com/bpel/faultpolicy">
<Conditions>
<!-- Conditions can be fine grained to include Actions based on Error Codes. If a remotefault occurs check whether it is a WSDLReadingError. If yes then rethrow it else retry it."-->
<faultName xmlns:bpelx="http://schemas.oracle.com/bpel/extension" name="bpelx:remoteFault">
<condition>
<test>$fault.code/code="WSDLReadingError"</test>
<action ref="ora-rethrow-fault"/>
</condition>
<condition>
<action ref="ora-retry"/>
</condition>
</faultName>
<faultName xmlns:bpelx="http://schemas.oracle.com/bpel/extension" name="bpelx:bindingFault">
<condition>
<action ref="java-fault-handler"/>
</condition>
</faultName>
<faultName xmlns:bpelx="http://schemas.oracle.com/bpel/extension" name="bpelx:runtimeFault">
<condition>
<action ref="java-fault-handler"/>
</condition>
</faultName>
</Conditions>
<Actions>
<!-- This Action will invoke a Custom Java Class to process Faults. Also depending upon the returnValueanother action will be invoked whic is specified by the ref attribute. This demosntrates chaining of Actions"-->
<Action id="java-fault-handler">
<javaAction className="com.beatech.faultapp.CustomFaultHandler" defaultAction="ora-human-intervention" propertySet="properties">
<returnValue value="Manual" ref="ora-human-intervention"/>
</javaAction>
</Action>
<!-- This Action will mark the instance as "Pending for Recovery" in the EM console -->
<Action id="ora-human-intervention">
<humanIntervention/>
</Action>
<!--This is an action will bubble up the fault to the Catch Blocks-->
<Action id="ora-rethrow-fault">
<rethrowFault/>
</Action>
<!--This action will attempt 3 retries with intervals of 120 seconds -->
<Action id="ora-retry">
<retry>
<retryCount>3</retryCount>
<retryInterval>120</retryInterval>
<retryFailureAction ref="java-fault-handler"/>
</retry>
</Action>
<!--This action will cause the instance to terminate-->
<Action id="ora-terminate">
<abort/>
</Action>
</Actions>
<!--Properties can be used to pass values to the Java class as a Map that can be used by the Class -->
<Properties>
<propertySet name="properties">
<property name="myProperty1">propertyValue1</property>
<property name="myProperty2">propertyValue2</property>
<property name="myPropertyN">propertyValueN</property>
</propertySet>
</Properties>
</faultPolicy>
</faultPolicies>

Inside the custom Java fault handler, we can also use a switch that acts on the return value to chain another Action

<javaaction classname="com.beatech.faultapp.CustomFaultHandler" defaultaction="ora-rethrow">
<returnValue ref="ora-rethrow" value="Rethrow"/>
<returnvalue ref="ora-terminate" value="Abort"/>
<returnvalue ref="ora-retry" value="Retry"/>
<returnvalue ref="ora-human-intervention" value="Manual"/>
</javaaction>

fault-bindings.xml

<faultPolicyBindings version="2.0.1" xmlns="http://schemas.oracle.com/bpel/faultpolicy">
<composite faultPolicy="CompositeFaultPolicy"/>
</faultPolicyBindings
Finally we have to add two properties in the composite.xml to let the composite know about them
<property name=”oracle.composite.faultPolicyFile”>fault-policies.xml></property>
< property name=”oracle.composite.faultBindingFile”>fault-bindings.xml></property>

we can refer to these files even from the MDS.
<property name=”oracle.composite.faultPolicyFile”>oramds://apps/policy/fault-policies.xml></property>
< property name=”oracle.composite.faultBindingFile”>oramds://apps/policy/fault-bindings.xml></property>

 To configure Fault Policy Document in jdev12c

SOA Tier -> Faults -> Fault Policy Document 

Go to composite.xml file and click on edit composite Fault policies option

Once you click on OK a fault-bindings.xml is automatically generated according the selection you have made in the previous window. It would be created in you project as fault-bindings.xml.  

A set of conditions invokes an action (known as fault policy).


Q) Which static routing rule support fault policy?
Parallel rules only.

Fatal
fatal run-time errors generated by the server because of wsdl, xsd and other internal errors.

Synchronous faults are defined in the WSDL operation.
Asynchronous faults are not defined in the WSDL and can be returned by using callback operations.


1 <Action id="ora-retry">
   <Retry>
      <retryCount>3</retryCount>
      <retryInterval>2</retryInterval>
      <exponentialBackoff/>
      <retryFailureAction ref="ora-java"/>
      <retrySuccessAction ref="ora-java"/>
   </Retry>
</Action>

Retry a specified number of times.
Provide a delay between retries (in seconds).
Increase the interval with an exponential back-off.
Chain to a retry failure action if retry N times fails.
Chain to a retry success action if a retry is successful.

Note: Exponential back off indicates the next retry attempt is scheduled at 2 x the delay,
where the delay is the current retry interval.

For example, if the current retry interval is 2 seconds, the next retry attempt is scheduled at 4, the next at 8, and the next at 16 seconds until the retryCount value is reached.

2 <condition>
  <test>$fault.code="WSDLReading Error"
  </test>
  <action ref="ora-terminate"/>
</condition>
This condition is checking a fault variable for code = "WSDLFailure"
An action of ora-terminate is specified.
3 <condition>
  <action ref="ora-rethrow"/>
</condition>
No test condition is provided. This is a catchAll condition for a given faultName.
4 <condition>
    <alert ref = "ora-jms"/>
    <alert ref = "ora-email"/>
  <action ref="ora-rethrow"/>
</condition>
Two user notification alerts are defined for the condition. Select the type of user notification alert to create when a fault occurs (for example, an email alert, a JMS queue alert, or a log file alert).
5 <Action id="ora-human-intervention">
 <humanIntervention/></Action>
Human Intervention: Causes the current activity to stop processing. You can now go to Oracle Enterprise Manager Fusion Middleware Control and perform manual recovery actions on this instance.
6 <Action id="ora-terminate"><abort/></Action>
Terminate Process: Terminates the process
7 <Action id="ora-java">
<!-- this is user provided custom java
 class-->
<javaAction className="mypackage.myClass"
 defaultAction="ora-terminate">
   <returnValue value="REPLAY"
    ref="ora-terminate"/>
   <returnValue value="RETRHOW"
    ref="ora-rethrow-fault"/>
   <returnValue value="ABORT"
    ref="ora-terminate"/>
   <returnValue value="RETRY" ref="ora-retry"/>
   <returnValue value="MANUAL"
    ref="ora-human-intervention"/>
</javaAction>
</Action>
Java Code: Enables you to execute an external Java class.
returnValue: The implemented Java class must implement a method that returns a string. The policy can chain to a new action based on the returned string.

<Action id="ora-rethrow-fault"><rethrowFault/></Action>
Rethrow Fault: The framework sends the fault to the BPEL fault handlers (catch activities in scope activities). If none are available, the fault is sent up.

<Action id="ora-replay-scope"><replayScope/></Action>
Replay Scope: Raises a replay fault.

getFaultAsString() XPath extension function

<catchAll>
   <sequence>
      <assign>
         <from expression="bpelx:getFaultAsString()"/>
         <to variable="faultVar" part="message"/>
      </assign>
      <reply faultName="ns1:myFault" variable="faultVar" .../>
   </sequence>
</catchAll>

Throwing Internal Faults with the Throw Activity
The throw activity has three elements: its name, the name of the fault, and the fault variable.
You cannot use a throw activity on an asynchronous process to communicate with a client. 
Throw activity syntax includes the throw name, fault name, and fault variable:
<throw name="delay" faultName="nsPrefix:fault-1" faultVariable="fVar"/>

How to Catch BPEL Runtime Faults
1 Import RuntimeFault.wsdl into your process WSDL.
2 Declare a variable with messageType bpelx:RuntimeFaultMessage.
3 Catch it using the following syntax:
 <catch faultName="bpelx:remoteFault"  | "bpelx:bindingFault" faultName="varName">  

The idempotent Property and Fault Handling
If the idempotent deployment descriptor property is set to false in the composite.xml file and the invocation of a partner link fails, recovery does not start from the invoke activity. 
 when the idempotent property is set to false and partner link invocation either succeeds or fails.

<property name="bpel.partnerLink.myPartnerLink.idempotent">false</property>
This setting causes the BPEL process to dehydrate immediately after execution of this activity and be recorded in the dehydration store.

Successful
The invoke activity is dehydrated immediately after execution and recorded in the dehydration store.

Unsuccessful, and your BPEL process includes fault handling, such as a catchAll activity
Recovery is started from the catchAll activity and not from the invoke activity.

Unsuccessful, and your BPEL process includes a fault policy
The fault policy is used to attempt recovery of the invoke activity. This is the recommended approach.

Oracle SOA Suite Fault Handling Best Practices


1. Create fault (catch block) for each partner link. For each partner link, have a catch block for all possible errors. Idea is not to let errors go to catchAll block.


2. CatchAll should be kept for all errors that cannot be thought of during design time.


3. Classify errors into various types – runtime, remote, binding, validation, Business errors etc.


5. Use Catch Block for non-partner link error.


6. Every retry defined in fault policy causes a commit of the transaction. Dehydration will be reached and threads released.


8. Handle Rollback fault by providing ‘No Action’ in fault policy.


9. Remember – Receive, OnMessage, On Alarm, Wait, CheckPoint (Activity for forcing Java thread to store its current state to Dehydration store) will cause storing of current state to dehydration store and threads will be released.


10. Always use MDS to store fault policies and bindings to increase their reuse for multiple composites and projects.


Important points while handling error

If the error occurred in synchronous need to return minimum error information like error code, description and message id


If the services are more alike Asynchronous then we need to have common error logging/audit framework which will receive errors through DB or JMS ,Based on error types we can notify or persist the error


Fault context variable is available only in error handler.


While designing fault schema object consider the below important points 

Error Code

Error Description

Error Reference (i.e. Service Name and Which operation)

Instance Id (This information you can get inbuilt function fn:uuid)

Error Type (System Error or Business Error)

Error Notification 





SOA Overview Part-1

  Middleware It provides a mechanism for the process to interact with other processes running on multiple network machines. Advantages...