Friday, August 22, 2008

Correspond manual recovery back to the BPEL process

Looks complex, but is it?

Manual recovery of BPEL process does not look like a trivial job at all.
And the BPEL Console for manual recovery does do seem to reveal a lot of information as to which process corresponds to which instance of BPEL process in the first glance.

Why do instances end up in manual recovery?

Well this is to do with how BPEL engine handles in coming messages.
BPEL delivery service does 2 things
  1. Use JMS to register message to be processed
  2. Save message in dehydration store
Once the BPEL thread picks the JMS message the instance goes into UNRESOLVED state.
And when the instance is complete the dehydration stores goes to HANDLED state.

If the server shuts down or crashes or engine times out and comes back unable to find the JMS message as it might be already consumed.

What does Manual Recovery show?

Manual recovery shows all instances that are in UNRESOLVED state.

So it would also show instance going from UNRESOLVED to HANDLED states in dehydration store. But are not yet marked as HANDLED.

From what I have seen in production box, a lot of instances come into manual recovery and disappear in a short while.

Should I recover everything that I see in Manual Recovery?

One should not recover every message that you see in manual recovery.
Some of these could be genuine messages that are in flight.
Recover only those messages that are in manual recovery for at least a duration of X which is larger than expected in that particular enterprise system.

What and where do you recover?

Well when you look at the manual recovery area there are 3 different tabs that you generally see in the BPEL console. Most often you would only recover from the first 2 that are represented invoke and callbacks.

Generally I use the instance ID of the BPEL process to identify manual recovery instance corresponds to which BPEL instance.
If there are instances in staled state in BPEL, there there would be a correspoonding manual recovery process instance.
This manualt recovery instance conversation ID will have also contain the instance ID of the BPEL process along with other information with which the conversation Id is build.

What if there is no instance ID but some thing like MD5{xyz...}

These are messages for which instances are not yet created. Can be safely recovered.

Relate BPEL process instance parent to child relationship

Why I want to relate these?

Before I start I consider this scenario that prompted me to write this stuff.
Well I have a sequence of BPEL processes that get called.
A
--> B
----> C
-------> D
-------> E
----> F

I want a way to relate all the process together in the BPEL console or bespoke application built on top of the BPEL Database.

How to do this?

There are multiple ways to relate these BPEL process.

Title approach

Well the one I prefer is to set the title of the BPEL process instances as some logical value that makes sense to business. What that would mean is all the BPEL processes from "A" to "F" will all be name as "XYZ".
But this is good for the bespoke applications.
Well some times before the actual title of the process is set the BPEL process might go into manual recovery. How do you solve?

Database approach

Well go into the BPEL database and look.
Now how do I relate all the processes in the database.
Well first you need to go the table "ORABPEL"."CUBE_INSTANCE"

There are 3 specific columns that you should know about.
"TITLE" - sets the title of the BPEL process in this column as seen in the BPEL console

"ROOT_ID" - CIKEY (instance id as in BPEL console) of the root parent basically instance ID of A as in BPEL console

"PARENT_ID" - Immediate parents ID i.e the caller process

Sample Scenario

A is the parent
A calls B
B calls C and F
C calls D and E
-------------------------------------------------------
Process Name - Instance ID - Root ID - Parent ID --> comment
-------------------------------------------------------
A - 1 - 1 - --> parent Id blank as no parent for it but root Id is 1
B - 2 - 1 - 1 --> both parent and root ID as same as A
C - 3 - 1 - 2 --> root is still 1, but parent now is B
D - 4 - 1 - 3 --> root is still 1, but parent now is C
E - 5 - 1 - 3 --> root is still 1, but parent now is C
F - 6 - 1 - 2 --> root is still 1, but parent now is B

Sample Query

So sample queries to find all the children in a parent BPEL tree would be to use the root ID
================ Query START ==================================================
select * from cube_instance where root_id = {ROOT aprent instance number}
================ Query END ====================================================


A query to find all root BPEL process i.e BPEL process that are the initiators

================ Query START =================
select * from cube_instance where parent_id is NULL
================ Query END ===================


Why not use Tree Finder?

Itz a good question to ask why not use tree finder.

In the projects that I worked, access to the BPEL console is restricted and access to the BPEL is only through a bespoke J2EE application. But that application does not allow viewing of BPEL instances are that is for functinal people to see work flow tasks related to business errors.

The alternative that was implemented was a read only user to the BPEL database from where these queries are helpful to relate BPEL instances. Especially when performing manual recovery of stales instances and to identify the tree chain that has stalled in the flow.

Hope this answers question raised as a commnet here.

Thursday, August 21, 2008

Correlation for what? What forced it?

Is it complex?

Well correlation is not as complex as it seems.
Well when I first looked at it, it did not make any sense.
But this is what my experience with correlation is, and it was not as hard as I though it would be.
Though it was not difficult to get it working, I had to do a lot of ground work basically read a couple of articles before I understood what it was all about.
But once I did understand, it was a cake walk.

Whatz correlation? What does it correlate?

What correlation meant to me as a maths student was linear relationship between two random variables.
I initially understood it was to relate two seemly unrelated instances in BPEL, that are related in business sense. And it did turn out that way.

Where to use?

When BPEL is in synchronous mode, the caller waits for the response, so caller instance does the correlation in the background using WS-Addressing, basically add its own correlation to the message header send out.

Itz the asynchronous execution where the BPEL instance fires and forgets who received it. That when you want to say to BPEL which instance to go back to. Here is where BPEL correlation kicks in. What is basically does is use the data in the message to relate the instance. And so the correlation should be unique to the instance.
It does not always need to be A --> B --> C --> A but can be A --> B --> A if BPEL fires and forgets and wants to send back to the caller.

What forced it?

First of all correlation has force itself into my workspace in 2 of the projects that I was involved in. And both of these were for different reasons.

Circular call

Firstly was for the generally known and widely written about one i.e. A--calls -> B --> C --> A, basically a transitive calls that ends up on the caller, and the caller needs to figure out which instance of A --> C should return to.

If canonical data model id used to pass data across then one or more of the elements in the message will be used to build the correlation else one of the those fields in the message will be used for correlation.

FlowN

Second was for a entirely different scenario.
I tried to use FlowN to spawn 'N' 'ProcessB' from a master 'ProcessA'.
Call from 'ProcessA' to 'ProcessB' was asynchronous call. After ProcessB was complete it would return status back to ProcessA.

Without any correlation I was getting

"Conflicting receive.Another receive activity or equivalent (currently, onMessage branch in a pick activity) has already been enabled with the partnerLink "Process", operation name "onResult" and correlation set "" (or conversation id). Appendix A - Standard Faults in the BPEL 1.1 specification specifies a fault should be thrown under these conditions."

To solve this I had to create correlation. The reason being each of the branches in FlowN created exactly the same asynchronous call, but the receiving instance had to identify which spawn in the FlowN was the response coming back to. And yes creation of correlation did solve the problem.

But making correlation in FlowN to work was not straight forward as there were scope issues that I had to address, and I initially did not see coming. I only found explanations about FlowN in oracle forums from the Google I did. Hopefully I would write another blog post for the correlation in FlowN.

Where to setup correlation?

Only the receiver should perform the correlation.
And the sender should only worry about sending enough data to perform correlation.

Related Articles that helped me:


There was another very good article that helped me but I am unable to find that now.

Friday, August 15, 2008

Perform Stress Test in BPEL Console

Where Do I find it?

There is "Perform stress test" check box that you might have see in BPEL console.
Well the one that sits along side the famous manual data feeding page "Post XML Message".

Well if you have not seen it then this what I am referring to.
When the BPEL process is clicked, the "Initiate" tab opens.
Here you will see the payload as HTML form, also possible to choose XML Source.

Right there at the bottom left you will find 2 check boxes.
"Save Test" and another "Perform stress Test"

Options that come up

When you click this "Perform stress test", a few options are thrown up.
Number of concurrent threads?
Number of loops?
Constant delay between each invocation?

So if you set these values are 100, 50, 1000 respectively that would run 100 threads that loop and send 50 requests i.e. total of 5000 requests in the system.
Each thread will wait 1000 milli seconds before posting a request.

When not usable?

If you have not guessed it then here it is.
When correlation exists in the BPEL process that you want to test this approach is not suitable.
If you ask me why, it is becoz the same input is feed in the loop.
So this might not work out.

What the other option?

The other option that I have used is to use Java API and pump messages into BPEL for such testing. Mostly I will post another blog post to say how to post messages using java API.

BPEL context properties for Client API invocation

Java Class Locator

"Locator" class Java APIs are used to connect to oracle BPEL.
The locator requires a BPEL Context properties to be loaded as parameter as part of the Locator initialization. So I need to know what the properties looks like.

Properties looks like this

The propeties should look like
=================properties file START===========================================
orabpel.platform={Oracle Application Server Platform}
java.naming.factory.initial=com.evermind.server.rmi.RMIInitialContextFactory
java.naming.provider.url=opmn:ormi://{hostname}:{port}:{oc4jinstance}/orabpel
java.naming.security.principal={username}
java.naming.security.credentials={password}

=================properties file END=============================================


Sample properties
=================properties file START=======================================
orabpel.platform=ias_10g
java.naming.factory.initial=com.evermind.server.rmi.RMIInitialContextFactory
java.naming.provider.url=opmn:ormi://localhost:6003:home/orabpel
java.naming.security.principal=oc4jadmin
java.naming.security.credentials=welcome1

=================properties file END=========================================


Where do I find the property values?

There are multiple ways to identify what the these property values are.
One of the easiest wasy is through the BPEL Console. Here is how I do it.

Open the BPEL console --> Click Goto BPEL Admin --> Login to the Admin console

The default page that opens is SERVER --> under the tab Configuration
the following value correspond to the following property
"jndiProviderURL" = "java.naming.provider.url"
"bpelPlatform" = "orabpel.platform"

Thursday, August 14, 2008

Truncate Work Flow Tasks in BPEL

Why I want to do this?

While in Development, the instances are purged using the "Purge All Instances" option button in Instances tab in BPEL console.

What I find is this process does not seem to clear out any work flow tasks that are outstanding.

How I do this

To clean out all the outstanding WF tasks use the following query.

=============Query START ===================
truncate table wftaskhistory;
truncate table wfassignee;
truncate table wfattachment;
truncate table wfcomments;
truncate table wfmessageattribute;
truncate table wfnotification;
truncate table wfnotificationmessages;
truncate table wfroutingslip;
truncate table wftasktimer;
truncate table wftask;

=============Query END ===================


What it does?

What this does is clear the back end BPEL database table.

Count Number of BPEL processes Instantiated in 15 minutes intervals

Why count number of instance that too in 15 minutes interval?

Count Number of BPEL processes instantiated in 15 minutes intervals for what is a good questions.

Well, post GO LIVE to identify what is the rate of arrival of BPEL processes at 15 minutes interval is some thing that upper management is always interested in.
Also can be used for performance stats collection.

Query

Well here I have only used the "Hello World BPEL" process for listing. Add process names to identify the rate of arrival of messages in BPEL system.

=================Query START ====================
SELECT processname sourcesystem, time_interval timeinterval,
COUNT (1) numberoftransactions
FROM (

SELECT process_id processname, creation_date,
CASE
WHEN (EXTRACT (MINUTE FROM (creation_date)) / 15 ) <>
TO_CHAR (creation_date, 'DD-MM-YYYY HH24')
'Hrs 1-(First 15 Minutes)'
WHEN (EXTRACT (MINUTE FROM (creation_date)) / 15) <>
TO_CHAR (creation_date, 'DD-MM-YYYY HH24')
'Hrs 2-(15 to 30 Minutes)'
WHEN (EXTRACT (MINUTE FROM (creation_date)) / 15) <>
TO_CHAR (creation_date, 'DD-MM-YYYY HH24')
'Hrs 3-(30 to 45 Minutes)'
ELSE TO_CHAR (creation_date, 'DD-MM-YYYY HH24')
'Hrs 4-(Last 15 Minutes)'
END AS time_interval
FROM orabpel.cube_instance
WHERE process_id IN ('Hello World BPEL')
AND creation_date > TO_DATE (:starttime, 'DD-MON-YYYY HH24:MI:SS')
AND creation_date <>
)
GROUP BY processname, time_interval
ORDER BY time_interval DESC;
=================Query END ====================

BPEL process states

Why Identify states?

To identify the states of the BPEL processes from the database can be quite handy while for housekeeping jobs in production and building some bespoke application to monitor BPEL servers.

Where to find?

Well the the states of the BPEL processes can be found from the database.
This is what I do and I prefer as it gives a lot more control of what you like to see and customize.

Core table and columns

Well the core table to look at is "ORABPEL"."CUBE_INSTANCE"

The states of the BPEL processes are stores in "STATE" column.
This is defined as INTEGER in the database and does not help in first glance, unless it is know what each state means.

What do the states mean?

Well this is what each integer in the "STATE" column mean they mean:
0 - INITIATED in BPEL
1 - OPEN and RUNNING in BPEL
2 - OPEN and SUSPENDED in BPEL
3 - OPEN and FAULTED in BPEL
4 - CLOSED and PENDING_CANCEL in BPEL
5 - CLOSED and COMPLETED in BPEL
6 - CLOSED and FAULTED in BPEL
7 - CLOSED and CANCELLED in BPEL
8 - CLOSED and ABORTED in BPEL
9 - CLOSED and STALE in BPEL
ELSE UNKNOWN in BPEL

Sample Query

A sample query to find count of BPEL process in various states with in start and end date
=================== Query START ======================
SELECT

CASE state
WHEN 0 THEN 'initiated'
WHEN 1 THEN 'open.running'
WHEN 2 THEN 'open.suspended'
WHEN 3 THEN 'open.faulted'
WHEN 4 THEN 'closed.pending_cancel'
WHEN 5 THEN 'closed.completed'
WHEN 6 THEN 'closed.faulted'
WHEN 7 THEN 'closed.cancelled'
WHEN 8 THEN 'closed.aborted'
WHEN 9 THEN 'closed.stale'
ELSE 'unknown'
END state_text,
COUNT (1)
FROM orabpel.cube_instance
WHERE creation_date < (:startdate) AND creation_date > (:enddate)
GROUP BY state;
=================== Query END ======================

Wednesday, August 13, 2008

Disclaimer of content

Well all the posts in this blog are what I have experienced in the projects that I have worked on or have tested with simple examples in SOA suite 10.1.3.3 MLR 7

I would say they work in the environment that I have got.
And I am fairly certain that the environment that I have got is a standard one that oracle provides.

If you find that these do not work, 
either you can play with it and make it work.
Or you can choose to leave me a note and have some suggestions that will help others understand the post better.

Or vene better take both the options.

Don't hesitate to drop in a comment