Parallel processing of Workflow Deadline Monitoing in 4.6C
Mike Gambier
madgambler at hotmail.com
Fri Nov 24 11:11:53 EST 2006
Hello fellow WUGgers,
We are faced with a bit of a new dilemma regarding WF Deadlines and seem to
be faced with some difficult choices.
Our old dilemma was this: we used to run RSWWDHEX every 15 minutes to pick
up our steps that had passed their deadline entries (SWWWIDH / SWWWIDEADL)
until this started to time out during the SELECT statement pulled too many
entries back (simply because we have so many Workflows running). We also had
an issue with the standard program respawning itself whilst its predecessor
job was still running which caused us a bit of grief. This last bit has been
resolved since we hear in later SAP versions.
So, to fix these issues we cloned the program and built in a MAX HITS
parameter to reduce the number of deadlines it processed per run and added a
self-terminate subroutine to ensure no two jobs ran concurrently.
But, even after these changes we are faced with a NEW dilemma with WF
Deadline Monitoring. Namely it has a nasty habit of loading up whatever
server the job is run on to progress the deadline! This manifests itself in
dailog process 'hogging' or excessive local tRFC entries in ARFCSSTATE where
it can't get hold of a dialog process to use on that particular server
(which can happen a lot if we have other heavy jobs running there). The load
then shifts to RSARFCEX which then struggles with the load as everything is
processed locally on whatever server it is run on.
Unlike the Event Queue there is no standard ready made Parallel processing
option for Deadlines that we know of, at least not in 4.6C. So we're
thinking of choosing one of these options:
1. Amend our Deadline Monitoring program (will require a mod to SAP code as
well) to redirect the first RFC with a new custom destination that can be
processed seperately to 'normal' Workflow tRFCs, e.g. 'WORKFLOW_DL_0100'
instead of 'WORKFLOW_LOCAL_0100'. The new destination would be set up to
point to a completely different server than the one the Deadline Monitor job
is currently running on. This won't diminish the load on the server where
dialog processes are available but at least it will shift the load on
RSARFCEX when it runs. Obviously we would have to schedule a new run for
RSARCFEX with this new destination into our schedule.
2. Same as 1 (mod required) but the new destination will point to a server
group destination (rather than a single server) to spread the load across
mutltiple servers when the tRFCs are converted into qRFCs. Has the added
benefit of reusing the qRFC queue (and its standard config settings and
transactions) to buffer the start of each new deadline being processed. Once
a deadline step is executed, any tRFCs that result will be appended as
WORKFLOW_LOCAL_0100 as normal because they will result from subsequent calls
that will not affected by our mod setting. End result should be the START of
each deadline process chain is distributed across multiple servers (and
therefore will spread the demand for dialog processes accordingly), but any
tRFCs that result will end up being chucked back into the 'local' pot.
Unfortunately this would mean that our version of SWWDHEX would pass the
baton on to RSQOWKEX (the outbound queue batch job) to actually progress the
deadline, i.e do any real work. We would therefore have two batch jobs to
watch and have a noticeable delay between deadlines being selected and
deadlines actually being progressed. Whether we can live with this we just
don't know. The issue of different deadlines for the same Workflow being
progressed on different servers is also a concern but since we limit the
number of deadlines we process per run anyway that is currently something we
suffer from at the moment.
3. Dynamic destination determination (OSS Note 888279) applied to all
Workflow steps, not just deadlines. Scary stuff. Breaks the concept of a
single server 'owning' a deadline process chain in its entirety. Considering
the volumes of Workflow we have, we're uncertain as to what impact this will
have system-wide.
4. Redesign Deadline Monitoring to use the same persistence approach as
Event Delivery and have a deadline queue. Complete overhaul using SWEQUEUE
etc as a guide. Would be a lovely project to do but honestly we can't really
justify the database costs, code changes and testing.
We are currently favouring option 2 as a realistic way forward as it seems
to offer the simplest way of shifting the load around to prevent a single
server from being hammered. It has risks and would require careful
monitoring or the qRFC queues, but it seems a safer bet than overloading
another single server (option 1), splitting up a single deadline chain
across multiple servers (option 3) or costing the earth and becoming
unsupportable (option 4).
Has anyone out there implemented option 3, the OSS Note? We'd love to
know...
Or, if you have any alternative suggestions we'd be interested to hear them
:)
Regards,
Mike GT
_________________________________________________________________
Stay up-to-date with your friends through the Windows Live Spaces friends
list.
http://clk.atdmt.com/MSN/go/msnnkwsp0070000001msn/direct/01/?href=http://spaces.live.com/spacesapi.aspx?wx_action=create&wx_url=/friends.aspx&mk
More information about the SAP-WUG
mailing list