Parallel processing of Workflow Deadline Monitoing in 4.6C

Wed Jan 24 06:28:46 EST 2007

Hi,

For anyone who was interested we have implemented the RFC Q option 2 on my 
previous mail and things look to be working well.

We have on average 60,000 RFC entries in ARFCSSTATE for our 'new' 
WORKFLOW_DEADLINE_100 queue and our version of SWWDHEX is churning thorugh 
our deadlines pretty well. Load balancing seems to be working as well as it 
can with WF-BATCH dominating 2 dedicated servers most of the time. Our 
'normal' event RFCs seem to be stable too which is good news.

Now for our next problem to tackle: we have 2 terabytes (!!) of data in our 
Workflow tables and only need about 300 gig of it to remain availablle.

We've been trying to get rid of our Workflow data using the SAP archiving 
objects but our net growth rate is such that we can only seem to keep up 
with new data coming in.

SAP have suggested a radical 'heart surgery' approach involving an Oracle 
DBA getting his/her hands dirty to clone the 25 WF tables, slice and dice 
them and then reinstate the trimmed versions back in the system. All a bit 
scary to be honest...

Will post the results here as to how we progress as someone else might find 
this useful.

MGT

>From: "Mike Gambier" <madgambler at hotmail.com>
>Reply-To: "SAP Workflow Users' Group" <sap-wug at mit.edu>
>To: sap-wug at mit.edu
>Subject: Parallel processing of Workflow Deadline Monitoing in 4.6C
>Date: Fri, 24 Nov 2006 16:11:53 +0000
>
>Hello fellow WUGgers,
>
>We are faced with a bit of a new dilemma regarding WF Deadlines and seem to
>be faced with some difficult choices.
>
>Our old dilemma was this: we used to run RSWWDHEX every 15 minutes to pick
>up our steps that had passed their deadline entries (SWWWIDH / SWWWIDEADL)
>until this started to time out during the SELECT statement pulled too many
>entries back (simply because we have so many Workflows running). We also 
>had
>an issue with the standard program respawning itself whilst its predecessor
>job was still running which caused us a bit of grief. This last bit has 
>been
>resolved since we hear in later SAP versions.
>
>So, to fix these issues we cloned the program and built in a MAX HITS
>parameter to reduce the number of deadlines it processed per run and added 
>a
>self-terminate subroutine to ensure no two jobs ran concurrently.
>
>But, even after these changes we are faced with a NEW dilemma with WF
>Deadline Monitoring. Namely it has a nasty habit of loading up whatever
>server the job is run on to progress the deadline! This manifests itself in
>dailog process 'hogging' or excessive local tRFC entries in ARFCSSTATE 
>where
>it can't get hold of a dialog process to use on that particular server
>(which can happen a lot if we have other heavy jobs running there). The 
>load
>then shifts to RSARFCEX which then struggles with the load as everything is
>processed locally on whatever server it is run on.
>
>Unlike the Event Queue there is no standard ready made Parallel processing
>option for Deadlines that we know of, at least not in 4.6C. So we're
>thinking of choosing one of these options:
>
>1. Amend our Deadline Monitoring program (will require a mod to SAP code as
>well) to redirect the first RFC with a new custom destination that can be
>processed seperately to 'normal' Workflow tRFCs, e.g. 'WORKFLOW_DL_0100'
>instead of 'WORKFLOW_LOCAL_0100'. The new destination would be set up to
>point to a completely different server than the one the Deadline Monitor 
>job
>is currently running on. This won't diminish the load on the server where
>dialog processes are available but at least it will shift the load on
>RSARFCEX when it runs. Obviously we would have to schedule a new run for
>RSARCFEX with this new destination into our schedule.
>
>2. Same as 1 (mod required) but the new destination will point to a server
>group destination (rather than a single server) to spread the load across
>mutltiple servers when the tRFCs are converted into qRFCs. Has the added
>benefit of reusing the qRFC queue (and its standard config settings and
>transactions) to buffer the start of each new deadline being processed. 
>Once
>a deadline step is executed, any tRFCs that result will be appended as
>WORKFLOW_LOCAL_0100 as normal because they will result from subsequent 
>calls
>that will not affected by our mod setting. End result should be the START 
>of
>each deadline process chain is distributed across multiple servers (and
>therefore will spread the demand for dialog processes accordingly), but any
>tRFCs that result will end up being chucked back into the 'local' pot.
>Unfortunately this would mean that our version of SWWDHEX would pass the
>baton on to RSQOWKEX (the outbound queue batch job) to actually progress 
>the
>deadline, i.e do any real work. We would therefore have two batch jobs to
>watch and have a noticeable delay between deadlines being selected and
>deadlines actually being progressed. Whether we can live with this we just
>don't know. The issue of different deadlines for the same Workflow being
>progressed on different servers is also a concern but since we limit the
>number of deadlines we process per run anyway that is currently something 
>we
>suffer from at the moment.
>
>3. Dynamic destination determination (OSS Note 888279) applied to all
>Workflow steps, not just deadlines. Scary stuff. Breaks the concept of a
>single server 'owning' a deadline process chain in its entirety. 
>Considering
>the volumes of Workflow we have, we're uncertain as to what impact this 
>will
>have system-wide.
>
>4. Redesign Deadline Monitoring to use the same persistence approach as
>Event Delivery and have a deadline queue. Complete overhaul using SWEQUEUE
>etc as a guide. Would be a lovely project to do but honestly we can't 
>really
>justify the database costs, code changes and testing.
>
>We are currently favouring option 2 as a realistic way forward as it seems
>to offer the simplest way of shifting the load around to prevent a single
>server from being hammered. It has risks and would require careful
>monitoring or the qRFC queues, but it seems a safer bet than overloading
>another single server (option 1), splitting up a single deadline chain
>across multiple servers (option 3) or costing the earth and becoming
>unsupportable (option 4).
>
>Has anyone out there implemented option 3, the OSS Note? We'd love to
>know...
>
>Or, if you have any alternative suggestions we'd be interested to hear them
>:)
>
>Regards,
>
>Mike GT
>
>_________________________________________________________________
>Stay up-to-date with your friends through the Windows Live Spaces friends
>list.
>http://clk.atdmt.com/MSN/go/msnnkwsp0070000001msn/direct/01/?href=http://spaces.live.com/spacesapi.aspx?wx_action=create&wx_url=/friends.aspx&mk
>
>_______________________________________________
>SAP-WUG mailing list
>SAP-WUG at mit.edu
>http://mailman.mit.edu/mailman/listinfo/sap-wug

_________________________________________________________________
>From predictions to trailers, check out the MSN Entertainment Guide to the 
Academy Awards® 
http://movies.msn.com/movies/oscars2007/?icid=ncoscartagline1