Parallel processing of Workflow Deadline Monitoing in 4.6C

Wed Feb 21 05:03:21 EST 2007

Hi all,

Well SAP's Oracle DBA, Stefan Kuhlmann (thanks Stefan if you're reading 
this!), worked his magic on our Workflow data on Sunday. In just over five 
and a half hours our live Production tables for Workflow were trimmed right 
down to 200Gb from just over 1.6Tb.

I don't know the figures for the indexes and how much space we clawed back 
from them, but overall everyone seems to be happy. Our previous run on a 
snapshot of Production showed a 2Tb gain in total so I suspect we were 
pretty close to that overall.

To give you an idea of what we were up against here are some table sizes and 
record counts from tables you'll probably all know:

Table		Size (Kb)	                Rows
SRRELROLES	16,588,160
SWEINSTCOU	1,330,176	                9,212,962
SWELOG		1,114,112	                6,376,382
SWELTS		303,104
SWEQUEUE	293,824
SWPNODELOG	71,023,616	502,301,486
SWPSTEPLOG	140,800,000	395,628,934
SWP_ADM_US	1,918,976
SWP_HEADER	8,227,840
SWP_JOIN	                4,669,440
SWP_NODEWI	25,812,992
SWWBINDEF	129,307,264	1,251,610,755
SWWEI		2,655,232
SWWLOGHIST	90,001,408
SWWORGTASK	434,176
SWWRUNMETH	27,454,464
SWWSWPRET	13,524,992
SWWUSERWI	37,192,960	653,628,051
SWWWIDEADL	14,109,696
SWWWIHEAD	174,683,584	421031224
SWWWIRET	37,329,536
SWW_CONT	125,770,880	1,834,208,289
SWW_CONTOB	92,498,944	955,142,364

I don't recall ever seeing a single table with 1.8 BILLION records before! 
^^

Apparently these tables alone when combined are much larger than some entire 
systems that SAP normally work with.

We now have just over 70 MILLION entries in SWWWIHEAD (only!) and things are 
running much more smoothly.

No problems yet reported.

MGT

>From: "Mike Gambier" <madgambler at hotmail.com>
>Reply-To: "SAP Workflow Users' Group" <sap-wug at mit.edu>
>To: sap-wug at mit.edu
>Subject: Re: Parallel processing of Workflow Deadline Monitoing in 4.6C
>Date: Tue, 06 Feb 2007 09:44:49 +0000
>
>Hi all,
>
>Progress update on our open heart surgery for Worklflow : )
>
>SAP's Oracle DBA people got hold of a snapshot of our system over the 
>weekend, copied the 25 Workflow tables they wanted to hack to pieces and 
>proceeded to slice and dice all Workflow entries where the parent WF had 
>terminated.
>
>They then swapped the new copies of the tables with the system down, 
>renaming the original ones as a backup, and brought the system back up.
>
>The whole procedure took about 7 hours and our WF tablespaces dropped from 
>2 Terabytes to 560 Megabytes at a stroke.
>
>We are currently running integrity tests to see if anything has been broken 
>but so far so good.
>
>Considering our estimates show that we would have to run archiving 
>agressively for 6 months to achieve the same result, this seems like an 
>acceptable alternative.
>
>The current plan is to do this for real in Production some time soon.
>
>We're still not sure if our current archiving strategy can keep tabs with 
>our net growth though, perhaps we'll have a better idea when we reduce the 
>overall size of our WF tables.
>
>MGT
>
>>From: Susan Keohan <keohan at ll.mit.edu>
>>Reply-To: "SAP Workflow Users' Group" <sap-wug at mit.edu>
>>To: "SAP Workflow Users' Group" <sap-wug at mit.edu>
>>Subject: Re: Parallel processing of Workflow Deadline Monitoing in 4.6C
>>Date: Wed, 24 Jan 2007 06:35:08 -0500
>>
>>Hi Mike,
>>
>>Thanks for the update.... Please keep us posted.
>>
>>I am dealing with WF Table sizes and access time as well, but not on the
>>scale that you describe.  We have not yet implemented archiving on our
>>workitems, but our SRM 5.0 system spends a *lot* of time accessing
>>SWWWIHEAD in certain business objects (Bids, BUS2200).
>>
>>Still, I might learn a lesson from your experience.
>>
>>Mike Gambier wrote:
>> > Hi,
>> >
>> > For anyone who was interested we have implemented the RFC Q option 2 on
>> > my previous mail and things look to be working well.
>> >
>> > We have on average 60,000 RFC entries in ARFCSSTATE for our 'new'
>> > WORKFLOW_DEADLINE_100 queue and our version of SWWDHEX is churning
>> > thorugh our deadlines pretty well. Load balancing seems to be working 
>>as
>> > well as it can with WF-BATCH dominating 2 dedicated servers most of the
>> > time. Our 'normal' event RFCs seem to be stable too which is good news.
>> >
>> > Now for our next problem to tackle: we have 2 terabytes (!!) of data in
>> > our Workflow tables and only need about 300 gig of it to remain 
>>availablle.
>> >
>> > We've been trying to get rid of our Workflow data using the SAP
>> > archiving objects but our net growth rate is such that we can only seem
>> > to keep up with new data coming in.
>> >
>> > SAP have suggested a radical 'heart surgery' approach involving an
>> > Oracle DBA getting his/her hands dirty to clone the 25 WF tables, slice
>> > and dice them and then reinstate the trimmed versions back in the
>> > system. All a bit scary to be honest...
>> >
>> > Will post the results here as to how we progress as someone else might
>> > find this useful.
>> >
>> > MGT
>> >
>> >> From: "Mike Gambier" <madgambler at hotmail.com>
>> >> Reply-To: "SAP Workflow Users' Group" <sap-wug at mit.edu>
>> >> To: sap-wug at mit.edu
>> >> Subject: Parallel processing of Workflow Deadline Monitoing in 4.6C
>> >> Date: Fri, 24 Nov 2006 16:11:53 +0000
>> >>
>> >> Hello fellow WUGgers,
>> >>
>> >> We are faced with a bit of a new dilemma regarding WF Deadlines and
>> >> seem to
>> >> be faced with some difficult choices.
>> >>
>> >> Our old dilemma was this: we used to run RSWWDHEX every 15 minutes to
>> >> pick
>> >> up our steps that had passed their deadline entries (SWWWIDH /
>> >> SWWWIDEADL)
>> >> until this started to time out during the SELECT statement pulled too
>> >> many
>> >> entries back (simply because we have so many Workflows running). We
>> >> also had
>> >> an issue with the standard program respawning itself whilst its
>> >> predecessor
>> >> job was still running which caused us a bit of grief. This last bit
>> >> has been
>> >> resolved since we hear in later SAP versions.
>> >>
>> >> So, to fix these issues we cloned the program and built in a MAX HITS
>> >> parameter to reduce the number of deadlines it processed per run and
>> >> added a
>> >> self-terminate subroutine to ensure no two jobs ran concurrently.
>> >>
>> >> But, even after these changes we are faced with a NEW dilemma with WF
>> >> Deadline Monitoring. Namely it has a nasty habit of loading up 
>>whatever
>> >> server the job is run on to progress the deadline! This manifests
>> >> itself in
>> >> dailog process 'hogging' or excessive local tRFC entries in ARFCSSTATE
>> >> where
>> >> it can't get hold of a dialog process to use on that particular server
>> >> (which can happen a lot if we have other heavy jobs running there).
>> >> The load
>> >> then shifts to RSARFCEX which then struggles with the load as
>> >> everything is
>> >> processed locally on whatever server it is run on.
>> >>
>> >> Unlike the Event Queue there is no standard ready made Parallel
>> >> processing
>> >> option for Deadlines that we know of, at least not in 4.6C. So we're
>> >> thinking of choosing one of these options:
>> >>
>> >> 1. Amend our Deadline Monitoring program (will require a mod to SAP
>> >> code as
>> >> well) to redirect the first RFC with a new custom destination that can 
>>be
>> >> processed seperately to 'normal' Workflow tRFCs, e.g. 
>>'WORKFLOW_DL_0100'
>> >> instead of 'WORKFLOW_LOCAL_0100'. The new destination would be set up 
>>to
>> >> point to a completely different server than the one the Deadline
>> >> Monitor job
>> >> is currently running on. This won't diminish the load on the server 
>>where
>> >> dialog processes are available but at least it will shift the load on
>> >> RSARFCEX when it runs. Obviously we would have to schedule a new run 
>>for
>> >> RSARCFEX with this new destination into our schedule.
>> >>
>> >> 2. Same as 1 (mod required) but the new destination will point to a
>> >> server
>> >> group destination (rather than a single server) to spread the load 
>>across
>> >> mutltiple servers when the tRFCs are converted into qRFCs. Has the 
>>added
>> >> benefit of reusing the qRFC queue (and its standard config settings 
>>and
>> >> transactions) to buffer the start of each new deadline being
>> >> processed. Once
>> >> a deadline step is executed, any tRFCs that result will be appended as
>> >> WORKFLOW_LOCAL_0100 as normal because they will result from subsequent
>> >> calls
>> >> that will not affected by our mod setting. End result should be the
>> >> START of
>> >> each deadline process chain is distributed across multiple servers 
>>(and
>> >> therefore will spread the demand for dialog processes accordingly),
>> >> but any
>> >> tRFCs that result will end up being chucked back into the 'local' pot.
>> >> Unfortunately this would mean that our version of SWWDHEX would pass 
>>the
>> >> baton on to RSQOWKEX (the outbound queue batch job) to actually
>> >> progress the
>> >> deadline, i.e do any real work. We would therefore have two batch jobs 
>>to
>> >> watch and have a noticeable delay between deadlines being selected and
>> >> deadlines actually being progressed. Whether we can live with this we
>> >> just
>> >> don't know. The issue of different deadlines for the same Workflow 
>>being
>> >> progressed on different servers is also a concern but since we limit 
>>the
>> >> number of deadlines we process per run anyway that is currently
>> >> something we
>> >> suffer from at the moment.
>> >>
>> >> 3. Dynamic destination determination (OSS Note 888279) applied to all
>> >> Workflow steps, not just deadlines. Scary stuff. Breaks the concept of 
>>a
>> >> single server 'owning' a deadline process chain in its entirety.
>> >> Considering
>> >> the volumes of Workflow we have, we're uncertain as to what impact
>> >> this will
>> >> have system-wide.
>> >>
>> >> 4. Redesign Deadline Monitoring to use the same persistence approach 
>>as
>> >> Event Delivery and have a deadline queue. Complete overhaul using
>> >> SWEQUEUE
>> >> etc as a guide. Would be a lovely project to do but honestly we can't
>> >> really
>> >> justify the database costs, code changes and testing.
>> >>
>> >> We are currently favouring option 2 as a realistic way forward as it
>> >> seems
>> >> to offer the simplest way of shifting the load around to prevent a 
>>single
>> >> server from being hammered. It has risks and would require careful
>> >> monitoring or the qRFC queues, but it seems a safer bet than 
>>overloading
>> >> another single server (option 1), splitting up a single deadline chain
>> >> across multiple servers (option 3) or costing the earth and becoming
>> >> unsupportable (option 4).
>> >>
>> >> Has anyone out there implemented option 3, the OSS Note? We'd love to
>> >> know...
>> >>
>> >> Or, if you have any alternative suggestions we'd be interested to hear
>> >> them
>> >> :)
>> >>
>> >> Regards,
>> >>
>> >> Mike GT
>> >>
>> >> _________________________________________________________________
>> >> Stay up-to-date with your friends through the Windows Live Spaces 
>>friends
>> >> list.
>> >> 
>>http://clk.atdmt.com/MSN/go/msnnkwsp0070000001msn/direct/01/?href=http://spaces.live.com/spacesapi.aspx?wx_action=create&wx_url=/friends.aspx&mk
>> >>
>> >>
>> >> _______________________________________________
>> >> SAP-WUG mailing list
>> >> SAP-WUG at mit.edu
>> >> http://mailman.mit.edu/mailman/listinfo/sap-wug
>> >
>> > _________________________________________________________________
>> >> From predictions to trailers, check out the MSN Entertainment Guide to
>> >> the
>> > Academy Awards®
>> > http://movies.msn.com/movies/oscars2007/?icid=ncoscartagline1
>> >
>> >
>> > 
>>------------------------------------------------------------------------
>> >
>> > _______________________________________________
>> > SAP-WUG mailing list
>> > SAP-WUG at mit.edu
>> > http://mailman.mit.edu/mailman/listinfo/sap-wug
>>
>>--
>>Susan R. Keohan
>>SAP Workflow Developer
>>MIT Lincoln Laboratory
>>244 Wood Street
>>LI-200
>>Lexington, MA. 02420
>>781-981-3561
>>keohan at ll.mit.edu
>>_______________________________________________
>>SAP-WUG mailing list
>>SAP-WUG at mit.edu
>>http://mailman.mit.edu/mailman/listinfo/sap-wug
>
>_________________________________________________________________
>Turn searches into helpful donations. Make your search count. 
>http://click4thecause.live.com/search/charity/default.aspx?source=hmemtagline_donation&FORM=WLMTAG
>

>_______________________________________________
>SAP-WUG mailing list
>SAP-WUG at mit.edu
>http://mailman.mit.edu/mailman/listinfo/sap-wug

_________________________________________________________________
Play Flexicon: the crossword game that feeds your brain. PLAY now for FREE.  
  http://zone.msn.com/en/flexicon/default.htm?icid=flexicon_hmtagline