Tip: don't 'tweak' the Workflow Server Group without really checking

Wed Oct 11 06:12:56 EDT 2006

Interesting, I'm sure someone will google it sometime.

I can add from my own experiences with the Event Queue to only make
adjustments in production in small increments over days, and especially
monitor during peak times (monthend).

Thanks for an excellent post!
Cheers,
Mike

On Wed, October 11, 2006 10:26, Mike Gambier wrote:
> Hi,
>
> This is not a question so feel free to ignore it if you're not interested.
>
> Recently our Basis people decided to play with the Server Group for the
> Event Delivery job in Workflow (SWEQSRV) without keeping a proper eye on
> the
> results. (The settings are visible in SWEQADM for those who don't know.)
>
> We have millions of Workflow instances running in our system (that's no
> joke) and deal with about 50,000 triggering events every day. That's okay
> though because we have 16 Application Servers on tap and Workflow had
> access
> to four of them (before the changes) and 50 dialog processes on each, all
> adding up to a whopping 200 dialog processes on demand. We used to be able
> to deliver up to 200 events per minute (12,000 per hour) depending on how
> the nature of the events being delivered.
>
> The Basis guys decided to increase the number of App Servers in the Server
> Group but decreased the number of processes in each. Logical thinking you
> may argue, as the number of dialog processes remainined at 200, but what
> happened was Workflow started to lose the battle for dialog processes on
> ALL
> servers during the online day because we have thousands of users hogging
> them.
>
> You may or may not know that if Workflow fails to secure a dialog process
> at
> runtime it creates a tRFC entry for WORKFLOW_LOCAL_100 in table ARCFSSTATE
> so that a batch job called RSARFCEX can run later on and pick up the
> slack.
> It does this for Events, Method Calls, New Tasks...pretty much everything.
>
> Had the Basis people looked a bit harder they would have spotted a sudden
> surge of entries being written to ARFCSSTATE following their changes.
>
> It's been a month now and Workflows have 'stalled' everywhere...........
> We
> have 20 MILLION tRFCs for Workflows that cannot be processed by RSARFCEX
> because SAP's code simply can't cope with that many records (it blows its
> own internal storage even trying to process a single day's work!!).
>
> Basis have reverted their changes and the tRFC queue is hardly being hit
> by
> Workflow at all anymore, so we're almost back to where we were. We are
> also
> trying to reprocess the tRFC queue using our own tools and asking SAP for
> help with theirs via OSS.
>
> Granted the numbers we are dealing with is extreme, but I urge Workflow
> and
> Basis people alike to keep an eye on tRFCs for Workflow whenever they
> meddle
> with the SWEQADM settings for Parallel Event Delivery.
>
> MGT
>
>
> _______________________________________________
> SAP-WUG mailing list
> SAP-WUG at mit.edu
> http://mailman.mit.edu/mailman/listinfo/sap-wug
>