Fall 2003 Workflow Extensions

Workflow Documentation : Fall 2003 Workflow Extensions

By Lars Pind

This requirements and design document is primarily motivated by:

A client project developing the Simulation pacakge (in CVS at openacs.org:/cvsroot openacs-4/contrib/packages/simulation), which is a workflow-based law simulation engine.
The need for an application that can handle the TIP voting process.

Hierarchical Workflows

Requirements

Use cases:

Leiden: We have several occurances of the simple AskInfo-GiveInfo question/response pair. Defining simulation templates would be simplified if that was a reusable component.
TIP Voting: There's a master workflow case for the TIP itself. When voting, there'll be a sub-workflow case for each TIP member to vote on the issue, with timeouts so if they don't vote within a week, their vote is automatically 'Abstained'.

Design

Actions will no longer be atomic. An action can be "in progress" for a long time, while the child workflow(s) completes.
We will introduce an uber-state of a case, which can be 'active', 'completed', 'canceled', or 'suspended'.
When the action gets enabled, a callback will create child cases linked to this particular enabled action.
Whenever a child case changes its case_state, a callback on the parent action is invoked, which examines the state of all of its child cases and determines whether the parent action is complete and ready to fire or not. If the parent action is completed, any remaining 'active' child cases will be marked 'canceled'.
If the action should ever get un-enabled, a callback will cancel all remaining 'active' child cases.
If the action becomes enabled again, we will create new child cases.
A case which is a child of another case cannot leave the 'completed' or 'canceled' state, unless its parent enabled action is still enabled.

Data Model

create table workflow_action_children(
  child_id                  integer
                            constraint ...
                            primary key,
  action_id                 integer
                            constraint ...
                            not null
                            constraint ...
                            references workflow_actions(action_id)
                            on delete cascade,
  child_workflow            integer
                            constraint wf_action_child_wf_fk
                            references workflows(workflow_id)
);

create table workflow_action_child_role_map(
  parent_action_id          integer
                            constraint wf_act_chid_rl_map_prnt_act_fk
                            references workflow_actions(action_id),
  parent_role               integer
                            constraint wf_act_chid_rl_map_prnt_rl_fk
                            references workflow_roles(role_id),
  child_role                integer
                            constraint wf_act_chid_rl_map_chld_rl_fk
                            references workflow_roles(role_id),
  mapping_type              char(40)
                            constraint wf_act_chid_rl_map_type_ck
                            check (mapping_type in 
                                ('per_role','per_member','per_user'))
);

create table workflow_case_enabled_actions(
  enabled_action_id         integer
                            constraint wf_case_enbl_act_case_id_pk
                            primary key,
  case_id                   integer
                            constraint wf_case_enbl_act_case_id_nn
                            not null
                            constraint wf_case_enbl_act_case_id_fk
                            references workflow_cases(case_id)
                            on delete cascade,
  action_id                 integer
                            constraint wf_case_enbl_act_action_id_nn
                            not null
                            constraint wf_case_enbl_act_action_id_fk
                            references workflow_actions(action_id)
                            on delete cascade,
  enabled_state             char(40)
                            constraint wf_case_enbl_act_state_ck
                            check (enabled_state in ('enabled','running','completed','canceled','refused')),
  -- the timestamp when this action automatically fires
  fire_timestamp            timestamp
                            constraint wf_case_enbl_act_timeout_nn
                            not null,
  constraint wf_case_ena_act_case_act_un
  primary key (case_id, action_id)
);

create table workflow_case_child_cases(
  case_id                 integer 
                          constraint wf_case_child_cases_case_fk
                          references workflow_cases
                          constraint wf_case_child_cases_case_pk
                          primary key,
  enabled_action_id       integer
                          constraint wf_case_child_cases_en_act_fk
                          references workflow_case_enabled_actions
                          constraint wf_case_child_cases_en_act_nn
                          not null
);

Enabled States Explained

The enabled_state of rows in workflow_case_enabled_actions can be in one of the following:

Enabled. The action is currently enabled.
Running. The action is currently running, specifically meaning that there are active child cases. XXXXXXXXXXXXXX do we need this?
Completed. The action has completed executing. The row will still stay around so we have a history of what was executed when and we're able to count the number of times a given action was executed.
Canceled. The action was enabled, but the case's state changed before the action was triggered. (Note: This is not necessary, we could just delete the row instead.)
Refused. The action had its database-driven preconditions for being enabled met (e.g. enabled-in-states for FSM, input places with tokens in Petri, plus dependencies on other tasks met), but the "CanEnableP" callback refused to let the action become enabled. (Note: This is not necessary, we could just delete the row instead.)

When Enabled

When an action with child workflows is enabled, we start the child cases defined by the parent workflow, executing the initial action on each of them.

We create one case per role in workflow_action_children times one case per member/user for roles with a mapping_type of 'per_member'/'per_user'. If more than one role has a mapping_type other than 'per_role', we will create cases for the cartesian product of members/users of those roles in the parent workflow.

When Triggered

The action can be triggered by a timeout, by the user, by child cases reaching a certain state, or by all child cases being completed.

An example of "child cases reaching a certain state" would be the TIP voting process, where 2/3rd Approved votes is enough to determine the outcome, and we don't need the rest to vote anymore.

When triggered, all child cases with a case_state of 'active' are put into the 'canceled' state. All child cases have their 'locked_p' flag set to true, so they cannot be reopened.

Trigger Conditions

Requirements

If any change to any child workflow of a case attempts to trigger the parent action, the trigger condition would tell us whether to allow the trigger to go through.

The trigger condition could check to see if all child cases are completed, or it could check if there's enough to determine the outcome, e.g. a 2/3 approval.

XXXXXXXXXXXXXXX

Child Case State Changed Logic

> We execute the OnChildCaseStateChange callback, if any. This gets to determine whether the parent action is now complete and should fire.

We provide a default implementation, which simply checks if the child cases are in the 'complete' state, and if so, fires.

NOTE: What do we do if any of the child cases are canceled? Consider the complete and move on with the parent workflow? Cancel the parent workflow?

NOTE: Should we provide this as internal workflow logic or as a default callback implementation? If we leave this as a service contract with a default implementation, then applications can customize. But would that ever be relevant? Maybe this callback is never needed.

Case State

Requirements

We want to be able to suspend a case, to reopen it later, without having to create an explicit state in the workflow for this. Suspending the case means it doesn't show up on people's task lists or in reminder emails until it's un-suspended.
In the UI, we want to be able to distinguish between cases that are considered active and complete, even if the closed ones could be reopened to haunt us later. A good example is bug-tracker, where bugs in "open" or "resolved" states are considered active and should be counted as bugs needing attention, whereas those in "closed" state are complete and do not.
A case can be canceled, which is the same as suspended, except it doesn't resurface unless someone actively goes reopen it.
Child cases must be locked down so they cannot be reactivated when the parent workflow has moved on to some other state.

Design

create table workflow_cases(
  ...
  state                     char(40)
                            constraint workflow_cases_state_ck
                            check (state in ('active', 'completed',
                            'canceled', 'suspended'))
                            default 'active',
  locked_p                  boolean default 'f',
  suspended_until           timestamptz,
  ...
);

Cases can be active, complete, suspended, or canceled.

They start out as active. For FSMs, when they hit a state with complete_p = t, the case is moved to 'complete'.

Users can choose to cancel or suspend a case. When suspending, they can type in a date, on which the case will spring back to 'active' life.

When a parent worfklow completes an action with a sub-workflow, the child cases that are 'completed' are marked 'closed', and the child cases that are 'active' are marked 'canceled'.

The difference between 'completed' and 'closed' is that completed does not prevent the workflow from continuing (e.g. bug-tracker 'closed' state doesn't mean that it cannot be reopened), whereas a closed case cannot be reactivarted (terminology confusion alert!).

Conditional Transformation For Atomic Actions

create table workflow_action_fsm_output_map(
  action_id                 integer
                            not null
                            references workflow_actions(action_id)
                            on delete cascade,
  output_short_name         varchar(100),
  new_state                 integer
                            references workflow_fsm_states,
  constraint ...
  primary key (action_id, output_value)
);

Callback: Action.OnFire -> (output): Executed when the action fires. Output can be used to determine the new state of the case (see below).

The callback must enumerate all the values it can possible output (similar contruct to GetObjectType operation on other current workflow service contracts), and the callback itself must return one of those possible values.

The workflow engine will then allow the workflow designer to map these possible output values of the callback to new states, in the case of an FSM, or similar relevant state changes for other models.

Service Contract

workflow.Action_OnFire:
  OnFire -> string
  GetObjectType -> string
  GetOutputs -> [string]

GetOutputs returns a list of short_names and pretty_names (possibly localizable, with #...# notation) of possible outputs.

Note

The above table could be merged with the current workflow_fsm_actions table, which only contains one possible new state, with a null output_short_name.

Conditional Transformation Based on Child Workflows

create table workflow_outcomes(
  outcome_id                integer
                            constraint ...
                            primary key,
  workflow_id               integer
                            constraint wf_outcomes_wf_fk
                            references workflows(workflow_id),
  short_name                varchar(100)
                            constraint wf_outcomes_short_name_nn
                            not null,
  pretty_name               varchar(200)
                            constraint wf_outcomes_pretty_name_nn
                            not null
);

create table workflow_fsm_states(
  ...
  -- If this is non-null, it implies that the case has completed with
  -- the given output, for use in determining the parent workflow's
  -- new state
  outcome                   integer
                            constraint
                            references workflow_outcomes(outcome_id),
  ...
);

Gated Actions

Requirements

An action does not become avilable until a given list of other actions have completed. The advanced version is that you can also specify for each of these other tasks how many times they must've been executed.

Also, an action can at most be executed a certain number of times.

Design

create table workflow_action_dependencies(
  action_id                 integer
                            constraint wf_action_dep_action_fk
                            references workflow_actions(action_id),
  dependent_on_action       integer
                            constraint wf_action_dep_dep_action_fk
                            references workflow_actions(action_id),
  min_n                     integer default 1,
  max_n                     integer,
  constraint wf_action_dep_act_dep_pk
  primary key (action_id, dependent_on_action)
);

When an action is about to be enabled, and before calling the CanEnableP callback, we check the workflow_case_enabled_actions table to see that the required actions have the required number of rows in the workflow_case_enabled_actions table with enabled_state 'completed'.

The second part, about maximum number of times an action can be executed, this could be solved with a row in the above table with the action being dependent upon it self with the given max_n value.

Enable Condition Callback

Action.CanEnableP -> (CanEnabledP): Gets called when an action is about to be enabled, and can be used to prevent the action from actually being enabled.

Is called after all database-driven enable preconditions have been met, i.e. FSM enabled-in-state, and "gated on"-conditions.

This will only get called once per case state change, so if the callback refuses to let the action become enabled, it will not be asked again until the next time an action is executed.

If the callback returns false, the enabled_state of the row in workflow_case_enabled_actions will be set to 'refused' (NOTE: Or the row will be deleted?).

Non-User Triggered Actions

Requirements

Some actions, for example those will child workflows, may not want to allow users to trigger them.

Design

create table workflow_actions(
  ...
  user_trigger_p          boolean default 't',
  ...
);

If user_trigger_p is false, we do not show the action on any user's task list.

Resolution Codes

Requirements

The bug-tracker has resolution codes under the "Resolve" action. It would be useful if these could be customized.

In addition, I saw one other dynamic-workflow product (TrackStudio) on the web, and they have the concept of resolution codes included. That made me realize that this is generally useful.

In general, a resolution code is a way of distinguishing different states, even though those states are identical in terms of the workflow process.

Currently, the code to make these happen is fairly clumsy, what with the "FormatLogTitle" callback which we invented.

Design

create sequence ...

create table workflow_action_resolutions(
  resolution_id           integer 
                          constraint wf_act_res_pk
                          primary key,
  action_id               integer
                          constraint wf_act_res_action_fk
                          references workflow_actions(action_id)
                          on delete cascade,
  sort_order              integer
                          constraint wf_act_res_sort_order_nn
                          not null,
  short_name              varchar(100)
                          constraint wf_act_res_short_name_nn
                          not null,
  pretty_name             varchar(200)
                          constraint wf_act_res_pretty_name_nn
                          not null
);

create index workflow_act_res_act_idx on workflow_action_resolutions(action_id);

create table workflow_action_res_output_map(
  action_id               integer
                          not null
                          references workflow_actions(action_id)
                          on delete cascade,
  acs_sc_impl_id          integer
                          not null
                          references acs_sc_impls(impl_id)
                          on delete cascade,
  output_value            varchar(4000),
  resolution_id           integer
                          not null
                          references workflow_action_resolutions(resolution_id)
                          on delete cascade,
);

-- FK index on action_id
-- FK index on acs_sc_impl_id
-- FK index on resolution

Assignment Notifications

Requirements

When someone is assigned to an action, we want the notification email to say "You are now assigned to these tasks".

Design

We'd need to postpone the notifications until we have fully updated the workflow state to reflect the changed state, to determine who should get the normal notifications, and who should get personalized ones.

Notifications doesn't support personalized notifications, but we could use acs-mail/acs-mail-lite to send them out instead, and exclude them from the normal notifications if they have instant notifications set up.

Assignment Reminders

Requirements

We want to periodically send out email reminders with a list of actions the user is assigned to, asking them to come do something about it. There should be a link to a web page showing all these actions.

For each action we will list the action pretty-name, the name of the case object, the date it was enabled, the deadline, and a link to the action page, where they can do something about it.

Trying to Sum Up

Logic to Determine if Action is Enabled

Executed when any action in the workflow has been executed, to determine which actions are now enabled.

If there are any rows in workflow_case_enabled_actions for this case with enabled_state 'running', no actions can be enabled, the action is not enabled.
Is the model-specific precondition met, e.g. are we in one of the action's enabled-in states? If not, the action is not enabled.
Are other preconditions met, e.g. if the action is gated on other actions having executed a minimum number of times, or itself having executed a maximum number fo times? If not, the action is not enabled.
Execute the CanEnableP callback. If it returns false, the action is not enabled.
The action is enabled.

If the action is enabled:

If there are any rows in workflow_case_enabled_actions for this action with enabled_state of 'enabled', the action was already enabled before. Quit.
Otherwise start the "Enabled Action Logic" below.

If the action is not enabled.

If there are any rows in workflow_case_enabled_actions for this action with enabled_state of 'enabled', the action was enabled before. Update the row to set 'enabled_state' to 'canceled'.

Enabled Action Logic

Executed when an action which was previously not enabled becomes enabled.

Insert a row into workflow_case_enabled_actions with enabled_state = 'enabled', with the proper fire_timestamp: timeout = null -> fire_timestamp = nul; timeout = 0 -> fire_timestamp = current_timestamp; timeout > 0 -> fire_timestamp = current_timestamp + timeout.
If the action has a timeout of 0, then call workflow::case::action::execute and quit.

Un-Enabled Action Logic

Executed when an action which was previously enabled is no longer enabled, because the workflow's state was changed by some other action.

If the action has any child cases, these will be marked canceled.

Action Execute Logic

Executed when an enabled action is triggered.

If the action has non-null child_workflow, create child cases. For each role which has a mapping_type of 'per_member' or 'per_user', create one case per member/user of that role. If more roles have per_member/per_user setting, then the cartesian product of child cases are created (DESIGN QUESTION: Would this ever be relevant?)
If there is any ActionEnabled callback, execute that (only the first, if multiple exists), and use the workflow_fsm_output_map to determine which new state to bump the workflow to, if any.

Child Case State Changed Logic

We execute the OnChildCaseStateChange callback, if any. This gets to determine whether the parent action is now complete and should fire.

We provide a default implementation, which simply checks if the child cases are in the 'complete' state, and if so, fires.

NOTE: What do we do if any of the child cases are canceled? Consider the complete and move on with the parent workflow? Cancel the parent workflow?

On Fire Logic

When the action finally fires.

If there's any OnFire callback defined, we execute this.

If the callback has output values defined, we use the mappings in workflow_action_fsm_output_map to determine which state to move to.

After firing, we execute the SideEffect callbacks and send off notifications.

DESIGN QUESTION: How do we handle notifications for child cases? We should consider the child case part of the parent in terms of notifications, so when a child action executes, we notify those who have requested notifications on the parent. And when the last child case completes, which will also complete the parent action, we should avoid sending out duplicate notifications. How?

Callback Types

(Not needed) Action.OnEnable -> (output): Gets called when an action is enabled. Output can be used to determine the new state of the case (see below), in particular for an in-progress state.
(Not needed) Action.OnUnEnable: Gets called when an action that used to be enabled is no longer enabled. Is not called when the action fires.
(Not needed) Action.OnChildCaseStateChange -> (output, CompleteP): Called when a child changes its case state (active/completed/canceled/suspended). Returns whether the parent action has now completed. Output can be used to determine the new state of the case (see below).

NOTE: Cloning

We need to update the new_from_spec and generate_spec procedures to output and parse all the new properties from this spec which get implemented.

Implemented: Timers

Requirements

Use cases:

A student has one week to send a document to another role. If he/she fails to do so, a default action executes.
An OpenACS OCT member has one week to vote on a TIP. If he/she does not vote within that week, a default "Abstain" action is executed.

The timer will always be of the form "This action will automatically execute x amount of time after it becomes enabled". If it is later un-enabled (disabled) because another action (e.g. a vote action in the second use casae above) was executed, then the timer will be reset. If the action later becomes enabled, the timer will start anew.

Design

We currently do not have any information on which actions are enabled, and when they're enabled. We will probably need a table, perhaps one just for timed actions, in which a row is created when a timed action is enabled, and the row is deleted again when the state changes.

Extending workflow_actions

create table workflow_actions(
    ...
    -- The number of seconds after having become enabled the action
    -- will automatically execute
    timeout                 interval
    ...
);

DESIGN NOTE: The 'interval' datatype is not supported in Oracle.

The Enabled Actions Table

create table workflow_case_enabled_actions(
    case_id                 integer
                            constraint wf_case_enbl_act_case_id_nn
                            not null
                            constraint wf_case_enbl_act_case_id_fk
                            references workflow_cases(case_id)
                            on delete cascade,
    action_id               integer
                            constraint wf_case_enbl_act_action_id_nn
                            not null
                            constraint wf_case_enbl_act_action_id_fk
                            references workflow_actions(action_id)
                            on delete cascade,
    -- the timestamp when this action will fires
    execution_time          timestamptz
                            constraint wf_case_enbl_act_timeout_nn
                            not null,
    constraint workflow_case_enabled_actions_pk
    primary key (case_id, action_id)
);

The Logic

After executing an action, workflow::case::action::execute will:

Delete all actions from worklfow_case_enabled_actions which are no longer enabled.
If the timeout is zero, execute immediately.
Insert a row for all enabled actions with timeouts which are not already in workflow_case_enabled_actions, with fire_timestamp = current_timestamp + workflow_actions.timeout_seconds .

NOTE: We need to keep running, so if another automatic action becomes enabled after this action fires, they'll fire as well.

The Sweeper

The sweeper will find rows in workflow_case_enabled_actions with fire_timetsamp < current_timestamp, ordered by fire_timstamp, and execute them.

It should do a query to find the action to fire first, then release the db-handle and execute it. Then do a fresh query to find the next, etc. That way we will handle the situation correctly where the first action firing causes the second action to no longer be enabled.

The Optimization

Every time the sweeper runs, at least one DB query will be made, even if there are no timed actions to be executed.

Possible optimizations:

We keep an NSV with the timestamp (in [clock seconds] format) and (case_id, action_id) of the first action to fire. That way, the sweeper need not hit the DB at all most of the time. When a new timed action is inserted, we compare with the NSV, and update if the new action fires before the old action. When the timed action referred to in the NSV is either deleted because it gets un-enabled, or executed, we'll clear the NSV, causing the next hit to the sweeper to execute the query to find the (case_id, action_id, fire_timestamp) of the first action to fire. Finally, we would need an NSV value to represent the fact that there are no rows in this table, so we don't keep executing the query in that case.