Process Visualisation

Introduction

This article gives you an overview of some visualisations that you can generate with rprom.

We are using package eventdataR to get some sample data for visualisation. We may use various datasets from this package.

First of all, we build an empty Transition System object:

tsobj = new('TransitionSystem')

Now, we pick an eventlog from eventdataR package and feed our object with the data:

traffic_fines_df = eventdataR::traffic_fines %>% as.data.frame()

traffic_fines_df %>% head %>% knitr::kable()

case_id	activity	lifecycle	resource	timestamp	amount	article	dismissal	expense	lastsent	matricola	notificationtype	paymentamount	points	totalpaymentamount	vehicleclass	activity_instance_id	.order
A1	Create Fine	complete	561	2006-07-24	350	157	NIL	NA	NA	NA	NA	NA	0	0.0	A	1	1
A1	Send Fine	complete	NA	2006-12-05	NA	NA	NA	110	NA	NA	NA	NA	NA	NA	NA	2	2
A100	Create Fine	complete	561	2006-08-02	350	157	NIL	NA	NA	NA	NA	NA	0	0.0	A	3	3
A100	Send Fine	complete	NA	2006-12-12	NA	NA	NA	110	NA	NA	NA	NA	NA	NA	NA	4	4
A100	Insert Fine Notification	complete	NA	2007-01-15	NA	NA	NA	NA	P	NA	P	NA	NA	NA	NA	5	5
A100	Add penalty	complete	NA	2007-03-16	715	NA	NA	NA	NA	NA	NA	NA	NA	NA	NA	6	6

We can define completion of an activity as status for our cases, so we filter for lifecycle to be complete. You also need to specify critical column headers for the data feeder:

tsobj$feed.transition_events(traffic_fines_df %>% filter(lifecycle == 'complete'), 
                    caseID_col = 'case_id', 
                    status_col = 'activity',
                    startTime_col = 'timestamp')

Process maps:

The first and most valuable process visualisation is a process map.

Process map is a directed graph containing nodes and edges. Each node is associated with a status and each edge denotes a transition from one status to another.

Method plot.process.map can be used to plot the process map for the transition system:

tsobj$plot.process.map()

## 
##  Aggregating nodes ...Done! 
## 
##  Aggregating links ...Done!

Process Map Measures:

Rate

Each edge (link or transition) in the process map has a source status and a destination status. By default method plot.process.map uses rate as the measure denoted by edges. On each edge, you will see the the rate (ratio) of transitions from the source status going to the destination status as a percentage of total outgoing transitions from the source status. The number denoted by each node(status) shows the total incoming transition to that status.

Frequency

When you choose frequency as measure, the incoming edges of each status annotate frequencies of entries to that status while the outgoing edges annotate the exit frequencies (outgoing from that status to other statuses). This option can be chosen by setting argument measure to freq.

Time

Other than frequencies in the process map, you can also use time as measure to focus on the durations of statuses and transitions. You can specify the time unit via argument time_unit:

tsobj$plot.process.map(measure = 'time', time_unit = 'hour')

Values denoted on each status(node) show average duration of lingering in that status and values denoted on the edges show average transition times which is average duration of lingering in the source status before transiting to the destination status.

Case Filtering:

Sometimes showing all transitions in the process map, makes it look noisy and complicated. We may need to filter out some cases with unusual traces in order to have a better visualisation of the process.

The most common filtering is to filter out cases which have uncommon transitions. These are what we call noisy cases. These cases follow traces different from the most frequent traces of the process and thus deviate from the main process.

Case filtering can be done via method filter.case. To filter out noisy cases, you can set argument FreqRateCut to a value between 0 and 1. Usually, a value close to 1.0 is chosen. For example, if you set it to 0.9, 10% of cases with the least frequent traces will be eliminated:

tsobj$set.filter.case(freq_rate_cut = 0.9)

tsobj$plot.process.map()

## 
##  Aggregating nodes ...Done! 
## 
##  Aggregating links ...Done!

When you case filter your object, all measures and charts will be impacted by the filtering until you run method filer.reset or run filter.case with different filtering arguments. For complete information about case filtering refer to the package API reference.

Process Map Plotters:

rprom currently supports two plotters for rendering process map graph: grviz and visNetwork. The default plotter is grviz which is using javascript package GraphVis and is available via the R package rviz. To use plotter visNetwork, you need to have R package visNetwork installed:

if(!require(visNetwork)){install.packages('visNetwork')}

visNetwork plots become noisy and unreadable when the graph size is big. So it’s best to use a frequancy rate cut filtering before plotting to filter out cases with low-frequent traces and simplify the process:

tsobj$reset.filters()
tsobj$set.filter.case(freq_rate_cut = 0.8)

tsobj$plot.process.map(plotter = 'visNetwork')

## 
##  Aggregating nodes ...Done! 
## 
##  Aggregating links ...Done!

## Loading required package: visNetwork

The layout of the plot generated by grvis looks better, on the other side, visNetwork gives you possibility of dragging the nodes or enable physics.

You can customize your plot by specifying parameters to the config argument.

For example, for the visNetwork plot, you can select hierarchical layout, change direction, edge smoothing type and enable physics for nodes and edges:

tsobj$reset.plots()
customized = list(layout = 'hierarchical', node.physics.enabled = T, direction = 'left.right', 
                  link.smooth = list(enabled = T, type = 'curvedCCW'))

tsobj$plot.process.map(plotter = 'visNetwork', config = customized,  width = "800px", height = "1200px")

You can refer to the documentation of rvis package to see how you can specify argument config in order to customize your plot. rvis uses a unique configuration format for all the plotters it supports. You can also directly use visNetwork functions to customize your plot. For example you can change edge arrow sizes.

tsobj$reset.plots()

tsobj$plot.process.map(plotter = 'visNetwork') %>% 
  visNetwork::visEdges(arrows = list(to = list(enabled = TRUE, scaleFactor = 1.5)))

For complete instructions refer to the documentation of visNetwork package.

Process Sankey

Sankey chart is another type of visualisation which can show you the process by focusing more on the process flows.

tsobj %>% plot_process_sankey()

## Loading required package: networkD3

Process Tree

Another type of visualisation ideal for showing a process is an interactive Sankey-Tree.

To generate this visualisation, you will need to install packages sankeytreeR and treemap. package treemap can be installed from cran and sankeytree needs to be installed from github:

if(!require(treemap)){install.packages('treemap')}

## Loading required package: treemap

if(!require(d3r)){install.packages('d3r')}

## Loading required package: d3r

if(!require(sankeytreeR)){devtools::install_github('https://github.com/timelyportfolio/sankeytree.git')}

## Loading required package: sankeytreeR

tsobj %>% plot_process_tree()

## 
##  Aggregating traces ...Done!

Nima Ramezani