Complex symbolic sequence clustering and multiple classifiers for predictive process monitoring

Verenich, Ilya, Dumas, Marlon, La Rosa, Marcello, Maggi, Fabrizio Maria, & Di Francescomarino, Chiara (2016) Complex symbolic sequence clustering and multiple classifiers for predictive process monitoring. In Business Process Management Workshops: BPM 2015, 13th International Workshops, Revised Papers (Lecture Notes in Business Information Processing, Volume 256), Springer, Innsbruck, Austria, pp. 218-229.

View at publisher


This paper addresses the following predictive business process monitoring problem: Given the execution trace of an ongoing case,and given a set of traces of historical (completed) cases, predict the most likely outcome of the ongoing case. In this context, a trace refers to a sequence of events with corresponding payloads, where a payload consists of a set of attribute-value pairs. Meanwhile, an outcome refers to a label associated to completed cases, like, for example, a label indicating that a given case completed “on time” (with respect to a given desired duration) or “late”, or a label indicating that a given case led to a customer complaint or not. The paper tackles this problem via a two-phased approach. In the first phase, prefixes of historical cases are encoded using complex symbolic sequences and clustered. In the second phase, a classifier is built for each of the clusters. To predict the outcome of an ongoing case at runtime given its (uncompleted) trace, we select the closest cluster(s) to the trace in question and apply the respective classifier(s), taking into account the Euclidean distance of the trace from the center of the clusters. We consider two families of clustering algorithms – hierarchical clustering and k-medoids – and use random forests for classification. The approach was evaluated on four real-life datasets.

Impact and interest:

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

Full-text downloads:

77 since deposited on 13 Dec 2015
77 in the past twelve months

Full-text downloads displays the total number of times this work’s files (e.g., a PDF) have been downloaded from QUT ePrints as well as the number of downloads in the previous 365 days. The count includes downloads for all files if a work has more than one.

ID Code: 91194
Item Type: Conference Paper
Refereed: Yes
Additional URLs:
DOI: 10.1007/978-3-319-42887-1_18
ISBN: 978-3-319-42886-4
Subjects: Australian and New Zealand Standard Research Classification > INFORMATION AND COMPUTING SCIENCES (080000) > INFORMATION SYSTEMS (080600)
Divisions: Past > QUT Faculties & Divisions > Faculty of Science and Technology
Current > Schools > School of Information Systems
Current > QUT Faculties and Divisions > Science & Engineering Faculty
Copyright Owner: Copyright 2015 [Please consult the author]
Deposited On: 13 Dec 2015 23:17
Last Modified: 25 Oct 2016 05:31

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page