Tampere University of Technology

TUTCRIS Research Portal

Mining itemset-based distinguishing sequential patterns with gap constraint

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Details

Original languageEnglish
Title of host publicationDatabase Systems for Advanced Applications - 20th International Conference, DASFAA 2015, Proceedings Hanoi, Vietnam, April 20-23, 2015 Proceedings, Part I
PublisherSpringer Verlag
Pages39-54
Number of pages16
Volume9049
ISBN (Print)9783319181196
DOIs
Publication statusPublished - 2015
Publication typeA4 Article in a conference publication
Event20th International Conference on Database Systems for Advanced Applications, DASFAA 2015 - Hanoi, Viet Nam
Duration: 20 Apr 201523 Apr 2015

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9049
ISSN (Print)03029743
ISSN (Electronic)16113349

Conference

Conference20th International Conference on Database Systems for Advanced Applications, DASFAA 2015
CountryViet Nam
CityHanoi
Period20/04/1523/04/15

Abstract

Mining contrast sequential patterns, which are sequential patterns that characterize a given sequence class and distinguish that class from another given sequence class, has a wide range of applications including medical informatics, computational finance and consumer behavior analysis. In previous studies on contrast sequential pattern mining, each element in a sequence is a single item or symbol. This paper considers a more general case where each element in a sequence is a set of items. The associated contrast sequential patterns will be called itemsetbased distinguishing sequential patterns (itemset-DSP). After discussing the challenges on mining itemset-DSP, we present iDSP-Miner, a mining method with various pruning techniques, for mining itemset-DSPs that satisfy given support and gap constraint. In this study, we also propose a concise border-like representation (with exclusive bounds) for sets of similar itemset-DSPs and use that representation to improve efficiency of our proposed algorithm. Our empirical study using both real data and synthetic data demonstrates that iDSP-Miner is effective and efficient.

Keywords

  • Contrast mining, Itemset, Sequential pattern