Tampere University of Technology

TUTCRIS Research Portal

Automating transformations in data vault data warehouse loads

Research output: Chapter in Book/Report/Conference proceedingChapterScientificpeer-review

Details

Original languageEnglish
Title of host publicationInformation Modelling and Knowledge Bases XXVIII
PublisherIOS Press
Pages215-230
Number of pages16
Volume292
ISBN (Electronic)9781614997191
DOIs
Publication statusPublished - Dec 2016
Publication typeA3 Part of a book or another research book
EventInternational conference on information modelling and knowledge bases -
Duration: 1 Jan 2000 → …

Publication series

NameFrontiers in Artificial Intelligence and Applications
Volume292
ISSN (Print)0922-6389

Conference

ConferenceInternational conference on information modelling and knowledge bases
Period1/01/00 → …

Abstract

Data warehousing is a process of integrating multiple data sources into one for, e.g., reporting purposes. An emerging modeling technique for this is the data vault method. The use of data vault creates many structurally similar data processing modifications in the transform phase of ETL work. Is it possible to automate the creation of transformations? Based on our study, the answer is mostly affirmative. Data vault modeling creates certain constraints to data warehouse entities. These model constraints and data vault table populating principles can be used to generate transformation code. Based on the original relational database model and data flow metadata we can gather populating principles. These can then be used to create general templates for each entity. Nevertheless, we need to note that the use of data flow metadata can be only partially automated and includes the only manual work phases in the process. In the end we can generate the actual transformation code automatically. In this paper, we carefully describe the creation of automation procedure and analyze the practical problems based on our experiences on PL/SQL proof of concept implementation. To the best of our knowledge, similar has not yet been described in the scientific literature.

ASJC Scopus subject areas

Keywords

  • code generation, data vault, database modeling, ELT, ETL

Publication forum classification

Field of science, Statistics Finland