In an ongoing industry-university collaboration we are developing a language-parametric framework for mining code idioms in legacy systems. This modular framework has a pipeline architecture and a language- parametric meta representation of the artefacts used by each of its 5 components: source code importer, mining preprocessor, pattern miner, pattern matcher, and modernisation assistant. The pipeline enables reuse of its components across systems and languages, as well as for project partners to work on each of these components separately. An example is the exploration of novel pattern mining techniques independently of the languages on which they will be applied and the modernisation assistant in which they will be used. Our first results on mining Java and COBOL code are promising, even though challenges still lie ahead to make the framework and its constituting components truly scalable, customisable, and language independent.
Original languageEnglish
Title of host publicationSeminar on Advanced Techniques & Tools for Software Evolution
Number of pages6
Publication statusPublished - 2019
Event12th Seminar on Advanced Techniques & Tools for Software Evolution - Free University of Bozen-Bolzano, Bolzano, Italy
Duration: 8 Jul 2019 → …
Conference number: 12


Conference12th Seminar on Advanced Techniques & Tools for Software Evolution
Abbreviated titleSATToSE
Period8/07/19 → …
Internet address

    Research areas

  • Pattern Mining, Frequent Tree Mining, Source Code Regularities

ID: 46729089