Supporting Topic Map Creation Using Data Mining Techniques
Keywords:
Information Systems Development, data mining, term crawling, topic map, Clustering Hierarchy Projection
Abstract
There is an increasing interest in automating creation of semantic structures, especially topic maps, by taking advantage of existing, structured information resources. This article gives a preview of the most popular method – based on RDF triples, and suggests a way to automate topic map creation from unstructured information sources. The method can be applied in information systems development domain when analysing vast unstructured data repositories in preparation for system design, or when migrating large amounts of unstructured data from legacy systems. There are two innovative methods presented in the paper – Term Crawling (TC) and Clustering Hierarchy Projection (CHP), which are applied to build a topic map based on free text documents from local repositories and those downloaded from the Internet. The methods originate from data mining techniques for knowledge discovery. A sample tool, which uses described techniques, has been implemented. The preliminary results that have been achieved on the test collection are presented in concluding sections of the article.
How to Cite
Abramowicz, W., Kaczmarek, T., & Kowalkiewicz, M. (1). Supporting Topic Map Creation Using Data Mining Techniques. Australasian Journal of Information Systems, 11(1). https://doi.org/10.3127/ajis.v11i1.147
Section
Research on Information Systems Development
Copyright (c) 1969 Witold Abramowicz, Tomasz Kaczmarek, Marek Kowalkiewicz

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
AJIS publishes open-access articles distributed under the terms of a Creative Commons Non-Commercial and Attribution License which permits non-commercial use, distribution, and reproduction in any medium, provided the original author and AJIS are credited. All other rights including granting permissions beyond those in the above license remain the property of the author(s).