The preparation of existing real-world datasets for publication as high-quality semantic web data is a complex task that requires the concerted execution of a variety of processing steps using a range of different tools. Faced with both changing input data and evolving requirements on the produced output, we face a significant engineering task for schema and data transformation. We argue that to achieve a robust and flexible transformation process, a high-level declarative description is needed, that can be used to drive the entire tool chain. We have implemented this idea for the deployment of ontology-based data access (OBDA) solutions, where semantically annotated views that integrate multiple data sources on different formats are created, based on an ontology and a collection of mappings. Furthermore, we exemplify our approach and show how a single declarative description helps to orchestrate a complete tool chain, beginning with the download of datasets, and through to the installation of the datasets for a variety of tool applications, including data and query transformation processes and reasoning services. Our case study is based on several publicly available tabular and relational datasets concerning the operations of the petroleum industry in Norway. We include a discussion of the relative performance of the used tools on our case study, and an overview of lessons learnt for practical deployment of OBDA on real-world datasets.
This item's license is: Attribution-NonCommercial-NoDerivatives 4.0 International