Please use this identifier to cite or link to this item:
Keywords: DATALOG
Issue Date: 23-Apr-2018
Publisher: Università degli studi Roma Tre
Abstract: The impressive amount of available corporate information motivates a renewed momentum of knowledge representation and reasoning (KRR) for- malisms, which allow to harness such knowledge and put it into action for many tasks. In the \early times", the importance of capturing knowledge about schema and data transformations has been clearly recognized and led to the development of logic-based formalisms, e.g., schema mappings, for data management purposes. More recently, modern companies wish to maintain knowledge in the form of a knowledge graph and use and manage it to achieve e cient and scalable reasoning over Big Data with accept- able computational complexity. This is giving rise to new systems, namely knowledge graph management systems (KGMS), based on logic-based KRR formalisms, which aim at supporting the traditional data management ac- tivities as well as more complex reasoning tasks, computing analytics and leveraging arti cial intelligence algorithms. The construction of such KRR languages and reasoning systems moti- vates a line of research in logic-based approaches that fosters fully- edged languages with features useful in practice while achieving a good compromise between expressive power and computational complexity. The dissertation presents theoretical results, techniques of practical util- ity and software systems that apply logic-based approaches to modern data management and reasoning scenarios. In particular, we present ve main contributions. We start from the application of schema mappings in the speci c domain of statistical data processing and adopt a logic formalism to bridge the gap between the de nition of statistical programs in terms of high-level domain-speci c constructs and the actual execution of such pro- grams in specialized systems. As a conceptual contribution, we argue for the equivalence between statistical data processing tasks and data exchange. Practically, we introduce new principled optimization techniques based on schema mapping composition and present a real-world system for statistical data processing, which is currently in use at the Bank of Italy. Then, we focus on large-scale data integration and introduce an approach based on consolidated and novel notions of the inverse of schema mappings, to manage scenarios with very many sources, where a manual speci cation of schema mappings would be unfeasible. In particular, we observe that in such scenarios, although di erent, the sources are very similar to one another. We exploit this to solve the data integration task with respect to a single representative source and then automatically extend the solution to the other sources. In order to optimize the e ectiveness of adopting schema mappings in complex corporate settings, as a further contribution, we propose a solution for the reuse of data transformations, based on a meta-level approach. We consider \more general" schema mappings, namely meta-mappings, which capture a number of ordinary mappings. We formally characterize the reuse problem through the notion of tness and introduce a searchable repository of transformations. Our repository relies on a novel notion of coverage, which measures the adherence of meta-mappings to a speci c scenario and allows e cient indexing. The approach has been implemented in our system, GAIA, and empirically tested in large-scale synthetic and real-world settings. We then examine the scienti c and technological achievements in KRR languages and sustain the concept of KGMS. As a central contribution, we propose a set of desiderata for KGMSs with special reference to the requirements for ontological reasoning and present a language and a system, the vadalog Reasoner, achieving such desiderata. Our language, vadalog, is based on Warded Datalog , a recent member of the Datalog family with a balance between very good expressive power and computational complexity. We present the vadalog Reasoner in detail, introducing principled tech- niques and data structures that leverage the theoretical underpinnings of Warded Datalog to achieve high performance reasoning, by means of an aggressive recursion and termination control, which exploits the periodicities in reasoning graphs. We also present the details of the system architecture, which brings key techniques and patterns from DBMS implementation prac- tice into the context of reasoning. Our system is evaluated experimentally on settings with millions of objects with sophisticated interactions.
Access Rights: info:eu-repo/semantics/openAccess
Appears in Collections:X_Dipartimento di Ingegneria
T - Tesi di dottorato

Files in This Item:
File Description SizeFormat
tesi_bellomarini.pdf3.34 MBAdobe PDFView/Open
Show full item record Recommend this item

Page view(s)

checked on Feb 24, 2024


checked on Feb 24, 2024

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.