Please use this identifier to cite or link to this item: http://hdl.handle.net/2307/4408
Title: A model oriented approach to heterogeneity
Authors: Bugiotti, Francesca
Advisor: Atzeni, Paolo
Keywords: database
Issue Date: 19-Apr-2012
Publisher: Università degli studi Roma Tre
Abstract: Data heterogeneity is a major issue in any context where software directly deals with data. The most general expectation of any complex system is the so-­‐called seamless integration, where data can be accessed, retrieved and handled with uniform techniques, tools and algorithms. In this sense, integration can be considered as the opposite of heterogeneity. It is a property of data that gives a measure of the degree of coherent exploitability. Indeed, the more conformed pieces of data are, the more information will be retrieved from them. Passing from highly heterogeneous to integrated data is the subject of data integration, the discipline that formulates in formal terms all the processes and the technical and algorithmic steps needed to transform data. Heterogeneity is a twofold issue. It has a technical connotation and a theoretical one. Data can be distributed in different data sources, memorized in various formats and encoding conventions, be queried via incompatible programming interfaces. All these kind of problems can be addressed with the design of proper adapting architectures, involving the presence of a number of connectors homogenizing from a technical perspective. On the other hand, the core problem with heterogeneity is that data can be intrinsically different because different data models -­‐collections of structural entities-­‐ are adopted to organize them. A relational database instance is intrinsically different from a XML file or from a collection of objects in a NoSQL key-­‐store repository. The divergences are not merely technical, but representational. The aim of this work is dealing with data integration techniques under a number of perspectives. From the theoretical perspective, we consider the Model Management as the framework to formalize translation problems. A schema, instance of a certain model will be translated to another schema instance of a target model. We recognize the need for a model-­‐ independent solution to schema and data translation and, in general, to model management problems. Hence we present MIDST, a tool born from many years of experience on schema and data translation, based on a metalevel approach. From the performance perspective, we appreciate the value of runtime environments, where translations are not performed out of the system with an import-­‐translate-­‐export process; by contrast, we illustrate, as a novel contribution, MIDST-­‐RT, an evolution of MIDST, where translations are performed at runtime and even generate views of data. Data heterogeneity also poses more dynamic problems, linked with data evolution and maintenance. As an important example, we adopt the perspective of model management to present a model-­‐ independent solution to round-­‐trip engineering problem, that prototypes the typical propagation of changes among related schema. MISM, a model management framework based on MIDST (and MIDST-­‐RT) is then illustrated. The contribution in this field is the use of MIDST as a model management platform. Nowadays market demand for highly specialized data processors, performing at best in specific cases such as web content retrieval, document search, object serialization, parallel calculation is taken in particular consideration. NoSQL engines promise exceptional performance in non transactional fields and leverage simplified but peculiar data models. Therefore a core goal of data integration is providing techniques and tools to facilitate the interaction with these systems. In the final part of this work, we extend our metalevel approach to encompass NoSQL systems and present several experimental results on them also addressing still unexplored indexing strategies. In conclusion, in this thesis we provide contributions in the three mentioned fields of data integration: a theoretical model management framework is proposed; performance is addressed through a change of paradigm within the context of metamodel approaches involving the fact that translations are operated directly in the target system; NoSQL systems are touched on and encompassed in the proposed metamodel.
URI: http://hdl.handle.net/2307/4408
Access Rights: info:eu-repo/semantics/openAccess
Appears in Collections:X_Dipartimento di Informatica e automazione
T - Tesi di dottorato

Files in This Item:
File Description SizeFormat
A_model_oriented_approach_to_heterogeneity.pdf1.95 MBAdobe PDFView/Open
Show full item record Recommend this item

Page view(s)

77
Last Week
0
Last month
0
checked on Apr 23, 2024

Download(s)

31
checked on Apr 23, 2024

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.