Please use this identifier to cite or link to this item: http://hdl.handle.net/2307/4550
DC FieldValueLanguage
dc.contributor.advisorMerialdo, Paolo-
dc.contributor.authorBlanco, Lorenzo-
dc.date.accessioned2015-05-25T13:50:59Z-
dc.date.available2015-05-25T13:50:59Z-
dc.date.issued2011-03-26-
dc.identifier.urihttp://hdl.handle.net/2307/4550-
dc.description.abstractThe web contains a huge amount of structured information provided by a large number of web sites. Since the current search engines are not able to fully recognize this kind of data, this abundance of information is an enormous opportunity to create new applications and services. To exploit the structured web data, several challenging issues must be addressed, spanning from the web pages gathering, the data extraction and integration, and the characterization of conflicting data. Three design criteria are critical for techniques that aim at working at the web scale: Scalability (in terms of computational complexity), unsupervised approach (as human intervention can not be involved at the web scale), and domain–independence (to avoid custom solutions). The thesis of this dissertation is that the redundancy of information provided by the web sources can be leveraged to create a system that locates the pages of interest, extracts and integrates the information, and handles the data inconsistency that the redundancy naturally implies.it_IT
dc.language.isoenit_IT
dc.publisherUniversità degli studi Roma Treit_IT
dc.titleExtraction, integration and probabilistic characterization of web datait_IT
dc.typeDoctoral Thesisit_IT
dc.subject.miurSettori Disciplinari MIUR::Ingegneria industriale e dell'informazione::SISTEMI DI ELABORAZIONE DELLE INFORMAZIONIit_IT
dc.subject.miurIngegneria industriale e dell'informazione-
dc.subject.isicruiCategorie ISI-CRUI::Ingegneria industriale e dell'informazione::Information Technology & Communications Systemsit_IT
dc.subject.isicruiIngegneria industriale e dell'informazione-
dc.subject.anagraferoma3Ingegneria industriale e dell'informazioneit_IT
dc.contributor.refereeLaender, Alberto-
dc.contributor.refereeSrivastava, Divesh-
dc.rights.accessrightsinfo:eu-repo/semantics/openAccess-
dc.description.romatrecurrentX_Dipartimento di Informatica e automazione*
item.grantfulltextrestricted-
item.languageiso639-1other-
item.fulltextWith Fulltext-
Appears in Collections:X_Dipartimento di Informatica e automazione
T - Tesi di dottorato
Files in This Item:
File Description SizeFormat
Extraction_ integration and probabilistic characterization of web data.pdf4.64 MBAdobe PDFView/Open
Show simple item record Recommend this item

Page view(s)

157
Last Week
0
Last month
0
checked on Nov 21, 2024

Download(s)

55
checked on Nov 21, 2024

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.