Extraction, integration and probabilistic characterization of web data

Adeegso tilmaantan si aad u carrabbaabdo ama ugu samayso link qoraalkan http://hdl.handle.net/2307/4550

Cinwaan:	Extraction, integration and probabilistic characterization of web data
Qore:	Blanco, Lorenzo
Tifaftire:	Merialdo, Paolo
Dibueege:	Laender, Alberto Srivastava, Divesh
Taariikhda qoraalka:	26-Mar-2011
Tifaftire:	Università degli studi Roma Tre
Abstract:	The web contains a huge amount of structured information provided by a large number of web sites. Since the current search engines are not able to fully recognize this kind of data, this abundance of information is an enormous opportunity to create new applications and services. To exploit the structured web data, several challenging issues must be addressed, spanning from the web pages gathering, the data extraction and integration, and the characterization of conflicting data. Three design criteria are critical for techniques that aim at working at the web scale: Scalability (in terms of computational complexity), unsupervised approach (as human intervention can not be involved at the web scale), and domain–independence (to avoid custom solutions). The thesis of this dissertation is that the redundancy of information provided by the web sources can be leveraged to create a system that locates the pages of interest, extracts and integrates the information, and handles the data inconsistency that the redundancy naturally implies.
URI :	http://hdl.handle.net/2307/4550
Xuquuqda Gelitaanka:	info:eu-repo/semantics/openAccess
Wuxuu ka dhex muuqdaa ururinnada:	X_Dipartimento di Informatica e automazione T - Tesi di dottorato

Fayl	Sifayn	Baac	Fayl
Extraction_ integration and probabilistic characterization of web data.pdf		4.64 MB	Adobe PDF	Muuji/fur

367

Last Week
0

Last month
0

checked on Jul 12, 2026

106

checked on Jul 12, 2026

Check

Dhammaan qoraallada lagu kaydiyay DSpace waxay u dhowrsanyihiin xuquuqda qoraha.