King Saud University Repository >
King Saud University >
Science Colleges >
College of Computer and Information Sciences >
College of Computer and Information Sciences >

Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/15761

Title: Web 2.0 Content Extraction
Authors: Mohannmad Waqar
Issue Date: 2010
Publisher: International Conference for Internet Technology and Secured Transactions (ICITST-2010) in London, UK
Abstract: This paper presents a simple, efficient and extendable solution for content extraction from web 2.0. Web 2.0 is perceived as the second generation of the web technologies. Web 2.0 has undoubtedly made significant impact in enriching the end-user experience and allowing programmers to write more interactive desktop-like applications for the web. However, it has also introduced some new issues for researchers in the field information retrieval and has made the job of information retrieval from web more difficult, time consuming and challenging. Web pages contain lot of clutter besides the original article. To extract the main content several methods have been developed. However, these methods were originally designed based on the traditional model of the web, and would fail to work on web 2.0 content. Due to evident popularity of web 2.0, the volume of the web 2.0 content on the Web will rise sharply in the coming years. In this paper we propose a new solution to this problem, based upon open source components, which will make the job of web 2.0 content extraction more efficient and will reduce the utilization of precious system resources. The paper also presents a high level logical design for the implementation of such system though available open source components.
URI: http://hdl.handle.net/123456789/15761
Appears in Collections:College of Computer and Information Sciences

Files in This Item:

File Description SizeFormat
Mohannmad Waqar -conf-1.doc37.5 kBMicrosoft WordView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.


DSpace Software Copyright © 2002-2009 MIT and Hewlett-Packard - Feedback