King Saud University Repository >
King Saud University >
Science Colleges >
College of Computer and Information Sciences >
College of Computer and Information Sciences >

Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/19380

Title: Towards Developing Automatic Name-Entity-Recognition System for Arabic Text
Authors: Naji, Fadl Dahan Abdu
Touir, Dr. Ameur
Keywords: Name-Entity-Recognition System
Issue Date: 29-Jan-2011
Abstract: Name Entity Recognition (NER) has emerged as a Natural Language Processing (NLP) technology that is effective and can provide high value to several different kinds of application such as Information Extraction (IE), Information Retrieval (IR), Question Answering (QA), text clustering, etc. NER is responsible for the identification of proper names in text and their classification as different types of named entity such as people, locations, and organizations. There are two main approaches to NER, one is based on linguistic knowledge in particular grammar rules and hence called rule-based, while the other is based on machine learning techniques. We aim in this research to build Name Entity Recognition automatic system for the Arabic language using Machine Learning (ML) approaches; these approaches have many models such as Maximum Entropy (ME), Decision Tree (DR), Support vector Machines (SVM), and Hidden Markov Model (HMM). Among these models we used HMM to build our system for the reason that it relies on the context structure. ML approaches provide us the ability to work with unrestricted domain, and to adapt the suitable machine learning with the nature and the difficulties of some characteristic of the Arabic language. The Arabic language does not exhibit differences in orthographic case; whereas the English language mixes case texts, therefore, there is some obvious clue such as initial capitalized letters to indicate the presence of a name constituent.
URI: http://hdl.handle.net/123456789/19380
Appears in Collections:College of Computer and Information Sciences

Files in This Item:

File Description SizeFormat
Thesis.pdf2.14 MBAdobe PDFView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.


DSpace Software Copyright © 2002-2009 MIT and Hewlett-Packard - Feedback