King Saud University Repository >
King Saud University >
Science Colleges >
College of Computer and Information Sciences >
College of Computer and Information Sciences >

Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/19380

Title: Towards Developing Automatic Name-Entity-Recognition System for Arabic Text
Authors: Naji, Fadl Dahan Abdu
Touir, Dr. Ameur
Keywords: Name-Entity-Recognition System
تاريخ النشر: 29-ينا-2011
Abstract: Name Entity Recognition (NER) has emerged as a Natural Language Processing (NLP) technology that is effective and can provide high value to several different kinds of application such as Information Extraction (IE), Information Retrieval (IR), Question Answering (QA), text clustering, etc. NER is responsible for the identification of proper names in text and their classification as different types of named entity such as people, locations, and organizations. There are two main approaches to NER, one is based on linguistic knowledge in particular grammar rules and hence called rule-based, while the other is based on machine learning techniques. We aim in this research to build Name Entity Recognition automatic system for the Arabic language using Machine Learning (ML) approaches; these approaches have many models such as Maximum Entropy (ME), Decision Tree (DR), Support vector Machines (SVM), and Hidden Markov Model (HMM). Among these models we used HMM to build our system for the reason that it relies on the context structure. ML approaches provide us the ability to work with unrestricted domain, and to adapt the suitable machine learning with the nature and the difficulties of some characteristic of the Arabic language. The Arabic language does not exhibit differences in orthographic case; whereas the English language mixes case texts, therefore, there is some obvious clue such as initial capitalized letters to indicate the presence of a name constituent.
URI: http://hdl.handle.net/123456789/19380
يظهر في المجموعات:College of Computer and Information Sciences

:الملفات في هذا العنصر

ملف وصف حجمالنوع
Thesis.pdf2.14 MBAdobe PDFعرض\u0641تح

جميع جميع الابحاث محمية بموجب حقوق الطباعة، جميع الحقوق محفوظة.


البرمجيات DSpace حقوق المؤلف © 2002-2009 معهد ماساتشوستس للتكنولوجيا و Hewlet Packard - التغذية الراجعة