Deep neural networks for Arabic information extraction

Saadi, Abdelhalim; Belhadef, Hacene

doi:10.1108/sasbe-03-2019-0031

Deep neural networks for Arabic information extraction

Auteur(s):	Abdelhalim Saadi Hacene Belhadef
Médium:	article de revue
Langue(s):	anglais
Publié dans:	Smart and Sustainable Built Environment, octobre 2020, n. 4, v. 9
Page(s):	467-482
DOI:	10.1108/sasbe-03-2019-0031
Abstrait:	Purpose The purpose of this paper is to present a system based on deep neural networks to extract particular entities from natural language text, knowing that a massive amount of textual information is electronically available at present. Notably, a large amount of electronic text data indicates great difficulty in finding or extracting relevant information from them. Design/methodology/approach This study presents an original system to extract Arabic-named entities by combining a deep neural network-based part-of-speech tagger and a neural network-based named entity extractor. Firstly, the system extracts the grammatical classes of the words with high precision depending on the context of the word. This module plays the role of the disambiguation process. Then, a second module is used to extract the named entities. Findings Using deep neural networks in natural language processing, requires tuning many hyperparameters, which is a time-consuming process. To deal with this problem, applying statistical methods like the Taguchi method is much requested. In this study, the system is successfully applied to the Arabic-named entities recognition, where accuracy of 96.81 per cent was reported, which is better than the state-of-the-art results. Research limitations/implications The system is designed and trained for the Arabic language, but the architecture can be used for other languages. Practical implications Information extraction systems are developed for different applications, such as analysing newspaper articles and databases for commercial, political and social objectives. Information extraction systems also can be built over an information retrieval (IR) system. The IR system eliminates irrelevant documents and paragraphs. Originality/value The proposed system can be regarded as the first attempt to use double deep neural networks to increase the accuracy. It also can be built over an IR system. The IR system eliminates irrelevant documents and paragraphs. This process reduces the mass number of documents from which the authors wish to extract the relevant information using an information extraction system.

Structurae ne peut pas vous offrir cette publication en texte intégral pour l'instant. Le texte intégral est accessible chez l'éditeur. DOI: 10.1108/sasbe-03-2019-0031.

Informations
sur cette fiche
Reference-ID
10779846
Publié(e) le:
12.05.2024
Modifié(e) le:
12.05.2024

Publicité

Structurae coopère avec

International Association for Bridge and Structural Engineering (IABSE)

Publicité

Deep neural networks for Arabic information extraction

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Originality/value

Vous avez dépassé votre limite de téléchargement mensuelle !

Les abonné·es à Structurae Plus peuvent télécharger 30 images ou ensembles de données par mois.
Les utilisateurs·rices de Structurae Pro sont limités à 50.

Données obligatoires

Deep neural networks for Arabic information extraction

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Originality/value

Vous avez dépassé votre limite de téléchargement mensuelle ! Les abonné·es à Structurae Plus peuvent télécharger 30 images ou ensembles de données par mois. Les utilisateurs·rices de Structurae Pro sont limités à 50.

Données obligatoires

Vous avez dépassé votre limite de téléchargement mensuelle !

Les abonné·es à Structurae Plus peuvent télécharger 30 images ou ensembles de données par mois.
Les utilisateurs·rices de Structurae Pro sont limités à 50.