Adaptation, Comparison, and Improvement of Metaheuristic Algorithms to the Part-of-Speech Tagging Problem
Part-of-Speech Tagging (POST) is a complex task in the preprocessing of Natural Language Processing applications. Tagging has been tackled from statistical information and rule-based approaches, making use of a range of methods. Most recently, metaheuristic algorithms have gained attention while bei...
- Autores:
- Tipo de recurso:
- Fecha de publicación:
- 2020
- Institución:
- Universidad Pedagógica y Tecnológica de Colombia
- Repositorio:
- RiUPTC: Repositorio Institucional UPTC
- Idioma:
- eng
spa
- OAI Identifier:
- oai:repositorio.uptc.edu.co:001/14291
- Acceso en línea:
- https://revistas.uptc.edu.co/index.php/ingenieria/article/view/11762
https://repositorio.uptc.edu.co/handle/001/14291
- Palabra clave:
- computational intelligence
computational linguistics
evolutionary computing
heuristic algorithms
natural language processing
parts of speech tagging
search methods
algoritmos heurísticos
computación evolutiva
etiquetado de partes del discurso
inteligencia computacional
lingüística computacional
métodos de búsqueda
procesamiento de lenguaje natural
- Rights
- License
- http://purl.org/coar/access_right/c_abf151
id |
REPOUPTC2_69d2c2f3ee47c3f6811a45b77eef6f51 |
---|---|
oai_identifier_str |
oai:repositorio.uptc.edu.co:001/14291 |
network_acronym_str |
REPOUPTC2 |
network_name_str |
RiUPTC: Repositorio Institucional UPTC |
repository_id_str |
|
dc.title.en-US.fl_str_mv |
Adaptation, Comparison, and Improvement of Metaheuristic Algorithms to the Part-of-Speech Tagging Problem |
dc.title.es-ES.fl_str_mv |
Adaptación, comparación y mejora de algoritmos metaheurísticos al problema de etiquetado de partes del discurso |
title |
Adaptation, Comparison, and Improvement of Metaheuristic Algorithms to the Part-of-Speech Tagging Problem |
spellingShingle |
Adaptation, Comparison, and Improvement of Metaheuristic Algorithms to the Part-of-Speech Tagging Problem computational intelligence computational linguistics evolutionary computing heuristic algorithms natural language processing parts of speech tagging search methods algoritmos heurísticos computación evolutiva etiquetado de partes del discurso inteligencia computacional lingüística computacional métodos de búsqueda procesamiento de lenguaje natural |
title_short |
Adaptation, Comparison, and Improvement of Metaheuristic Algorithms to the Part-of-Speech Tagging Problem |
title_full |
Adaptation, Comparison, and Improvement of Metaheuristic Algorithms to the Part-of-Speech Tagging Problem |
title_fullStr |
Adaptation, Comparison, and Improvement of Metaheuristic Algorithms to the Part-of-Speech Tagging Problem |
title_full_unstemmed |
Adaptation, Comparison, and Improvement of Metaheuristic Algorithms to the Part-of-Speech Tagging Problem |
title_sort |
Adaptation, Comparison, and Improvement of Metaheuristic Algorithms to the Part-of-Speech Tagging Problem |
dc.subject.en-US.fl_str_mv |
computational intelligence computational linguistics evolutionary computing heuristic algorithms natural language processing parts of speech tagging search methods |
topic |
computational intelligence computational linguistics evolutionary computing heuristic algorithms natural language processing parts of speech tagging search methods algoritmos heurísticos computación evolutiva etiquetado de partes del discurso inteligencia computacional lingüística computacional métodos de búsqueda procesamiento de lenguaje natural |
dc.subject.es-ES.fl_str_mv |
algoritmos heurísticos computación evolutiva etiquetado de partes del discurso inteligencia computacional lingüística computacional métodos de búsqueda procesamiento de lenguaje natural |
description |
Part-of-Speech Tagging (POST) is a complex task in the preprocessing of Natural Language Processing applications. Tagging has been tackled from statistical information and rule-based approaches, making use of a range of methods. Most recently, metaheuristic algorithms have gained attention while being used in a wide variety of knowledge areas, with good results. As a result, they were deployed in this research in a POST problem to assign the best sequence of tags (roles) for the words of a sentence based on information statistics. This process was carried out in two cycles, each of them comprised four phases, allowing the adaptation to the tagging problem in metaheuristic algorithms such as Particle Swarm Optimization, Jaya, Random-Restart Hill Climbing, and a memetic algorithm based on Global-Best Harmony Search as a global optimizer, and on Hill Climbing as a local optimizer. In the consolidation of each algorithm, preliminary experiments were carried out (using cross-validation) to adjust the parameters of each algorithm and, thus, evaluate them on the datasets of the complete tagged corpus: IULA (Spanish), Brown (English) and Nasa Yuwe (Nasa). The results obtained by the proposed taggers were compared, and the Friedman and Wilcoxon statistical tests were applied, confirming that the proposed memetic, GBHS Tagger, obtained better results in precision. The proposed taggers make an important contribution to POST for traditional languages (English and Spanish), non-traditional languages (Nasa Yuwe), and their application areas. |
publishDate |
2020 |
dc.date.accessioned.none.fl_str_mv |
2024-07-05T19:11:56Z |
dc.date.available.none.fl_str_mv |
2024-07-05T19:11:56Z |
dc.date.none.fl_str_mv |
2020-09-18 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article |
dc.type.coar.fl_str_mv |
http://purl.org/coar/resource_type/c_2df8fbb1 |
dc.type.coarversion.fl_str_mv |
http://purl.org/coar/version/c_970fb48d4fbd8a85 |
dc.type.version.spa.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.coarversion.spa.fl_str_mv |
http://purl.org/coar/version/c_970fb48d4fbd8a234 |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
https://revistas.uptc.edu.co/index.php/ingenieria/article/view/11762 10.19053/01211129.v29.n54.2020.11762 |
dc.identifier.uri.none.fl_str_mv |
https://repositorio.uptc.edu.co/handle/001/14291 |
url |
https://revistas.uptc.edu.co/index.php/ingenieria/article/view/11762 https://repositorio.uptc.edu.co/handle/001/14291 |
identifier_str_mv |
10.19053/01211129.v29.n54.2020.11762 |
dc.language.none.fl_str_mv |
eng spa |
dc.language.iso.spa.fl_str_mv |
eng spa |
language |
eng spa |
dc.relation.none.fl_str_mv |
https://revistas.uptc.edu.co/index.php/ingenieria/article/view/11762/9627 https://revistas.uptc.edu.co/index.php/ingenieria/article/view/11762/9660 https://revistas.uptc.edu.co/index.php/ingenieria/article/view/11762/10015 |
dc.rights.coar.fl_str_mv |
http://purl.org/coar/access_right/c_abf2 |
dc.rights.coar.spa.fl_str_mv |
http://purl.org/coar/access_right/c_abf151 |
rights_invalid_str_mv |
http://purl.org/coar/access_right/c_abf151 http://purl.org/coar/access_right/c_abf2 |
dc.format.none.fl_str_mv |
application/pdf application/pdf application/xml |
dc.publisher.en-US.fl_str_mv |
Universidad Pedagógica y Tecnológica de Colombia |
dc.source.en-US.fl_str_mv |
Revista Facultad de Ingeniería; Vol. 29 No. 54 (2020): Continuos Publication; e11762 |
dc.source.es-ES.fl_str_mv |
Revista Facultad de Ingeniería; Vol. 29 Núm. 54 (2020): Publicación Continua; e11762 |
dc.source.none.fl_str_mv |
2357-5328 0121-1129 |
institution |
Universidad Pedagógica y Tecnológica de Colombia |
repository.name.fl_str_mv |
Repositorio Institucional UPTC |
repository.mail.fl_str_mv |
repositorio.uptc@uptc.edu.co |
_version_ |
1839633823757762560 |
spelling |
2020-09-182024-07-05T19:11:56Z2024-07-05T19:11:56Zhttps://revistas.uptc.edu.co/index.php/ingenieria/article/view/1176210.19053/01211129.v29.n54.2020.11762https://repositorio.uptc.edu.co/handle/001/14291Part-of-Speech Tagging (POST) is a complex task in the preprocessing of Natural Language Processing applications. Tagging has been tackled from statistical information and rule-based approaches, making use of a range of methods. Most recently, metaheuristic algorithms have gained attention while being used in a wide variety of knowledge areas, with good results. As a result, they were deployed in this research in a POST problem to assign the best sequence of tags (roles) for the words of a sentence based on information statistics. This process was carried out in two cycles, each of them comprised four phases, allowing the adaptation to the tagging problem in metaheuristic algorithms such as Particle Swarm Optimization, Jaya, Random-Restart Hill Climbing, and a memetic algorithm based on Global-Best Harmony Search as a global optimizer, and on Hill Climbing as a local optimizer. In the consolidation of each algorithm, preliminary experiments were carried out (using cross-validation) to adjust the parameters of each algorithm and, thus, evaluate them on the datasets of the complete tagged corpus: IULA (Spanish), Brown (English) and Nasa Yuwe (Nasa). The results obtained by the proposed taggers were compared, and the Friedman and Wilcoxon statistical tests were applied, confirming that the proposed memetic, GBHS Tagger, obtained better results in precision. The proposed taggers make an important contribution to POST for traditional languages (English and Spanish), non-traditional languages (Nasa Yuwe), and their application areas.La identificación de partes del discurso (Part-of-Speech Tagging, POST) es una tarea compleja en las aplicaciones de procesamiento de lenguaje natural. Ha sido abordada desde enfoques basados en información estadística y reglas, haciendo uso de distintos métodos y, últimamente, se destacan los algoritmos metaheurísticos obteniendo buenos resultados. Por ello, se involucran en esta investigación para asignar la mejor secuencia de etiquetas (roles) para las palabras de una oración, basándose en información estadística. Este proceso se desarrolló en 2 ciclos, donde cada ciclo tuvo 4 fases para la adaptación al problema de etiquetado en los algoritmos metaheurísticos Particle Swarm Optimization, Jaya, Random-Restart Hill Climbing, y un algoritmo memético basado en Global-Best Harmony Search como optimizador global, y en Hill Climbing como optimizador local. Se realizaron experimentos preliminares (utilizando validación cruzada), para ajustar los parámetros de cada algoritmo y luego ejecutarlos sobre los datasets completos de los corpus etiquetados IULA (castellano), Brown (inglés) y Nasa Yuwe (Nasa). Los resultados obtenidos por los etiquetadores propuestos se compararon mediante las pruebas estadísticas no paramétricas de Friedman y Wilcoxon, ratificando que el memético propuesto, GBHS Tagger, obtiene mejores resultados de precisión. Los etiquetadores propuestos se convierten en un aporte muy importante para el POST, tanto para lenguas tradicionales (Inglés y Castellano), no tradicionales (Nasa Yuwe), y sus áreas de aplicación.application/pdfapplication/pdfapplication/xmlengspaengspaUniversidad Pedagógica y Tecnológica de Colombiahttps://revistas.uptc.edu.co/index.php/ingenieria/article/view/11762/9627https://revistas.uptc.edu.co/index.php/ingenieria/article/view/11762/9660https://revistas.uptc.edu.co/index.php/ingenieria/article/view/11762/10015Copyright (c) 2020 Miguel Alexis Solano-Jiménez, Jose Julio Tobar-Cifuentes, Luz Marina Sierra-Martínez, Ph. D., Carlos Alberto Cobos-Lozada, Ph. D.http://purl.org/coar/access_right/c_abf151http://purl.org/coar/access_right/c_abf2Revista Facultad de Ingeniería; Vol. 29 No. 54 (2020): Continuos Publication; e11762Revista Facultad de Ingeniería; Vol. 29 Núm. 54 (2020): Publicación Continua; e117622357-53280121-1129computational intelligencecomputational linguisticsevolutionary computingheuristic algorithmsnatural language processingparts of speech taggingsearch methodsalgoritmos heurísticoscomputación evolutivaetiquetado de partes del discursointeligencia computacionallingüística computacionalmétodos de búsquedaprocesamiento de lenguaje naturalAdaptation, Comparison, and Improvement of Metaheuristic Algorithms to the Part-of-Speech Tagging ProblemAdaptación, comparación y mejora de algoritmos metaheurísticos al problema de etiquetado de partes del discursoinfo:eu-repo/semantics/articlehttp://purl.org/coar/resource_type/c_2df8fbb1info:eu-repo/semantics/publishedVersionhttp://purl.org/coar/version/c_970fb48d4fbd8a234http://purl.org/coar/version/c_970fb48d4fbd8a85Solano-Jiménez, Miguel AlexisTobar-Cifuentes, Jose JulioSierra-Martínez, Luz MarinaCobos-Lozada, Carlos Alberto001/14291oai:repositorio.uptc.edu.co:001/142912025-07-18 11:53:37.502metadata.onlyhttps://repositorio.uptc.edu.coRepositorio Institucional UPTCrepositorio.uptc@uptc.edu.co |