Concept attribute labeling and context-aware named entity recognition in electronic health records

Extracting valuable knowledge from Electronic Health Records (EHR) represents a challenging task due to the presence of both structured and unstructured data, including codified fields, images and test results. Narrative text in particular contains a variety of notes which are diverse in language an...

Full description

Autores:
Pomares-Quimbaya, Alexandra
González, Rafael A.
Muñoz, Óscar
Garcia-Pena, A.A.
Daza Rodríguez, Julián Camilo
Sierra Múnera, Alejandro
Labbé, Cyril
Tipo de recurso:
Part of book
Fecha de publicación:
2020
Institución:
Pontificia Universidad Javeriana
Repositorio:
Repositorio Universidad Javeriana
Idioma:
N/A
OAI Identifier:
oai:repository.javeriana.edu.co:10554/57112
Acceso en línea:
http://hdl.handle.net/10554/57112
http://dx.doi.org/10.4018/978-1-7998-1204-3.ch017
Palabra clave:
Rights
License
Atribución-NoComercial 4.0 Internacional
id JAVERIANA2_0a5fe1e42c54edf97909fb6f024d5717
oai_identifier_str oai:repository.javeriana.edu.co:10554/57112
network_acronym_str JAVERIANA2
network_name_str Repositorio Universidad Javeriana
repository_id_str
dc.title.spa.fl_str_mv Concept attribute labeling and context-aware named entity recognition in electronic health records
title Concept attribute labeling and context-aware named entity recognition in electronic health records
spellingShingle Concept attribute labeling and context-aware named entity recognition in electronic health records
title_short Concept attribute labeling and context-aware named entity recognition in electronic health records
title_full Concept attribute labeling and context-aware named entity recognition in electronic health records
title_fullStr Concept attribute labeling and context-aware named entity recognition in electronic health records
title_full_unstemmed Concept attribute labeling and context-aware named entity recognition in electronic health records
title_sort Concept attribute labeling and context-aware named entity recognition in electronic health records
dc.creator.fl_str_mv Pomares-Quimbaya, Alexandra
González, Rafael A.
Muñoz, Óscar
Garcia-Pena, A.A.
Daza Rodríguez, Julián Camilo
Sierra Múnera, Alejandro
Labbé, Cyril
dc.contributor.author.none.fl_str_mv Pomares-Quimbaya, Alexandra
González, Rafael A.
Muñoz, Óscar
Garcia-Pena, A.A.
Daza Rodríguez, Julián Camilo
Sierra Múnera, Alejandro
Labbé, Cyril
dc.contributor.corporatename.none.fl_str_mv Pontificia Universidad Javeriana. Facultad de Medicina. Departamento de Medicina Interna. Cardiología
Pontificia Universidad Javeriana. Facultad de Medicina. Departamento de Medicina Interna. Medicina Interna
dc.contributor.javerianateacher.none.fl_str_mv Garcia-Pena, A.A.
Muñoz, Óscar
description Extracting valuable knowledge from Electronic Health Records (EHR) represents a challenging task due to the presence of both structured and unstructured data, including codified fields, images and test results. Narrative text in particular contains a variety of notes which are diverse in language and detail, as well as being full of ad hoc terminology, including acronyms and jargon, which is especially challenging in non-English EHR, where there is a dearth of annotated corpora or trained case sets. This paper proposes an approach for NER and concept attribute labeling for EHR that takes into consideration the contextual words around the entity of interest to determine its sense. The approach proposes a composition method of three different NER methods, together with the analysis of the context (neighboring words) using an ensemble classification model. This contributes to disambiguate NER, as well as labeling the concept as confirmed, negated, speculative, pending or antecedent. Results show an improvement of the recall and a limited impact on precision for the NER process.
publishDate 2020
dc.date.created.none.fl_str_mv 2020
dc.date.accessioned.none.fl_str_mv 2021-09-13T13:47:20Z
dc.date.available.none.fl_str_mv 2021-09-13T13:47:20Z
dc.type.local.spa.fl_str_mv Capítulo de libro
dc.type.coar.none.fl_str_mv http://purl.org/coar/resource_type/c_3248
format http://purl.org/coar/resource_type/c_3248
dc.identifier.isbn.spa.fl_str_mv 9781799812043 / 9781799812050 (Electrónico)
dc.identifier.uri.none.fl_str_mv http://hdl.handle.net/10554/57112
dc.identifier.doi.spa.fl_str_mv http://dx.doi.org/10.4018/978-1-7998-1204-3.ch017
dc.identifier.instname.spa.fl_str_mv instname:Pontificia Universidad Javeriana
dc.identifier.reponame.spa.fl_str_mv reponame:Repositorio Institucional - Pontificia Universidad Javeriana
dc.identifier.repourl.spa.fl_str_mv repourl:https://repository.javeriana.edu.co
identifier_str_mv 9781799812043 / 9781799812050 (Electrónico)
instname:Pontificia Universidad Javeriana
reponame:Repositorio Institucional - Pontificia Universidad Javeriana
repourl:https://repository.javeriana.edu.co
url http://hdl.handle.net/10554/57112
http://dx.doi.org/10.4018/978-1-7998-1204-3.ch017
dc.language.iso.spa.fl_str_mv N/A
language N/A
dc.relation.ispartofbook.spa.fl_str_mv Data Analytics in Medicine: Concepts, Methodologies, Tools, and Applications
dc.rights.licence.*.fl_str_mv Atribución-NoComercial 4.0 Internacional
dc.rights.uri.*.fl_str_mv http://creativecommons.org/licenses/by-nc/4.0/
dc.rights.coar.spa.fl_str_mv http://purl.org/coar/access_right/c_abf2
rights_invalid_str_mv Atribución-NoComercial 4.0 Internacional
http://creativecommons.org/licenses/by-nc/4.0/
http://purl.org/coar/access_right/c_abf2
dc.format.spa.fl_str_mv PDF
dc.format.mimetype.spa.fl_str_mv application/pdf
dc.publisher.spa.fl_str_mv IGI Global
institution Pontificia Universidad Javeriana
bitstream.url.fl_str_mv http://repository.javeriana.edu.co/bitstream/10554/57112/3/Concept-Attribute-Labeling-and-Context-Aware-Named.pdf
http://repository.javeriana.edu.co/bitstream/10554/57112/2/license.txt
http://repository.javeriana.edu.co/bitstream/10554/57112/4/Concept-Attribute-Labeling-and-Context-Aware-Named.pdf.jpg
bitstream.checksum.fl_str_mv f2c5a3379f2f52def4c3303763bf4fa8
2070d280cc89439d983d9eee1b17df53
9543e93e29037f1a9ae9acb4474e23a6
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
repository.name.fl_str_mv Repositorio Institucional - Pontificia Universidad Javeriana
repository.mail.fl_str_mv repositorio@javeriana.edu.co
_version_ 1814337521080860672
spelling Atribución-NoComercial 4.0 Internacionalhttp://creativecommons.org/licenses/by-nc/4.0/http://purl.org/coar/access_right/c_abf2Pomares-Quimbaya, AlexandraGonzález, Rafael A.Muñoz, ÓscarGarcia-Pena, A.A.Daza Rodríguez, Julián CamiloSierra Múnera, AlejandroLabbé, CyrilPontificia Universidad Javeriana. Facultad de Medicina. Departamento de Medicina Interna. CardiologíaPontificia Universidad Javeriana. Facultad de Medicina. Departamento de Medicina Interna. Medicina InternaGarcia-Pena, A.A.Muñoz, Óscar2021-09-13T13:47:20Z2021-09-13T13:47:20Z20209781799812043 / 9781799812050 (Electrónico)http://hdl.handle.net/10554/57112http://dx.doi.org/10.4018/978-1-7998-1204-3.ch017instname:Pontificia Universidad Javerianareponame:Repositorio Institucional - Pontificia Universidad Javerianarepourl:https://repository.javeriana.edu.coPDFapplication/pdfN/AIGI GlobalConcept attribute labeling and context-aware named entity recognition in electronic health recordsCapítulo de librohttp://purl.org/coar/resource_type/c_3248Extracting valuable knowledge from Electronic Health Records (EHR) represents a challenging task due to the presence of both structured and unstructured data, including codified fields, images and test results. Narrative text in particular contains a variety of notes which are diverse in language and detail, as well as being full of ad hoc terminology, including acronyms and jargon, which is especially challenging in non-English EHR, where there is a dearth of annotated corpora or trained case sets. This paper proposes an approach for NER and concept attribute labeling for EHR that takes into consideration the contextual words around the entity of interest to determine its sense. The approach proposes a composition method of three different NER methods, together with the analysis of the context (neighboring words) using an ensemble classification model. This contributes to disambiguate NER, as well as labeling the concept as confirmed, negated, speculative, pending or antecedent. Results show an improvement of the recall and a limited impact on precision for the NER process.https://orcid.org/0000-0002-3606-2102https://orcid.org/0000-0001-5401-0018Data Analytics in Medicine: Concepts, Methodologies, Tools, and ApplicationsORIGINALConcept-Attribute-Labeling-and-Context-Aware-Named.pdfConcept-Attribute-Labeling-and-Context-Aware-Named.pdfCapítulo de libroapplication/pdf1046035http://repository.javeriana.edu.co/bitstream/10554/57112/3/Concept-Attribute-Labeling-and-Context-Aware-Named.pdff2c5a3379f2f52def4c3303763bf4fa8MD53metadata only accessLICENSElicense.txtlicense.txttext/plain; charset=utf-82603http://repository.javeriana.edu.co/bitstream/10554/57112/2/license.txt2070d280cc89439d983d9eee1b17df53MD52open accessTHUMBNAILConcept-Attribute-Labeling-and-Context-Aware-Named.pdf.jpgConcept-Attribute-Labeling-and-Context-Aware-Named.pdf.jpgIM Thumbnailimage/jpeg7442http://repository.javeriana.edu.co/bitstream/10554/57112/4/Concept-Attribute-Labeling-and-Context-Aware-Named.pdf.jpg9543e93e29037f1a9ae9acb4474e23a6MD54open access10554/57112oai:repository.javeriana.edu.co:10554/571122023-07-11 16:02:04.627Repositorio Institucional - Pontificia Universidad Javerianarepositorio@javeriana.edu.coTElDRU5DSUEgWSBBVVRPUklaQUNJw5NOIERFIExPUyBBVVRPUkVTIFBBUkEgUFVCTElDQVIgWSBQRVJNSVRJUiBMQSBDT05TVUxUQSBZIFVTTy4KClBhcnRlIDEuIFTDqXJtaW5vcyBkZSBsYSBsaWNlbmNpYSBnZW5lcmFsIHBhcmEgcHVibGljYWNpw7NuIGRlIG9icmFzIGVuIGVsIHJlcG9zaXRvcmlvIGluc3RpdHVjaW9uYWwKQ29tbyB0aXR1bGFyIChlcykgZGVsIGRlcmVjaG8gZGUgYXV0b3IsIGNvbmZpZXJvIChlcmltb3MpIGEgbGEgUG9udGlmaWNpYSBVbml2ZXJzaWRhZCBKYXZlcmlhbmEgdW5hIGxpY2VuY2lhIG5vIGV4Y2x1c2l2YSwgbGltaXRhZGEgeSBncmF0dWl0YSBzb2JyZSBsYSBvYnJhIHF1ZSBzZSBpbnRlZ3JhcsOhIGVuIGVsIFJlcG9zaXRvcmlvIEluc3RpdHVjaW9uYWwsIHF1ZSBzZSBhanVzdGEgYSBsYXMgc2lndWllbnRlcyBjYXJhY3RlcsOtc3RpY2FzOgphKSAgICAgIEVzdGFyw6EgdmlnZW50ZSBhIHBhcnRpciBkZSBsYSBmZWNoYSBkZSBpbmNsdXNpw7NuIGVuIGVsIHJlcG9zaXRvcmlvLCBwb3IgdW4gcGxhem8gZGUgNSBhw7FvcywgcXVlIHNlcsOhbiBwcm9ycm9nYWJsZXMgaW5kZWZpbmlkYW1lbnRlIHBvciBlbCB0aWVtcG8gcXVlIGR1cmUgZWwgZGVyZWNobyBwYXRyaW1vbmlhbCBkZWwgYXV0b3IuIEVsIGF1dG9yIHBvZHLDoSBkYXIgcG9yIHRlcm1pbmFkYSBsYSBsaWNlbmNpYSBzb2xpY2l0w6FuZG9sbyBhIGxhIFVuaXZlcnNpZGFkIHBvciBlc2NyaXRvLgpiKSAgICAgIEF1dG9yaXphIGEgbGEgUG9udGlmaWNpYSBVbml2ZXJzaWRhZCBKYXZlcmlhbmEgYSBwdWJsaWNhciBsYSBvYnJhIGVuIGRpZ2l0YWwsIGNvbm9jaWVuZG8gcXVlLCBkYWRvIHF1ZSBzZSBwdWJsaWNhIGVuIEludGVybmV0LCBwb3IgZXN0ZSBoZWNobyBjaXJjdWxhIGNvbiB1biBhbGNhbmNlIG11bmRpYWwuCmMpICAgICAgTG9zIGF1dG9yZXMgYWNlcHRhbiBxdWUgbGEgYXV0b3JpemFjacOzbiBzZSBoYWNlIGEgdMOtdHVsbyBncmF0dWl0bywgcG9yIGxvIHRhbnRvIHJlbnVuY2lhbiBhIHJlY2liaXIgYmVuZWZpY2lvIGFsZ3VubyBwb3IgbGEgcHVibGljYWNpw7NuLCBkaXN0cmlidWNpw7NuLCBjb211bmljYWNpw7NuIHDDumJsaWNhIHkgY3VhbHF1aWVyIG90cm8gdXNvIHF1ZSBzZSBoYWdhIGVuIGxvcyB0w6lybWlub3MgZGUgbGEgcHJlc2VudGUgbGljZW5jaWEgeSBkZSBsYSBsaWNlbmNpYSBkZSB1c28gY29uIHF1ZSBzZSBwdWJsaWNhLgpkKSAgICAgIExvcyBhdXRvcmVzIG1hbmlmaWVzdGFuIHF1ZSBzZSB0cmF0YSBkZSB1bmEgb2JyYSBvcmlnaW5hbCBzb2JyZSBsYSBxdWUgdGllbmVuIGxvcyBkZXJlY2hvcyBxdWUgYXV0b3JpemFuIHkgcXVlIHNvbiBlbGxvcyBxdWllbmVzIGFzdW1lbiB0b3RhbCByZXNwb25zYWJpbGlkYWQgcG9yIGVsIGNvbnRlbmlkbyBkZSBzdSBvYnJhIGFudGUgbGEgUG9udGlmaWNpYSBVbml2ZXJzaWRhZCBKYXZlcmlhbmEgeSBhbnRlIHRlcmNlcm9zLiBFbiB0b2RvIGNhc28gbGEgUG9udGlmaWNpYSBVbml2ZXJzaWRhZCBKYXZlcmlhbmEgc2UgY29tcHJvbWV0ZSBhIGluZGljYXIgc2llbXByZSBsYSBhdXRvcsOtYSBpbmNsdXllbmRvIGVsIG5vbWJyZSBkZWwgYXV0b3IgeSBsYSBmZWNoYSBkZSBwdWJsaWNhY2nDs24uCmUpICAgICAgQXV0b3Jpem8gKGFtb3MpIGEgbGEgVW5pdmVyc2lkYWQgcGFyYSBpbmNsdWlyIGxhIG9icmEgZW4gbG9zIMOtbmRpY2VzIHkgYnVzY2Fkb3JlcyBxdWUgZXN0aW1lbiBuZWNlc2FyaW9zIHBhcmEgcHJvbW92ZXIgc3UgZGlmdXNpw7NuLgpmKSAgICAgIEFjZXB0byAoYW1vcykgcXVlIGxhIFBvbnRpZmljaWEgVW5pdmVyc2lkYWQgSmF2ZXJpYW5hIHB1ZWRhIGNvbnZlcnRpciBlbCBkb2N1bWVudG8gYSBjdWFscXVpZXIgbWVkaW8gbyBmb3JtYXRvIHBhcmEgcHJvcMOzc2l0b3MgZGUgcHJlc2VydmFjacOzbiBkaWdpdGFsLgpnKSAgICAgIEF1dG9yaXpvIChhbW9zKSBxdWUgbGEgb2JyYSBzZWEgcHVlc3RhIGEgZGlzcG9zaWNpw7NuIGRlbCBww7pibGljbyBlbiBsb3MgdMOpcm1pbm9zIGF1dG9yaXphZG9zIGVuIGxvcyBsaXRlcmFsZXMgYW50ZXJpb3JlcyBiYWpvIGxvcyBsw61taXRlcyBkZWZpbmlkb3MgcG9yIGxhIHVuaXZlcnNpZGFkIGVuIGxhcyDigJxDb25kaWNpb25lcyBkZSB1c28gZGUgZXN0cmljdG8gY3VtcGxpbWllbnRv4oCdIGRlIGxvcyByZWN1cnNvcyBwdWJsaWNhZG9zIGVuIFJlcG9zaXRvcmlvIEluc3RpdHVjaW9uYWwgUFVKICwgY3V5byB0ZXh0byBjb21wbGV0byBzZSBwdWVkZSBjb25zdWx0YXIgZW4gaHR0cDovL3JlcG9zaXRvcnkuamF2ZXJpYW5hLmVkdS5jby8KClNJIEVMIERPQ1VNRU5UTyBTRSBCQVNBIEVOIFVOIFRSQUJBSk8gUVVFIEhBIFNJRE8gUEFUUk9DSU5BRE8gTyBBUE9ZQURPIFBPUiBVTkEgQUdFTkNJQSBPIFVOQSBPUkdBTklaQUNJw5NOLCBDT04gRVhDRVBDScOTTiBERSBMQSBQT05USUZJQ0lBIFVOSVZFUlNJREFEIEpBVkVSSUFOQSwgRUwgKExPUykgQVVUT1IoRVMpIEdBUkFOVElaQShNT1MpIFFVRSBTRSBIQSBDVU1QTElETyBDT04gTE9TIERFUkVDSE9TIFkgT0JMSUdBQ0lPTkVTIFJFUVVFUklET1MgUE9SIEVMIFJFU1BFQ1RJVk8gQ09OVFJBVE8gTyBBQ1VFUkRPLgo=