Study of a visual perception system with multiple models for visual disability assistance

The project presents a technological solution to assist people with visual impairments through a mobile application developed with Flutter. This application integrates advanced computer vision models (YOLOv11 and YOLO-World) to detect and recognize objects in real time. Once objects in the environme...

Full description

Autores:
Apresa Echeverría, Rubens André
Heras Gómez, Camilo Andrés
Linero Caro, Jaymed Daniel
Tipo de recurso:
Fecha de publicación:
2025
Institución:
Universidad del Norte
Repositorio:
Repositorio Uninorte
Idioma:
spa
OAI Identifier:
oai:manglar.uninorte.edu.co:10584/13409
Acceso en línea:
http://hdl.handle.net/10584/13409
Palabra clave:
Yolo
Multiples modelos
visual
Multiple Models
Visual Disability Assistance
visual
Rights
License
Universidad del Norte
id REPOUNORT2_206f372e0efce33575a7de1cd3dc1dbb
oai_identifier_str oai:manglar.uninorte.edu.co:10584/13409
network_acronym_str REPOUNORT2
network_name_str Repositorio Uninorte
repository_id_str
dc.title.en_US.fl_str_mv Study of a visual perception system with multiple models for visual disability assistance
dc.title.es_ES.fl_str_mv Estudio de un sistema de percepción visual con múltiples modelos para asistencia en discapacidad visual
title Study of a visual perception system with multiple models for visual disability assistance
spellingShingle Study of a visual perception system with multiple models for visual disability assistance
Yolo
Multiples modelos
visual
Multiple Models
Visual Disability Assistance
visual
title_short Study of a visual perception system with multiple models for visual disability assistance
title_full Study of a visual perception system with multiple models for visual disability assistance
title_fullStr Study of a visual perception system with multiple models for visual disability assistance
title_full_unstemmed Study of a visual perception system with multiple models for visual disability assistance
title_sort Study of a visual perception system with multiple models for visual disability assistance
dc.creator.fl_str_mv Apresa Echeverría, Rubens André
Heras Gómez, Camilo Andrés
Linero Caro, Jaymed Daniel
dc.contributor.advisor.none.fl_str_mv Nieto Bernal, Wilson
Salazar Silva, Augusto
dc.contributor.author.none.fl_str_mv Apresa Echeverría, Rubens André
Heras Gómez, Camilo Andrés
Linero Caro, Jaymed Daniel
dc.subject.es_ES.fl_str_mv Yolo
Multiples modelos
visual
topic Yolo
Multiples modelos
visual
Multiple Models
Visual Disability Assistance
visual
dc.subject.en_US.fl_str_mv Multiple Models
Visual Disability Assistance
visual
description The project presents a technological solution to assist people with visual impairments through a mobile application developed with Flutter. This application integrates advanced computer vision models (YOLOv11 and YOLO-World) to detect and recognize objects in real time. Once objects in the environment are identified using the phone's camera, the application generates detailed audio descriptions with the support of Gemini AI, allowing users to better understand their surroundings. The system is based on a hybrid architecture that combines local processing on the device with cloud services, ensuring efficiency and scalability. It has been tested and validated with real users to ensure its practical usability. The use of established datasets such as COCO and LVIS allows the models to recognize a wide variety of everyday objects, improving the richness and relevance of the information provided to users. The development process followed the agile SCRUM methodology, which allowed for iterative improvements in development, accessibility, validation, and user experience. The project demonstrates how the integration of generative artificial intelligence with robust object detection models can contribute to greater independence and mobility for people with visual impairments in diverse real-world environments.
publishDate 2025
dc.date.accessioned.none.fl_str_mv 2025-06-05T14:57:03Z
dc.date.available.none.fl_str_mv 2025-06-05T14:57:03Z
dc.date.issued.none.fl_str_mv 2025-06-03
dc.type.es_ES.fl_str_mv article
dc.type.coar.fl_str_mv http://purl.org/coar/resource_type/c_6501
dc.identifier.uri.none.fl_str_mv http://hdl.handle.net/10584/13409
url http://hdl.handle.net/10584/13409
dc.language.iso.es_ES.fl_str_mv spa
language spa
dc.rights.es_ES.fl_str_mv Universidad del Norte
dc.rights.coar.fl_str_mv http://purl.org/coar/access_right/c_abf2
rights_invalid_str_mv Universidad del Norte
http://purl.org/coar/access_right/c_abf2
dc.publisher.es_ES.fl_str_mv Barranquilla, Universidad del Norte, 2025
institution Universidad del Norte
bitstream.url.fl_str_mv https://manglar.uninorte.edu.co/bitstream/10584/13409/1/Lookie.jpeg
https://manglar.uninorte.edu.co/bitstream/10584/13409/2/Reconocimiento.jpeg
https://manglar.uninorte.edu.co/bitstream/10584/13409/3/license.txt
bitstream.checksum.fl_str_mv 015fc8067a6e0bbdd445a465786a1156
afd82808f02bb99f25c58bd6d11fe9d9
8a4605be74aa9ea9d79846c1fba20a33
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
repository.name.fl_str_mv Repositorio Digital de la Universidad del Norte
repository.mail.fl_str_mv mauribe@uninorte.edu.co
_version_ 1836754026655383552
spelling Nieto Bernal, WilsonSalazar Silva, AugustoApresa Echeverría, Rubens AndréHeras Gómez, Camilo AndrésLinero Caro, Jaymed Daniel2025-06-05T14:57:03Z2025-06-05T14:57:03Z2025-06-03http://hdl.handle.net/10584/13409The project presents a technological solution to assist people with visual impairments through a mobile application developed with Flutter. This application integrates advanced computer vision models (YOLOv11 and YOLO-World) to detect and recognize objects in real time. Once objects in the environment are identified using the phone's camera, the application generates detailed audio descriptions with the support of Gemini AI, allowing users to better understand their surroundings. The system is based on a hybrid architecture that combines local processing on the device with cloud services, ensuring efficiency and scalability. It has been tested and validated with real users to ensure its practical usability. The use of established datasets such as COCO and LVIS allows the models to recognize a wide variety of everyday objects, improving the richness and relevance of the information provided to users. The development process followed the agile SCRUM methodology, which allowed for iterative improvements in development, accessibility, validation, and user experience. The project demonstrates how the integration of generative artificial intelligence with robust object detection models can contribute to greater independence and mobility for people with visual impairments in diverse real-world environments.El proyecto presenta una solución tecnológica para ayudar a personas con discapacidad visual mediante una aplicación móvil desarrollada con Flutter. Esta aplicación integra modelos avanzados de visión artificial (YOLOv11 y YOLO-World) para detectar y reconocer objetos en tiempo real. Una vez identificados los objetos del entorno mediante la cámara del teléfono, la aplicación genera descripciones de audio detalladas con el apoyo de Gemini AI, lo que permite a los usuarios comprender mejor su entorno. El sistema se basa en una arquitectura híbrida que combina el procesamiento local del dispositivo con servicios en la nube, lo que garantiza eficiencia y escalabilidad. Se ha probado y validado con usuarios reales para garantizar su usabilidad práctica. El uso de conjuntos de datos consolidados como COCO y LVIS permite que los modelos reconozcan una amplia variedad de objetos cotidianos, mejorando la riqueza y la relevancia de la información proporcionada a los usuarios. El proceso de desarrollo siguió la metodología ágil SCRUM, que permitió mejoras iterativas en el desarrollo, la accesibilidad, la validación y la experiencia de usuario. El proyecto demuestra cómo la integración de la inteligencia artificial generativa con modelos robustos de detección de objetos puede contribuir a una mayor independencia y movilidad de las personas con discapacidad visual en diversos entornos del mundo real.spaBarranquilla, Universidad del Norte, 2025Universidad del Nortehttp://purl.org/coar/access_right/c_abf2YoloMultiples modelosvisualMultiple ModelsVisual Disability AssistancevisualStudy of a visual perception system with multiple models for visual disability assistanceEstudio de un sistema de percepción visual con múltiples modelos para asistencia en discapacidad visualarticlehttp://purl.org/coar/resource_type/c_6501Nieto Bernal, WilsonORIGINALLookie.jpegLookie.jpegAplicación funcionando en un dispositivo movil.image/jpeg21656https://manglar.uninorte.edu.co/bitstream/10584/13409/1/Lookie.jpeg015fc8067a6e0bbdd445a465786a1156MD51Reconocimiento.jpegReconocimiento.jpegAplicación interpretando el entorno.image/jpeg78709https://manglar.uninorte.edu.co/bitstream/10584/13409/2/Reconocimiento.jpegafd82808f02bb99f25c58bd6d11fe9d9MD52LICENSElicense.txtlicense.txttext/plain; charset=utf-81748https://manglar.uninorte.edu.co/bitstream/10584/13409/3/license.txt8a4605be74aa9ea9d79846c1fba20a33MD5310584/13409oai:manglar.uninorte.edu.co:10584/134092025-06-05 09:57:03.804Repositorio Digital de la Universidad del Nortemauribe@uninorte.edu.coTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=