Study of a visual perception system with multiple models for visual disability assistance

The project presents a technological solution to assist people with visual impairments through a mobile application developed with Flutter. This application integrates advanced computer vision models (YOLOv11 and YOLO-World) to detect and recognize objects in real time. Once objects in the environme...

Full description

Autores:: Apresa Echeverría, Rubens André
Heras Gómez, Camilo Andrés
Linero Caro, Jaymed Daniel

Tipo de recurso:

Fecha de publicación:: 2025

Institución:: Universidad del Norte

Repositorio:: Repositorio Uninorte

Idioma:: spa

id	REPOUNORT2_206f372e0efce33575a7de1cd3dc1dbb
oai_identifier_str	oai:manglar.uninorte.edu.co:10584/13409
network_acronym_str	REPOUNORT2
network_name_str	Repositorio Uninorte
repository_id_str
dc.title.en_US.fl_str_mv	Study of a visual perception system with multiple models for visual disability assistance
dc.title.es_ES.fl_str_mv	Estudio de un sistema de percepción visual con múltiples modelos para asistencia en discapacidad visual
title	Study of a visual perception system with multiple models for visual disability assistance
spellingShingle	Study of a visual perception system with multiple models for visual disability assistance Yolo Multiples modelos visual Multiple Models Visual Disability Assistance visual
title_short	Study of a visual perception system with multiple models for visual disability assistance
title_full	Study of a visual perception system with multiple models for visual disability assistance
title_fullStr	Study of a visual perception system with multiple models for visual disability assistance
title_full_unstemmed	Study of a visual perception system with multiple models for visual disability assistance
title_sort	Study of a visual perception system with multiple models for visual disability assistance
dc.creator.fl_str_mv	Apresa Echeverría, Rubens André Heras Gómez, Camilo Andrés Linero Caro, Jaymed Daniel
dc.contributor.advisor.none.fl_str_mv	Nieto Bernal, Wilson Salazar Silva, Augusto
dc.contributor.author.none.fl_str_mv	Apresa Echeverría, Rubens André Heras Gómez, Camilo Andrés Linero Caro, Jaymed Daniel
dc.subject.es_ES.fl_str_mv	Yolo Multiples modelos visual
topic	Yolo Multiples modelos visual Multiple Models Visual Disability Assistance visual
dc.subject.en_US.fl_str_mv	Multiple Models Visual Disability Assistance visual
description	The project presents a technological solution to assist people with visual impairments through a mobile application developed with Flutter. This application integrates advanced computer vision models (YOLOv11 and YOLO-World) to detect and recognize objects in real time. Once objects in the environment are identified using the phone's camera, the application generates detailed audio descriptions with the support of Gemini AI, allowing users to better understand their surroundings. The system is based on a hybrid architecture that combines local processing on the device with cloud services, ensuring efficiency and scalability. It has been tested and validated with real users to ensure its practical usability. The use of established datasets such as COCO and LVIS allows the models to recognize a wide variety of everyday objects, improving the richness and relevance of the information provided to users. The development process followed the agile SCRUM methodology, which allowed for iterative improvements in development, accessibility, validation, and user experience. The project demonstrates how the integration of generative artificial intelligence with robust object detection models can contribute to greater independence and mobility for people with visual impairments in diverse real-world environments.
publishDate	2025
dc.date.accessioned.none.fl_str_mv	2025-06-05T14:57:03Z
dc.date.available.none.fl_str_mv	2025-06-05T14:57:03Z
dc.date.issued.none.fl_str_mv	2025-06-03
dc.type.es_ES.fl_str_mv	article
dc.type.coar.fl_str_mv	http://purl.org/coar/resource_type/c_6501
dc.identifier.uri.none.fl_str_mv	http://hdl.handle.net/10584/13409
url	http://hdl.handle.net/10584/13409
dc.language.iso.es_ES.fl_str_mv	spa
language	spa
dc.rights.es_ES.fl_str_mv	Universidad del Norte
dc.rights.coar.fl_str_mv	http://purl.org/coar/access_right/c_abf2
rights_invalid_str_mv	Universidad del Norte http://purl.org/coar/access_right/c_abf2
dc.publisher.es_ES.fl_str_mv	Barranquilla, Universidad del Norte, 2025
institution	Universidad del Norte
bitstream.url.fl_str_mv	https://manglar.uninorte.edu.co/bitstream/10584/13409/1/Lookie.jpeg https://manglar.uninorte.edu.co/bitstream/10584/13409/2/Reconocimiento.jpeg https://manglar.uninorte.edu.co/bitstream/10584/13409/3/license.txt
bitstream.checksum.fl_str_mv	015fc8067a6e0bbdd445a465786a1156 afd82808f02bb99f25c58bd6d11fe9d9 8a4605be74aa9ea9d79846c1fba20a33
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5
repository.name.fl_str_mv	Repositorio Digital de la Universidad del Norte
repository.mail.fl_str_mv	mauribe@uninorte.edu.co
_version_	1836754026655383552
spelling	Nieto Bernal, WilsonSalazar Silva, AugustoApresa Echeverría, Rubens AndréHeras Gómez, Camilo AndrésLinero Caro, Jaymed Daniel2025-06-05T14:57:03Z2025-06-05T14:57:03Z2025-06-03http://hdl.handle.net/10584/13409The project presents a technological solution to assist people with visual impairments through a mobile application developed with Flutter. This application integrates advanced computer vision models (YOLOv11 and YOLO-World) to detect and recognize objects in real time. Once objects in the environment are identified using the phone's camera, the application generates detailed audio descriptions with the support of Gemini AI, allowing users to better understand their surroundings. The system is based on a hybrid architecture that combines local processing on the device with cloud services, ensuring efficiency and scalability. It has been tested and validated with real users to ensure its practical usability. The use of established datasets such as COCO and LVIS allows the models to recognize a wide variety of everyday objects, improving the richness and relevance of the information provided to users. The development process followed the agile SCRUM methodology, which allowed for iterative improvements in development, accessibility, validation, and user experience. The project demonstrates how the integration of generative artificial intelligence with robust object detection models can contribute to greater independence and mobility for people with visual impairments in diverse real-world environments.El proyecto presenta una solución tecnológica para ayudar a personas con discapacidad visual mediante una aplicación móvil desarrollada con Flutter. Esta aplicación integra modelos avanzados de visión artificial (YOLOv11 y YOLO-World) para detectar y reconocer objetos en tiempo real. Una vez identificados los objetos del entorno mediante la cámara del teléfono, la aplicación genera descripciones de audio detalladas con el apoyo de Gemini AI, lo que permite a los usuarios comprender mejor su entorno. El sistema se basa en una arquitectura híbrida que combina el procesamiento local del dispositivo con servicios en la nube, lo que garantiza eficiencia y escalabilidad. Se ha probado y validado con usuarios reales para garantizar su usabilidad práctica. El uso de conjuntos de datos consolidados como COCO y LVIS permite que los modelos reconozcan una amplia variedad de objetos cotidianos, mejorando la riqueza y la relevancia de la información proporcionada a los usuarios. El proceso de desarrollo siguió la metodología ágil SCRUM, que permitió mejoras iterativas en el desarrollo, la accesibilidad, la validación y la experiencia de usuario. El proyecto demuestra cómo la integración de la inteligencia artificial generativa con modelos robustos de detección de objetos puede contribuir a una mayor independencia y movilidad de las personas con discapacidad visual en diversos entornos del mundo real.spaBarranquilla, Universidad del Norte, 2025Universidad del Nortehttp://purl.org/coar/access_right/c_abf2YoloMultiples modelosvisualMultiple ModelsVisual Disability AssistancevisualStudy of a visual perception system with multiple models for visual disability assistanceEstudio de un sistema de percepción visual con múltiples modelos para asistencia en discapacidad visualarticlehttp://purl.org/coar/resource_type/c_6501Nieto Bernal, WilsonORIGINALLookie.jpegLookie.jpegAplicación funcionando en un dispositivo movil.image/jpeg21656https://manglar.uninorte.edu.co/bitstream/10584/13409/1/Lookie.jpeg015fc8067a6e0bbdd445a465786a1156MD51Reconocimiento.jpegReconocimiento.jpegAplicación interpretando el entorno.image/jpeg78709https://manglar.uninorte.edu.co/bitstream/10584/13409/2/Reconocimiento.jpegafd82808f02bb99f25c58bd6d11fe9d9MD52LICENSElicense.txtlicense.txttext/plain; charset=utf-81748https://manglar.uninorte.edu.co/bitstream/10584/13409/3/license.txt8a4605be74aa9ea9d79846c1fba20a33MD5310584/13409oai:manglar.uninorte.edu.co:10584/134092025-06-05 09:57:03.804Repositorio Digital de la Universidad del Nortemauribe@uninorte.edu.coTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=

Study of a visual perception system with multiple models for visual disability assistance

Publicaciones similares