Evaluation of SQL injection (SQLi) attack detection strategies in web applications using machine learning. Industry semester

ABSTRACT : This work evaluates strategies for detecting SQL injection attacks based on artificial intelligence to generate a recommendation that allows the improvement of the web application firewall of AizoOn Technology Consulting (Mithril). To achieve this, detection techniques known as Naïve Baye...

Full description

Autores:
Taborda Echeverri, Santiago
Tipo de recurso:
Trabajo de grado de pregrado
Fecha de publicación:
2024
Institución:
Universidad de Antioquia
Repositorio:
Repositorio UdeA
Idioma:
eng
OAI Identifier:
oai:bibliotecadigital.udea.edu.co:10495/40601
Acceso en línea:
https://hdl.handle.net/10495/40601
Palabra clave:
Bosques aleatorios
Random Forest
Seguridad computacional
Computer Security
Procesamiento de datos
http://vocabularies.unesco.org/thesaurus/concept522
Aprendizaje automático (inteligencia artificial)
Machine learning
Análisis de regresión logística
Logistic regression analysis
Integración numérica - procesamiento de datos
Numerical integration - data processing
Inteligencia artificial
Artificial intelligence
Data processing
Inyección SQL (SQLi)
Firewall de Aplicaciones Web
SVM de Una Clase
AizoOn Technology Consulting
https://id.nlm.nih.gov/mesh/D000093743
https://id.nlm.nih.gov/mesh/D016494
Rights
openAccess
License
http://creativecommons.org/licenses/by-nc-nd/2.5/co/
id UDEA2_9498e4b67254671d8ae3af024eff93d9
oai_identifier_str oai:bibliotecadigital.udea.edu.co:10495/40601
network_acronym_str UDEA2
network_name_str Repositorio UdeA
repository_id_str
dc.title.spa.fl_str_mv Evaluation of SQL injection (SQLi) attack detection strategies in web applications using machine learning. Industry semester
dc.title.translated.spa.fl_str_mv Evaluación de estrategias de detección de ataques de inyección SQL (SQLi) en aplicaciones web basadas en técnicas de inteligencia computacional. Semestre de industria
title Evaluation of SQL injection (SQLi) attack detection strategies in web applications using machine learning. Industry semester
spellingShingle Evaluation of SQL injection (SQLi) attack detection strategies in web applications using machine learning. Industry semester
Bosques aleatorios
Random Forest
Seguridad computacional
Computer Security
Procesamiento de datos
http://vocabularies.unesco.org/thesaurus/concept522
Aprendizaje automático (inteligencia artificial)
Machine learning
Análisis de regresión logística
Logistic regression analysis
Integración numérica - procesamiento de datos
Numerical integration - data processing
Inteligencia artificial
Artificial intelligence
Data processing
Inyección SQL (SQLi)
Firewall de Aplicaciones Web
SVM de Una Clase
AizoOn Technology Consulting
https://id.nlm.nih.gov/mesh/D000093743
https://id.nlm.nih.gov/mesh/D016494
title_short Evaluation of SQL injection (SQLi) attack detection strategies in web applications using machine learning. Industry semester
title_full Evaluation of SQL injection (SQLi) attack detection strategies in web applications using machine learning. Industry semester
title_fullStr Evaluation of SQL injection (SQLi) attack detection strategies in web applications using machine learning. Industry semester
title_full_unstemmed Evaluation of SQL injection (SQLi) attack detection strategies in web applications using machine learning. Industry semester
title_sort Evaluation of SQL injection (SQLi) attack detection strategies in web applications using machine learning. Industry semester
dc.creator.fl_str_mv Taborda Echeverri, Santiago
dc.contributor.advisor.none.fl_str_mv Vergara Tejada, Jaime Alberto
Triana Maldonado, Jhonny Alexander
dc.contributor.author.none.fl_str_mv Taborda Echeverri, Santiago
dc.subject.decs.none.fl_str_mv Bosques aleatorios
Random Forest
Seguridad computacional
Computer Security
topic Bosques aleatorios
Random Forest
Seguridad computacional
Computer Security
Procesamiento de datos
http://vocabularies.unesco.org/thesaurus/concept522
Aprendizaje automático (inteligencia artificial)
Machine learning
Análisis de regresión logística
Logistic regression analysis
Integración numérica - procesamiento de datos
Numerical integration - data processing
Inteligencia artificial
Artificial intelligence
Data processing
Inyección SQL (SQLi)
Firewall de Aplicaciones Web
SVM de Una Clase
AizoOn Technology Consulting
https://id.nlm.nih.gov/mesh/D000093743
https://id.nlm.nih.gov/mesh/D016494
dc.subject.unesco.none.fl_str_mv Procesamiento de datos
http://vocabularies.unesco.org/thesaurus/concept522
dc.subject.lemb.none.fl_str_mv Aprendizaje automático (inteligencia artificial)
Machine learning
Análisis de regresión logística
Logistic regression analysis
Integración numérica - procesamiento de datos
Numerical integration - data processing
Inteligencia artificial
Artificial intelligence
dc.subject.agrovoc.none.fl_str_mv Data processing
dc.subject.proposal.spa.fl_str_mv Inyección SQL (SQLi)
Firewall de Aplicaciones Web
SVM de Una Clase
AizoOn Technology Consulting
dc.subject.meshuri.none.fl_str_mv https://id.nlm.nih.gov/mesh/D000093743
https://id.nlm.nih.gov/mesh/D016494
description ABSTRACT : This work evaluates strategies for detecting SQL injection attacks based on artificial intelligence to generate a recommendation that allows the improvement of the web application firewall of AizoOn Technology Consulting (Mithril). To achieve this, detection techniques known as Naïve Bayes, logistic regression, random forests, and one-class support vector machines were selected based on their relevance and effectiveness demonstrated in the scientific literature and the company's expressed interests. These techniques were implemented by structuring a hybrid database integrating public data from the "SQL Injection Dataset" available on Kaggle with data processed by Mithril. This process involved data analysis, preprocessing, and conditioning. Data integration proved useful for implementing the machine learning models. Subsequently, hyperparameter tuning was performed to improve the models' performance, identifying the best configurations for each of them, thus increasing detection capabilities and minimizing false positives. The evaluation and benchmarking of the models were conducted using performance metrics such as accuracy, precision, recall, and F1-Score. Finally, the results led to the recommendation of implementing the logistic regression model in Mithril, as it achieved the best performance with accuracy and F1-Score of 99.45%.
publishDate 2024
dc.date.accessioned.none.fl_str_mv 2024-07-16T18:58:39Z
dc.date.available.none.fl_str_mv 2024-07-16T18:58:39Z
dc.date.issued.none.fl_str_mv 2024
dc.type.spa.fl_str_mv Tesis/Trabajo de grado - Monografía - Pregrado
dc.type.coar.spa.fl_str_mv http://purl.org/coar/resource_type/c_7a1f
dc.type.redcol.spa.fl_str_mv https://purl.org/redcol/resource_type/TP
dc.type.coarversion.spa.fl_str_mv http://purl.org/coar/version/c_b1a7d7d4d402bcce
dc.type.driver.spa.fl_str_mv info:eu-repo/semantics/bachelorThesis
dc.type.version.spa.fl_str_mv info:eu-repo/semantics/draft
format http://purl.org/coar/resource_type/c_7a1f
status_str draft
dc.identifier.uri.none.fl_str_mv https://hdl.handle.net/10495/40601
url https://hdl.handle.net/10495/40601
dc.language.iso.spa.fl_str_mv eng
language eng
dc.relation.issupplementedby.spa.fl_str_mv https://github.com/taechsantiago/ml_sqli_evaluation.git
dc.rights.uri.*.fl_str_mv http://creativecommons.org/licenses/by-nc-nd/2.5/co/
dc.rights.uri.spa.fl_str_mv https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.rights.accessrights.spa.fl_str_mv info:eu-repo/semantics/openAccess
dc.rights.coar.spa.fl_str_mv http://purl.org/coar/access_right/c_abf2
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-nd/2.5/co/
https://creativecommons.org/licenses/by-nc-nd/4.0/
http://purl.org/coar/access_right/c_abf2
eu_rights_str_mv openAccess
dc.format.extent.spa.fl_str_mv 64 páginas
dc.format.mimetype.spa.fl_str_mv application/pdf
dc.publisher.spa.fl_str_mv Universidad de Antioquia
dc.publisher.place.spa.fl_str_mv Medellín, Colombia
dc.publisher.faculty.spa.fl_str_mv Facultad de Ingeniería. Ingeniería de Telecomunicaciones
institution Universidad de Antioquia
bitstream.url.fl_str_mv https://bibliotecadigital.udea.edu.co/bitstreams/c7e876e8-26ac-437b-bb63-ff5955d6f69e/download
https://bibliotecadigital.udea.edu.co/bitstreams/b006ee66-3998-4e36-a9df-5a3eecf0ce4a/download
https://bibliotecadigital.udea.edu.co/bitstreams/1bf533f9-cdfb-4c99-a60a-45fa2cf2e174/download
https://bibliotecadigital.udea.edu.co/bitstreams/c89b071f-5d54-4202-92ad-2461fed2e9b5/download
https://bibliotecadigital.udea.edu.co/bitstreams/608b201d-12da-4d29-8cd9-f332ee388e9e/download
bitstream.checksum.fl_str_mv b88b088d9957e670ce3b3fbe2eedbc13
8a4605be74aa9ea9d79846c1fba20a33
4766ffcdde9b3d1746d8d71176dda236
d58216eb693918282ef1d4673ccaa37a
6f912d154cec6e67807bddd662e4334b
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositorio Institucional de la Universidad de Antioquia
repository.mail.fl_str_mv aplicacionbibliotecadigitalbiblioteca@udea.edu.co
_version_ 1851052581985452032
spelling Vergara Tejada, Jaime AlbertoTriana Maldonado, Jhonny AlexanderTaborda Echeverri, Santiago2024-07-16T18:58:39Z2024-07-16T18:58:39Z2024https://hdl.handle.net/10495/40601ABSTRACT : This work evaluates strategies for detecting SQL injection attacks based on artificial intelligence to generate a recommendation that allows the improvement of the web application firewall of AizoOn Technology Consulting (Mithril). To achieve this, detection techniques known as Naïve Bayes, logistic regression, random forests, and one-class support vector machines were selected based on their relevance and effectiveness demonstrated in the scientific literature and the company's expressed interests. These techniques were implemented by structuring a hybrid database integrating public data from the "SQL Injection Dataset" available on Kaggle with data processed by Mithril. This process involved data analysis, preprocessing, and conditioning. Data integration proved useful for implementing the machine learning models. Subsequently, hyperparameter tuning was performed to improve the models' performance, identifying the best configurations for each of them, thus increasing detection capabilities and minimizing false positives. The evaluation and benchmarking of the models were conducted using performance metrics such as accuracy, precision, recall, and F1-Score. Finally, the results led to the recommendation of implementing the logistic regression model in Mithril, as it achieved the best performance with accuracy and F1-Score of 99.45%.RESUMEN : Este trabajo se centra en evaluar estrategias de detección de ataques de inyección SQL basadas en inteligencia computacional para generar una recomendación que permita mejorar el firewall de aplicaciones web de la empresa AizoOn Technology Consulting (Mithril). Para ello, se seleccionaron las técnicas de detección conocidas como Naïve Bayes, regresión logística, bosques aleatorios y máquinas de soporte vectorial de única clase, basándose tanto en su relevancia y efectividad demostrada en la literatura científica como en los intereses expresados por la compañía. Estas técnicas se implementaron a partir de la estructuración de una base de datos híbrida integrando datos públicos del conjunto de datos "SQL Injection Dataset" disponible en Kaggle con datos procesados por Mithril. Este proceso incluyó análisis, pre-procesamiento y acondicionamiento de los datos. La integración de los datos resultó útil para la implementación de los modelos de inteligencia computacional. Posteriormente se realizó el ajuste de hiper-parámetros que permitió mejorar el rendimiento de los modelos, identificando las mejores configuraciones para cada uno de ellos, lo que aumentó las capacidades de detección y minimizó los falsos positivos. La evaluación y comparación de los modelos fue realizada utilizando métricas de desempeño como exactitud, precisión, recall y F1-Score. Finalmente, los resultados obtenidos permitieron recomendar la implementación del modelo de regresión logística en Mithril, debido a que fue el modelo que alcanzó el mejor desempeño con una exactitud y F1-Score del 99.45%.PregradoIngeniero de Telecomunicaciones64 páginasapplication/pdfengUniversidad de AntioquiaMedellín, ColombiaFacultad de Ingeniería. Ingeniería de Telecomunicacioneshttp://creativecommons.org/licenses/by-nc-nd/2.5/co/https://creativecommons.org/licenses/by-nc-nd/4.0/info:eu-repo/semantics/openAccesshttp://purl.org/coar/access_right/c_abf2Evaluation of SQL injection (SQLi) attack detection strategies in web applications using machine learning. Industry semesterEvaluación de estrategias de detección de ataques de inyección SQL (SQLi) en aplicaciones web basadas en técnicas de inteligencia computacional. Semestre de industriaTesis/Trabajo de grado - Monografía - Pregradohttp://purl.org/coar/resource_type/c_7a1fhttps://purl.org/redcol/resource_type/TPhttp://purl.org/coar/version/c_b1a7d7d4d402bcceinfo:eu-repo/semantics/bachelorThesisinfo:eu-repo/semantics/draftBosques aleatoriosRandom ForestSeguridad computacionalComputer SecurityProcesamiento de datoshttp://vocabularies.unesco.org/thesaurus/concept522Aprendizaje automático (inteligencia artificial)Machine learningAnálisis de regresión logísticaLogistic regression analysisIntegración numérica - procesamiento de datosNumerical integration - data processingInteligencia artificialArtificial intelligenceData processingInyección SQL (SQLi)Firewall de Aplicaciones WebSVM de Una ClaseAizoOn Technology Consultinghttps://id.nlm.nih.gov/mesh/D000093743https://id.nlm.nih.gov/mesh/D016494https://github.com/taechsantiago/ml_sqli_evaluation.gitPublicationCC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8823https://bibliotecadigital.udea.edu.co/bitstreams/c7e876e8-26ac-437b-bb63-ff5955d6f69e/downloadb88b088d9957e670ce3b3fbe2eedbc13MD52falseAnonymousREADLICENSElicense.txtlicense.txttext/plain; charset=utf-81748https://bibliotecadigital.udea.edu.co/bitstreams/b006ee66-3998-4e36-a9df-5a3eecf0ce4a/download8a4605be74aa9ea9d79846c1fba20a33MD53falseAnonymousREADORIGINALTabordaSantiago_2024_MachineLearningSqli.pdfTabordaSantiago_2024_MachineLearningSqli.pdfTrabajo de grado de pregradoapplication/pdf2240315https://bibliotecadigital.udea.edu.co/bitstreams/1bf533f9-cdfb-4c99-a60a-45fa2cf2e174/download4766ffcdde9b3d1746d8d71176dda236MD51trueAnonymousREADTEXTTabordaSantiago_2024_MachineLearningSqli.pdf.txtTabordaSantiago_2024_MachineLearningSqli.pdf.txtExtracted texttext/plain90365https://bibliotecadigital.udea.edu.co/bitstreams/c89b071f-5d54-4202-92ad-2461fed2e9b5/downloadd58216eb693918282ef1d4673ccaa37aMD54falseAnonymousREADTHUMBNAILTabordaSantiago_2024_MachineLearningSqli.pdf.jpgTabordaSantiago_2024_MachineLearningSqli.pdf.jpgGenerated Thumbnailimage/jpeg5914https://bibliotecadigital.udea.edu.co/bitstreams/608b201d-12da-4d29-8cd9-f332ee388e9e/download6f912d154cec6e67807bddd662e4334bMD55falseAnonymousREAD10495/40601oai:bibliotecadigital.udea.edu.co:10495/406012025-03-27 00:37:13.462http://creativecommons.org/licenses/by-nc-nd/2.5/co/open.accesshttps://bibliotecadigital.udea.edu.coRepositorio Institucional de la Universidad de Antioquiaaplicacionbibliotecadigitalbiblioteca@udea.edu.coTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=