Computer Architecture Simulation on GPU

ABSTRACT : Due to the power wall in computer architectures [33], processor manufacturers have adopted a strategy of increasing the number of cores on modern computers to enhance their performance and reduce power consumption [100][93][21]. As a result, research and innovation are now focused on inte...

Full description

Autores:
Buitrago Paniagua, John Byron
Tipo de recurso:
Doctoral thesis
Fecha de publicación:
2024
Institución:
Universidad de Antioquia
Repositorio:
Repositorio UdeA
Idioma:
eng
OAI Identifier:
oai:bibliotecadigital.udea.edu.co:10495/38543
Acceso en línea:
https://hdl.handle.net/10495/38543
Palabra clave:
Simulación por computadores
Computer simulation
Métodos de simulación
Simulation methods
Arquitectura de computadores
Computer architecture
Rights
embargoedAccess
License
https://creativecommons.org/licenses/by-nc-sa/4.0/
id UDEA2_080ed90b8a592b57902861dd1439fffd
oai_identifier_str oai:bibliotecadigital.udea.edu.co:10495/38543
network_acronym_str UDEA2
network_name_str Repositorio UdeA
repository_id_str
dc.title.spa.fl_str_mv Computer Architecture Simulation on GPU
dc.title.translated.spa.fl_str_mv Simuación de Arquitectura de Computadores en GPUs
title Computer Architecture Simulation on GPU
spellingShingle Computer Architecture Simulation on GPU
Simulación por computadores
Computer simulation
Métodos de simulación
Simulation methods
Arquitectura de computadores
Computer architecture
title_short Computer Architecture Simulation on GPU
title_full Computer Architecture Simulation on GPU
title_fullStr Computer Architecture Simulation on GPU
title_full_unstemmed Computer Architecture Simulation on GPU
title_sort Computer Architecture Simulation on GPU
dc.creator.fl_str_mv Buitrago Paniagua, John Byron
dc.contributor.advisor.none.fl_str_mv Velasquez Ricardo, Rivera Fredy
Velásquez Vélez, Ricardo Andrés
dc.contributor.author.none.fl_str_mv Buitrago Paniagua, John Byron
dc.contributor.researchgroup.spa.fl_str_mv Sistemas Embebidos e Inteligencia Computacional (SISTEMIC)
dc.subject.lemb.none.fl_str_mv Simulación por computadores
Computer simulation
Métodos de simulación
Simulation methods
Arquitectura de computadores
Computer architecture
topic Simulación por computadores
Computer simulation
Métodos de simulación
Simulation methods
Arquitectura de computadores
Computer architecture
description ABSTRACT : Due to the power wall in computer architectures [33], processor manufacturers have adopted a strategy of increasing the number of cores on modern computers to enhance their performance and reduce power consumption [100][93][21]. As a result, research and innovation are now focused on integrating more cores on a single chip and improving communication and memory systems, resulting in a complex design space for these architectures [93][110]. Due to processor complexity outpacing processor performance, computer architecture simulators tend to slow down over time [28][36]. This fact necessitates more advanced simulation strategies and tools. Computer architects have proposed various techniques to speed up their simulators by reducing their workload and separately exploiting their parallelism [89] [41] [104] [15] [54] [29] [55] [34]. This work aims to accelerate a detailed simulation of an uncore in a multicore architecture that supports multithreading workloads. The work proposes a methodology that integrates the core models technique [16] that abstracts the core behavior and reduces the time needed to simulate the cores. It also incorporates techniques that leverage the parallel capacity of GPU platforms to speed up the simulation. The methodology includes establishing a core model, a target uncore, their parallelization process, and their inter- action. Additionally, it involves simulator task distribution and optimizing GPU process runtime. The proposed methodology is evaluated through a case study, testing simulations of architectures from 1 to 128 cores to evaluate the speedup scaling and estimate the simulation’s accuracy. The reference model consists of a sequential CPU GEM5-based version. The simulation cycles and the number of misses for each cache in the architecture are compared for accuracy. The experimental results show that increasing the number of cores to be simulated in the architecture increases our GPU-based simulation’s speed compared to the CPU version. The parallel simulator shows an acceleration starting at 64 cores. In large part of the results, the accuracy for the simulation cycles has a relative error lower than 10%. The accuracy for the number of misses is a consistent metric through all cache levels. The relative error in most cases is lower than 5%. However, performance and accuracy metrics were dependent on workload type. Some simulations presented relative errors of around 40% for the simulation cycles, and some benchmarks did not reach any speedup at 128 cores.
publishDate 2024
dc.date.accessioned.none.fl_str_mv 2024-03-11T19:32:41Z
dc.date.available.none.fl_str_mv 2024-03-11T19:32:41Z
dc.date.issued.none.fl_str_mv 2024
dc.type.spa.fl_str_mv Tesis/Trabajo de grado - Monografía - Doctorado
dc.type.coar.spa.fl_str_mv http://purl.org/coar/resource_type/c_db06
dc.type.redcol.spa.fl_str_mv https://purl.org/redcol/resource_type/TD
dc.type.coarversion.spa.fl_str_mv http://purl.org/coar/version/c_b1a7d7d4d402bcce
dc.type.driver.spa.fl_str_mv info:eu-repo/semantics/doctoralThesis
dc.type.version.spa.fl_str_mv info:eu-repo/semantics/draft
format http://purl.org/coar/resource_type/c_db06
status_str draft
dc.identifier.uri.none.fl_str_mv https://hdl.handle.net/10495/38543
url https://hdl.handle.net/10495/38543
dc.language.iso.spa.fl_str_mv eng
language eng
dc.rights.uri.spa.fl_str_mv https://creativecommons.org/licenses/by-nc-sa/4.0/
dc.rights.uri.*.fl_str_mv http://creativecommons.org/licenses/by/2.5/co/
dc.rights.accessrights.spa.fl_str_mv info:eu-repo/semantics/embargoedAccess
dc.rights.coar.spa.fl_str_mv http://purl.org/coar/access_right/c_f1cf
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/4.0/
http://creativecommons.org/licenses/by/2.5/co/
http://purl.org/coar/access_right/c_f1cf
eu_rights_str_mv embargoedAccess
dc.format.extent.spa.fl_str_mv 124 páginas
dc.format.mimetype.spa.fl_str_mv application/pdf
dc.publisher.spa.fl_str_mv Universidad de Antioquia
dc.publisher.place.spa.fl_str_mv Medellín, Colombia
dc.publisher.faculty.spa.fl_str_mv Facultad de Ingeniería. Doctorado en Ingeniería Electrónica y de Computación
institution Universidad de Antioquia
bitstream.url.fl_str_mv https://bibliotecadigital.udea.edu.co/bitstreams/771fc3b5-31f6-4c88-a035-73484e02c33a/download
https://bibliotecadigital.udea.edu.co/bitstreams/674ef0f2-20e7-41e0-8f86-21ee2f5ecf88/download
https://bibliotecadigital.udea.edu.co/bitstreams/15568646-143a-4731-bcc1-9ff0de6959ab/download
https://bibliotecadigital.udea.edu.co/bitstreams/46f5c82c-8782-4b5b-bfe1-1d5165d24329/download
https://bibliotecadigital.udea.edu.co/bitstreams/59774f0e-8af0-45c9-ba28-29b2cffe2a84/download
bitstream.checksum.fl_str_mv 1646d1f6b96dbbbc38035efc9239ac9c
8a4605be74aa9ea9d79846c1fba20a33
331ecb5fcb715540069034b9410b8317
e2716a0b3eab0217016a4a3274cb8863
c31e1f459db6e615db042b1f8cc23fd7
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositorio Institucional de la Universidad de Antioquia
repository.mail.fl_str_mv aplicacionbibliotecadigitalbiblioteca@udea.edu.co
_version_ 1851052354530443264
spelling Velasquez Ricardo, Rivera FredyVelásquez Vélez, Ricardo AndrésBuitrago Paniagua, John ByronSistemas Embebidos e Inteligencia Computacional (SISTEMIC)2024-03-11T19:32:41Z2024-03-11T19:32:41Z2024https://hdl.handle.net/10495/38543ABSTRACT : Due to the power wall in computer architectures [33], processor manufacturers have adopted a strategy of increasing the number of cores on modern computers to enhance their performance and reduce power consumption [100][93][21]. As a result, research and innovation are now focused on integrating more cores on a single chip and improving communication and memory systems, resulting in a complex design space for these architectures [93][110]. Due to processor complexity outpacing processor performance, computer architecture simulators tend to slow down over time [28][36]. This fact necessitates more advanced simulation strategies and tools. Computer architects have proposed various techniques to speed up their simulators by reducing their workload and separately exploiting their parallelism [89] [41] [104] [15] [54] [29] [55] [34]. This work aims to accelerate a detailed simulation of an uncore in a multicore architecture that supports multithreading workloads. The work proposes a methodology that integrates the core models technique [16] that abstracts the core behavior and reduces the time needed to simulate the cores. It also incorporates techniques that leverage the parallel capacity of GPU platforms to speed up the simulation. The methodology includes establishing a core model, a target uncore, their parallelization process, and their inter- action. Additionally, it involves simulator task distribution and optimizing GPU process runtime. The proposed methodology is evaluated through a case study, testing simulations of architectures from 1 to 128 cores to evaluate the speedup scaling and estimate the simulation’s accuracy. The reference model consists of a sequential CPU GEM5-based version. The simulation cycles and the number of misses for each cache in the architecture are compared for accuracy. The experimental results show that increasing the number of cores to be simulated in the architecture increases our GPU-based simulation’s speed compared to the CPU version. The parallel simulator shows an acceleration starting at 64 cores. In large part of the results, the accuracy for the simulation cycles has a relative error lower than 10%. The accuracy for the number of misses is a consistent metric through all cache levels. The relative error in most cases is lower than 5%. However, performance and accuracy metrics were dependent on workload type. Some simulations presented relative errors of around 40% for the simulation cycles, and some benchmarks did not reach any speedup at 128 cores.COL0010717DoctoradoDoctor en Ingeniería Electrónica y de Computación124 páginasapplication/pdfengUniversidad de AntioquiaMedellín, ColombiaFacultad de Ingeniería. Doctorado en Ingeniería Electrónica y de Computaciónhttps://creativecommons.org/licenses/by-nc-sa/4.0/http://creativecommons.org/licenses/by/2.5/co/info:eu-repo/semantics/embargoedAccesshttp://purl.org/coar/access_right/c_f1cfComputer Architecture Simulation on GPUSimuación de Arquitectura de Computadores en GPUsTesis/Trabajo de grado - Monografía - Doctoradohttp://purl.org/coar/resource_type/c_db06https://purl.org/redcol/resource_type/TDhttp://purl.org/coar/version/c_b1a7d7d4d402bcceinfo:eu-repo/semantics/doctoralThesisinfo:eu-repo/semantics/draftSimulación por computadoresComputer simulationMétodos de simulaciónSimulation methodsArquitectura de computadoresComputer architecturePublicationCC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8927https://bibliotecadigital.udea.edu.co/bitstreams/771fc3b5-31f6-4c88-a035-73484e02c33a/download1646d1f6b96dbbbc38035efc9239ac9cMD53falseAnonymousREADLICENSElicense.txtlicense.txttext/plain; charset=utf-81748https://bibliotecadigital.udea.edu.co/bitstreams/674ef0f2-20e7-41e0-8f86-21ee2f5ecf88/download8a4605be74aa9ea9d79846c1fba20a33MD54falseAnonymousREADORIGINALBuitragoJohn_2024_ComputerArchitectureSimulation.pdfBuitragoJohn_2024_ComputerArchitectureSimulation.pdfTesis doctoralapplication/pdf1959291https://bibliotecadigital.udea.edu.co/bitstreams/15568646-143a-4731-bcc1-9ff0de6959ab/download331ecb5fcb715540069034b9410b8317MD55trueAnonymousREAD2025-02-22TEXTBuitragoJohn_2024_ComputerArchitectureSimulation.pdf.txtBuitragoJohn_2024_ComputerArchitectureSimulation.pdf.txtExtracted texttext/plain100315https://bibliotecadigital.udea.edu.co/bitstreams/46f5c82c-8782-4b5b-bfe1-1d5165d24329/downloade2716a0b3eab0217016a4a3274cb8863MD56falseAnonymousREAD2025-02-22THUMBNAILBuitragoJohn_2024_ComputerArchitectureSimulation.pdf.jpgBuitragoJohn_2024_ComputerArchitectureSimulation.pdf.jpgGenerated Thumbnailimage/jpeg6459https://bibliotecadigital.udea.edu.co/bitstreams/59774f0e-8af0-45c9-ba28-29b2cffe2a84/downloadc31e1f459db6e615db042b1f8cc23fd7MD57falseAnonymousREAD2025-02-2210495/38543oai:bibliotecadigital.udea.edu.co:10495/385432025-03-26 21:01:44.547https://creativecommons.org/licenses/by-nc-sa/4.0/open.accesshttps://bibliotecadigital.udea.edu.coRepositorio Institucional de la Universidad de Antioquiaaplicacionbibliotecadigitalbiblioteca@udea.edu.coTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=