Computer Architecture Simulation on GPU

ABSTRACT : Due to the power wall in computer architectures [33], processor manufacturers have adopted a strategy of increasing the number of cores on modern computers to enhance their performance and reduce power consumption [100][93][21]. As a result, research and innovation are now focused on inte...

Full description

Autores:: Buitrago Paniagua, John Byron

Tipo de recurso:: Doctoral thesis

Fecha de publicación:: 2024

Institución:: Universidad de Antioquia

Repositorio:: Repositorio UdeA

Idioma:: eng

id	UDEA2_080ed90b8a592b57902861dd1439fffd
oai_identifier_str	oai:bibliotecadigital.udea.edu.co:10495/38543
network_acronym_str	UDEA2
network_name_str	Repositorio UdeA
repository_id_str
dc.title.spa.fl_str_mv	Computer Architecture Simulation on GPU
dc.title.translated.spa.fl_str_mv	Simuación de Arquitectura de Computadores en GPUs
title	Computer Architecture Simulation on GPU
spellingShingle	Computer Architecture Simulation on GPU Simulación por computadores Computer simulation Métodos de simulación Simulation methods Arquitectura de computadores Computer architecture
title_short	Computer Architecture Simulation on GPU
title_full	Computer Architecture Simulation on GPU
title_fullStr	Computer Architecture Simulation on GPU
title_full_unstemmed	Computer Architecture Simulation on GPU
title_sort	Computer Architecture Simulation on GPU
dc.creator.fl_str_mv	Buitrago Paniagua, John Byron
dc.contributor.advisor.none.fl_str_mv	Velasquez Ricardo, Rivera Fredy Velásquez Vélez, Ricardo Andrés
dc.contributor.author.none.fl_str_mv	Buitrago Paniagua, John Byron
dc.contributor.researchgroup.spa.fl_str_mv	Sistemas Embebidos e Inteligencia Computacional (SISTEMIC)
dc.subject.lemb.none.fl_str_mv	Simulación por computadores Computer simulation Métodos de simulación Simulation methods Arquitectura de computadores Computer architecture
topic	Simulación por computadores Computer simulation Métodos de simulación Simulation methods Arquitectura de computadores Computer architecture
description	ABSTRACT : Due to the power wall in computer architectures [33], processor manufacturers have adopted a strategy of increasing the number of cores on modern computers to enhance their performance and reduce power consumption [100][93][21]. As a result, research and innovation are now focused on integrating more cores on a single chip and improving communication and memory systems, resulting in a complex design space for these architectures [93][110]. Due to processor complexity outpacing processor performance, computer architecture simulators tend to slow down over time [28][36]. This fact necessitates more advanced simulation strategies and tools. Computer architects have proposed various techniques to speed up their simulators by reducing their workload and separately exploiting their parallelism [89] [41] [104] [15] [54] [29] [55] [34]. This work aims to accelerate a detailed simulation of an uncore in a multicore architecture that supports multithreading workloads. The work proposes a methodology that integrates the core models technique [16] that abstracts the core behavior and reduces the time needed to simulate the cores. It also incorporates techniques that leverage the parallel capacity of GPU platforms to speed up the simulation. The methodology includes establishing a core model, a target uncore, their parallelization process, and their inter- action. Additionally, it involves simulator task distribution and optimizing GPU process runtime. The proposed methodology is evaluated through a case study, testing simulations of architectures from 1 to 128 cores to evaluate the speedup scaling and estimate the simulation’s accuracy. The reference model consists of a sequential CPU GEM5-based version. The simulation cycles and the number of misses for each cache in the architecture are compared for accuracy. The experimental results show that increasing the number of cores to be simulated in the architecture increases our GPU-based simulation’s speed compared to the CPU version. The parallel simulator shows an acceleration starting at 64 cores. In large part of the results, the accuracy for the simulation cycles has a relative error lower than 10%. The accuracy for the number of misses is a consistent metric through all cache levels. The relative error in most cases is lower than 5%. However, performance and accuracy metrics were dependent on workload type. Some simulations presented relative errors of around 40% for the simulation cycles, and some benchmarks did not reach any speedup at 128 cores.
publishDate	2024
dc.date.accessioned.none.fl_str_mv	2024-03-11T19:32:41Z
dc.date.available.none.fl_str_mv	2024-03-11T19:32:41Z
dc.date.issued.none.fl_str_mv	2024
dc.type.spa.fl_str_mv	Tesis/Trabajo de grado - Monografía - Doctorado
dc.type.coar.spa.fl_str_mv	http://purl.org/coar/resource_type/c_db06
dc.type.redcol.spa.fl_str_mv	https://purl.org/redcol/resource_type/TD
dc.type.coarversion.spa.fl_str_mv	http://purl.org/coar/version/c_b1a7d7d4d402bcce
dc.type.driver.spa.fl_str_mv	info:eu-repo/semantics/doctoralThesis
dc.type.version.spa.fl_str_mv	info:eu-repo/semantics/draft
format	http://purl.org/coar/resource_type/c_db06
status_str	draft
dc.identifier.uri.none.fl_str_mv	https://hdl.handle.net/10495/38543
url	https://hdl.handle.net/10495/38543
dc.language.iso.spa.fl_str_mv	eng
language	eng
dc.rights.uri.spa.fl_str_mv	https://creativecommons.org/licenses/by-nc-sa/4.0/
dc.rights.uri.*.fl_str_mv	http://creativecommons.org/licenses/by/2.5/co/
dc.rights.accessrights.spa.fl_str_mv	info:eu-repo/semantics/embargoedAccess
dc.rights.coar.spa.fl_str_mv	http://purl.org/coar/access_right/c_f1cf
rights_invalid_str_mv	https://creativecommons.org/licenses/by-nc-sa/4.0/ http://creativecommons.org/licenses/by/2.5/co/ http://purl.org/coar/access_right/c_f1cf
eu_rights_str_mv	embargoedAccess
dc.format.extent.spa.fl_str_mv	124 páginas
dc.format.mimetype.spa.fl_str_mv	application/pdf
dc.publisher.spa.fl_str_mv	Universidad de Antioquia
dc.publisher.place.spa.fl_str_mv	Medellín, Colombia
dc.publisher.faculty.spa.fl_str_mv	Facultad de Ingeniería. Doctorado en Ingeniería Electrónica y de Computación
institution	Universidad de Antioquia
bitstream.url.fl_str_mv	https://bibliotecadigital.udea.edu.co/bitstreams/771fc3b5-31f6-4c88-a035-73484e02c33a/download https://bibliotecadigital.udea.edu.co/bitstreams/674ef0f2-20e7-41e0-8f86-21ee2f5ecf88/download https://bibliotecadigital.udea.edu.co/bitstreams/15568646-143a-4731-bcc1-9ff0de6959ab/download https://bibliotecadigital.udea.edu.co/bitstreams/46f5c82c-8782-4b5b-bfe1-1d5165d24329/download https://bibliotecadigital.udea.edu.co/bitstreams/59774f0e-8af0-45c9-ba28-29b2cffe2a84/download
bitstream.checksum.fl_str_mv	1646d1f6b96dbbbc38035efc9239ac9c 8a4605be74aa9ea9d79846c1fba20a33 331ecb5fcb715540069034b9410b8317 e2716a0b3eab0217016a4a3274cb8863 c31e1f459db6e615db042b1f8cc23fd7
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5 MD5 MD5
repository.name.fl_str_mv	Repositorio Institucional de la Universidad de Antioquia
repository.mail.fl_str_mv	aplicacionbibliotecadigitalbiblioteca@udea.edu.co
_version_	1851052354530443264
spelling	Velasquez Ricardo, Rivera FredyVelásquez Vélez, Ricardo AndrésBuitrago Paniagua, John ByronSistemas Embebidos e Inteligencia Computacional (SISTEMIC)2024-03-11T19:32:41Z2024-03-11T19:32:41Z2024https://hdl.handle.net/10495/38543ABSTRACT : Due to the power wall in computer architectures [33], processor manufacturers have adopted a strategy of increasing the number of cores on modern computers to enhance their performance and reduce power consumption [100][93][21]. As a result, research and innovation are now focused on integrating more cores on a single chip and improving communication and memory systems, resulting in a complex design space for these architectures [93][110]. Due to processor complexity outpacing processor performance, computer architecture simulators tend to slow down over time [28][36]. This fact necessitates more advanced simulation strategies and tools. Computer architects have proposed various techniques to speed up their simulators by reducing their workload and separately exploiting their parallelism [89] [41] [104] [15] [54] [29] [55] [34]. This work aims to accelerate a detailed simulation of an uncore in a multicore architecture that supports multithreading workloads. The work proposes a methodology that integrates the core models technique [16] that abstracts the core behavior and reduces the time needed to simulate the cores. It also incorporates techniques that leverage the parallel capacity of GPU platforms to speed up the simulation. The methodology includes establishing a core model, a target uncore, their parallelization process, and their inter- action. Additionally, it involves simulator task distribution and optimizing GPU process runtime. The proposed methodology is evaluated through a case study, testing simulations of architectures from 1 to 128 cores to evaluate the speedup scaling and estimate the simulation’s accuracy. The reference model consists of a sequential CPU GEM5-based version. The simulation cycles and the number of misses for each cache in the architecture are compared for accuracy. The experimental results show that increasing the number of cores to be simulated in the architecture increases our GPU-based simulation’s speed compared to the CPU version. The parallel simulator shows an acceleration starting at 64 cores. In large part of the results, the accuracy for the simulation cycles has a relative error lower than 10%. The accuracy for the number of misses is a consistent metric through all cache levels. The relative error in most cases is lower than 5%. However, performance and accuracy metrics were dependent on workload type. Some simulations presented relative errors of around 40% for the simulation cycles, and some benchmarks did not reach any speedup at 128 cores.COL0010717DoctoradoDoctor en Ingeniería Electrónica y de Computación124 páginasapplication/pdfengUniversidad de AntioquiaMedellín, ColombiaFacultad de Ingeniería. Doctorado en Ingeniería Electrónica y de Computaciónhttps://creativecommons.org/licenses/by-nc-sa/4.0/http://creativecommons.org/licenses/by/2.5/co/info:eu-repo/semantics/embargoedAccesshttp://purl.org/coar/access_right/c_f1cfComputer Architecture Simulation on GPUSimuación de Arquitectura de Computadores en GPUsTesis/Trabajo de grado - Monografía - Doctoradohttp://purl.org/coar/resource_type/c_db06https://purl.org/redcol/resource_type/TDhttp://purl.org/coar/version/c_b1a7d7d4d402bcceinfo:eu-repo/semantics/doctoralThesisinfo:eu-repo/semantics/draftSimulación por computadoresComputer simulationMétodos de simulaciónSimulation methodsArquitectura de computadoresComputer architecturePublicationCC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8927https://bibliotecadigital.udea.edu.co/bitstreams/771fc3b5-31f6-4c88-a035-73484e02c33a/download1646d1f6b96dbbbc38035efc9239ac9cMD53falseAnonymousREADLICENSElicense.txtlicense.txttext/plain; charset=utf-81748https://bibliotecadigital.udea.edu.co/bitstreams/674ef0f2-20e7-41e0-8f86-21ee2f5ecf88/download8a4605be74aa9ea9d79846c1fba20a33MD54falseAnonymousREADORIGINALBuitragoJohn_2024_ComputerArchitectureSimulation.pdfBuitragoJohn_2024_ComputerArchitectureSimulation.pdfTesis doctoralapplication/pdf1959291https://bibliotecadigital.udea.edu.co/bitstreams/15568646-143a-4731-bcc1-9ff0de6959ab/download331ecb5fcb715540069034b9410b8317MD55trueAnonymousREAD2025-02-22TEXTBuitragoJohn_2024_ComputerArchitectureSimulation.pdf.txtBuitragoJohn_2024_ComputerArchitectureSimulation.pdf.txtExtracted texttext/plain100315https://bibliotecadigital.udea.edu.co/bitstreams/46f5c82c-8782-4b5b-bfe1-1d5165d24329/downloade2716a0b3eab0217016a4a3274cb8863MD56falseAnonymousREAD2025-02-22THUMBNAILBuitragoJohn_2024_ComputerArchitectureSimulation.pdf.jpgBuitragoJohn_2024_ComputerArchitectureSimulation.pdf.jpgGenerated Thumbnailimage/jpeg6459https://bibliotecadigital.udea.edu.co/bitstreams/59774f0e-8af0-45c9-ba28-29b2cffe2a84/downloadc31e1f459db6e615db042b1f8cc23fd7MD57falseAnonymousREAD2025-02-2210495/38543oai:bibliotecadigital.udea.edu.co:10495/385432025-03-26 21:01:44.547https://creativecommons.org/licenses/by-nc-sa/4.0/open.accesshttps://bibliotecadigital.udea.edu.coRepositorio Institucional de la Universidad de Antioquiaaplicacionbibliotecadigitalbiblioteca@udea.edu.coTk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=

Computer Architecture Simulation on GPU

Publicaciones similares