Reducing Latency and Enhancing Accuracy in LLM Inference through Firmware-Level Optimization. International journal of signal processing, embedded systems and VLSI design, [S. l.], v. 5, n. 02, p. 26–36, 2025. DOI: 10.55640/ijvsli-05-02-02. Disponível em: https://www.academicpublishers.org/journals/index.php/ijvsli/article/view/5873.. Acesso em: 5 oct. 2025.