Reducing Latency and Enhancing Accuracy in LLM Inference through Firmware-Level Optimization. (2025). International Journal of Signal Processing, Embedded Systems and VLSI Design, 5(02), 26-36. https://doi.org/10.55640/ijvsli-05-02-02