Reducing Latency and Enhancing Accuracy in LLM Inference through Firmware-Level Optimization. ijvsli. 2025;5(02):26-36. doi:10.55640/ijvsli-05-02-02