Abstract

Designing high performance and energy efficient AI accelerators requires significant engineering effort, and as the rapidly evolving field of machine learning develops new models, the current approach of designing ad hoc accelerators does not scale. In this talk, I will present our ongoing research on a high-level synthesis (HLS)-based framework for design space exploration and generation of hardware accelerators for AI. Given architectural parameters, such as datatype, scaling granularity, compute parallelism and memory sizes, the framework generates a performant fabrication-ready accelerator. Accelerators generated through this framework have been taped out in several chips, targeting various workloads including convolutional neural networks and transformer networks. In this talk, I will present the generator framework, and show how we can also use it as a benchmarking tool for designs leveraging emerging technologies.