9.5 A 6K-MAC Feature-Map-Sparsity-Aware Neural Processing Unit in 5nm Flagship Mobile SoC

2021
On-device machine learning is critical for mobile products as it enables real-time applications (e.g. AI-powered camera applications), which need to be responsive, always available (i.e. do not require network connectivity) and privacy preserving. The platforms used in such situations have limited computing resources, power, and memory bandwidth. Enabling such on-device machine learning has triggered wide development of efficient neural-network accelerators that promise high energy and area efficiency compared to general-purpose processors, such as CPUs. The need to support a comprehensive range of neural networks has been important as well because the field of deep learning is evolving rapidly as depicted in Fig. 9.5.1. Recent work on neural-network accelerators has focused on improving energy efficiency, while obtaining high performance in order to meet the needs of real-time applications. For example, weightzero-skipping and pruning have been deployed in recent accelerators [2] –[7]. SIMD or systolic array-based accelerators [2] –[4], [6] provide flexibility to support various types of compute across a wide range of Deep Neural Network (DNN) models.
    • Correction
    • Source
    • Cite
    • Save
    6
    References
    6
    Citations
    NaN
    KQI
    []
    Baidu
    map