Wednesday, October 27, 2021
In recent years interest in sparse neural networks has steadily increased, accelerated by NVIDIA’s inclusion of dedicated hardware support in their recent Ampere GPUs. Sparse networks feature both limited interconnections between the neurons and restrictions on the number of neurons that are permitted to become active. By introducing this weight and activation sparsity, significant simplification of the computations required to both train and use the network is achieved. These sparse networks can achieve equivalent accuracy to their traditional ‘dense’ counterparts but have the potential to outperform the dense networks by an order of magnitude or more. In this presentation we start by discussing the opportunity associated with sparse networks and provide an overview of the state-of-the-art techniques used to create them. We conclude by presenting new software algorithms that unlock the full potential of sparsity on current hardware platforms, highlighting 100X speedups on FPGAs and 20X on CPUs and GPUs.