Optimizing Memory-Access Patterns for Deep Learning Accelerators

Hongbin Zheng, Sejong Oh, Huiqing Wang, Preston Briggs, Jiading Gai, Animesh Jain, Yizhi Liu, Rich Heaton, Randy Huang, Yida Wang

Computer Science PDF Available Non-peer-reviewed Preprint

Optimizing Memory-Access Patterns for Deep Learning Accelerators

Hongbin Zheng, Sejong Oh, Huiqing Wang, Preston Briggs, Jiading Gai, Animesh Jain, Yizhi Liu, Rich Heaton, Randy Huang, Yida Wang · Published 2020-02-27

Expertini /
Research /
Computer Science /
Optimizing Memory-Access Patterns for Deep...

📄 Download PDF 🔖 Bookmark Paper

Abstract

Deep learning (DL) workloads are moving towards accelerators for faster processing and lower cost. Modern DL accelerators are good at handling the large-scale multiply-accumulate operations that dominate DL workloads; however, it is challenging to make full use of the compute power of an accelerator since the data must be properly staged in a software-managed scratchpad memory. Failing to do so can result in significant performance loss. This paper proposes a systematic approach which leverages the polyhedral model to analyze all operators of a DL model together to minimize the number of memory accesses. Experiments show that our approach can substantially reduce the impact of memory accesses required by common neural-network models on a homegrown AWS machine-learning inference chip named Inferentia, which is available through Amazon EC2 Inf1 instances.

Keywords

Computer Science

📄 Full Paper Available as PDF

This paper is available as a downloadable PDF.

📄 Download PDF

Comments (0)

No comments yet. Be the first to comment.

Paper Details

Authors Hongbin Zheng ,
Sejong Oh ,
Huiqing Wang ,
Preston Briggs ,
Jiading Gai ,
Animesh Jain ,
Yizhi Liu ,
Rich Heaton ,
Randy Huang ,
Yida Wang
Published 2020-02-27
Category Computer Science
Status Non-peer-reviewed Preprint
Language English
Word Count 128

Optimizing Memory-Access Patterns for Deep Learning Accelerators

Abstract

Keywords

✨ AI Plain-English Summary

Comments (0)

Related Papers

A Model for Web Page Usage Mining Based on Segmentation

Core-Periphery Structure in Networks

Risk Assessment Techniques and Survey Method for COTS Components

Beyond the Bethe Free Energy of LDPC Codes via Polymer Expansions