Kontakt5956 |dreamstime.com
Ai Chip Promo

When Selecting Memory for AI, You Must Choose…Wisely

Sept. 3, 2021
AI对存储的需求很高,因此选择正确的内存体系结构成为设计过程中的关键步骤。

本文是我们的一部分图书馆系列:System Design:AI的记忆

What you’ll learn:

  • 片上记忆的好处。
  • Dealing with the capacity issues of on-chip memory.
  • HBM vs. GDDR: Determining the best option.

在本系列的第3部分中,我们探索了车顶线模型如何帮助确定某些AI架构是否受其计算性能或内存带宽的限制。利用这些数据,设计师可以做出明智的决定,以最适合哪种类型的内存系统。

A variety of common memory systems are being used in high-performance AI applications, each with its own unique set of benefits and challenges. More than anything, choosing the “right” solution depends on the application and your use case.

On-Chip Memory: All Business

片上的记忆是可用的最高带宽,最节能的解决方案。它可以每秒提供数十个记忆带宽,现代标线处理器可以达到数百兆字节的容量。此外,数据需要在片上内存和计算单元之间运行的短距离大大降低了访问延迟并进一步提高功率效率。

The low latency and high bandwidth nature of on-chip memory allow for extremely high utilization of compute engines, making them well-suited to high-performance, low-power applications, especially when processing in handheld and battery-operated devices.

尽管片上内存的性能和功率效率是无与伦比的,但主要缺点围绕有限的容量旋转。片上存储器的存储容量远低于外部DRAM解决方案,当时使用多个DRAM时,该解决方案今天可以进入数十千兆字节。

A number of interesting innovations have emerged that make better use of the limited capacity of on-chip memory, including reduced precision data types and recalculating intermediate results to avoid occupying on-chip storage. However, the tremendous growth in training sets and model sizes continues to outpace these innovations, resulting in on-chip memory being better equipped for AI inference tasks than for AI training tasks.

Because of these tradeoffs, on-chip memory is a great solution when running inference tasks on smaller neural networks that fit within the capacity of the memory, or when inferencing in environments where multiple chips can work together to provide a solution. If this isn’t the case, it’s best to pursue other external memory options, such as high bandwidth memory (HBM) and graphics double data rate (GDDR).

HBM: Complex Power

HBM是最新的大批量DRAM解决方案,已在AI解决方案中迅速采用。HBM使用设备内的堆叠来实现高能力,以及以相对较低的数据速率(HBM2中的每秒两千兆位)运行的极宽的界面(1024个数据线),以实现具有良好信号完整性的极高带宽。堆叠以及宽阔和慢速界面的独特组合使HBM内存能够达到极高的性能,同时保持良好的功率效率。随着片上内存的能力增加,HBM为外部记忆解决方案提供了带宽和功率效率的最佳组合。

The area and power advantages that result from the HBM architecture come at an additional design and manufacturing cost. The numerous I/Os require a fine pitch that necessitates the use of an additional silicon interposer, substrate, and intricate stacking within the DRAM and between components in the system, adding extra cost and complexity before being assembled onto the PCB. Keeping the silicon cool and addressing the system engineering challenges associated with stacking add further challenges to implementing HBM2 solutions.

However, for organizations with the engineering skill to implement HBM memory systems, and with the ability to amortize the added costs, HBM2 can be a great choice for systems that need an external memory solution.

GDDR6: The All-Rounder

GDDR于20年前是为图形行业创建的,在片上记忆和HBM DRAM提供的带宽,功率效率,成本和可靠性之间提供了良好的中间立场。GDDR利用了在DDR等传统DRAM中使用的更熟悉的大容量制造和组装技术,使其成为平衡性能和复杂性的良好解决方案。

与HBM DRAM相反,HBM DRAM实施了以适度的数据速率运行的大量数据线,GDDR6 DRAM采用相反的方法,并以32个数据线以16 GB/s的速度运行,这是HBM2 DRAM的速度的八倍。较少的数据线消除了对插入器等其他组件的需求。但是,以更高的数据速率运行会带来信号完整性和发电效率的挑战。

Those issues can be managed with carefully designed PHYs, packages, and boards. Furthermore, GDDR DRAM devices don’t utilize stacking, further simplifying the manufacturing process and reducing cost. As a result, GDDR offers a cost-effective solution for achieving good performance, power-efficiency, and cost.

在HBM2和GDDR6之间进行选择的SOC考虑

当设计一个处理器利用GDDR或HBM,one must consider some important tradeoffs. In addition to the aforementioned differences between the DRAMs themselves, there are other disparities in how processors connect to these DRAMs.

其中最重要的差别是那些related to the PHY circuits on the SoC, which connect it to the DRAMs. For equivalent GDDR6 and HBM2 memory systems that deliver 256 GB/s of memory bandwidth, GDDR6 PHYs require between 1.5 and 1.75 times the area on the SoC compared to HBM2 PHY circuits delivering the same performance.

In terms of power, the differences are even more pronounced: GDDR6 PHYs consume anywhere between three-and-a-half to four-and-a-half times as much power as the HBM2 PHY at the same bandwidth. From the point of view of an SoC designer, this large disparity in power and area favor HBM2 memory systems. However, the added cost and implementation complexity of HBM2 memory systems can make the choice of GDDR6 a more attractive one.

Whether or not you choose HBM2 or GDDR6 ultimately depends on what matters most in the system at hand. If you’re prepared to handle the cost and engineering complexity of an HBM2 implementation, it’s the best route to take. But for systems that prioritize cost and more mainstream manufacturing methods, GDDR6 is an excellent solution. There’s no wrong answer when it comes to picking a high-bandwidth-memory solution for your application.

片上和外部记忆解决方案均提供高带宽和低潜伏期,以满足当今最密集的应用的需求。明智地选择,您的努力将得到回报。

阅读更多文章图书馆系列:System Design:AI的记忆

最新的

Murata-IRA IRA-S210ST01 pyroelectric红外传感器

2022年3月31日
Murata IRA-S210ST01是一种含有铅的Pyroelectric红外传感器,可提供良好的信噪比和可靠的性能。

Nexperia — PMEGxxxTx Trench Schottky Rectifiers

2022年3月31日
Nexperia扩展了其沟槽肖特基整流器的投资组合,其设备额定最高为100 V和20 A.新零件具有出色的Switchi…

Women in Engineering – Inspiring Creative Growth in Our Field

2022年3月8日
在过去的几年中,技术或工程专业的女性人数增加了。入学后EN的妇女人数…

GMR的汽车车轮传感的未来

2022年2月23日
下载PDF版本。Allegro微型系统。磁性传感器广泛用于现代车辆,以测量运动部件的位置,…

Voice your opinion!

This site requires you to register or login to post a comment.
尚未添加评论。想开始对话吗?
Baidu