Design of Circuits for Analog in-Memory Computing

Vignali, Riccardo

The rapid surge of Artificial Intelligence (AI) has highlighted the need for computing architectures capable of performing low-energy computations on large data sets. The widely used von Neumann architecture has proven inadequate to meet the computational demands of modern Deep Neural Networks (DNNs), mainly due to the physical separation between processing and memory units, commonly referred to as the ``memory wall'', which constitutes a severe communication bottleneck. In recent years, several alternative approaches have been proposed to overcome this limitation. Among these, Analog In-Memory Computing (AiMC) based on resistive memory devices has emerged as a promising solution to optimize, both in terms of energy efficiency and throughput, the execution of Matrix-Vector Multiplication (MVM) operations, which are among the most frequent and energy-demanding tasks during neural network inference. AiMC enables the parallel execution of all multiply and accumulate operations required for MVM directly where the network parameters (weights) are stored. This is achieved by exploiting Ohm’s law to realize multiplications at the device level and Kirchhoff’s law to perform summations along the bitlines (BLs). Within this framework, Phase-Change Memories (PCMs) exhibit attractive properties such as non-volatility, multi-bit storage, small cell footprint, and CMOS compatibility, enabling the fabrication of dense embedded arrays that retain neural network weights across power cycles. However, PCMs also suffer from key non-idealities, including non-linear I--V characteristics, programming variability, and conductance drift over time. These effects degrade the accuracy of MVM computations, requiring dedicated circuit- and system-level techniques for compensation. This thesis tries to propose solutions to several challenges in designing AiMC hardware accelerators based on PCM, while also providing techniques that can be extended to other resistive memory technologies. The first part investigates strategies for input and weight encoding and develops co-design methodologies for digital-to-analog and analog-to-digital converters. Subsequently, a PCM-based AiMC test-chip, fabricated in 28 nm FD-SOI, is presented and characterized. This first prototype demonstrates the capability to compute with eight independent 512x512 weight sets, incorporating compensation mechanisms against cell- and array-level non-idealities, and achieving competitive storage-energy efficiency per unit area compared to prior works. A second prototype, under fabrication at the time of writing, has been developed as a dedicated AiMC macro suitable for integration in larger systems. For this design, an improved version of the previously adopted CCO-based ADC topology is proposed, aimed at mitigating the impact of the non-idealities observed in the first implementation, together with a trimming scheme to locally calibrate the BL peripheral circuitry. In addition, a detailed power consumption model for both the ADC and BL-MVM operations has been developed to support security analysis, assessing system robustness against external attackers aiming at retrieving sensitive information store inside the memory tile, as, for instance, neural networks weights. Finally, an enhanced version of the BL voltage regulator, used to bias the BLs at the proper operating voltage, has been designed to further improve efficiency, memory access time and accuracy, with respect to the previous versions.