Every increasing computing need in the fields of signal processing and inference are straining the current-day digital computers due to their high-power consumption and latency. In principle, analog signal processing and computing can provide significant power efficiency, low-latency and area advantage over digital computing architectures. However, the prior analog computing systems have been limited due to their sensitivity to variations in process, voltage, and temperature (PVT variations), noise, and their performance degradation with increase in size (lack of scalability), thus hindering their development and usability in modern systems. The overarching vision of this research thrust is to develop and demonstrate scalable, low-noise, and PVT insensitive analog-friendly computing architectures that can enable high computing speeds and efficiencies. Under this vision, we are specifically looking into the demonstration of RF correlators (dot-product calculators) and matrix-vector-multipliers (MVMs) that are at the heart of modern-day radar processing, artificial intelligence (AI) and machine learning (ML) applications. In digital systems, these operations follow a multiply and accumulate (MAC) architecture and require binary operations that are of polynomial-order to the required compute precision, thus making them extremely computationally intensive for large vectors sizes and high precision.
Ultra-Efficient, Analog Correlators Using Charge Thresholding: Cross-correlation like any pattern matching computation has sufficient ensemble redundancy to tolerate minor approximation errors in individual MAC operations. Hence, mechanisms other than a true multiplication, such as a simple thresholding function, can be used to estimate the true cross-correlation R with similar accuracy as MAC-based correlators. We proposed that, unlike MAC-based digital correlators, the non-MAC based multiplier-less analog correlators can be implemented in a more energy-efficient and/or in a more hardware-friendly and scalable manner using universal conservation laws (charge, current, mass, energy etc.) and commonly available thresholding/limiting non-linearity available in modern CMOS devices. To demonstrate this architecture, we developed an MP-compute based analog correlator that relies on naturally available device-thresholding function and demonstrated a real-time correlation computation of 5GS/s data with a precision of 8-bit ENOB and efficiency of 150TOPS/W and presented at ISSCC 2024. Through live demonstrations at ISSCC 2024, we established that such as an analog correlator can enhance the performance of several wireless and radar systems including code-domain multiple access (CDMA) communication, impulse radars, spread-spectrum radars, and wideband spectrum sensing.
3-D NAND Flash Based Analog Matrix-Vector-Multipliers (MVMs)
Concepts of MP-based compute can be extended to compute in-memory (CIM) architectures. Most CIM architectures proposed in the past rely on non-volatile memory (NVM) elements arranged in a NOR-based configuration, however, NVMs based on NAND configuration can be more energy and area efficient. In particular, 3D-NAND flash memories are attractive because of their potential in achieving ultra-high memory density and ultra-low cost per bit storage. To this end, we are developing NAND-Flash-based CIM architecture by combining the conventional 3D-NAND flash with the thresholding techniques based approximate computing.
Related Publications:
- K. Rashed, A. C. Undavalli, S. Chakrabartty, A. Nagulu and A. Natarajan, “A Scalable and Instantaneously Wideband 5GS/s RF Correlator based on Charge Thresholding achieving 8-bit ENOB and 152 TOPS/W Compute Efficiency” accepted and to appear in IEEE International Solid-State Circuits Conference – (ISSCC), 2024.
- A. C. Undavalli, G. Cauwenberghs, A. Natarajan, S. Chakrabartty and A. Nagulu, “ADC-less 3D-NAND Compute-in-Memory Architecture using Margin Propagation,” in Midwest Symposium on Circuits and Systems (MWSCAS), Aug. 2023.