CHRISTIAN URRICARIET OPTICAL CONNECTIVITY
THE IMPACT OF OPTICAL CONNECTIVITY ON LARGE LANGUAGE MODELS Christian Urricariet , Senior Director of Product Marketing for Integrated Photonics Solutions at Intel, looks at how Generative AI and large language models are driving unprecedented requirements on AI infrastructure and its ability to keep up with model sizes. New interconnect technologies are necessary to enable massive scaling of bandwidth as well as clusters and fabric. Connectivity in the form of a new class of optical interfaces offers a solution that provides high-bandwidth, low-latency GPU links at a cost and power efficiency that is scalable. As an example of this emerging segment, Intel’s Optical Compute Interconnect (OCI) chiplet recently demonstrated shows us a glimpse of the future – Compact, power-efficient co-packaged optical I/O with performance to enable scaling of AI resources. UNLOCKING AI’S POTENTIAL:
I n the field of artificial intelligence, the development of large language models (LLMs) has been a game- changer, enabling a range of complex applications that can understand and generate human- like text. However, the computational demands of these models, which can have tens of billions of parameters or more, are immense. Training and running such models require distributing tasks across many GPUs, which often leads to inefficient utilization due to the memory-intensive nature of the workloads. To address this, larger batch sizes are used, but this solution introduces latency and requires even more GPUs to maintain parallel processing efficiency. The traditional approach of simply scaling up the number of GPUs is not without challenges. High-density GPU configurations consume significant power and necessitate specialized cooling solutions and infrastructure. Even less dense systems are limited by the performance constraints of copper interconnects and require additional
switching levels for larger deployments. Optical connectivity has emerged as a promising solution to these limitations. It provides high-bandwidth, low-latency connections between GPUs, which not only lowers power consumption but also facilitates better cooling and more efficient data transfer. This advancement is poised to accelerate the training and inference of LLMs by enhancing latency, throughput, and GPU utilization. Moreover, it allows for more flexible GPU placement, which is essential for reducing power and thermal loads, thereby improving the total cost of ownership. Kernel parallelism, which distributes computations across multiple GPUs, greatly benefits from optical connectivity. The reduced latency in communication between GPUs leads to improved performance and efficiency, making it possible to train larger and more complex AI models. Despite the potential of optical connectivity, the current generation of pluggable optical transceiver modules used in data centers and early
AI clusters cannot meet the scaling requirements for emerging AI workloads due to their size, cost, and power consumption. New integrated optical connectivity solutions co-packaged with GPUs are required, which can provide the higher bandwidth density, energy efficiency, low latency, lower cost, and extended reach needed for scaling AI infrastructure. These so-called Optical Compute Interconnect (OCI) applications can be enabled by aggressive integration along three different vectors. Firstly, increased integration at the PIC level, including on-chip DWDM lasers and SOAs to efficiently support scaling through increases in the number of wavelengths, larger number of fibers, and optionally on-chip laser sparing. The use of silicon photonics process technology for the PIC leverages the existing high-yielding and scalable manufacturing and testing infrastructure from CMOS processes, and the hybrid integration of the laser on chip allows the delivery of fully tested, Known-Good-Die (KGD), for
8
www.opticalconnectionsnews.com
| ISSUE 38 | Q3 2024
Made with FlippingBook Digital Proposal Creator