Will Optical Switches Become a Key Element in High-Performance AI/ML Datacenter Networks?

Sunday, 24 March, 16:00 – 18:30

Room 6C

The application of optical switching in data center networks has been extensively studied. Google’s recent announcement showcasing the implementation of an optical circuit switch in a production data center has re-sparked interest in the requirements and challenges associated with optical switching. In addition, generative AI models are extensively advancing, with the number of parameters exponentially increasing. This requires GPU clusters with a very high bandwidth density and low energy consumption, which has revamped research on photonic switch fabrics co-integrated and optically interconnected with multiple GPUs/TPUs/CPUs. This workshop discusses the challenges and opportunities of optical switching for large-scale data center networks, especially for GPU clusters and HPC networks, from a system networking perspective (capacity demand, latency, fast configuration, and control scheme, flexibility and scalability, cost, power consumption), optical switch architectures (switch radix, topology, size and scale, performance), to device performance (loss, bandwidth, switching speed, crosstalk, and integration platform). We will explore innovative photonic technologies and network architectures for enabling optical switching for AI/ML applications. Some of the topics that we intend to dive into in this workshop are:

1. What are the requirements and challenges for the broad adoption of optical circuit switching?

2. How do optical and electrical switch systems co-exist to enable scaling and optimize the cost-to-performance metric for AI/ML systems?

3. Will semiconductor-based optical switches (e.g., Silicon Photonics) play a significant role in large-scale integrated optical switches after the MEMS-based OCS systems deployment? What hurdles need to be overcome?

4. Is there a role for fast optical switching in AI/ML systems? What are the requirements for fast optical switching?

5. Novel system architectures and packaging techniques involving co-packaged optics, optical interposers, and I/Os, with GPU/CPU/TPU for future AI/ML computing?

Organizers

Qixiang Cheng, Cambridge University, United Kingdom

Wenhua Lin, Intel Corp., United States

Kazuhiro Ikeda, AIST, Japan

Speakers: Session 1

Keren Bergman, Columbia University, United States

Ben Lee, NVIDIA, United States

Shu Namiki, AIST, Japan

George Papen, UCSD/Google, United States

Stefano Stracca, Ericsson, Italy

Speakers: Session 2

Darius Bunandar, Lightmatter, United States

Richard Penty, Cambridge University, United Kingdom

Daniel Perez-Lopez, iPronics, Spain

Ming Wu, University of California, Berkeley, United States

Vaccine and Mask Requirements

Will Optical Switches Become a Key Element in High-Performance AI/ML Datacenter Networks?

Organizers

Speakers: Session 1

Speakers: Session 2