Workshop: How Far is Too Far? Interconnect Latency and Distributed AI Training

Artificial intelligence? Generalized intelligence? Superintelligence? Delivering on this promise requires ever larger models, enormous training datasets, and millions of GPUs. Distributed training enables workloads to be spread over large geographic regions, separated by hundreds or even thousands of kilometers. Novel networks are being introduced to interconnect these compute clusters, resolving power density problems, but introducing new challenges such as restricted bandwidth, network latency, transceiver complexity, and power dissipation.

The key questions to address in this workshop are:

Which design constraints (latency, total bandwidth, power consumption) constrain distributed training the most?
What is the practical limit on the maximum distances between geo-distributed AI data centers?
Which direction should optical transceiver technologies (DSP, FEC, etc) evolve?
How are transport-layer technologies enabling or hindering the design of large training clusters?
Does hollow-core fiber or optical switching help?
Can standards keep up?
Will we see training clusters further evolve to support “Agentic AI”?

Organizers

Brandon Buscaino

Ciena, Canada
Sergejs Makovejs

Corning, United Kingdom
Jeffrey Rahn

Meta, United States
Jesse Simsarian

Nokia, United States

Workshop: How Far is Too Far? Interconnect Latency and Distributed AI Training

Organizers

Brandon Buscaino

Sergejs Makovejs

Jeffrey Rahn

Jesse Simsarian

About

Conference Programs

Exhibition Programs

Exhibit at OFC

Hotel & Travel

Submit a Paper