By Sameh Boujelbene
OFC, the foremost global event in optical communications and networking, draws attendees from around the world interested to learn about the latest advancements and innovations in the industry. While OFC covers a broad range of industries and applications for optics, my focus is specifically on the data center market, for intra and inter-data center applications. As OFC approaches, I'm eager to discuss in this blog some of the key topics that will take center stage at the conference.
1. The increasing role of optics in AI Networks:
The scale of emerging large AI applications appears to be expanding exponentially, with the number of parameters that these applications have to process growing 1000X every 2 to 3 years. Consequently, the average size of AI clusters in terms of number of accelerators is quadrupling every 2 years, evolving from a typical size of 256 to 1000, then rapidly to 4K, and now some clusters boast 32K and 64K accelerators.
Another key aspect is the amount of bandwidth per accelerator, which is expected to grow from 200/400/800 Gbps today to over 1 Tbps in the near future. In summary, the traffic growth within AI networks is bolstered not only by growth in cluster size but also by increasing bandwidth per accelerator. As a result, the network bandwidth in AI clusters is growing at an astonishing rate, with a tenfold increase every two years in certain Cloud Service Provider (SP) networks.
In our recently published “AI Networks for AI workloads” Advanced Research Report, we forecast that by 2025, the majority of ports in AI networks will be 800 Gbps, and by 2027, the majority of ports will be 1600 Gbps, showing a very fast adoption of the highest speeds available in the market. This pace of migration is almost twice as fast as what we usually see in the traditional front-end network that is used to connect general-purpose servers.
Regrettably, the increase in optic speed is challenged by a significant increase in cost and power consumption. Substantial investments in AI infrastructure are accelerating the development of innovative optical connectivity solutions tailored to meet the demands of AI clusters while solving some of the cost and power consumption challenges. Various solutions and strategies addressing these challenges will be explored at OFC this year.
2. The state of 1.6 Tbps optics and potential path to 3.2 Tbps:
At OFC 2023, numerous 1.6 Tbps optical components and transceivers based on 200 G per lambda were introduced. We anticipate further technology demonstrations of such 1.6 Tbps products at this year's OFC. While we don't anticipate volume shipment of 1.6 Tbps until 2025/2026, the industry must already begin efforts towards achieving 3.2 Tbps and exploring various paths and options to reach this milestone. This sense of urgency arises from a combination of factors, including the exponential growth in bandwidth demand within AI clusters and the escalating power and cost concerns associated with higher speeds. We expect multiple discussions around the potential path to achieve 3.2 Tbps to take place at OFC this year.
3. Linear Drive Pluggable Optics vs. Co-Packaged Optics vs. Coherent Optics:
Pluggable optics are expected to account for an increasingly significant portion of power consumption at a system level, exceeding 50% of the switch system power at 51.2 Tbps and beyond. This issue will be further exacerbated as Cloud SPs build their next-generation AI networks and continue to push for higher speeds.
At OFC 2023, Linear Drive Pluggable Optics (LPOs) were introduced, sparking a series of testing activities. At OFC 2024, we are looking forward to hearing the latest updates on LPOs and whether they hold promise beyond 112 G SerDes lanes. Additionally, coherent optics will be explored as part of the market's efforts to reduce power and cost compared to conventional pluggable solutions.
Meanwhile, Co-Packaged Optics (CPOs) remain in development, with industry speculation suggesting that CPOs may eventually become the sole solution capable of enabling higher speeds at one point in the future.
About Sameh Boujelbene
Sameh Boujelbene joined Dell’Oro Group in 2011, where she currently oversees research in the areas of Ethernet Campus Switch, Ethernet Data Center Switch, and AI Networks for AI Workloads. During her tenure at the firm, Ms. Boujelbene expanded her research programs to address data center interconnect, AI/ML workloads, and digital transformation. She has published articles and been cited in various industries and trade publications, and she is a frequent speaker at industry conferences and events.
|
Posted: 16 February 2024 by
Sameh Boujelbene
| with 0 comments