Servers and Routers: To Aggregate or Disaggregate?
By Casimer DeCusatis | Posted: 2 September 2015 12:53:21 PM
Twitter | LinkedIn | Facebook | Reddit
One topic of note at the MIT Workshop on Photonics in Servers and Routers, commonly known as “Majorca at MIT” was the assertion made five years ago at the first Majorca meeting about network architectures migrating from the campus LAN into the data center, and then into boards and chips.
Increased Traffic Required New Solutions
This was partially correct; most data center networks simply transplanted the hierarchical switch model from their campus LAN, with its combination of oversubscribed access, aggregation, and core switches, each of which process the entire TCP/IP stack. This worked well as long as most network traffic ran between clients and servers (so-called north-south traffic). However, more recently we’ve seen the emergence of workloads that require servers to spend most of their bandwidth talking to each other (so-called east-west traffic). Examples include mobile communication, social networking, and big data. Many large data centers discovered that the campus LAN was never designed to handle this kind of traffic. Moving data packets up and down a network tree and processing the full stack at each node significantly increased latency, which directly impacts performance. This problem is even more pronounced when your data center is the size of a warehouse, so the issue was exacerbated by the advent of cloud computing. There’s good reason why Google and other cloud service providers pioneered adoption of SDN to flatten their networks, reduce latency, and enable dynamic bandwidth control to reduce oversubscription. While adoption of SDN remains in its early stages, the technology has been proven to be ready for prime time; large production use cases of SDN networks are becoming increasingly common. Although the full impact of SDN likely will not be felt for the better part of a decade, the design of data center networks has taken a profound turn that the original Majorca meeting couldn’t have anticipated.
Disaggregation of Server Hardware Components
The ability to rapidly reconfigure data center resources has led some researchers to propose disaggregating server hardware components, using an alternative design in which racks of processors, storage, and other components are interconnected by a high speed optical network. The promise of disaggregated systems is to create a more modular, flexible design, and promote open innovation from third-party vendors. There is precedent for this approach, since many data centers are performing workload optimization by clustering similar jobs together on subsets of the data center to enable tuning for better performance. If this trend continues to its logical extension, disaggregated hardware is a future step. Today, the Open Compute Project (OCP) proposed by Intel, Facebook, and Cisco offers just such an opportunity. As discussed at OFC 2015, the power consumption for optics on a printed circuit board is about a factor of ten less than the requirements for optics integration in a data center rack.
Facebook Discusses their Disaggregation Plans
During the workshop, Facebook discussed their disaggregation plans using a hybrid SDN network (with some localized control plane functions). To enable this approach, Facebook is working on an extension to the Quad SFP (QSFP) form factor optical transceivers, essentially doubling the capacity to create an Octal SFP (OSFP). This will support Facebook’s strategic direction of bi-directional, single-mode fiber links for 95% of their connectivity within a data center. Other disaggregation examples were given, such as Ericsson’s development of a NEBS-optional cloud computing rack with optics on board.
Another Alternative: Use of Microservers
Alternately, some workshop participants proposed a future architecture based on microservers. Essentially the opposite of disaggregation, this approach compresses an entire rack mounted server into a single chip or hand-held module. If this can be done without sacrificing performance, and at a low enough cost point, it might be possible to compress an entire cloud data center into a small box. With all the servers so close together, optics would only be required for I/O to peripheral devices. Further, since most of the energy used in a data center comes from data transport rather than computation, this architecture should be very energy efficient. The workshop included a live demo of a general purpose microserver based on a network processor.
The debate on disaggregation vs microservers continued well beyond the end of this workshop; what do you think the future data center will look like? Drop me a line on Twitter (@Dr_Casimer).
Posted: 2 September 2015 by
| with 0 comments