Virtualization in Multi-Tenant Datacenters (text)
  
  
  
    
      
      by
      Sasha Shkrebets
      
         —
    
  
  
  
  
  
    
      last modified
    
    Mar 06, 2023 01:18 PM
  
  
  
  
  
  
                   We'll talk about what a multi-tenant datacenter is, as
well as the role of network virtualization in multi-tenant datacenters. 
                
            
            
        
                             
                             Welcome back. 
In this lesson, we'll talk about multi-tenant datacenters. 
We'll talk about what a multi-tenant datacenter is, as 
well as the role of network virtualization in multi-tenant datacenters. 
We'll talk about this in the context of an Acera 
product called NVP, which is described in a recent NSDI paper. 
We'll talk about two specific challenges 
with virtualizing the network for multi-tenant datacenters, 
in particular, achieving fast forwarding speeds 
and software and scaling the control architecture. 
We'll then talk about the role of SDN, in network virtualization more generally. 
A multi-tenant datacenter is a single 
physical datacenter that's shared by many tenants. 
In the case of Amazon EC2 or Rackspace, 
the tenants might be different customers such as yourself. 
In the case of a service provider such as Google, Yahoo, or Microsoft, 
the tenants might actually be different applications, 
or services such as Mail or Search. 
Finally developers might develop software on a datacenter that's actually 
shared with operational infrastructure, that's 
actually shared with production infrastructure. 
A multi-tenant datacenter, presents many challenges. 
First, different workloads require different topologies and services. 
For example some workloads assume a flat layer two topology. 
Whereas others might assume layer three routing. 
Second a tendant address base on its physical 
network overlap with the addressing in the virtual network. 
So there need to be ways to resolve this potential crashes. 
In a multi-center, each host in a datacenter has mutiple virtual machine. 
Each physical host has a hypervisor that has an internal 
switch that determines whether the traffic, being exchanged with that 
virtual host should be destined for another virtual host on 
the same machine, or some other hypervisor in the network. 
In order to implement this architecture. 
We need a network hypervisor that exposes 
the right types of abstractions from tenants. 
A network hypervisor essentially has two abstractions. 
One is a control abstraction, which allows each tenant to define 
a set of logical network data-plane elements that they can control. 
Providing the illusion of complete ownership over the data-plane. 
The second abstraction that the hypervisor provides is a packet abstraction. 
Whereby packets sent by these endpoints should see the same service, as 
they would in a native network 
with physical addresses, and physical data-plane elements. 
In the series NVP a network hyperviser, implements these abstractions 
by setting up tunnels, between each pair of host hypervisers. 
This, of course, makes unicast possible, but broadcast 
and multicast are not automatically implemented as a result. 
To cope with this short-coming multicast and broadcast, are actually 
implemented as overlay services on top of the pair wise tunnels. 
To perform packet processing, a centralized SDN controller, 
configures the virtual switches on each virtual host. 
And the logical datapath for each flow of 
traffic is implemented entirely on the sending host. 
So, the physical network simply sees IP 
packets, and performs essentially no custom processing whatsoever. 
It simply forwards the packet over a tunnel. 
The tunnel end points are simply virtual 
switches that are running on the host's hypervisors. 
In NVP these virtual switches are implemented with Open vSwitch. 
The Controller cluster can modify the flow table entries on these 
virtual switches, and also set up tunnels between each pair of hypervisors. 
All the packet processing of the logical datapath, is configured by the 
central controllers, which modify the flow table entries on these virtual switches. 
Here's an overview of the controller structure in NVP. 
The host hypervisors and physical gateways provide 
the controller with location, and topology information. 
The service providers then can configure the controller, and the forwarding 
state is pushed to the open V switch instances on each host. 
Using the open flow protocol. 
This design poses two main challenges. 
One is in the design of the datapath and that software switching at the 
end host, needs to be fast, but by default switching and software can be slow. 
The second challenge is scaling the controller computation. 
They are potentially n square task between each pair of virtual host. 
So the architecture must provide a way to scalably compute 
the logical datapaths, and tunnels for each pair of host. 
To make the datapath fast NVP performs exact match on flows in the kernel. 
A user-space program allows matching on the full flow table. 
But, once a flow is matched in user-space, a corresponding exact 
match table without any wildcard entries, is installed in the kernel. 
Thus, future packets for the same flow, can be matched in the kernel. 
Thus providing faster forwarding. 
The second challenge involves forwarding encapsulated packets 
and hardware, which requires some additional tricks 
which are described in the paper I reference at the beginning of the lesson. 
Controller computation is scaled by 
decomposing the computation into two layers. 
Logical controllers and physical controllers. 
The logical controllers compute flows and tunnels for 
the logical datapaths, in what are called universal flows. 
The physical controllers communicate with 
the hypervisors, gateways, and service nodes. 
This decomposition allows the logical controllers to 
operate on logical abstractions of the network. 
Without knowledge of the physical topology or the parallel 
mesh between all of the hypervisors in the datacenter. 
The use of network virtualization in the context of multi 
tenant datacenters illustrates a distinction 
between network virtualization, and software-defined networking. 
A common misconception is that network virtualization and SDN are the same thing. 
But in fact, network virtualization actually predates SDN. 
And network virtualization doesn't require SDN. 
Some examples of network virtualization that predate 
SDN include overlay networks and V lamps. 
However, SDN can make it much easier to orchestrate a multi-tenant datacenter. 
Because it's much easier to virtualize an SDN switch than a physical switch. 
Virtualizing the network makes it possible to 
run a separate controller for each virtual network. 
And SDN makes it straightforward to partition the space of all flows. 
As we've seen in this lesson, network 
virtualization can also make use of software switches. 
And SDN controllers can orchestrate packet processing 
in datapaths that are completely in software. 
In short, you could almost think of network virtualization as a killer 
App for SDN, simply because SDN makes the type of orchestration that's required. 
For managing the combined network computer and storage in a multi-tenant datacenter, 
much easier than it would be 
without a logically centralized control infrastructure. 
In short, the rise of virtualization in multi-tenant 
datacenters has created a need for network virtualization. 
In addition to the virtualization that we already had with compute and 
storch, SDN plays some important roles 
in configuring those logical datapaths and tunnels. 
In some ways, NVP represents an interesting extreme design point. 
Because all of the configuration is in fact happening 
on the host, and the network is once again dump. 
Want to think about trade offs between the NVP 
architecture, and an alternative type of architecture where some 
of the data path processing is performed in the 
network as well as in virtual switches on the host. 
