When building out network labs, often multiple people will need access to the lab. The main way right now is to use something like EVE-NG or GNS3 to provide access.
There are 2 downsides to this method. The first is that your server is exposed to the internet and if your usernames/passwords aren’t strong enough, your server can become compromised. The second is that sometimes you may not want everyone to be able to add or edit to the lab topology.
The solution to this is using Containerlab and ZeroTier. This setup is great for things like testing new hires, training classes, or for providing lab access to others on a limited basis.
What is Containerlab?
Containerlab is a container orchestrating tool for managing container-based networking labs. It doesn’t just support Container based Network Operating Systems though. Through vrnetlab, there is support for a wide variety of commonly used NOSes: Mikrotik RouterOS, Nokia SROS, Juniper vMX and vQFX, and many more.
Configs are text based making it easy to add/update links between nodes. The lab does need to be destroyed and redeployed when adding/removing links. With some tools to generate configs, it’s easy to spin up a 500+ node lab in under 30 minutes.
ZeroTier is a mesh VPN that allows for Layer 2 connectivity. It is typically used in a remote worker scenario to provide access to internal company resources. In our case, we use it to provide remote lab access.
Setting it up
This assumes that Containerlab and ZeroTier are already setup on the lab server. One thing that will be needed on the lab server is to enable ip forwarding. Otherwise, the lab server won’t be able to forward packets.
Sample labs showing the changes needed between labs are available in a repo. The main differences between the labs are the lab names and management networks. In the sample labs, we hardcode the management IP addresses to make access easier. This isn’t necessary but makes it easier when not using DNS.
Multiple Pod Network layout
Once the labs are deployed, you should be able to SSH into all of the nodes from the server.
Then you need to setup a ZeroTier network. The first step is to add a route to the managed route pointing from the lab network to the server IP address (we set the server IP address to 172.22.0.1).
172.16.0.0/16 is the lab network(s). 172.22.0.0/16 is the user network
If you’re not worried about preventing cross pod access, you are all set. But if you want to secure access to each pod to prevent end users from accessing each others lab pod, some rules need to be added to the ZeroTier network. Those can be found in the GitHub repo file named zerotier.rules.
Then you can set your server to prevent user to user traffic.
To provide user access to their pod, select the correct capabilities for the pod
Using the lab
With the lab setup with the security features, you should be able to access the nodes via SSH or Winbox.
Super interesting screenshot of a blank Winbox
With the ZeroTier rules in place, you are not able to reach the other running pods.
Do you have a 3 tier, switched, or vendor proprietary data center design?
Does it rely on spanning tree or proprietary solutions to eliminate spanning tree?
Not sure how to migrate to a new architecture without serious downtime?
If you answered yes to any of these questions then this post is for you. We’ll be looking at deploying an EVPN/VxLAN Data Center fabric and migrating from a cisco fabricpath environment to the new design.
Although we will be focusing on a fabricpath migration many, if not all, of the principles apply to migrating a 3 tier architecture.
1. Building the new Data Center Fabric 2. Connecting the current fabricpath and new fabric 3. Migrating switched virtual interfaces 4. Migrating various types of physical devices
Building the new Data Center Fabric
The easiest part of designing and building the new fabric is the physical topology. This should be a symmetric topology to easily take advantage of equal cost multipath and add additional switches with ease. This is also known as a spine/leaf or clos topology. The basic idea is leafs connect to spines and spines connect to super spines. A leaf/spine should not connect to another switch of the same type expect for multichassis lag or virtual port-channel at the access layer if you’re utilizing this.
Next you must decide on routing protocols. We will not examine layer 2 as this will be a completely routed fabric eliminating the need for any STP in your datacenter. Remember if you’re not Facebook, Amazon, Netflix, or Google (FANG) or some other webscaler you probably don’t have FANG problems i.e. there is no need to run a BGP underlay and learn to turn all the associated knobs to make that work; nor to engage in troubleshooting complex problems like path hunting.
For this reason we will look at utilizing Intermediate System to Intermediate System (ISIS) as an underlay with Internal Border Gateway Protocol (iBGP) as an overlay.
We prefer ISIS as an underlay network for data centers because:
it is easier to scale than OSPF
is extensible from the beginning (Type Length Values for additional capabilities)
better stability at scale
The secondary loopback is to enable the advertisement of a virtual IP address for traffic destined to the vPC pair. Single attached or routed links will advertise the physical IP address of the leaf so traffic returns the that specific leaf and not the pair.
iBGP as the overlay
The overlay is pretty straight forward. We will run iBGP with loopback peerings to exchange EVPN routes. EVPN scales significantly better than other VxLAN control plane protocols so we will not explore flood and learn or static assignment.
We will be utilizing vPC on the access layer for the remainder of the post. There are other methods for dual attached devices such as EVPN-multihoming but as this is cisco specific for fabricpath migrations they will not be discussed.
See an example configuration below of how the VIP/PIP mentioned earlier operate
Leaf BGP and NVE
interface nve1
no shutdown
host-reachability protocol bgp
advertise virtual-rmac ## for advertising the VIP
source-interface loopback1
router bgp 8675309
router-id 100.127.0.4
address-family l2vpn evpn
advertise-pip ## for advertising PIP if single attached
neighbor 100.127.0.0
remote-as 8675309
update-source loopback0
address-family l2vpn evpn
send-community
send-community extended
Connecting the current fabricpath DC and new fabric
The first thing to do is decide on the physical point of interconnection. You’ll want to ensure you chose a place you have enough ports to do a dual sided vPC with enough bandwidth to cover lateral traffic between new/old until the migrations are complete.
Next we have to think about the layer 2 protocols in play. Since spanning tree isn’t in play on either side we need to take special consideration to make sure we do not introduce a layer 2 loop.
The EVPN/VxLAN side will not do anything with STP BPDUs but there is a requirement on the fabricpath side that it remains the root bridge. This is due to the entire fabricpath domain looking like one physical bridge. If a port in the fabricpath domain receives a superior BPDU a root-guard of sorts is enacted and the content edge port begins blocking.
Why do we care if STP doesn’t pass over the EVPN fabric? If the fabricpath environment is interconnected at two points then there will be a loop back to the fabricpath domain. This is a situation we want to avoid.
It can be avoided by:
only having one interconnect
manually pruning vlans at the two+ points of interconnect to ensure vlans remain on exactly ONE path
Migrating Switched Virtual Interfaces
Our preferred method of migrating SVIs from the old fabricpath environment to the new fabric is to:
build all of the new Distributed Anycast Gateways (DAG) on the new fabric
keep them shutdown
establish a L3 adjacency via BGP for routing traffic back to exit points until the migrations are complete
add the VLANs being migrated to the dual side vPC
shutdown the SVIs on the fabricpath side and no shut the DAGs on the new fabric
manually clear ARP on any hosts that did not update with the new DAG MAC
Migrating physical devices
Most of the physical devices are “easy” since there is no option but to physically move cables and you know this will result in a slight outage while the new uplinks come online.
However, with HA pairs of devices it is possible to migrate by moving the standby unit, waiting for the HA to reestablish, forcing a failover, move the active unit, and then “fail” back to the primary unit. This will test your HA setup as well as provide a seamless migration.
If you have new compute and storage you can migrate your workloads directly to the new environment and age out the legacy compute/storage.
Finally, ensure there are no more devices in use on your old environment and decommission the devices.
If you have questions or need assistance do not hesitate to reach out to us at iparchitechs.