Migrating from fabricpath to EVPN/VxLAN

Introduction

Do you have a 3 tier, switched, or vendor proprietary data center design?

Does it rely on spanning tree or proprietary solutions to eliminate spanning tree?

Not sure how to migrate to a new architecture without serious downtime?

If you answered yes to any of these questions then this post is for you. We’ll be looking at deploying an EVPN/VxLAN Data Center fabric and migrating from a cisco fabricpath environment to the new design.

Although we will be focusing on a fabricpath migration many, if not all, of the principles apply to migrating a 3 tier architecture.

1. Building the new Data Center Fabric
2. Connecting the current fabricpath and new fabric
3. Migrating switched virtual interfaces
4. Migrating various types of physical devices

Building the new Data Center Fabric

The easiest part of designing and building the new fabric is the physical topology. This should be a symmetric topology to easily take advantage of equal cost multipath and add additional switches with ease. This is also known as a spine/leaf or clos topology. The basic idea is leafs connect to spines and spines connect to super spines. A leaf/spine should not connect to another switch of the same type expect for multichassis lag or virtual port-channel at the access layer if you’re utilizing this.


https://iparchitechs.com/contact


ISIS as an underlay routing protocol

Next you must decide on routing protocols. We will not examine layer 2 as this will be a completely routed fabric eliminating the need for any STP in your datacenter. Remember if you’re not Facebook, Amazon, Netflix, or Google (FANG) or some other webscaler you probably don’t have FANG problems i.e. there is no need to run a BGP underlay and learn to turn all the associated knobs to make that work; nor to engage in troubleshooting complex problems like path hunting.

For this reason we will look at utilizing Intermediate System to Intermediate System (ISIS) as an underlay with Internal Border Gateway Protocol (iBGP) as an overlay.

We prefer ISIS as an underlay network for data centers because:

  • it is easier to scale than OSPF
  • is extensible from the beginning (Type Length Values for additional capabilities)
  • better stability at scale

The secondary loopback is to enable the advertisement of a virtual IP address for traffic destined to the vPC pair. Single attached or routed links will advertise the physical IP address of the leaf so traffic returns the that specific leaf and not the pair.

iBGP as the overlay

The overlay is pretty straight forward. We will run iBGP with loopback peerings to exchange EVPN routes. EVPN scales significantly better than other VxLAN control plane protocols so we will not explore flood and learn or static assignment.

We will be utilizing vPC on the access layer for the remainder of the post. There are other methods for dual attached devices such as EVPN-multihoming but as this is cisco specific for fabricpath migrations they will not be discussed.

See an example configuration below of how the VIP/PIP mentioned earlier operate

Leaf BGP and NVE

interface nve1
  no shutdown
  host-reachability protocol bgp
  advertise virtual-rmac ## for advertising the VIP 
  source-interface loopback1

router bgp 8675309
  router-id 100.127.0.4
  address-family l2vpn evpn
    advertise-pip ## for advertising PIP if single attached
  neighbor 100.127.0.0
    remote-as 8675309
    update-source loopback0
    address-family l2vpn evpn
      send-community
      send-community extended

Connecting the current fabricpath DC and new fabric

The first thing to do is decide on the physical point of interconnection. You’ll want to ensure you chose a place you have enough ports to do a dual sided vPC with enough bandwidth to cover lateral traffic between new/old until the migrations are complete.

Next we have to think about the layer 2 protocols in play. Since spanning tree isn’t in play on either side we need to take special consideration to make sure we do not introduce a layer 2 loop.

The EVPN/VxLAN side will not do anything with STP BPDUs but there is a requirement on the fabricpath side that it remains the root bridge. This is due to the entire fabricpath domain looking like one physical bridge. If a port in the fabricpath domain receives a superior BPDU a root-guard of sorts is enacted and the content edge port begins blocking.

Why do we care if STP doesn’t pass over the EVPN fabric? If the fabricpath environment is interconnected at two points then there will be a loop back to the fabricpath domain. This is a situation we want to avoid.

It can be avoided by:

  1. only having one interconnect
  2. manually pruning vlans at the two+ points of interconnect to ensure vlans remain on exactly ONE path

Migrating Switched Virtual Interfaces

Our preferred method of migrating SVIs from the old fabricpath environment to the new fabric is to:

  • build all of the new Distributed Anycast Gateways (DAG) on the new fabric
    • keep them shutdown
  • establish a L3 adjacency via BGP for routing traffic back to exit points until the migrations are complete
  • add the VLANs being migrated to the dual side vPC
  • shutdown the SVIs on the fabricpath side and no shut the DAGs on the new fabric
  • manually clear ARP on any hosts that did not update with the new DAG MAC

Migrating physical devices

Most of the physical devices are “easy” since there is no option but to physically move cables and you know this will result in a slight outage while the new uplinks come online.

However, with HA pairs of devices it is possible to migrate by moving the standby unit, waiting for the HA to reestablish, forcing a failover, move the active unit, and then “fail” back to the primary unit. This will test your HA setup as well as provide a seamless migration.

If you have new compute and storage you can migrate your workloads directly to the new environment and age out the legacy compute/storage.

Finally, ensure there are no more devices in use on your old environment and decommission the devices.

If you have questions or need assistance do not hesitate to reach out to us at ip architechs.

MikroTik – RouterOSv7 first look – MLAG on CRS 3xx switches

What is MLAG?

Multi-Chassis Link Aggregation Group or MLAG is an idea that’s been around for a while.

It allows for the ability to form LACP channels across multiple physical switches.

Wikipedia shows a few different topology examples here


Vendor implementations are proprietary but the idea of MLAG was first mentioned in 802.1AX-2008 in 2008.

It first started to become popular in data center networking in the late 2000s

What makes the addition of MLAG to MikroTik’s RouterOS feature set notable is that it lowers the barrier to entry for this particular feature.

CRS 3xx switches are very inexpensive (starting at $149 USD) and may very well be the lowest cost MLAG capable hardware available on the market.

Contact Us | IP ArchiTechs | Network Consulting Firm


Introduced in 7.1beta6

MLAG has been asked for by the MikroTik community a number of times and the most active feature request thread started here in 2020:

new feature request MLAG!!! – MikroTik

MikroTik added several version 7 beta releases in 2021 and included MLAG for all CRS 3xx series switches in 7.1beta6 on May 18th, 2021.

Overview of protocol requirements

MLAG is fairly consistent across vendors with the need for a link between physical devices that manages the MLAG groups. In MikroTik, these are called peer ports which facilitate the ICCP.

Here are a few terms for MikroTik MLAG:

ICCP (Inter Chassis Control Protocol). – Responsible for determining active/secondary switches and maintaining and updating the bridge table between physical switches.

Peer port – An interface that will be used as a peer port. Both peer devices use inter-chassis communication over the peer ports to establish MLAG and update the host table. The Peer port should be isolated on a different untagged VLAN using a pvid setting. The Peer port can be configured as a bonding interface.

System-id – The lowest MAC address between both peer bridges will be used as the system-id. This system-id is used for (R)STP bridge identifier.

Active-role – The peer with the lowest bridge MAC address will be acting as a primary device. The primary device is responsible for sending the correct LACP system ID on all MLAG ports.

mlag-id – An integer from 0 to 4294967295, it is used to set the MLAG ID for bonding interfaces. The same MLAG ID should be used on both peer devices to successfully create a single MLAG.

MikroTik’s requirements for ICCP and MLAG are:

  • RouterOS ICCP does not require an IP configuration
  • It should be isolated from the rest of the network using a dedicated untagged VLAN
  • Peer ports can also be configured as LACP bonding interfaces
  • MLAG requires enabled STP or RSTP protocol.


In order to present a single MAC address for the L2 spanning tree topology, ICCP functions on top of the peer ports to manage the MLAG/LACP system-id.

The system-id is used as the MAC address presented to the LACP client for RSTP/MSTP bridge identification.

reference for images and MLAG definitions: Multi-chassis Link Aggregation Group – RouterOS – MikroTik Documentation


Lab Example

In order to test the new MLAG functionality, we decided to setup a lab with CRS326-24S+2Q switches and CCR2004-1G-12S+2XS routers.

Below is the lab physical and logical topology.

Configuring an MLAG Group

Configure Bond and MLAG ID on CSW-01

/interface bonding
add mlag-id=100 mode=802.3ad name=Po1 slaves=sfp-sfpplus1


Configure Bond and MLAG ID on CSW-02

/interface bonding
add mlag-id=100 mode=802.3ad name=Po1 slaves=sfp-sfpplus1


* Apply each configuration step below on both switches to complete mlag setup and mlag-id 100. *


Configure the bridge and enable VLAN filtering. Add MLAG bonded interfaces and peer port to the bridge.

/interface bridge
add name=Bridge-MLAG vlan-filtering=yes
/interface bridge port
add bridge=Bridge-MLAG interface=Po1
add bridge=Bridge-MLAG interface=qsfpplus1-1 pvid=777


Configure a VLAN to be used over the MLAG

/interface bridge vlan
add bridge=Bridge-MLAG tagged=Po1 vlan-ids=3000


Set the peer port

/interface bridge mlag
set bridge=Bridge-MLAG peer-port=qsfpplus1-1


Validate the MLAG group


Show the status of the MLAG group, active and secondary ports and verify the system-id the client LACP receives

######## MLAG Switches - 2 x CRS326 ############

[[email protected]] > interface/bridge/mlag/monitor 
       status: connected
    system-id: 48:8F:5A:3A:44:BA
  active-role: primary

[[email protected]] > interface/bridge/mlag/monitor                        
       status: connected
    system-id: 48:8F:5A:3A:44:BA
  active-role: secondary

######## NON-MLAG LACP Router ############

[[email protected]] > /interface bonding monitor Po1
                    mode: 802.3ad
            active-ports: sfp-sfpplus3,sfp-sfpplus4
          inactive-ports: 
          lacp-system-id: 48:8F:5A:00:4F:80
    lacp-system-priority: 65535
  lacp-partner-system-id: 48:8F:5A:3A:44:BA


Configurations

RTR-01

/interface bridge
add name=Lo0
/interface bonding
add mode=802.3ad name=Po1 slaves=sfp-sfpplus3,sfp-sfpplus4 transmit-hash-policy=layer-2-and-3
/interface vlan
add interface=Po1 name=vlan3000 vlan-id=3000
/routing table
add fib name=""
/ip address
add address=100.126.0.1/29 interface=vlan3000 network=100.126.0.0
add address=100.127.0.1 interface=Lo0 network=100.127.0.1
/ipv6 address
add address=200:100:126::1 interface=vlan3000
add address=200:100:127::1/128 advertise=no interface=Lo0
/system identity
set name=RTR-01

RTR-02

/interface bridge
add name=Lo0
/interface bonding
add mode=802.3ad name=Po1 slaves=sfp-sfpplus3,sfp-sfpplus4 transmit-hash-policy=layer-2-and-3
/interface vlan
add interface=Po1 name=vlan3000 vlan-id=3000
/routing table
add fib name=""
/ip address
add address=100.126.0.2/29 interface=vlan3000 network=100.126.0.0
add address=100.127.0.2 interface=Lo0 network=100.127.0.2
/ipv6 address
add address=200:100:126::2 interface=vlan3000
add address=200:100:127::2/128 advertise=no interface=Lo0
/system identity
set name=RTR-02

CSW-01

/interface bridge
add name=Bridge-MLAG vlan-filtering=yes
/interface bonding
add mlag-id=100 mode=802.3ad name=Po1 slaves=sfp-sfpplus1
add mlag-id=101 mode=802.3ad name=Po2 slaves=sfp-sfpplus2
/interface bridge mlag
set bridge=Bridge-MLAG peer-port=qsfpplus1-1
/interface bridge port
add bridge=Bridge-MLAG interface=Po1
add bridge=Bridge-MLAG interface=qsfpplus1-1 pvid=777
add bridge=Bridge-MLAG interface=Po2
/interface bridge vlan
add bridge=Bridge-MLAG tagged=Po1,Po2 vlan-ids=3000
/system identity
set name=CSW-01

CSW-02

/interface bridge
add name=Bridge-MLAG vlan-filtering=yes
/interface bonding
add mlag-id=100 mode=802.3ad name=Po1 slaves=sfp-sfpplus1
add mlag-id=101 mode=802.3ad name=Po2 slaves=sfp-sfpplus2
/interface bridge mlag
set bridge=Bridge-MLAG peer-port=qsfpplus1-1
/interface bridge port
add bridge=Bridge-MLAG interface=Po1
add bridge=Bridge-MLAG interface=qsfpplus1-1 pvid=777
add bridge=Bridge-MLAG interface=Po2
/interface bridge vlan
add bridge=Bridge-MLAG tagged=Po1,Po2 vlan-ids=3000
/system identity
set name=CSW-02