IP Infusion: EVPN-MPLS first look on GA 6.0

IP Infusion just released OcNOS version 6.0 and the release notes, as well as press release, show a focus on EVPN with an MPLS data plane. Don’t forget EVPN and VxLAN aren’t mutually exclusive, EVPN runs on and was originally designed for a MPLS data plane. I recently discussed this on a podcast EVPN doesn’t need VxLAN if you want to know more on that topic.

Lets take a look at basic EVPN-VPWS and EVPN-VPLS deployment. Since we’re looking at an MPLS data plane we will utilize ISIS-SR for MPLS. We’re utilizing ISIS-SR as it is increasingly replacing LDP and RSVP-TE for label distribution.

IGP and Label Distribution

First let’s look at the IGP setup and label distribution as everything else will be built on top of this.

ipi-1.lab.jan1.us.ipa.net#show run int lo
interface lo
 ip address 127.0.0.1/8
 ip address 100.127.0.1/32 secondary
 ipv6 address ::1/128
 ipv6 address 2001:db8::1/128
 prefix-sid index 101
 ip router isis UNDERLAY
 ipv6 router isis UNDERLAY
!

We have to set an index to create the node-sid for this device. In this case we use 101.

ipi-1.lab.jan1.us.ipa.net#show run segment-routing
segment-routing
 mpls sr-prefer
 global block 16000 23999

Since our segment routing global block starts at 16000 the node-sid becomes 16101 as the index + the start of the SRGB defines the sid. Additionally, we run mpls sr-prefer as this will prefer SR labels over LDP or RSVP-TE labels.

ipi-1.lab.jan1.us.ipa.net#show run isis
router isis UNDERLAY
 is-type level-1-2
 metric-style wide
 mpls traffic-eng router-id 100.127.0.1
 mpls traffic-eng level-1
 mpls traffic-eng level-2
 capability cspf
 dynamic-hostname
 fast-reroute ti-lfa level-1 proto ipv4
 fast-reroute ti-lfa level-2 proto ipv4
 net 49.0015.1001.2700.0001.00
 segment-routing mpls
!

Finally, we have to enable ISIS for segment routing.

ipi-1.lab.jan1.us.ipa.net#show clns neighbors

Total number of L1 adjacencies: 1
Total number of L2 adjacencies: 1
Total number of adjacencies: 2
Tag UNDERLAY:  VRF : default
System Id      Interface   SNPA                State  Holdtime  Type Protocol
ipi-2.lab.jan1.us.ipa.net xe48        3c2c.99c0.00aa      Up     26        L1L2 IS-IS
ipi-1.lab.jan1.us.ipa.net#show mpls ilm-table
Codes: > - installed ILM, * - selected ILM, p - stale ILM
        K - CLI ILM, T - MPLS-TP, s - Stitched ILM
       S - SNMP, L - LDP, R - RSVP, C - CRLDP
       B - BGP , K - CLI , V - LDP_VC, I - IGP_SHORTCUT
       O - OSPF/OSPF6 SR, i - ISIS SR, k - SR CLI
       P - SR Policy, U - unknown

Code    FEC/VRF/L2CKT    ILM-ID      In-Label    Out-Label   In-Intf    Out-Intf/VRF       Nexthop
     LSP-Type
   i>   100.127.0.1/32     4           16101       Nolabel     N/A        N/A              127.0.0.1
     LSP_DEFAULT
   B>   evpn:1             3           17          Nolabel     N/A        N/A              127.0.0.1
     LSP_DEFAULT
   B>   evpn:100           1           16          Nolabel     N/A        N/A              127.0.0.1
     LSP_DEFAULT
   B>   evpn:1             2           640         Nolabel     N/A        N/A              127.0.0.1
     LSP_DEFAULT
   P>   100.127.0.2/32     7           20          3           N/A        xe48             100.126.0.2
     LSP_DEFAULT
   i>   100.126.0.2/32     5           26240       3           N/A        xe48             100.126.0.2
     LSP_DEFAULT
   i>   100.127.0.2/32     6           16102       3           N/A        xe48             100.126.0.2
     LSP_DEFAULT

Now we can see that we have a clns/isis neighbor with ipi-2 as well as learned labels. We can see both device’s node-sids in the label table on ipi-1.

This image has an empty alt attribute; its file name is IPA-Blog-ad-template-network.jpg
iparchitechs.com/contact

BGP EVPN Setup

Next we can build EVPN on top of the underlay to begin delivering services. First we have to build an EVPN BGP session between the two routers.

ipi-1.lab.jan1.us.ipa.net#show run bgp
!
router bgp 65000
 neighbor 100.127.0.2 remote-as 65000
 neighbor 100.127.0.2 update-source lo
 !
 address-family l2vpn evpn
 neighbor 100.127.0.2 activate
 exit-address-family
 !
ipi-1.lab.jan1.us.ipa.net#show bgp l2vpn evpn summary
BGP router identifier 100.127.0.1, local AS number 65000
BGP table version is 32
1 BGP AS-PATH entries
0 BGP community entries

Neighbor                 V   AS   MsgRcv    MsgSen TblVer   InQ   OutQ    Up/Down   State/PfxRcd     AD  MACIP
MCAST    ESI  PREFIX-ROUTE
100.127.0.2              4 65000 22856      22856      32      0      0  6d18h34m               2      1      0
     1      0      0

EVPN-VPWS

Next we can start build services on top. First we’ll build an EVPN-VPWS service.

ipi-1.lab.jan1.us.ipa.net:
!
evpn mpls enable
!
evpn mpls vtep-ip-global 100.127.0.1
!
mac vrf BLUE
 rd 100.127.0.1:1
 route-target both evpn-auto-rt
!
evpn mpls id 100 xconnect target-mpls-id 2
 host-reachability-protocol evpn-bgp BLUE
!
interface xe46.10 switchport
 encapsulation dot1q 10
 access-if-evpn
  map vpn-id 100
!

EVPN MPLS has to be enabled. *IMPORTANT* This requires a reboot. Next the vtep id needs to be set. These are global settings for the environment.

For the creation of the service we’ll start by making a mac vrf to generate the information needed to create a EVPN type 2 route (mac-ip).

Since this is VPWS it is considered a cross connect xconnect and a target is defined. This is the remote PE vpn-id, in this case 2.

Finally it is assigned to a switchport. It has to be a switchport with a type of access-if-evpn. This maps back to the EVPN mac-vrf via the xconnect. Anything arriving on xe46.10 with a dot1q tag of 10 is placed into this tunnel.

ipi-1.lab.jan1.us.ipa.net#show evpn mpls xconnect
EVPN Xconnect Info
========================
AC-AC: Local-Cross-connect
AC-NW: Cross-connect to Network
AC-UP: Access-port is up
AC-DN: Access-port is down
NW-UP: Network is up
NW-DN: Network is down
NW-SET: Network and AC both are up

Local                            Remote       Connection-Details

================================ ============ ==========================================================================
=========
VPN-ID       EVI-Name      MTU   VPN-ID       Source       Destination                   PE-IP           MTU   Type   NW
-Status
================================ ============ ==========================================================================
=========
100          ----          1500  2            xe46.10      --- Single Homed Port ---     100.127.0.2     1500  AC-NW  NW
-SET

Total number of entries are 1
ipi-1.lab.jan1.us.ipa.net#show evpn mpls xconnect tunnel
EVPN-MPLS Network tunnel Entries
Source           Destination      Status        Up/Down       Update        local-evpn-id remote-evpn-id
========================================================================================================
100.127.0.1      100.127.0.2      Installed     01:31:06      01:31:06      100           2

Total number of entries are 1

The tunnels are up, installed, and ready for forwarding. We can see the CE macs as mac-ip routes in evpn.

ipi-1.lab.jan1.us.ipa.net#show bgp l2vpn evpn vrf BLUE
BGP table version is 1, local router ID is 100.127.0.1
Status codes: s suppressed, d damped, h history, a add-path, * valid, > best, i - internal,
              l - labeled, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

[EVPN route type]:[ESI]:[VNID]:[relevent route informantion]
1 - Ethernet Auto-discovery Route
2 - MAC/IP Route
3 - Inclusive Multicast Route
4 - Ethernet Segment Route
5 - Prefix Route

    Network          Next Hop            Metric    LocPrf	Weight     Path  Peer          Encap
* i  [1]:[0]:[2]:[16]  100.127.0.2          0        100       0    i  100.127.0.2     MPLS
*>   [1]:[0]:[100]:[16]
                       100.127.0.1          0        100       32768  i  ----------      MPLS

Total number of prefixes 2

The mac addresses are sent via an EVPN type-2 route between PEs.

ipi-1.lab.jan1.us.ipa.net#show evpn mpls mac-table
========================================================================================================================
=================
                                                     EVPN MPLS MAC Entries
========================================================================================================================
=================
VNID       Interface VlanId    In-VlanId Mac-Addr       VTEP-Ip/ESI                    Type            Status     MAC mo
ve AccessPortDesc
________________________________________________________________________________________________________________________
_________________


Total number of entries are : 0

Since this is VPWS there are no macs learned on the device.

[email protected]# run ping 172.16.0.2
PING 172.16.0.2 (172.16.0.2): 56 data bytes
64 bytes from 172.16.0.2: icmp_seq=0 ttl=64 time=21.531 ms
64 bytes from 172.16.0.2: icmp_seq=1 ttl=64 time=22.124 ms

Success! The CEs can reach each other over the EVPN-VPWS circuit.

EVPN-VPLS

Now we’ll build an EVPN-VPLS service. The BGP setup is the same so we’ll focus solely on the differences. The first one being the vpn-id creation.

mac vrf ORANGE
 rd 100.127.0.1:2
 route-target both evpn-auto-rt
!
evpn mpls id 1
 host-reachability-protocol evpn-bgp ORANGE
!

There is no end point defined as a xconnect. All that is necessary is to bind the mac vrf to the evpn vpn id.

interface xe46.100 switchport
 encapsulation dot1q 100
 access-if-evpn
  map vpn-id 1
!

Again, a switchport defined as an access-if-evpn is necessary. This is then mapped to the vpn-id for the VPLS service. In this case anything coming in with a dot1q tag of 100 will be placed into vpn-id 1.

ipi-1.lab.jan1.us.ipa.net#show evpn mpls mac-table
========================================================================================================================
=================
                                                     EVPN MPLS MAC Entries
========================================================================================================================
=================
VNID       Interface VlanId    In-VlanId Mac-Addr       VTEP-Ip/ESI                    Type            Status     MAC mo
ve AccessPortDesc
________________________________________________________________________________________________________________________
_________________

1          xe46.100  ----      ----      84c1.c132.5031 100.127.0.1                    Dynamic Local   -------    0
   -------
1          ----      ----      ----      84c1.c132.5032 100.127.0.2                    Dynamic Remote  -------    0
   -------

Total number of entries are : 2

Since this is a VPLS service MACs are learned both locally and remotely. The remote MAC is the MAC of the remote CE. This was learned via EVPN and from the VTEP 100.127.0.2.

ipi-1.lab.jan1.us.ipa.net#show bgp l2vpn evpn mac-ip vrf ORANGE
ESI                            Eth-Tag     Mac-Address    IP-Address                              VNID/LABEL     L3VNID
   Nexthop         GW-Type         Encap
0                              1           84c1:c132:5031 --                                      17             0
   100.127.0.1     --              MPLS
0                              1           84c1:c132:5032 --                                      17             0
   100.127.0.2     --              MPLS

The type-2 routes are populated in the BGP table.

[email protected]# run ping 192.168.0.2
PING 192.168.0.2 (192.168.0.2): 56 data bytes
64 bytes from 192.168.0.2: icmp_seq=0 ttl=64 time=21.894 ms
64 bytes from 192.168.0.2: icmp_seq=1 ttl=64 time=22.159 ms

Success! We have reachability across the service.

Conclusion

IP Infusion is continuing to build their evpn/mpls deployment as well as segment routing. It is exciting to see these feature sets continue to mature as traditional LDP/VPLS deployments move to EVPN/MPLS. If you need assistance on the transition from LDP to segment routing or VPLS to EVPN reach out to IP Architechs.

BGP communities for traffic steering – part 2: State Management across Data Centers

This post has been a while in the making and follows up on an article about BGP communities that can be found here. Then we followed it up with some more discussion about FW design and place, or lack there of, on this podcast which inspired me to finish up “part 2”.

Anyone who has ever had to run active/active data centers and has come across this problem of how do I manage state?

You can ignore it and prepare yourself for a late night at the worst time.

Take everyone’s word that systems will never have to talk to the a system in a different security zone in the remote DC

Utilize communities and BGP policy to manage state; which we’ll focus on here

One of the biggest reasons we see for stretching a virtual routing and forwarding (vrf) is to move DC to DC flows of the same security zone below FWs. This reduces the load on the firewall and makes for easier rule management. However, it does introduce a state problem.

We’ll be using the smallest EVPN-multisite deployment you’ve ever seen with Nexus 9000v and Fortinet FWs.

Inter vrf intra data center

The first flow we’ll look at it is transitioning vrfs in the same data center. In this example and all work going forward vrf Blue is allowed to initiate to vrf Orange. However, vrf Orange cannot initiate communication to vrf Blue.

Assuming your firewall rules are correct this “just works” and is no different than running your standard deployment.

vrf-BLUE-1#ping 192.168.10.2
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.10.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 4/5/8 ms

initial request

dc1-leaf-1# show ip route 192.168.10.0/24 vrf BLUE
IP Route Table for VRF "BLUE"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

192.168.10.0/24, ubest/mbest: 1/0
    *via 172.16.0.1, [20/0], 17:29:08, bgp-65100, external, tag 65110

Fortinet-1 routing table

return traffic

dc1-leaf-1# show ip route 192.168.1.0/24 vrf ORANGE
IP Route Table for VRF "ORANGE"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

192.168.1.0/24, ubest/mbest: 1/0
    *via 172.16.0.5, [20/0], 17:30:21, bgp-65100, external, tag 65110

Inter DC intra vrf flow

Here is the flow that normal starts this conversation. There is a desire to move same security zone flows and/or large traffic flows (replication) between DCs below FWs. This can reduce load on the FWs and make rulesets easier to manage since you don’t have to write a lot of exceptions for inbound flows on your untrusted interface.

vrf-BLUE-1#ping 192.168.2.2
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.2.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 16/18/23 ms

Initial request

dc1-leaf-1# show ip route 192.168.2.0/24 vrf BLUE
IP Route Table for VRF "BLUE"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

192.168.2.0/24, ubest/mbest: 1/0
    *via 100.127.0.255%default, [200/1], 19:24:36, bgp-65100, internal, tag 6520
0, segid: 3003000 tunnelid: 0x647f00ff encap: VXLAN

Since we utilized EVPN-Multisite to extend the vrfs between DCs (to be covered in a later blog) the first stop is the border gateway. This is abstracted on the flow diagram but can be seen on the original BGP layout.

dc1-border-leaf-1# show ip route 192.168.2.0/24 vrf BLUE
IP Route Table for VRF "BLUE"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

192.168.2.0/24, ubest/mbest: 1/0
    *via 100.127.1.255%default, [20/1], 19:30:27, bgp-65100, external, tag 65200
, segid: 3003000 tunnelid: 0x647f01ff encap: VXLAN
dc2-border-leaf-1# show ip route 192.168.2.0/24 vrf BLUE
IP Route Table for VRF "BLUE"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

192.168.2.0/24, ubest/mbest: 1/0
    *via 100.127.1.2%default, [200/0], 19:30:59, bgp-65200, internal, tag 65200,
 segid: 3003000 tunnelid: 0x647f0102 encap: VXLAN
dc2-leaf-1# show ip route 192.168.2.0/24 vrf BLUE
IP Route Table for VRF "BLUE"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

192.168.2.0/24, ubest/mbest: 1/0, attached
    *via 192.168.2.1, Vlan2000, [0/0], 20:00:30, direct, tag 3000

This traffic never reaches the FW on the way there and the same behavior happens on the return path. I’m not going to show every hop on the way as it’s identical but in reverse.

Intra vrf Intra DC

Here is the flow that causes a problem. When you change vrfs and change DCs without any other considerations there is an asymmetric path which introduces a state problem. After defining and analyzing the problem here we’ll walk through a solution.

vrf-BLUE-1#ping 192.168.20.2
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.20.2, timeout is 2 seconds:
.....
Success rate is 0 percent (0/5)

Initial request

dc1-leaf-1# show ip route 192.168.20.0/24 vrf BLUE
IP Route Table for VRF "BLUE"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

192.168.20.0/24, ubest/mbest: 1/0
    *via 172.16.0.1, [20/0], 17:50:50, bgp-65100, external, tag 65110

Fortinet-1 routing table

vrf change has occurred and we’re now in vrf Orange after starting in vrf Blue

dc1-leaf-1# show ip route 192.168.20.0/24 vrf ORANGE
IP Route Table for VRF "ORANGE"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

192.168.20.0/24, ubest/mbest: 1/0
    *via 100.127.0.255%default, [200/1], 18:54:34, bgp-65100, internal, tag 6520
0, segid: 3003001 tunnelid: 0x647f00ff encap: VXLAN

we’re going to skip the border gateways as nothing excited happens there.

dc2-leaf-1# show ip route 192.168.20.0/24 vrf ORANGE
IP Route Table for VRF "ORANGE"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

192.168.20.0/24, ubest/mbest: 1/0, attached
    *via 192.168.20.1, Vlan2001, [0/0], 18:58:30, direct, tag 3001

Now we hit the connected route on dc2-leaf-1 as we expected. Remember that we initiated state on fortinet-1.

Return traffic

Okay, now that we made it to vrf-ORANGE-2 what happens to the return traffic.

dc2-leaf-1# show ip route 192.168.1.0/24 vrf ORANGE
IP Route Table for VRF "ORANGE"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

192.168.1.0/24, ubest/mbest: 1/0
    *via 172.16.1.5, [20/0], 17:47:05, bgp-65200, external, tag 65210

Fortinet-2 routing table

The first thing that the return traffic does is try to switch vrf’s back to vrf BLUE. However, fortinet-2 doesn’t have state for this flow. Since vrf-ORANGE can’t initiate communication with vrf-BLUE and there is no state in fortinet-2 the traffic is dropped on the default rule.

The Solution

The first thing we’re going to do is set a community on generation of the type-5 route. This is done by matching a tag of the $L3VNI-VLAN-ID and setting a community of $ASN:$L3VNI-VLAN-ID.

vlan 2000
  name BLUE-DATA
  vn-segment 2002000
vlan 2001
  name ORANGE-DATA
  vn-segment 2002001
vlan 3000
  name VRF-BLUE
  vn-segment 3003000
vlan 3001
  name VRF-ORANGE
  vn-segment 3003001

route-map RM-CON-BLUE permit 10
  match tag 3000
  set community 65100:3000
route-map RM-CON-ORANGE permit 10
  match tag 3001
  set community 65100:3001
vrf context BLUE
  vni 3003000
  rd auto
  address-family ipv4 unicast
    route-target both auto
    route-target both auto evpn
vrf context ORANGE
  vni 3003001
  rd auto
  address-family ipv4 unicast
    route-target both auto
    route-target both auto evpn

interface Vlan2000
  no shutdown
  vrf member BLUE
  ip address 192.168.1.1/24 tag 3000
  fabric forwarding mode anycast-gateway

interface Vlan2001
  no shutdown
  vrf member ORANGE
  ip address 192.168.10.1/24 tag 3001
  fabric forwarding mode anycast-gateway

interface Vlan3000
  no shutdown
  vrf member BLUE
  ip forward

interface Vlan3001
  no shutdown
  vrf member ORANGE
  ip forward

By setting the logic correctly we can force the traffic to always utilize the FW from the datacenter it originated from.

dc1-leaf-1# show run rpm

!Command: show running-config rpm
!Running configuration last done at: Sun Mar 20 15:08:38 2022
!Time: Sun Mar 20 15:43:29 2022

version 9.3(3) Bios:version
ip community-list standard DC1-BLUE-CL seq 10 permit 65100:3000
ip community-list standard DC1-ORANGE-CL seq 10 permit 65100:3001
ip community-list standard DC2-BLUE-CL seq 10 permit 65200:3000
ip community-list standard DC2-ORANGE-CL seq 10 permit 65200:3001
route-map BLUE-TO-FW-IN permit 10
  match community DC1-ORANGE-CL
route-map BLUE-TO-FW-IN permit 20
  match community DC2-ORANGE-CL
  set local-preference 120
route-map BLUE-TO-FW-OUT permit 10
  match community DC1-BLUE-CL DC2-BLUE-CL
route-map ORANGE-TO-FW-IN permit 10
  match community DC1-BLUE-CL
route-map ORANGE-TO-FW-IN permit 20
  match community DC2-BLUE-CL DC2-ORANGE-CL
  set local-preference 80
route-map ORANGE-TO-FW-OUT permit 10
  match community DC1-ORANGE-CL DC2-ORANGE-CL
route-map RM-CON-BLUE permit 10
  match tag 3000
  set community 65100:3000
route-map RM-CON-ORANGE permit 10
  match tag 3001
  set community 65100:3001

dc1-leaf-1# show run bgp

!Command: show running-config bgp
!Running configuration last done at: Sun Mar 20 15:08:38 2022
!Time: Sun Mar 20 15:44:05 2022

version 9.3(3) Bios:version
feature bgp

router bgp 65100
  neighbor 100.127.0.0
    remote-as 65100
    update-source loopback0
    address-family l2vpn evpn
      send-community extended
  vrf BLUE
    address-family ipv4 unicast
      advertise l2vpn evpn
      redistribute direct route-map RM-CON-BLUE
    neighbor 172.16.0.1
      remote-as 65110
      address-family ipv4 unicast
        send-community
        route-map BLUE-TO-FW-IN in
        route-map BLUE-TO-FW-OUT out
  vrf ORANGE
    address-family ipv4 unicast
      redistribute direct route-map RM-CON-ORANGE
    neighbor 172.16.0.5
      remote-as 65110
      address-family ipv4 unicast
        send-community
        route-map ORANGE-TO-FW-IN in
        route-map ORANGE-TO-FW-OUT out
dc2-leaf-1# show run rpm

!Command: show running-config rpm
!Running configuration last done at: Sun Mar 20 15:13:30 2022
!Time: Sun Mar 20 15:45:25 2022

version 9.3(3) Bios:version
ip community-list standard DC1-BLUE-CL seq 10 permit 65100:3000
ip community-list standard DC1-ORANGE-CL seq 10 permit 65100:3001
ip community-list standard DC2-BLUE-CL seq 10 permit 65200:3000
ip community-list standard DC2-ORANGE-CL seq 10 permit 65200:3001
route-map BLUE-TO-FW-IN permit 10
  match community DC2-ORANGE-CL
route-map BLUE-TO-FW-IN permit 20
  match community DC1-ORANGE-CL
  set local-preference 120
route-map BLUE-TO-FW-OUT permit 10
  match community DC1-BLUE-CL DC2-BLUE-CL
route-map ORANGE-TO-FW-IN permit 10
  match community DC2-BLUE-CL
route-map ORANGE-TO-FW-IN permit 20
  match community DC1-BLUE-CL DC1-ORANGE-CL
  set local-preference 80
route-map ORANGE-TO-FW-OUT permit 10
  match community DC1-ORANGE-CL DC2-ORANGE-CL
route-map RM-CON-BLUE permit 10
  match tag 3000
  set community 65200:3000
route-map RM-CON-ORANGE permit 10
  match tag 3001
  set community 65200:3001

dc2-leaf-1# show run bgp

!Command: show running-config bgp
!Running configuration last done at: Sun Mar 20 15:13:30 2022
!Time: Sun Mar 20 15:45:40 2022

version 9.3(3) Bios:version
feature bgp

router bgp 65200
  neighbor 100.127.1.0
    remote-as 65200
    update-source loopback0
    address-family l2vpn evpn
      send-community
      send-community extended
  vrf BLUE
    address-family ipv4 unicast
      redistribute direct route-map RM-CON-BLUE
    neighbor 172.16.1.1
      remote-as 65210
      address-family ipv4 unicast
        send-community
        route-map BLUE-TO-FW-IN in
        route-map BLUE-TO-FW-OUT out
  vrf ORANGE
    address-family ipv4 unicast
      redistribute direct route-map RM-CON-ORANGE
    neighbor 172.16.1.5
      remote-as 65210
      address-family ipv4 unicast
        send-community
        route-map ORANGE-TO-FW-IN in
        route-map ORANGE-TO-FW-OUT out

Here is the result of this implementation

vrf-BLUE-1#ping 192.168.20.2
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.20.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 16/18/22 ms

Lets look at the routing tables now.

dc1-leaf-1# show ip route 192.168.20.0/24 vrf BLUE
IP Route Table for VRF "BLUE"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

192.168.20.0/24, ubest/mbest: 1/0
    *via 172.16.0.1, [20/0], 00:40:23, bgp-65100, external, tag 65110

fortinet-1 routing table

We changed vrfs to vrf ORANGE now.

dc1-leaf-1# show ip route 192.168.20.0/24 vrf ORANGE
IP Route Table for VRF "ORANGE"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

192.168.20.0/24, ubest/mbest: 1/0
    *via 100.127.0.255%default, [200/1], 20:54:07, bgp-65100, internal, tag 6520
0, segid: 3003001 tunnelid: 0x647f00ff encap: VXLAN

again we’ll skip over the border gateways

dc2-leaf-1# show ip route 192.168.20.0/24 vrf ORANGE
IP Route Table for VRF "ORANGE"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

192.168.20.0/24, ubest/mbest: 1/0, attached
    *via 192.168.20.1, Vlan2001, [0/0], 20:57:54, direct, tag 3001

Return traffic

Now the return traffic will go back to fortinet-1 where we have the original state instead of fortinet-2.

dc2-leaf-1# show ip route 192.168.1.0/24 vrf ORANGE
IP Route Table for VRF "ORANGE"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

192.168.1.0/24, ubest/mbest: 1/0
    *via 100.127.1.255%default, [200/2000], 00:43:36, bgp-65200, internal, tag 6
5100, segid: 3003001 tunnelid: 0x647f01ff encap: VXLAN

skipping over the border gateways we land back at dc1-leaf-1

dc1-leaf-1# show ip route 192.168.1.0/24 vrf ORANGE
IP Route Table for VRF "ORANGE"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

192.168.1.0/24, ubest/mbest: 1/0
    *via 172.16.0.5, [20/0], 19:54:44, bgp-65100, external, tag 65110

and we arrived back at fortinet-1 where we have a valid session.

switch vrfs back to vrf BLUE and hit the connected route

dc1-leaf-1# show ip route 192.168.1.0/24 vrf BLUE
IP Route Table for VRF "BLUE"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]
'%<string>' in via output denotes VRF <string>

192.168.1.0/24, ubest/mbest: 1/0, attached
    *via 192.168.1.1, Vlan2000, [0/0], 22:19:11, direct, tag 3000

Conclusion

That was a lot of work to meet the goal of utilizing both data centers, allowing vrf to vrf communication below firewalls, and not breaking state.

However, it is manageable. It also gives a few other benefits such as:

  • being able to take an entire DCs firewall stack offline and not losing connectivity.
  • less load on FWs
  • less FW rule complexity

But with this comes increased routing complexity. So as always there are tradeoffs! Make sure you analyze them against your business needs before proceeding.

If you’d like to know more or need help with that contact us at IP Architechs.