MikroTik CHR – Breaking the 100G barrier

Introduction

The world is strange today. Despite the Covid-19 crisis all over the world, most ISPs are fighting a battle to deliver more bandwidth on a daily basis.

  • Work from home
  • Online schools
  • increasing content consumption

All pushed ISPs to their bandwidth limits, leaving ISP’s no option but to look for upgrades, for everlasting bandwidth demands. There, they are having another set of problems, facing them in this completely new and strange world. Chip shortage, logistic and labor health issues caused higher prices and no stock availability. Here in IP ArchiTechs, we are spending lot of our time finding a good solution for our customers and to help them overcome these hard times. Whether that is our regular Team meeting or just a chat with our colleagues in almost any occasion someone mentions something about new solution to improve capacity and performance for our customers.

Starting with a thought, what is available as a platform today, and of course it’s ready to be shipped immediately after you checkout and pay one thing obviously was just in front of me. X86 server, dozens of them. They are left from the time when we were buying new hardware just because new generation was 10 to 15 % faster then previous with some minor or no new features. When we are looking between three generations of the servers that includes DDR4 support and Xeon E5 processors, v3, v4 and 1st generation of the scalable processor family we could not see epic and dramatic improvements between them. That makes these widely avaliable in reseller stock.

That’s why I mainly focused on x86 while running performance tests with new RouterOS 7 in order to see what is achievable with that combination of the software and hardware. In this, my first article on StubArea51 I will try to show you how we achieved 200Gbps aggregate and the testing methodology. Enjoy.


This image has an empty alt attribute; its file name is IPA-Blog-ad-template-network.jpg
https://iparchitechs.com/contact

Hardware

When it comes to hardware, I choose the most available servers on the eBay. These are by my opinion:

  • Dell R430
  • Dell R630
  • Dell R730

This does not mean that you can’t choose something equivalent from a different manufacturer, anyone of them have almost the same specs just different brand. So, let’s deep dive into specs of these three and see differences between them.

Pick your Box

Dell R430 or equivalent

  • 1U chassis
  • Short-depth chassis (23.9 in)
  • Supports dual CPU configuration with Intel v3 and v4 processors
  • Supports up to two Pci-Express v3.0 cards
  • Embedded Network card is integrated (soldered)

Dell R630 or equivalent

  • 1U chassis
  • Chassis depth (27.57 in)
  • Supports dual CPU configuration with Intel v3 and v4 processors
  • Supports up to three Pci-Express v3.0 cards
  • Network card is in Dell nDC format, and you have three options 4x1Gb, 2x1Gb + 2x10Gb sfp+ and 4x 10Gb sfp+

Dell R730 or equivalent

  • 2U chassis
  • Chassis depth (26.92 in)
  • Supports dual CPU configuration with Intel v3 and v4 processors
  • Supports up to seven Pci-Express v3.0 cards
  • Network card is in Dell nDC format, and you have three options 4x1Gb, 2x1Gb + 2x10Gb sfp+ and 4x 10Gb sfp+

The Dell R730 is most dense platform upon them. It has 7 PCIe slots plus one integrated nDC card where you can also have your 1Gb or 10Gb connections and attach them to the RouterOS 7. For the testing we did, we went with a Dell R730 chassis.

Pick your CPU

When it comes to the CPU recommendations, we had these processors available in our lab:

  • 2x Intel Xeon E5-2630v3
  • 2x Intel Xeon E5-2680v3
  • 2x Intel Xeon E5-2650v4
  • 2x Intel Xeon E5-2667v4
  • 2x Intel Xeon E5-2699v4

We decided to use middle range processor and our pick for all the tests was 2x Intel Xeon E5-2650v4. Here are the specs for the chosen CPU:

  • Base frequency: 2.2Ghz
  • Turbo frequency: 2.9Ghz
  • Bus speed: 9.6 GT/s
  • PCIe lanes: 40

On the chassis we selected for the testing there are 7 PCIe slots, 4 full height and 3 low profile height. Below you can find how slot are associated to the CPU’s:

As you can see from the previous picture, we have 4 slots tied to CPU2 and 3 tied to CPU1. Here, NDC card is not included, NDC is always tied up to CPU1. A natural conclusion is that we have good load balancing between CPU’s.

Unfortunately, this is not very good for us and let’s talk about why and how the CPU’s communicate with each other. CPU’s are interconnected with QPI bus and each individual CPU have its own memory controller. Traffic between CPU’s is called “Snoop traffic”. Bad thing about this type of traffic is that QPI traffic introduce asymmetric routing and increasing latency in our use case.

For example: you have two NICs, one is tied up to CPU1 and second is tied up to CPU2, a TCP connection will flow from NIC1 to the CPU1, then will go through QPI link, and then to CPU2, at the end leaving through NIC2. This is causing:

  • TCP out of sync
  • Packet disorder
  • Latency issues.

One way to go around this is to keep NIC groups separated and make connection to the switch accordingly. If you have three NICs tied to the CPU2 best way is to only group them into the one LACP and to not add NICs that are tied to the CPU1. With this you will minimize impact of the issues stated above.

Pick your memory

Memory configuration in our use case is not crucial. We kept in Advanced ECC mode configured with two banks for each CPU. We had 32GB stick in A1, A2, B1 and B2 slot, that gave us plenty of memory available (128GB) which is more than we need. Our thinking is that you will be probably good with four 16GB or even 8GB sticks (if you find it anywhere).

Peripherals and RAID controller

On the Intel Xeon v3 and v4 processor you have 40 PCIe lanes available per CPU. Our idea is to keep them as free as possible. You will not need the RAID controller, or any other peripherals attached to your Router OS. We used the SATA port that is on motherboard to connect the SSD and install software on it.

Pick your NIC’s

Choosing the right NIC is essential for your Mikrotik RouterOS CHR. We went through a lot of different options but one really stuck out. That is Intel X520 dual 10Gbps SFP+ card. This is PCIe 2.0 x8 card.

Reason why we went with Intel X520 is that first we have a lot of them in our lab, second, they are widely available and affordable. This card has been supported for a long time on Mikrotik and has the most important drivers, that allow pass-through from the hypervisor directly to the OS.

We installed six of them in our chosen box (fixed form factor server), populating Slot 1,2,3,4,5,6 respectively. This setup will give as 120Gbps of potential throughput and its perfect for what we trying to accomplish which is to break 100Gbps barrier with our Mikrotik CHR router.

Pick up PSU’s, iDrac

PSU are also one of the important factors, giving you necessary redundancy for your system. Platinum version is good option because will save your money because its energy efficient. Always have two PSU to have fully redundant setup.

iDrac (or equivalent) can comes handy especially when you need to recover your router remotely and you need to see what’s happening. We have enterprise licenses in our lab so we were able to login to remote console through web and access chassis keyboard, mouse and monitor remotely.

Software

Hypervisor: ESXi version 7.

Mikrotik: 7.1beta6

Ubuntu Server 20.04

Ubuntu server was used to hold iPerf3 Server and Client side on the other servers that was for generating traffic.

LAB Setup

After we choose all necessary components for our chassis, where we will run the Mikrotik CHR, let’s see what else we need for the LAB. We need following things:

  • 12x Traffic generators
  • 10Gbit Switch
  • Lot of 10Gb DAC cables
  • Ethernet Patch cables
  • 1Gbit Switch for ESXi management
  • 100Mbps Switch for iDrac management
  • Console Server for remote access to the switch console ports
  • UPS

Traffic generators used:

  • 8 Dell R430:
    • 2x Intel Xeon E5-2630v3
    • 128Gb Ram
    • QLogic® 57810 dual-port 10Gb
  • 1 Dell R630:
    • 2x Intel Xeon E5-2680v3
    • 128Gb Ram
    • Intel X520 dual 10Gb port SFP+
  • 1 Dell R630:
    • 2x Intel Xeon E5-2630v3
    • 128Gb Ram
    • Dell nDC 2x1Gb i350 + X520 2xSFP+
  • 2 Fujitsu RX2530 M2
    • 2x Intel Xeon E5-2667v4
    • 128Gb Ram
    • Mellanox® ConnectX®-3 dual-port 10Gb Direct Attach/SFP+

All the traffic generators had ESXi 7.0b installed on them and Ubuntu Server 20.04 as a vm with passthrough enabled for NIC.

10Gbit Switch

Our primary choice for the switch was Dell S4048-ON with 48 SFP+ and 6 QSFP ports, capable of handling 1.44 Tb which is more than enough we needed for the tests.

All other equipment various, but it’s based on a Mikrotik switches, Digi Console Server and APC UPS’s.

Software configurations

VMware ESXi

There are not really configuration changes in ESXi other than default settings, only one thing that is configured is Pass-Through being enabled on 10 Gbit NIC’s.

Mikrotik VM configuration

Mikrotik RouterOS CHR

/interface ethernet
set [ find default-name=ether8 ] mtu=1520 name=ether1
set [ find default-name=ether1 ] mtu=1520 name=ether2
set [ find default-name=ether2 ] mtu=1520 name=ether3
set [ find default-name=ether3 ] mtu=1520 name=ether4
set [ find default-name=ether4 ] mtu=1520 name=ether5
set [ find default-name=ether5 ] mtu=1520 name=ether6
set [ find default-name=ether6 ] mtu=1520 name=ether7
set [ find default-name=ether7 ] mtu=1520 name=ether8
set [ find default-name=ether1 ] disable-running-check=no mtu=1520 name=ether9
/interface bonding
add mode=802.3ad name=Po0 slaves=ether1,ether2,ether3,ether4,ether5,ether6,ether7,ether8 transmit-hash-policy=\
    layer-3-and-4
/interface vlan
add interface=Po0 name=VLAN3000-SERVER-01-IPERF-01-SERVER-03 vlan-id=3000
add interface=Po0 name=VLAN3001-SERVER-03-SERVER-01 vlan-id=3001
add interface=Po0 name=VLAN3002-SERVER-01-IPERF-02-SERVER-04 vlan-id=3002
add interface=Po0 name=VLAN3003-SERVER-04-SERVER-01 vlan-id=3003
add interface=Po0 name=VLAN3004-SERVER-06-SERVER-07 vlan-id=3004
add interface=Po0 name=VLAN3005-SERVER-07-SERVER-06 vlan-id=3005
add interface=Po0 name=VLAN3006-SERVER-05-SERVER-08 vlan-id=3006
add interface=Po0 name=VLAN3007-SERVER-08-SERVER-05 vlan-id=3007
add interface=Po0 name=VLAN3009-SERVER-09-SERVER-10 vlan-id=3009
add interface=Po0 name=VLAN3010-SERVER10-SERVER-09 vlan-id=3010
add interface=Po0 name=VLAN3011-SERVER-11-SERVER-12 vlan-id=3011
add interface=Po0 name=VLAN3012-SERVER12-SERVER-11 vlan-id=3012
/system identity
set name=RouterOS

Ubuntu Server 20.04

Sample Ubuntu traffic generator netplan configuration

network:
  bonds:
    bond0:
      interfaces:
      - ens160f0
      - ens160f1
      parameters:
        lacp-rate: slow
        mode: 802.3ad
        transmit-hash-policy: layer3+4
  ethernets:
    ens160f0: {}
    ens160f1: {}
    ens224:
      dhcp4: true
  version: 2
  vlans:
    bond0.3000:
      addresses:
      - 10.1.1.2/30
      routes:
      - to: 10.1.3.0/30
        via: 10.1.1.1
      - to: 10.6.7.0/30
        via: 10.1.1.1
      id: 3000
      link: bond0
      nameservers:
        addresses: []
        search: 

iPerf3

Client:

Client configuration includes running three separate instances and 10 parallel streams to better utilize multi-core performance 

iperf3 -c <server IP> -T p1 -P 10 -t 0 -p 5101 & 
iperf3 -c <server IP> -T p2 -P 10 -t 0 -p 5102 & 
iperf3 -c <server IP> -T p3 -P 10 -t 0 -p 5103

Server:

Three separate instances also for the server

iperf3 -s -p 5101 & 
iperf3 -s -p 5102 & 
iperf3 -s -p 5103

TEST RESULTS

L2/L3 Diagram

Finally at the end we run the tests. Results are great. We were able to push almost 10 Million packets and 101Gbps of full duplex throughput (total aggregate throughput is 202Gbps). Please see picture attached below which is actual test running picture:

We did also another type of the test, to see how new CHR v7 performs with and without hyper threading enabled. See screenshots below:

HyperThreading On

HyperThreading Off

This test gave us interesting result, new v7 CHR performs much better with HyperThreading enabled.

Conclusion

RouterOS v7 CHR performs much better with the new kernel and new routing engine. With v6 we were able to do a maximum of 27Gbps in our labs. CPU multicore utilization is much better, and load balanced across the cores. It does also support newer hardware like Intel XL710 which we will be testing next.

So, I must conclude, we did broke 100G barrier with Mikrotik!!!

All test’s has been done without measuring:

  • Packet out of order
  • Retransmissions
  • Latency measuring

We are working on setting up a Cisco TREX traffic generators to measure RFC2544 performance of the Mikrotik CHR virtual router

Special thanks to whole IPA crew for the support, especially to:

Kevin-Wan Myers – IPA Jedi Master (thanks for the force)

Derek Mr Anderson – Supreme commander of our Matrix

Matrix Vincman Shuele – (he actually took glove off from Tanos, but Ironman wanted to show off)

Whitebox networking – coming soon to an edge near you?


What is whitebox networking and why is it important?

whitebox-switch_500px-wide

A brief history of the origins of whitebox

One of the many interesting conversations to come out of my recent trip to Network Field Day 14 (NFD14) hosted by Gestalt IT was a discussion on the future of whitebox. As someone who co-founded a firm that consults on whitebox and open networking, it was a topic that really captivated me and generated a flurry of ideas on the subject. This will be the first in a series of posts about my experiences and thoughts on NFD14.

Whitebox is a critical movement in the network industry that is reshaping the landscape of what equipment and software we use to build networks. At the dawn of the age of IT in the late 80s and early 90’s, we used computing hardware and software that was proprietary – a great example would be an IBM mainframe.

Then we evolved into the world of x86 and along came a number of operating systems that we could choose from to customize the delivery of applications and services. Hardware became a commodity and software became independent of the hardware manufacturer.

ONIE – The beginning of independent network OS

A few years ago, an initiative called the Open Network Install Envrionment or ONIE for short was started by Cumulus Networks and shortly thereafter received support from NFD14 presenter, Big Switch Networks, among others. It was the first open project that developed a common multi-vendor framework for separating a network operating system from the hardware – just like we did with x86 and computing in the late 80s and early 90s.

The importance of whitebox in 2017 and beyond

Whitebox is critically important to the future of networking because it is forcing all incumbent network hardware/software vendors to compete in an entirely different way. The idea that a tightly integrated network operating system on proprietary hardware is essential to maintaining uptime and availability is rapidly fading.

Tech giants like Google, Facebook and Linkedin have proven that commodity and open network hardware/software can scale and support the most mission critical environments. This has given early adopters of whitebox networking the confidence to deploy it in the enterprise data center.

Whitebox is only a data center technology…right?

White box has been so disruptive in the data center, why wouldn’t we want it everywhere?

As we visited the presenters of NFD14 who had been developing whitebox software and hardware (Big Switch Networks and Barefoot Networks), one idea in particular stood out in my mind – White box has been so disruptive in the data center, why wouldn’t we want it everywhere?

The use case for whitebox in the data center has grown so much in the last few years, that it’s now a major part of the conversation when selecting a vendor for new data centers that I’m involved with designing. This is a huge leap forward as most companies would have struggled with considering whitebox technology as recently as two or three years ago.

A clear advantage to come out of the hard work that whitebox vendors have done in marketing and selling data centers on the proposition of commodity hardware and independent operating systems is that it’s paved the way for whitebox to make it to the branch and edge of the network.

Vendor position on whitebox as an edge technology

The idea that whitebox can be used outside of the data center seems to be prevalent among the vendors we met with during NFD14. This is a question I specifically posed to both Barefoot and Big Switch and the answer was the same – both companies have developed technology that is reshaping the network engineering landscape and they would like to see it go beyond the boundaries of the data center.

I would expect in the next 12 to 18 months, we are going to see the possibility of use cases targeted to the edge and maybe even campus distribution/core. A small leaf spine architecture would be well suited to run most enterprise campuses and the CAPEX/OPEX benefits would be enormous. Couple that with smaller edge switches and you’ve got a really strong argument to make for ditching the incumbent network vendor as well as breaking out of vendor lock in.

And it makes a lot of sense for whitebox companies to expand into this market, there are already edge platforms in the world of commodity switching that support 48 ports of POE copper with 4 x 10 Gpbs uplinks. Aside from the obvious cost savings, imagine taking some of the orchestration/automation systems that are being used right now for the data center and applying them to problems like edge port provisioning or Network Access Control. Also, being able to support large wireless rollouts and controllers in a more automated fashion would be a huge win.

Role in Network Function Virtualization and the virtual enterprise branch

Network Function Virtualization (NFV) has gotten a lot of press lately as more and more organizations are virtualizing routers, firewalls, wan optimizers and now SDWAN appliances.

One of the endgames that I see coming out of the whitebox revolution is the marrying of NFV and whitebox edge switching to create a virtual branch in a box. The value to an enterprise is enormous here as the underlying hardware can be used in a longer design cycle while the software running on top can be refreshed to solve new business requirements as needed.

The other major benefit of hardware abstraction is that it simplifies orchestration and automation when interfaces are VLAN based and not tied to a specific port/number.

What are the challenges to getting whitebox into the hands of the average enterprise for edge or campus use?

While not an exhaustive list, these are a few of the challenges to getting whitebox into the average enterprise.

  • Perception is one of the key challenges to moving towards whitebox in the edge. Most enterprises tend to be hesitant to be early adopters unless there is a clear business advantage to doing so.
  • Vendor lock in is another major barrier as most enterprises tend to stay within one vendor for routing and switching.
  • Confidence is a key part of the hardware selection process and this is an area that I feel whitebox is gathering serious momentum. And that will help make the edge case an easier sell when it becomes a more common use case.
  • Training and Support is always one of the first questions asked by a network team to see how easy it will be to support and get the techies up to speed on care and feeding of the new deployment. This is another area of whitebox that has seen a huge amount of growth in the last few years. Big Switch has a fantastic lab for learning data center deployments and hopefully as the use cases expand, we’ll see the same high quality training for those areas as well.

Closing thoughts

Whitebox at the edge is closer than ever before and there are small pockets of actual deployments which are mostly in the service provider world.

However, we’ll likely have to wait until the use cases are specifically developed and marketed towards the enterprise before there is significant momentum in adoption. As a designer of whitebox solutions, it’s something that i’ll continue to push for and evaluate as it has the potential to make an enormous impact in enterprise networking.

That said, the future of whitebox is incredibly bright and 2017 should bring a significant amount of growth and more movement towards commodity switching as a mainstream technology in all areas of networking.