A reference guide for new & existing ISPs that need to understand network functions and separation.
“How do I add redundancy?” “How do I scale?” “How do I reduce downtime and operational costs?”
These are questions that I get asked practically every day as a consulting network architect that designs and builds ISPs.
In most cases the answer is the same whether the ISP uses fixed wireless broadband, copper or fiber to deliver the last mile – separation of network functions.
This illustrated guide is intended to define the topic and create visual context for each function using a network drawing. It’s the first in a new series on this subject.
A new series of content
This topic is deep and there is a lot to unpack so this will be the first segment in a series of blog posts and videos covering function separation.
Large ISPs typically already embrace the philosophy of separating network functions, so the focus of this series will be to help new or growing regional ISPs understand the design intent and the challenges/costs of running networks that don’t separate network functions.
I typically spend more time in the enterprise data center than most of our team members and this comes with its own unique set of problems. One discussion that seems to never fail to come up is “where do I put the Firewalls (FWs)?”. That is typically followed by I have a disaster recovery or backup site with FWs there as well. This inevitably leads to a state management problem. Let’s look at how we can utilize BGP to address this problem:
what is a BGP standard community
BGP best path selection process
how to utilize them to steer traffic
This is something most service providers deal with on a daily basis but can be new to an enterprise.
BGP Standard communities
A BGP community is a route attribute that, essentially provides extra information for someone to take action or glean information from the route such as where it came from (location, type, organizational role).
By definition, a community is a 32 bit number that can be included with a route and when utilizing the new community format is displayed as (0-65535):(0-65535). It is recommend to utilize the new community format versus the old community format which is just a number. It is typically to utilize $YOURASN:$VALUE such as 65000:1000 which would be a community within ASN 65000 to signify something such as 1000 came from datacenter 1. This insures that you know the community was originated by your organization.
There are some well known communities that have global meaning. BGP communities are also what is called an optional transitive BGP attribute meaning that they can be passed from autonomous system (ASN) to autonomous system. It is typically recommended to strip communities sent from other organizations to prevent interference with your local policy. Telia has a great looking glass utility (https://lg.telia.net)that gives information on the communities they’ve attached to your routes.
BGP path selection
Now that we know what is a community is lets look at the BGP path selection process. It’s not uncommon for vendors to have vendor specific selection criteria with “weight” being at the top. However, typically local preference is the first with highest being the best. Then locally originated, shortest AS path, origin check, and Multi Exit Discriminator (MED). There are more selection criteria than this but we will focus on local preference (LP) since it is at the top of the list.
Modifying local preference will ALWAYS beat one of the other criteria with the exception of a weight value where applicable. For example in a Cisco environment the highest weight will beat the LP, if the weight is equal it will move to LP.
You might have noticed that there is no “community” value in the path selection process. That is because we are going to combine the community value with a policy to take an action on a route. Let’s tie this back to firewalls, security zones, and policy in the next section.
Modifying behavior with communities and LP
There are two ways to extend a security zone through a firewall. You can either route and utilize a Virtual Routing and Forwarding instance (VRF) or switch and put the Switched Virtual Interface (SVI) on the firewall. Switching limits your options as you scale and is not recommended. For this reason we will focus on utilizing VRFs. In this case we have VRF blue with community 65000:1 and VRF Orange with community 65000:2. These are segregated by firewalls and have different security policy.
With two firewalls it is easy to introduce a state problem and have packets rejected because of invalid state. When a packet ingresses one firewall and egresses another this introduces the problem as firewall 2 does not have a session for the flow. “How come we don’t run an HA pair?” is a common question; these might already be an HA pair and you’re preparing for a migration, they’re different vendors, or they’re in different data centers completely.
So now I can tie communities and local preference (LP) together to pin flows to firewalls and manage state. I can raise or lower the LP based on the community to ensure routing always returns to the firewall with the active session state. If the firewalls are in different data centers you can even do this to move flows of the same security level below the firewalls and maintain state across the DCs. Doing so involves advanced network techniques and policy configurations which IParchitechs has successfully implemented in dozen of DCs in spite of vendor attestation that this is not feasible.