Situational Awareness for Network Migrations

At IP Architechs we perform a lot of network migrations and it is no secret network migrations/ maintenance windows can be one of the most nerve-racking things for engineers, managers, and business leaders for a variety of reasons.

For the engineers the uncertainty might be caused by fear of failure, not being able to predict the outcome due to complexity, rushed on preparation to meet a deadline, or a litany of other reasons.

For managers and business leaders it might be more along the lines of; what happens if this goes wrong, how will this effect my bottom line, are there going to be 1000s of trouble tickets come 8/9am when everyone hits the office, and so on.

The Preparation

We’re going to look at this at the perspective of the engineer throughout. The prep work is probably one of the most important pieces of success. This is where you do many things including but not limited to:

  • building and testing the configuration to be implemented
  • making a rollback plan — this might be something as simple as move a cable and shut an interface or a multistep/multi-device plan
  • know the situation surrounding the window

Lets explore understanding the situation surrounding the window a some more. I’ll use some real examples here to help.

We were getting ready to change the internet edge deployment at an enterprise. We did all the prep and rollback planning. However, we were given a few constraints on downtime by the business. Additionally, all of the product teams had to join the call for verification due to the impact of the, relatively small, routing change. The next opportunity was going to be a few months out due to change freezes and the coordination of resources necessary.

So what did we learn by engaging outside of the technical realm?

  • We had tight timeframes which placed an increased emphasis on planning
  • We needed to have plans for things that could go wrong and resolution paths based on downtime constraints
  • although a low impact routing change it was a high impact business change
  • We needed to have clearly defined decision points on what would be cause for a rollback
Image

The Execution

All the prep is done and it’s time to execute the change. We put in the first couple lines of the script and everything is going well. We get to the point where we need to clean up the old configuration. Then every engineers nightmare happens – everything starts to go down.

Okay what do we do now, we know based on the situation we don’t have a lot of time to work through the problem. We need to stay calm and start working through our decision trees made during the planning process.

Some quick troubleshooting revealed when we removed the no longer used virtual routing and forwarding (VRF) instance it shutdown the ports that we now in the global table. We put the VRF back, still unused, everything began to work as expected again.

Next the debate began, should we get TAC on the line to assist. There were still a few items to knock out in the change window to avoid a complete rollback. A majority of people wanted to “chase the rabbit” of what caused the VRF deletion to bring down the interface. However, this would not be a good use of our time. If we got TAC on the line and began to go down that rabbit hole there is no telling where it would have gone or how long it would have taken. The facts were leaving the unused VRF, although annoying to have extra config, didn’t effect performance as far as we could tell and we needed to get through the rest of the migration.

After a short debate we all agreed based on the circumstances of the migration, coordination efforts, business drivers, and still needing to get some more work done we would continue down the migration path. We also took the necessary logs for an initial case with TAC and opened a ticket in the morning. Would we get the same level of info/t-shooting on that problem? No, but we were able to complete the migration and follow up on the weird behavior at a safer time.

Conclusion

Sometimes, based on different circumstance, the right decision would be to get TAC on the line and work through the issue. The owners might decide everything can be down until it’s working as planned or anywhere in between. Often, things like physical access or travel will allow for longer down time/troubleshooting.

It is important to know the situation around the migration, why it’s happening, who’s involved, and keep awareness of those during the migration to make informed decisions with the owner to make everyone successful.

If you need help planning your migrations reach out to us.

This image has an empty alt attribute; its file name is IPA-Blog-ad-template-network.jpg

Netbox IPAM/DCIM – What all Network Engineers beg for!

We found it!!!

Have you ever sat at your desk, hoping on a miracle, that somebody somewhere will develop a fully comprehensive application for tracking network information???  I know I have, along with millions of other fellow network professional’s I have to assume.  What exactly am I referring to?  IP addresses, vlans, VRF’s, Rack Elevations and on and on and on.  We all have to keep up with this information, for most it is located in spreadsheets; some in notepads; others try to lock it all away in the vast empty space we call a brain.

So, the stage is set.  Yes, there are claims of applications that can keep track of what your CORE router IP address is and what vlan you assigned to one of your customers, or even where in the bloody rack it sits in relation to your other devices.  Some can even keep track of which VRF routing table your management lies in along with which physical port it connects to.  Going a little further, maybe the application claims to give you a basic map layout to which you can refer to…

BUT, very few paid applications actually combine most of these functions into one and very little if any Open Source projects do at all.  Although I can think of maybe one or two programs such as iTop or phpIPAM that combine some useful features such as IPAM and documentation pools etc.

Which brings us to Netbox.

Netbox is a swiss army knife, a gem, a diamond in the rough.  It combines all the features every person in the networking world needs, wants and should have.  We found Netbox on packetlife.net which is run by Jeremy Stretch and who subsequently developed Netbox.  If you want to read more about how it came to fruition, take a look here.

Basically, this is what Netbox does and it does it extremely well, its also Open Source and completely FREE:

  • IPAM – IP Address Management
  • DCIM – Data Center Infrastructure Management
  • Single Converged Database
  • Circuit Provider Management
  • Vlan Management
  • VRF Management
  • Multi-Site (tenancy)
  • Rack Elevation
  • Connection Management – Interfaces/Console/Power
  • Customization Header For Logo’s etc
  • And More!

Here are a few screen shots to highlight some of the above features:

 

Main      Devices      IP_Space     Circuits     Connections    vlans

 

Hopefully if you are as geeky as we are, you are biting at the bit to give this puppy a try.  In that respect, there are a couple of options for you to give it a go.

  1. Follow the written documentation provided by Jeremy.  I have to say, the instructions are pretty spot on.  They are lengthy though with the components needed in Linux to allow Netbox to work.  You can find the documentation here if you wish to try yourself.  I will not be going over the installation steps in the post because they are cemented in the provided link; though have no fear, there is the second option…
  2. I took all the brain hurt and built a Virtual Machine and installed/configured Netbox for you, just follow the below steps and voila.  Currently I have it ported to an OVF which you can use with VMWare ESX, VMWare Workstation.

Just follow these easy steps and you will have Netbox up and running in about 15 minutes (vs ???, I cant remember how much time I spent but still!).  This is for VMWare ESXi using vSphere client.

  1. Download the OVF from here
  2. Select ‘Deploy OVF Template’ from the file menu in vSphere
  3. Browse for the downloaded OVF then click next
  4. Click next again
  5. Give your netbox server a VMware name
  6. Choose which Datastore to install to
  7. Pick whether you like to Lazy provision or Thick provision (if you don’t know what this means, you probably should not be using VMWare)
  8. Click next
  9. Click Finish

Now this gets a provisioned server with Netbox installed, but don’t power it up yet, there are still a few more steps to complete.

You will need to add an Ethernet Adapter.

  • Right click your server
  • Select ‘Edit Settings’
  • Click on ‘Add’
  • Select ‘Ethernet Adapter’
  • Follow the prompts and finish

Now you can start the server and open the console to watch it boot and perform the final couple of steps and you will be up and running.

Once the server is at the login prompt, go ahead and login using these credentials (all usernames and passwords for the site and database are the same):

Username: netadmin

Password: netadmin

At the #, type ‘ifconfig’ and find your current IP address (hopefully assigned by your DHCP server on your network if you installed the network adapter as above) and note it.

ifconfig

Again, at # do the following using nano (my personal preference), you could substitute for your own like vi.

sudo nano /opt/netbox/netbox/netbox/configuration.py

The only parameter you need to change is the ALLOWED_HOSTS which needs to be the IP Address of the server and/or DNS name you want to assign.  This is a security precaution to only allow web requests to either the IP or DNS configured in this file.  Once you have edited, exit and save.

Next, we need to restart a service called supervisor.

sudo service supervisor restart

That’s it, your done.  Open your favorite web browser and go to your server IP/DNS to login.  Creds are posted above.

In summary, Netbox seems to be the solution many of us are looking for to keep us straight in the networking life.  I for one will be glad to get away from spreadsheets, documents strewn about and in-cohesive scribble by other people; to a centralized repository of cohesive information and network bliss!