NOC Best Practices: Taking Your Operations Team from Zero to Hero
Building a high performing Network Operations Center from the ground up
Download as PDF
We now live in an era of business driven by applications and tools which help increase the productivity of the modern workforce. These applications and tools have also become mandatory for many businesses to operate.
In the past, many business’s applications were a way to boost convenience and productivity; they were a mere annoyance when end-users would lose access to them. Today, should users lose access to their tools, many of which are cloud-based, then they become unable to perform basic duties and productivity can come to a screeching halt.
The increasing importance of applications to businesses is forcing IT teams to get as close to 100% network uptime as possible to prevent profit loss. One way many businesses approach this is to leverage a 24/7/365 NOC (network operations center) to keep a vigilant eye on the health of their network and perform as much preventative maintenance as possible. A NOC also serves as tier 1-3 support for the organization and handles problem tickets as they relate to the network and applications.
In short, if the network is the heartbeat of the company, the NOC is the cardiologist.
With this in mind, it’s easy to see why the NOC is such an integral part of the success and profitability of a modern business — whether a company builds its own or outsources (a tough decision which we’ll explore soon in another article). Regardless of which route you choose, it’s essential that your NOC performs at a high level and provides end users with quick resolutions to any issues.
As with many elements of technology, NOC best practices are not a luxury — they’re a necessity. Keeping that in mind, we will explore some of the key elements that make a NOC successful.
Location, location, location! We have all heard this cliche before in business, but it couldn’t be more true than when discussing where to build a successful NOC. While many elements come in to play, 3 top priorities should be considered when choosing a location for your network operations center.
Think data-center! You need a facility that is either already incredibly secure, or can quickly be made that way. All exterior entrances should be secured with a key (at a bare minimum), and segregated interior security is preferred. All NOC employees should only have access to the areas that they need to enter to perform their duties.
In addition, there should be a log (preferably digital) of everybody who enters the NOC, including when they enter and leave, what their purpose is, and who they are affiliated with. Are they a vendor, a partner, an employee or a guest? Video surveillance is also essential in order to validate the log and monitor activities on-premise.
Your NOC is always monitoring your network and has full access to your company’s information in order to support the needs of your users. This fact makes it vitally important for the NOC to be as secure as you’ve made your network. No company wants to be the next one on the news because they allowed a major security breach in their network, so every effort should be made to secure your network as a whole as well as the central-hub that’s monitoring it.
Many things are outside of our control, so identifying those elements we can’t control and taking the proper steps to avoid issues is essential when it comes to NOC management. For instance, you can’t control the weather, natural disasters, the power-grid or Joe-Backhoe cutting fiber during road construction. This doesn’t mean you’re off the hook though. You can focus your efforts on locations that are more susceptible to these issues or less equipped to deal with them.
Again, think data-center. Generators can be made available in the case of a power outage, diverse WAN connections can help prevent road construction from taking your NOC out of the game, and choosing a building that can structurally withstand severe weather (and isn’t prone to flooding) can give you better odds of weathering storms or other natural disasters.
3. Labor Source
One often overlooked element is proximity to a plentiful and qualified labor source. Unfortunately, it’s common for there to be a high level of churn when it comes to NOC employees (we propose ways to improve this later in this article), so it’s important that you are located near an abundance of potential replacements or someplace where qualified employees may be willing to move.
Cities and metro areas are an obvious good choice because of a high local population density, but it’s also important to consider the qualifications needed for the positions you will need to fill. Is there a college or technical school nearby that can provide a steady stream of entry-level employees? Is your location in a district full of other technology companies, data center or NOCs that will also have churn offering you qualified prospects?
Don’t overlook the importance of having a readily available labor source. Finding the right people could be a game-changer in your NOC (or be a pain in the posterior).
Depending on the scope of work for your NOC, it’s important to know exactly what skills your employees will need. Will your NOC be offering tier 1-3 support, or will a helpdesk or partner be taking some of the tickets off their hands? Understanding exactly what type of tickets your NOC will handle will drive the skill set for the technicians you will need.
Keep in mind that not all tickets are created equal. Many tickets can be handled by a lower wage employee with the help of an extensive knowledge base, while other tickets will need a network engineer to handle. If you have a historic breakdown of tickets that you can reference to discover what percentage of your tickets will need each knowledge-level, you can more easily develop a strategy around staffing your NOC to make sure you have the right people in place to make it a true asset.
It may seem out of place to discuss work environment for something like a network operations center, but with this being a high churn job, it’s important to keep good employees when you find them. Many companies invest in hiring a good ergonomics specialist to make sure the workspace is as comfortable as possible. This applies to everything from comfortable desk and chairs, to user-friendly keyboards and monitors. Comfortable employees are happy employees!
Another important consideration is climate control. Most NOCs will have a strategy around humidity and climate control as these are important for equipment performance; however, it can lead to a very cold environment for employees, especially at night. Most complaints about NOC climate come from team members working the 2AM-4AM shift. Finding good people isn’t easy, so all efforts should be made to keep them comfortable and happy once you find them.
Also, consider things like a break-room or on-site activities during breaks for employees’ mental health. Working in a network operation center can be a tedious job, which mostly consists of looking at a monitor all day long. Encouraging employees to get away and decompress during their shift can help them stick around and be more productive while on the job.
On top of having the technical skillset and environment to solve the problems that your NOC team will be bombarded with, it’s also important that potential employees have the personality to excel in what can be a highly stressful job.
Keep in mind that users don’t call the NOC when things are going great; they call when things are broken. This means that NOC technicians will often deal with annoyed or angry users that may have a temper. Not only do they need the patience to deal with these people, they also need the social skills to handle them in a professional manner and the patience to teach them what needs to be done to solve the problem. NOC technicians and engineers will have to navigate a sea of short-tempered users who don’t understand the technical issues causing their problem and just want them fixed. Not everybody is equipped to handle these situations.
Another factor to consider when interviewing potential employees is their current life situation. Many NOCs operate 24/7/365, meaning there will always be night-shifts and holiday shifts. While most people may be willing to start working the night-shift as long as they move to day-shifts at some point, finding employees who embrace the night-shift will be important as those times will always need to be covered. Also, finding staff willing to alternate holidays and be away from their family will be important. Making your needs clear during the interview process will ensure that all shifts get covered and your retention rate doesn’t keep you running in circles looking for new employees.
There are essentially two major types of monitors: infrastructure monitoring and user experience monitoring. Understanding both of these types and knowing how each affects your company’s productivity is essential to a successful long-term NOC strategy.
Infrastructure monitoring consist of servers, network and data center equipment. This creates a snapshot of a network’s overall health that will allow the NOC to identify problems as they arise and remotely address them. It’s essential to have a full understanding of network architecture and which diagnostics most affect the experience of end users. This will allow the NOC to focus on the metrics most important to keeping the workflow moving and the users happy.
User experience monitoring simulates user behavior and activities to replicate problems and find effective solutions. This step is essential for NOC productivity because it will allow technicians and engineers to experience the problems being described by users and find the most effective solutions. This process can also drive future knowledge base articles and help identify areas for improvement should issues become persistent.
Finding the right ticketing system is imperative to keeping a productive workflow for issues brought to the NOC by users. There are many solutions available nowadays and it’s impossible to recommend just one without a full understanding of the types of tickets most common to your network and the full scope of what your NOC will be monitoring.
Essentially, your ticketing system will keep track of all open issues the NOC is working, what has been done on each issue, what is left to be done, who is handling the ticket, and the urgency assigned to the issue. If used correctly, your ticketing system will allow technicians to work tickets together or hand tickets off with all of the relevant information being in one place to prevent any confusion or delays.
Choosing the right ticketing system is important, but it’s worth nothing if your employees don’t know how to use it correctly. For this reason, every step must be documented in the ticketing system and your team must be trained.
A centralized and extensive knowledge base will allow many tickets to be resolved by the first person on the phone. Developing an understanding of the most common problems faced by users and building out a knowledge base to address common issues is a time investment that will pay out enormous dividends in the long run.
It’s important to keep in mind that building our your knowledge base isn’t a one time project. NOC employees should keep a record of tickets and recurring issues so that if a problem becomes persistent, the knowledge base can be updated to address it without the need for escalation. The knowledge base is a dynamic and ever-evolving tool that shouldn’t be ignored. For every article added to the knowledge base, you will avoid countless escalations and your users will experience a faster resolution to their problems.
Daily and monthly reports of all tickets is essential for NOC productivity. These reports will keep NOC managers and the entire IT team informed on what issues are being worked, where they are in the process, and who’s working them. These reports not only create a single pane of glass for the entire team to see what their users are experiencing, but can also help to identify trends. Identification of trends can help identify areas for improvement or development, and identify areas where users may need additional training.
Automation is the friend of IT when it comes to tickets. Identifying tickets that can be handled by an automated system can lead to instant answers for users and keep NOC operators free to handle more complex issues. Many issues are so common and routine that it makes no sense for a live person to be handling them. These can be issues ranging from password resets, disc space cleanup or even remote restart services.
Each company is different and there are many other potential issues that can be automated, depending on the company’s network and applications. Automation lowers average ticket time, improves end-user experience, and leads to a faster MTTR (mean time to recovery).
Ready, Set, Implement!
A NOC is complex environment and the best practices listed here are only a starting point of what it takes to build a successful network operation center. When deployed correctly, a NOC can keep employees productive and profitable by decreasing downtime created by technical failures. A company’s network and applications will only become more integral to driving profits in the future, so you’ll want to start finding ways to make sure employees have access to the tools they need to be productive now.
Stay ahead of the game by making sure your users have the support they need to stay productive and profitable. Our managed NOC monitoring services are well ahead of the curve.
About Enable IP
EnableIP is a telecom solutions provider founded by Wired Networks’ founder Jeremy Kerth and head engineer Steve Roos after they realized there was a deep market need for helping mid-size businesses establish better uptime rates for their Wide Area Networks (WANs). Armed with the best-in-class carriers and partners, Jeremy and Steve set out with a bold plan: Guarantee better uptime rates than the industry standard of only 99.5%.
Their bold plan became a reality. EnableIP’s solutions guarantee clients 99.99% (even 99.999%) network uptime. But we don’t stop there. Many telecom providers promise high availability network solutions but fail to deliver because they’re in the business of providing services, not solutions.
That’s the EnableIP difference: We deliver highly available networks by providing a complete system (called “Cloud Assurance”) that ensures 99.99% or above uptime.
We deliver this bold promise by:
- Owning the entire customer experience. From pricing, contracting, ordering and provisioning to installing, servicing and billing—we do it all! This means no stressful negotiations, confusing setups, or finger pointing if something goes wrong. We actually deliver on our promise.
- We manage the entire system, and monitor and manage issues as they occur so you can focus on your business—not your network.
The Enable IP solution is like no other. Contact us to get started and experience the difference of a system that truly delivers on its 99.99% network uptime promise.