As an MSP, it is imperative to maximize profits while minimizing costs. However, cutting corners to lower capital and operating expenses can be disastrous: platforms can suffer from lower availability and stability levels, which can lead to unsatisfied customers. The biggest area where many MSPs are still not well-optimized is using their available resources.
For instance, in an on-premises, non-virtualized platform, each workload tends to run on a physical server and will have its own volatile and permanent memory. The server and available memory will have been calculated on a ‘best guess’ basis – what will be needed to support the workload when it hits an average peak rather than what it will need when it hits an abnormal peak.
Even an average peak usually means that the workload will be running below the CPU and volatile memory levels allocated to it. Research has shown that the overall CPU load for a physical server is rarely above 10 percent, and volatile memory usage can be even lower. For an organization, that’s a lot of capital, real estate, energy, and maintenance costs tied up in systems that do little to nothing. As such, sharing resources across workloads is needed to increase overall usage levels.
MSPs have an advantage
By managing resources effectively, an MSP massively undercuts what a prospect or customer can do on a dedicated platform.
The first example is where the MSP has many customers (preferably running different, counter-cyclical workloads). For the sake of argument, let’s assume that it has 1,000 customers. Now, if each of these customers tried to operate their workload against their server, they would need to architect the server to meet their specific requirements. For the MSP, it knows that it has 1,000 workloads to deal with. With a bit of planning and continuous monitoring and analysis, it can figure out the actual base load of those 1,000 workloads. Some will essentially be in sleep mode; some will be barely running; some will be peaking – but each workload will sleep, barely run, or peak at different times.
Therefore, a good starting point is to create a platform where the available overall resource is enough to meet the average needs of those 1,000 workloads, then slowly add more. Then, as one of the 1,000 workload peaks beyond the normal average, some of those added resources can be allocated to meet its needs – and then taken back into the resource pool when the workload’s peak has finished.
The ‘added bit’ of resources is generally good enough at around 20 percent – far less than would be required on a customer’s on-premises platform. However, the larger the number of customers, the more variable workloads an MSP will deal with, and the overage can be pulled back. For example, a 10,000 customer MSP will find that one workload peak will barely impact the overall platform and that a 5-10 percent overage on resources could be enough.
Optimize resources through cloud platforms
The second example of effective resource management is where the MSP uses a public cloud platform, such as AWS or Azure. These platforms are already experts in resource planning, so an MSP using one of these platforms should make the most of what is available to minimize resource overage. This can include writing into customer contract clauses such as stating that abnormal peaks in workload while being fully supported, will result in premium charges to obtain resources from the cloud. This allows the MSP to pay minimum amounts to the cloud provider for the basic package but gain access to either premium, lower cost (lower performance), or scavenged resources if required from the cloud’s resource pool.
However, such resource management is not something that can either be managed by sticking a finger in the air or by manual means. Advanced workload monitoring and management via suitable systems management software with automated resource management capabilities will be required. Also, the software must be able to identify workloads that are completely misbehaving (through memory leaks or CPU runaway) and either throttle these workloads or shut them down completely.
Another way to optimize resource management is to use workload replatforming. Workloads that do not require the best resources at any time can be moved to a part of the MSP’s platform that has slower CPUs and memory, freeing up better resources for more demanding workloads. If that original workload suddenly requires a boost, it can rapidly be placed back onto a more expensive part of the platform.
In the end, having a low, relatively fixed platform cost allows the MSP to tailor solutions on top of that which meet the spend of the customer and to ensure that the customer’s needs are met on an ongoing basis.
Photo: jaboo2foto / Shutterstock