I have previously written about how important it is to be flexible with customers, allowing the occasional traffic burst to go unnoticed – well, at least as far as the invoice is concerned.
The problem is when that traffic burst should not be happening. Here, there should be a dual responsibility for what is going on, but with the responsibility heavily weighted to the MSP.
The problem with traffic bursts is that not all such activity is benevolent. Sure, it could be either the customer’s latest marketing play bringing in thousands of new prospects, or all those happy buyers leaving glowing feedback on the customer’s site.
It could be the start of a distributed denial of service (DDoS) attack or a brute force port sweep looking for means of gaining access to the customer’s site in order to do — well, anything.
You could put in your Terms and Conditions that it is up to the customer to monitor for such problems and to deal with them. However, it is likely that they won’t do anything, and this then gives you a problem.
MSPs are targets
Being an MSP means that you have lots of customers. This makes you a bigger target for those with malicious intent. If they can find an underlying opening on your platform, they can gain access to all those customers. If they can find a specific issue for one customer, it may just be that others on your platform have the same issue.
What it means is that one attack on one customer can have traffic issues for everyone. Unless you have carved up your bandwidth using VPNs and traffic shaping, all that lovely bandwidth can get sucked up into one major DDoS or other attack.
So, there is a simple answer. As soon as you see such a problem, block the traffic.
What happens if this is a false positive? In reality, the supposed DDoS attack is massive traffic from prospects wanting to spend money with your customer. I wouldn’t like to be the account manager in the next review meeting saying, “we did it for your own good, honestly.”
No – what you must do is to identify issues in the balance of probability. Yes, there will be activity which can be stated as being, without doubt, malicious. By all means, block it.
Best practices for suspicious activity
There will be other activity which may look suspect, but you are not sure. The first thing that may seem obvious is to contact the customer and ask if they know what is going on. This is probably wrong. The customer will have to take time themselves to try and figure out what is happening, and during that time, that activity is continuing. If it is malicious, every second is getting closer to something possibly calamitous happening.
What I recommend is that activity which looks as if it is probably malicious, but cannot be solidly proven to be so, is throttled. This can be done in a couple of ways.
For the standard customer, you ramp back the traffic that they are allowed per discrete amount of time and either stack the traffic (which has the unfortunate side effect of having customers leave the site as it appears to be far too slow). You could also allow a certain amount of the traffic through, while redirecting the rest to a holding page that just states the site is having some problems that are being worked on and request that customers come back later. Both ways allow for some activity to continue.
However, another way for premium customers (and a way to upsell) is to be able to spin up a new instance of the affected function on a different part of the overall platform and redirect the traffic to that area – an area that can be more effectively airlocked from the rest of the customer’s site and so making it less of an issue should the activity prove to be malicious.
Indeed, through packet inspection, load balancing can be put in place, with ‘normal’ traffic still being directed to the original functional instance and only the possibly malicious traffic being offloaded to the secondary instance.
As always, the key is to offer the customer what they need – a capability to continue operations when there is a possibility of things not being as they seem.
By applying such steps as throttling and redirection, it buys time for both your MSP and the customer to delve deeper into what is causing the traffic. Together, you can decide whether to shut down the streams or allow them through. Then, if needed, you can charge an agreed amount extra for the added resources.
Photo: Ryoji Iwata / Unsplash