Alert fatigue is a very real and growing issue for security operations teams today. With more threats to deal with, more endpoints to consider, and more tools that beep (often with false alarms), teams are burning out, which can lead to missed intrusions, and serious business consequences.
But threats aren’t going away anytime soon, and the number of endpoints to manage is only going to increase, so how can you mitigate alert fatigue in your security operations? Even more important, how can you streamline the alerting process to fully optimize your team’s resources?
In this post, we’ll explore six ways to help. Whether you do all of these or some of these, every additional layer can help ease the burden of alert fatigue.
1. Fine-Tune Your Alerts
The first step is ensuring your alerts are firing in a way that will provide you with useful and contextual data from which you can take action. Begin by taking inventory of the type of alerts you want to see. This might include:
- Phishing attempts
- Privilege escalations
- New vulnerabilities
- Compromised file scans
Then, develop a scale that will determine (based on priority levels) how your team will be notified and required to respond. Let’s say you’re operating on a three-tier scale and define alerts as such:
|Level 1||Low||Respond to it within 24 hours|
|Level 2||Medium||ChatOps (e.g. a ping on slack)||Respond to it within 1 hour|
|Level 3||High||ChatOps + text message + phone call||Respond immediately, ev if at 2 a.m.|
You can then map the alerts you want to see with these severity levels in your security tools.For example, let’s say a key file is compromised, which is a Level 3 alert for your organization. Your on-call team will receive a notification at whatever time of day the alert fires (yes, even 2 a.m.) and will know to respond immediately. On the other hand, let’s say too many failed login attempts is a Level 1 alert for your organization. Your team will get an email, but they aren’t required to respond until normal working hours.
Having alerts categorized in this way may take a little work up-front, but it will save your team from working on the wrong things at the wrong time, which will allow them to prioritize better and make sure the most dangerous threats are taken care of first.
2. Consolidate Jobs That Fit a Certain Parameter
Taking step one a bit further, it can also help to consolidate related jobs. This will help streamline alerting and investigations. Let’s say you run three different security scans: one on your network, one on your applications, and another on key files. Rather than running each of them separately, consolidate and run them simultaneously. This way, if the scans find issues, your team can see all of the issues across your entire environment at once and take action with the big picture in mind, rather than getting one-off alerts that may not be as contextual or meaningful on their own anyway.
Consolidating backup jobs is another good example. If you regularly run backup jobs on systems and data, align them so that they all run on the same day and time. This way, if there are any failures or issues, you will be able to see them all at once and take action with more information at hand.
This batched approach to running jobs, receiving alerts, and investigating can save a lot of time and provide your team with the information it needs to respond effectively.
3. Streamline Alerts with ChatOps
When alerts are firing from different tools (most often directly to your email inbox), it can be difficult to catch them all and connect the dots to uncover real security issues. Even worse, if the right person doesn’t see the alert at the right time, that could mean a missed intrusion or delayed response to a serious issue.
Instead, we recommend implementing a process called ChatOps, whereby you can pipe all of your security alerts into a single, open communication channel, such as Slack. Creating a #security channel that all of your alerts flow into can give your team a single pane of glass view into alert activity, rather than requiring them to hop between systems or dig through their inboxes. Within the chat service, they can also discuss alerts in real-time and escalate and assign them to the right people.
4. Ensure the Correct Teams and Individuals are Notified
Much time and hassle can be saved when certain alerts are assigned to particular teams or individuals rather than all the alerts being sent to everyone. For example, rather than having your security scanning tool email your security and IT teams with every issue it finds, have it send alerts only to one or a few security team members who can then escalate it to other team members if needed.
This can greatly reduce the alert load on each person and keep team members focused just on the alerts and issues they’re best equipped to handle. With this process in place, they’ll know that when an alert does come in, it’s a real issue that needs to be taken care of fast.
5. Orchestrate Tools and Processes, and Automate Alert Investigation and Escalation
Many alerts are fairly common and routine (e.g. phishing attempts, privilege escalation, malware detection). While you certainly need to know about these alerts, the steps required to investigate and respond to them can easily be automated to free up your team’s time to work on more detailed or serious issues.
Phishing is a common threat today, so let’s use that as an example. Let’s say you get a phishing alert every single day. The steps your team has to take to investigate, escalate, and respond to these are always the same.
So rather than requiring your busy team to tend to the same routine tasks day after day, you can configure your security tools to connect directly with your ticketing system or security inbox.
Then, the moment a phishing alert is triggered, your security orchestration tool can automate the investigation, escalation, and response processes for you. This is one of many use cases InsightConnect addresses, all of which have measurable time savings for security teams.
6. Revisit and Adjust Alert Criteria Regularly
Threats are always evolving, and so should your security processes. We recommend revisiting and reworking your alert thresholds, definitions, and processes regularly to ensure the signal-to-noise ratio is appropriate and your team is seeing and taking action on threats in a timely manner.
Getting feedback from your security team about this is key, so we suggest this process be reviewed regularly during team meetings and tweaked as needed to optimize time and resources internally.
Don’t Let Alert Fatigue Become the “New Normal”
Alert fatigue is often a problem that flies under the radar until it’s too late, and can be a huge contributor of talent attrition. Employees try their best to tolerate the load, and may neglect to admit when they’re falling behind or feeling overwhelmed. Alert fatigue is a serious issue with real consequences that must be fixed — not tolerated — to ensure no alert is left behind and your defenses stand strong.
Security orchestration and automation can go a long way in streamlining this entire process. To explore security automation deeper, we put together a guide on the best practices for preparation, the best processes to automate, implementation, and more. You can download a copy here.