When it comes to securing your cloud assets' activities at runtime, the first step is deciding how. There are enough possible solutions that you're likely to find yourself at a crossroads trying to decide between them. The factors that may affect your choice include:
- Friction level — How time-consuming or disruptive is it to instrument the solution within the existing environment? What happens to normal operations when the solution malfunctions or becomes misconfigured?
- Costs — How much am I going to pay for an effective solution?
- Scalability — Would I have to bother with instrumenting this solution over and over again as more assets are being added to my environment (which might have just happened without any intervention)?
- Blind spots — What coverage does a single instrumentation provide? Does it cover all an asset's activities and communications? How is overall visibility impaired when it stops working?
- Depth of view — How deep does the inspection go per asset? Is it capable of retrieving all viable information required for detection of vulnerabilities and ongoing malicious activities? Is it sufficient for a reliable detection of behavioral anomalies?
- Breadth of view — Do I get the big picture of what is going on? Can suspicious activity be linked to all assets involved? And again, can the solution reliably detect behavioral anomalies?
- Forensics — Am I able to keep the data for post-mortem analysis on the crime afterward? Does the solution allow me to make smart conclusions about the next steps for mitigation?
In addition to such questions, there are also practical aspects of your existing cloud platform settings that may affect your selection. For example, working on a serverless setup, in which the hosting instances are completely segregated from your reach, will rule out solutions involving security agents designed to run at underlying host-level scopes.
A solution involving an applied process or a module that resides within the asset's scope. These kinds of solutions are aware of the hosting platform and are tailored to probe its application, as well as valuable runtime information regarding process activities, incoming or outgoing network traffic of the monitored asset, and possibly other local resources that may play a part in denial-of-service attacks, such as CPU and memory consumption. Along with information access, an agent and its host also share resources. The agent is usually also deployed with privileged access rights (a “let's run as root" kind of routine) so it can operate at the operating system level and even in the kernel space of its host.
So how does an agent-based solution stack up against the deciding factors we listed earlier?
- Friction — Due to the nature of their deployment, agent-based solutions can cause a considerable amount of friction. In the absence of an inherent central management, these solutions are required to expose some APIs, which may be non-trivial to operate in order to control and configure their activation.
- Cost — Agent-based deployments are not part of the provider service stack and, as such, are mostly subject to solution vendor licensing fees. These fees can be unrelated to the actual data billing and thus may save on costs in practice.
- Scalability — Agent-based solutions are usually bound per specific asset, so they're required to scale up and down as assets are added or removed. This may incur more resource consumption from the shared resources pool of the assets and agents.
- Blind spots — Agents usually operate as a mesh of probes, providing the big picture by correlating their spot coverages, typically within a database and a management application. When a probe goes down, its spot coverage vanishes as well, and the runtime information will be missing for the assets it covers.
- Depth of view — Agent-based solutions are tailored tightly to the asset's platform, which they monitor and secure. As such, they can fish out information an external service would never be able to find from the depths of the operating system, kernel, and other local devices.
- Breadth of view — As mentioned, the benefit of providing the overall picture comes from correlating a mesh of agent information within a single point. Agents can't do this by themselves. It requires the help of external applications and possibly a central database to maintain the findings and make useful links between them.
- Forensics — Having a realistic retention policy for findings can play a role in determining how far back to investigate an incident and what conclusions can be drawn about it.
Instead of deploying an agent per asset or per subset of assets, agentless solutions are deployed at the cluster or cloud account level. As they live outside of the assets themselves, these kinds of solutions are based on the cloud provider's native APIs and services. They're also affected by and confined to the provider's specified functionality. In this case, the protected assets and the agentless service share no resources, and there's a strong reliance on the cloud provider role schemes and APIs for accessing valuable information.
- Friction — Agentless solutions are provided as part of the official provider services stack. The only effort on the operator's end is to enable and customize it, and customization can usually be simplified to the default settings for less experienced users. Other aspects of updates and fixes should be seamless as part of the overall provider's user experience suite.
- Costs — The provider's pricing model may vary, but there's a trend of billing per data capacity used for tracking, so agentless solutions can become quite an expense for a full-blown active network of assets.
- Scalability — Agentless solutions take advantage of centralized services within the provider's platform. As such, they can scale well, since they aren't relying on the same resource pool as the assets themselves.
- Blind spots — Provider-based solutions aim to deliver the big picture of cloud activity as a whole. In this context, a blind spot would be related to poor depth of view per asset internals, so you might consider blind spots to actually be “blind layers."
- Depth of view — As we've discussed, agentless solutions rely mostly on platform data handled by the provider itself, such as the cluster networking and user accounts. This being the case, these solutions are less aware of the containerized internals, such as the processes being executed within the containers.
- Breadth of view: Agentless solutions have a cluster-level view of asset activities, usually from a single collection point. This makes it easy to correlate between different cluster assets' activities.
- Forensics: The forensic capabilities of an agentless solution are mostly related to how well the solution is integrated with the data retention facilities of the provider.
So if both approaches have their strengths and weaknesses, which do you choose? Stay tuned for part 2, where we'll discuss a third option and draw conclusions.