At Logentries we chat to new users everyday who are looking for an improved solution for centralizing and analyzing their log data. They have often tried rolling their own solution, have previously gone the open source route, or are using an “old school” logging technology.
But, what we find across new users, regardless of how they are managing their log data, are some common challenges that have historically made log management and real-time analytics challenging.
We decided to take our data, along with some similar research from a recent SANs report, and show you some of these challenges, and possible solutions!
#4 Data Normalization at Collection
Many of today’s logging solution can handle formats like JSON nicely. Most however will not be able to do a whole lot with any custom logs you might have or flat log formats that do not have keys value pairs.
Let’s face it, it’s not always possible to structure or format your logs nicely as you may not have access to the app or service source code to have the formats updated so that your logging solution can handle them.
Tip: Look out for logging solutions with the ability to easily handle known formats (e.g. JSON, Key value pairs, syslog…) as well as the ability to work with any custom log format (e.g. using regex for field extraction).
#3 Lack of Analysis Capabilities
Logs as Data is the concept of using your logs to extract key metrics or trends about your systems behaviour. Logs can be a rich data source, provided you can work with the log format AND can perform analytical functions on your key metrics extracted from your log events. Many traditional logging solutions have focused on being able to simply index and search your logs. While being able to effectively and efficiently search your logs is important in particular for troubleshooting and forensics, being able to apply analytical functions to key metrics in your logs (e.g. Average, Max, Min, GroupBys…) opens up your log data to a much richer set of use cases.
Tip: Look for logging solutions that can handle any log format and that can perform analytical functions on your key metrics so that you can use your logs as data and can easily use them to investigate trends in system behavior or resource usage.
#2 Correlation of Information
One of my biggest pet peeves with Log Management Solutions is how difficult they make it to correlate data. Most Splunk-like technologies send all of your log data into one big bucket and provide you with a complex search query language that you need to learn to even do some basic correlation. It’s a little like using a sledge hammer to … deal with a nail…Finding those important events or correlating only a small number of sources can be extremely painful. Being able to access and correlate your logs in real time is also a key requirement.
Tip: Find a solution where you can dynamically group your logs into different containers or buckets. It is important to be able to look at different log sources in isolation (e.g. at a per log level such as a single web server log from a given instance) as well as in a combined view (e.g. all web server logs from my production environment). It’s also important to be able to do all of this in real time. Look for a solution that provides you with a live tail view and also the ability to combine live tail views from different sources into an aggregate view so that you can correlate in real time.
#1 Identification of Key Events
Knowing what to look for can be the hardest challenge of all. This is one of the biggest issues with technologies that focus on search and complex query languages. After all it doesn’t matter how powerful your search language is if you don’t know what to look for.
Tip: Use a log management solution that goes beyond search. The ability to identify system anomalies is important as well as the ability to perform health checks using inactivity alerting. Ideally, you should be able to plug in out of the box intelligence for known platforms and frameworks so that you can see important trends in key metrics without having to think about what you need to search for.