When you are in the log management business, who manages your logs? Well, you do, of course. The proverbs of old, though, tell you this can sometimes be a very bad idea. In the Irish language the saying is:
“Ba mhinic droch-éadach ar tháilliúr ‘s droch-bhróg ar ghréasaidh.” Or, in English, “Often the tailor has bad clothes and the cobbler has bad shoes.” The Scots’ phrase is very similar: “The blacksmith’s mare and the shoemaker’s bairns are aye the worst shod.”
The insight that these proverbs provide is that sometimes when you are so busy working day and night to satisfy your customers, it can be difficult to put time aside to focus on your own needs.
So here at Logentries we’ve decided to learn the lessons provided by this ancient wisdom and to invest time using our own product for what it was designed for: to monitor our own production systems, to debug software issues, to troubleshoot operational issues and to gain insights that will help us improve the product.
Along the way, we are sure to find that some of these things are a little more cumbersome than they should be – and the engineering team will be on standby to make improvements. Jidoka is the Lean Manufacturing principle that informs us that strategic advantage comes from “the decision to stop problems as they occur rather than pushing them down the line to be resolved later.” In the SaaS world this aligns with Agile-based software methodologies and continuous delivery processes. As soon as we uncover new ways to improve the product, we move to make these a reality.
Wikipedia speculates that the phrase “eating your own dog food” may have come from the president of Kal Kan Pet Food, who was said to eat a can of the company’s dog food at shareholder meetings. Thankfully, we already have a well-designed, easy to use product – so dogfooding is a pleasure compared to the culinary delights of Kal Kan’s president. But there is always so much that can be improved and what better way to discover new insights than by pushing your own product to its limits.
In the log management business dogfooding can get complicated. Logs are predominantly used for troubleshooting issues. So if you log to your own solution and an outage occurs, then how do you view your own logs? And, if you log to your own log management system, there is also a risk you can cause an endless loop where the act of logging causes the platform itself to generate a new log. With a little care to avoid loops and an acceptance that catastrophic outages are extremely rare, though, we’ve been able to get the benefit of using our own platform to troubleshoot the vast majority of software development and operational issues to date.
Today, however, we are announcing a new initiative. Let’s call it “Project Insights” (rather than “DogFood”), to deploy a dedicated infrastructure outside the production system to retain our own logs. This new infrastructure will use all the same software components that make up the production system, but will sit in a separate EC2 Availability Zone. We are also adding additional logging to the platform and improving the level of log detail provided in the existing codebase.
The next step is to ensure we are using the Tagging features within the Logentries platform. A comprehensive review is underway to make sure we classify the key log entries generated by our platform, so that we can quickly find important logs when needed. It’s a lot easier to find the needles in the haystack if you color-code them beforehand! Alerts can then be set up to automatically notify us when anomalous behaviour is detected.
It’s worth noting that the timing of this project is no accident. Being able to graph results has always been an important aspect of gaining insights but the recent addition of Sum, Count and Average takes the possibilities here to a new level. By utilising these new features ourselves, we’ve already discovered that the one of the most popular features in the product happens to be one that needs some improvement. Maybe we will tell you which one in a future article – but only after Jidoka has worked its magic. 😉
The insights don’t stop there. Yes, the new analytics features described above provide a quantum leap in terms of our understanding of how the product is used. But we can continue to gain insights when we take this same data and analyze it with third-party tools. Some of the unstructured log data lends itself well to crunching with Hadoop. Other metrics generated by the Logentries analytics functions can then be sliced and diced using business intelligence tools, Excel, etc. The application of quantitative methods and machine learning can provide further insights.
So it all starts with dogfooding and, from this new initiative, evolves in many different directions in parallel. But it must always come back to the product. Any insight we gain with a third-party tool that proves to be useful for us, and is a useful feature for our customers, goes back into the product. It is used and refined, iteration after iteration, until we get the perfect balance between usability and the benefit to the end-user.
This is just Part 1 of the story. Over future blog posts, we will detail the steps we have taken to make Project Insights a reality. As we proceed we’d like you to follow along to hear about the insights we have gained and to watch the product improve in front of your eyes. Of course, as the story evolves we never lose sight of the end goal: log insights that are Simply Accessible.