Log Looms and Microservices Logging
10.29.21
LogDNA is now Mezmo but the insights you know and love are here to stay.
IT professionals love their metaphors. From “pets vs. cattle” to “post mortems” to “fog computing” and beyond, practitioners tend to use analogies to shape the way they think about complex technical topics.
Here’s another analogy: Log looms. In the world of logging – especially microservices logging – the concept of a log loom underlines two key concepts that teams must master to manage logs effectively: First, logs need to be woven together to deliver value. Second, integrating logs is complex work that requires special skills and tools.
Here’s what a log loom means and how to use the concept to shape a microservices logging strategy.
Logs are Threads. Observability is a Fabric.
A loom is a machine that weaves individual threads together to create fabric. Looms have been around for thousands of years, helping humans create something of value (fabric) out of parts that, on their own, are not very useful (thread).
In the log loom analogy, logs are threads, and the fabric is the story that you can construct by weaving the logs together.
After all, on their own, individual log files are not very useful. They might be helpful if you want to figure out what happened in one node or trace the performance of one pod. But in most cases, observability for just one node or one pod is not very helpful. A single node failure is not a big deal if you have hundreds of nodes in your cluster. A performance degradation issue in a pod may not matter very much if it’s temporary and other pods continue to operate normally.
However, log files become much more meaningful when you can monitor the health and status of all of your nodes and pods at once. Collectively analyzing logs allows you to determine whether a node failure is a one-off event or part of a systemic issue that will bring down your entire cluster if left unaddressed. Likewise, analyzing logs from nodes and pods in tandem helps you understand how a problem on a node impacts the performance of a pod hosted on the node and vice versa.
I’m using just nodes and pods here to keep the discussion more straightforward, but the lessons hold no matter which types of log files and resources you are handling. Kubernetes logs, standard OS logs, database logs, logs generated by public cloud services, and beyond - all logs are much more valuable when integrated into a fabric that tells the complete story of what is happening within your environment.
Log Loom Challenges
The process of weaving logs together is critical for achieving meaningful observability into complex systems.
Determining which logs to correlate with other logs and how to restructure log data to enable analysis is challenging and usually requires particular expertise.
That’s especially true in systems where conditions change continuously. If your pod moves between different nodes (or is hosted on multiple nodes at once), how do you determine which node logs correlate with the pod’s logs? Which metrics that you pull from log files are most important for weaving the fabric that tells the story of overall performance? How do you define the periods over which you measure log data? These are questions that teams have traditionally answered only through trial and error.
You could say that these challenges are akin to determining which threads to weave into which parts of your fabric as you use a loom. Even with a loom that can automatically weave threads together, you still need the expertise to determine how to combine threads to create complex patterns or textures.
Power Log Looms
Modern log analysis tools designed for the microservices age solve the problems described above by automatically determining how to correlate, integrate and shape data from multiple logs. Instead of leaving it to users to compare log data manually, they generate the broader fabric automatically.
In this sense, modern log looms are comparable to power looms, a key invention during the Industrial Revolution. Power looms eliminated most of the manual processes required to use a traditional loom, enabling fabric production on a massive scale.
Likewise, power log looms, like LogDNA, let you collect logs from across your environment, correlate them and turn them into a story that you can interpret – all in an automated, efficient way. Like Kubernetes does with microservices orchestration, looms make it possible to manage complex, large-scale environments without compromising control or observability.
Conclusion: Power Log Looms in the Microservices Age
Just as it’s difficult to imagine modern life without power looms that efficiently enable large-scale textile and clothing production. It’s equally hard to visualize an effective microservices logging strategy that doesn’t use power log looms to automate the complex process of turning individual logs into meaningful insights.
We’ll admit it: Perhaps the “log loom” analogy won’t catch on as much as analogies like “the cloud.” Still, we think this is a definitive way of thinking about where tools like LogDNA fit into the modern tech landscape.