A Breakdown of MELT for Observability

Here’s a short breakdown of MELT, which stands for metrics, events, logs, and traces. These are the four most basic data types used for network and system telemetry. There are other data types that are widely used and very useful in a robust observability solution, though I would argue some are just another form of one of these general data types.

These data types are used in system monitoring in general, but are very valuable in observability. Metrics, events, logs, and traces have most commonly been the cornerstone of application and system observability, but today they also form key components of the telelemtry used for network observability.

Metrics

Metrics are data, usually some sort of measurement in numerical format, collected regularly from a device. They’re defined by a dimension, like memory usage, CPU utilization, or in networking, CRC errors, packet loss, or retransmissions. By and large, the colorful graphs and charts we see are made up of metrics.

Events

Events are specific things that happened at a single moment in time, and not necessarily regularly. That could be something like a change in the system state such as an interface going down, a system restarting, an access rule blocking traffic, and so on. Events will usually have an event type associated with them as an identifier.

Logs

Logs are simply timestamped data generated by a system when code is executed. This can be in almost any data structure and take almost any form depending on the system that generated the log. On the surface it may seem there’s overlap between events and logs, but just remember that a log is the actual text the system generates.

Traces

Traces, which are a little harder to do but are super interesting, are a breakdown of a user request and all the services that are involved in fulfilling it. Think of a waterfall chart with the request at the top kicking off a chain of events including each service involved in fulfilling the request. This could include both frontend and backend activity. Traces, commonly referred to as distributed traces, area a useful tool to pinpoint exactly where application delivery is broken or slow, especially in today’s microservices architectures.

—

These are broad and high level categories, and you may notice some overlap among them. And as you look at one system, an event, log, or metrics could appear very different than it would for another system. As a simple example, many of the metrics gleaned from a router would be different from the metrics gleaned from a CentOS server.

Thanks,

Phil

	EVPN Active-Active M… on The Five EVPN Route Types in V…
	Phil Gervasi on Rail-Optimized Networking for…
	Tyler Conrad on Rail-Optimized Networking for…
	Ravi SY on Choosing Between Leaf-Spine an…
	Phil Gervasi on Understanding the A2A Protocol…

A Breakdown of MELT for Observability

Leave a comment Cancel reply

Social Media

Subscribe to Blog via Email

Recent Posts

Archives