Observe – how is the Software System Monitored?

clipboard image 1756115722 - FULLSTACKS

In classic architectures, such as monolithic systems, a look into the system’s log file was often sufficient to recognize whether the system was behaving as expected. If not, it was also possible to determine why the system was not behaving as expected.

So, often a tool that understood how to interpret log files accordingly was sufficient. However, this is no longer the case for the cloud-native architectures that are used today. Modern systems and thus also the log files are too distributed to be meaningfully related with traditional means.

The Observe stop takes care of exactly this circumstance and transforms the classic monitoring into an Observability-First strategy.

What Can I Expect from this Stop?

The following topics are included in the Observe stop:

  • Visualization & Alerting

  • Application Performance Monitoring

  • Infrastructure Monitoring

  • SIEM

  • Cost monitoring

Observability is of course much more than just looking at log files. Different signals such as traces, metrics or logs must be linked to get a holistic picture of the situation. The monitoring backend can then automatically recognize exceptional situations and react to them accordingly.

As a CNCF Silver Member, we naturally build our solution on standards from the CNCF landscape. In recent years, OpenTelemetry has established itself as the standard for observability and has now matured into one of the largest projects within the CNCF.

OpenTelemetry integrates the mentioned signals in the standard and enables them to be correlated with each other. In order to be able to monitor distributed systems end-to-end, it uses the W3C Trace Context standard and thus does not reinvent the wheel.

Even technical information can be propagated in a standardized way across multiple subsystems, which makes it possible, for example, to build service level objectives on such information. For this, OpenTelemetry builds on the W3C Baggage standard.

Thanks to a standardized API and a manufacturer-independent SDK, the telemetry data can be interacted with directly in the application code. This enables an additional level for the aggregation of telemetry data and thus, of course, completely new possibilities for the evaluation of this data.

Why should I Stop at this Stop?

This stop is essential to be able to sustainably and long-term cover the following points in a cloud native system:

  • Introduction of (Near-)Realtime Alerting & AI Ops

  • Understand the behavior of the overall system

  • Monitoring of security-relevant events

  • Get an overview of the operating costs

How Does this Stop Work?

We have designed a process model that makes the introduction of OpenTelemetry as frictionless as possible by iteratively processing topics, adapted to customer needs, in our proven CRAWL-WALK-RUN method:

fullstacks Splunk observability Cloud - FULLSTACKS

The current main contributor to the heart of an observability architecture based on OpenTelemetry – namely to the Collector – is Splunk. Splunk is probably familiar to many around the topic of logging or SIEM, but is now also very strong in the Application Performance Monitoring and Infrastructure Monitoring area with the Splunk Observability Cloud.

In the next article, we will therefore take a closer look at this product of our partner, as it offers optimal and native support for OpenTelemetry and also sets new accents in terms of usability, which ideally also fit the DevOps philosophy.

More Blog Posts