Observability 2024

ยท 776 words ยท 4 minute read

One of the many corners of the (post-)modern data stack I have kept an eye on is observability. I recently revisited it, and while little has changed, much has changed.

At its core, observability is about process monitoring. Finding changes, because changes might be errors. Perhaps interestingly, status quo is rarely suspected to be an error.

Mostly, observability is about finding changes in data. Changes in row counts. Changes in distinct values. Changes in means and standard deviations. Sometimes, it is about finding changes in the process itself. Usually that means load times spiking.

The players ๐Ÿ”—

The 2024 MAD data landscape lists 21 products in the observability space. There are plenty more too, but I’m going to focus on a few.

  • Monte Carlo hardly needs an introduction - it sprung onto the scene in early pandemic times and is often the reference point for data observability - having coined the term data reliability along the way.

  • Metaplane is similar to Monte Carlo, but with transparent pricing and less marketing.

  • Great expectation started as a simple open-source python library for testing pandas dataframes. Now it is a quite complex frameork for designing, running and monitoring test suites, comparing new data to old. Importantly, it has also grown into a cloud offering and most of the documentation tries to get you to sign up.

  • Soda is an open-core framework, where the core part of Soda lets you define tests for data and evaluate them - generating machine-readable output. The Cloud offering is much more comprehensive, and includes AI functions for automatic anomaly detection. There used to be a function for generating a test baseline, but it seems that has been removed from the open source part of the offering.

  • Elementary data is listed in the “data management” portion of the MAD data landscape, but it is definetely in the observability business too. It is open-core, with a SaaS component. But unlike many other tools, Elementary core is useful in its own right. Elementary is tightly coupled to dbt, and the core component is a dbt plugin. Let it live in its own schema, add custom dbt tests as you would do any tests, and Elementary will create its own tables for keeping a history of table statistics. The anomaly detection is based on Z-scores, which is a very simple statistical test that won’t be able to adapt to repeating patterns. The results can be compiled into an HTML page, to be displayed on any statically hosted site like S3 or netlify.

An interesting baseline to compare these tools to is what you get out of the box from dbt core. dbt lets you specify explicit expectations of your data, and it checks your data against it. Depending on your configuration it can return either warnings or errors.

The use cases ๐Ÿ”—

While there isn’t a standard categorization of observability tools, 3 or 4 features stand out:

  • Run tests against explicitly defined expectations such as freshness, row counts, etc.
  • Run tests against explicitly defined expectations of changes in data, such as freshness, row counts, etc.
  • Test for changes without defined expectations. This is basically outlier detection, and comes in two forms:
    • Simple Z-tests for changes from one scan to the next
    • Advanced ML models that can learn weekly or monthly patterns

The big question for a data team is whether or not they are able to a priori define the expectations of each table. If they are, and they are willing to do that, dbt core can do most of the lifting there. And if they aren’t always using dbt, Soda core can probably do the same thing. And great expectations can let you be especially particular about testing new data against what has been loaded previously.

For the ones that can’t define their expectations up front. the alternatives are mostly in the paid SaaS space - with a partial exception for Elementary, which in it’s open-source version can keep a history and do Z-tests for fairly generic measures such as row counts, means etc.

Feature/ToolPredefined expectationsNaive Outlier DetectionML Outlier Detection
dbt coreYesNoNo
Soda coreYesNoNo
Great ExpectationsYesYesNo
Elementary coreYesYesNo
Paid SaaS ToolsYesYesYes

Conclusion ๐Ÿ”—

Not knowing your data is a problem you can throw a lot of money at. Call it an implicit tax on poor team composition.

But there is also something alluring in having an observability tool automatically keep metrics of your data. Tests you define yourself are usually based on failure scenarios you can predict. It is more difficult to write tests for types of errors you haven’t seen before.