The beginning and the end š
The days of the modern data stack were waning. Interest rates were soaring. And the appetite for Yet Another SaaS was plummeting among both companies and investors. Meltano Cloud entered public Beta behind everyone else, and behind their own schedule. And it disappeared before anyone else. The Meltano team is now working on Arch, a new adventure for similar but different use cases.
Perhaps Meltano Cloud was too late to market. Or perhaps the unique brand of CLI-first, no bells and whistles service had no appeal outside the do-it-yourself crowd. After all, everyone who already ran Meltano orchestrated it themselves. And Meltano Cloudās offering was just that: the orchestration bit. No UI to talk of, no guided onboarding, no yellow warning boxes, stepwise forms to fill out or green checkmarks. Nothing splashy for anyone who didnāt live their lives on the command line.
Fear of the command line š
I have been thinking a lot about why people are scared of the command line. A few weeks ago I talked with some people who werenāt anywhere near IT, who expressed a feeling I once held too: The command line is scary. I had totally forgotten. I still think the ocean is scary. Staring down into the dark sea, not knowing what is down there. Fear of the unknown. There is a lot of unknown in the CLI.
CLIs have no instructions. None that pop out at you, anyways. No welcome screen saying āpress any key, it will be fine". And that is by design.
Developers are often the same š
You might think this is different for developers, but by now I have had some experience onboarding developers who are used to GUIs (Informatica, ODI, Alteryx, etc) and it isn’t all that different. I keep taking it for granted that developers know command line tools. I keep being reminded that that isnāt the case.
And so, even the journey to what I intuitively think of as the starting point is perilous. As a developer pivoting in to working with Meltano, perhaps you start off by installing python. Then, ensure python is reachable from powershell. Endure the rolling of eyes when people see you use Powershell. Create a venv. Activate the venv. Install git. Learn git. Install GitHub Desktop. Authenticate to GitHub. Learn about pip. And about requirements.txt files. And experience other peopleās exacerbation when you didnāt know that stuff you pip install in your venv isnāt accessible unless you have activated your venv. What really is a venv anyways? Nobody explained that.
OK, so with all of that done, let me introduce Meltano. The Meltano commands. Adding extractors and loaders. In five minutes, you will have experienced one of three things:
- Utter dread at configuring a tap or target without any colourful tooltips to help you along the way.
- Very strange error messages referring to pip install subprocesses and different versions of python.
- Success, followed by a rinse and repeat with some new connectors, starting the lottery over again.
Note that the second error doesn’t have anything to do with it being a CLI tool. It has to do with dependencies and assumptions, which are hard to get around.
One of my big aha-moments when first getting started with Meltano was when I realized the problem was always the connector. Meltano was just the messenger. And since connectors are a vast ecosystem of very varying quality, your results will vary wildly.
In general, Meltano and the Singer ecosystem has one big drawback and one big advantage: The drawback is that a lot of connectors donāt work the way you want them to. The advantage is that you can change them fairly easily. Provided you learn object oriented programming in Python. I know I said fairly easy, what I meant was āeasier than most other optionsā.
The math changes when you enter production though.
Code is for production, UIs are for sales š
Meltanoās code-first, git-native approach means changes are deliberate and reviewed. No schedules are suddenly switched off, no configuration changed by accident. The effects of a change might be an accident, but you know what change you made - and you can roll it back. Git is backup as well. No configuration is trapped in a UI.
Getting any of this through in a sales pitch is tough. A CLI isnāt a pretty picture for powerpoint. If you look at almost any SaaS tool, they are hyper-optimised for getting started. Perhaps the most impressive such effort is Matillion, which offers to automatically set up a Snowflake account for you the first time you create a data pipeline just so that your data has somewhere to go.
Are we stuck? š
Could a CLI tool be appealing? Remember that most tools that don’t have crucial help from designers (typically in-house applications and open-source GUIs) are close to revolting. Maybe CLIs just need better designers?
Meltano did take a lot of steps in that direction.
Configuration š
Configuring taps and targets can be done interactively in the CLI, giving you a number of prompts for username, password, hostnames and whatever else the connector needs. But it lacks step-by-step validation. Once everything is configured, it either works or it doesn’t. But there are a few logical steps involved in a pipeline that could be validated independently:
- Is it able to reach whatever service it tries to connect to?
- Is it able to authenticate (if relevant) to that service?
- Is it able to get the schema for the data? What files (if relevant) did it find?
- Is it able to actually read and send data?
Each of these could be a little red or green checkmark as part of the configuration, and some very similar checks could be implemented for targets. There is only one issue: Taps and targets aren’t really designed to do this.
Cloud login š
The Meltano Cloud login experience was sublime. You could log in explicitly with meltano-cloud login
or implicitly by running any other command. The authentication was tied to Github, so all that happened was that an OAuth window popped up and told you everything was fine. I have never experienced anything as good.
Why can’t GUIs play nice? š
There shouldn’t be any conflict between GUIs and code. Some earlier versions of Meltano did just that: You could configure it interactively via the CLI, interactively via a GUI (local web server) or directly via YAML.
There are some tools that does this, at least to some extent. The Azure portal can at any time give you the ARM code (json) to create any of the things you have spun up using clickOps in the portal. Azure Data Factory (ADF) stores code in Git, and lets you version control it in normal places like github. Oracle Data Integrator (ODI) has Git support, and can represent your entire project in XML files.
Although nominally a step in the right direction, these implementations are far from where we want to be. The Azure ARM example creates the ARM in isolation, it doesn’t update your existing IaC repo with the new changes you did manually. And the changes you did are already applied, they didn’t wait for a PR approval. The ADF and ODI approaches leaves code that is cryptic (ADF) and incomprehensible (ODI). I wouldn’t even know what to do with such a PR.
Does sales dictate software design? š
I come back to the question of easy onboarding. Code, PRs and reproducibility does not contribute to easy onboarding. On the contrary, adding GitHub integration is an annoying first step when what you really want to integrate is the Oracle database. Meltano Cloud was the only SaaS I know that was designed around the same principles that we use for normal software development.