We've been getting a very consistent version of the same pushback from automation engineers. Recorders are toys. You can't build a real test suite with one. The moment the flow gets complex you're stuck. No-code has been tried and it always flops. All of that is fair. Most of the no-code tools those engineers have seen were bad, and most of the pitches around them were worse. Treating a recorder as a serious tool in 2022 is still, at some level, an act of faith.
What I wanted to do with this post is write out what I actually think about the trade-offs, because I don't think either side of the argument is being honest about the shape of it. Most of the recorder-versus-code debate, I think, is a fake fight. The two tools are strongest at different things. Using the right one for the situation your team is actually in will mostly make the question go away, the way good tooling tends to fade into the background once you've committed to it.
There are parts of the job where code is straightforwardly better, and I have no interest in pretending otherwise. Anything that needs real programmatic composition, conditional branches, loops over a data fixture, a bit of computation between steps, is something a scripting language can do and a GUI can't. Review is the other clear one. A test in code is a diff in a pull request, and another engineer can read it and push back on a brittle locator before it ships. A recorder's output is, at best, a text representation of steps that doesn't really map to code-review habits. Copy-paste I'll concede without a fight too. When you're authoring the tenth variant of a similar flow, starting from an existing test and editing it is faster than re-recording, and a recorder can't really fix that.
A recorder genuinely wins in two places. The first is onboarding. A new engineer on a code-based suite has to learn the team's page-object conventions, plus whatever helpers the team has built up over the years. On a recorder there's a lot less to pick up before you're useful. The second is how tests hold up over time. I wrote about this in the previous post, but the short version is that if the tool doesn't export code, the tests don't rot when a browser API changes underneath them.
Then there's a larger middle where the honest answer is depends.
Whether the recorder or code is faster to author comes down to whether you're writing a novel flow for the first time, which favours the recorder, or you're on your hundredth variant of something you already have, which favours code because you can copy and edit. Updating a test after a UI change is fast on the code side if you've built a page-object layer that isolates selectors, and if you have, a one-line commit usually beats clicking through the recorder again. Flakiness shows up on both sides in different shapes. On the code side it's most often a bad wait or a locator that was always going to break. Recorders have their own version of this, which is the platform's element-matcher picking something a human would not have picked. Neither of those is obviously harder to diagnose than the other; the instincts are just different. Team scaling comes back to who's on the team, which means engineers get more out of code and teams with mixed-skill QA get more out of a recorder. Debuggability in CI is a wash at this point, and everyone has invested roughly equally in it.
The big caveat cuts across the whole thing. A team that actually invests in its test code, with someone who owns the conventions and a culture that reviews tests the way it reviews product code, will beat our recorder on most of the axes I've listed, including the ones I put in the recorder-wins column. How tests hold up over time is the only axis where the structural argument against code still holds, and even there a serious team can keep a code suite healthy for a long time.
The catch is that that team is rare. The common reality is the codebase that's twice the size of the app, a coverage number that only goes down, and a long tail of @skip-annotated tests nobody wants to own. That's the suite a recorder is actually being compared against.
If you're an automation engineer with a decade of experience and a codebase you've kept clean, Mockingjay is probably not for your team, and I'd cheer for what you've built. The tool is for the teams that don't have what you have and don't have a realistic path to getting there, trying to decide whether to keep paying the tax of the bad codebase or do something structurally different.
What to pick is usually more boring than the argument makes it sound. Look at what your team can actually maintain, and pick the tool that fits.