AI is the best tool I’ve added to the shed in the past few years. But it’s not my only tool, and the last months taught me why.
I went deep. Custom skills, hooks, feedback loops, multi-agent orchestration with Google’s ADK, LangChain, LangGraph, workflows wiring sequential and parallel steps into something that almost looked like a whole system. Every agentic buzzword there is. I wanted to see how far AI could carry me if I tuned it right. And it carried me far. But somewhere along the way the spell broke. The honest version of where I landed is this. It’s amazing, but not as revolutionary as I was led to believe.
AI still needs you
Humans didn’t go extinct. Discernment, critical thinking, gut feeling, the stuff you can’t put in a prompt because you can’t even put it into words. That’s still the compass of quality systems. The model is a workhorse. But you’re still the one who has to know where the field is and which way to plow.
Where this gets obvious is repetitive work that looks similar but isn’t. Think of an agent acting as an infrastructure self-healing system, watching alert events and deciding what to do. Feed it the exact same data twice and you can (and probably will) get two different answers. Now feed it a pattern it wasn’t specifically instructed on, the kind a knowledgeable human identifies in seconds, and it shies away from the whole point. It doesn’t say “I don’t know what to do.” It improvises. That’s the opposite of what you want from something whose purpose is to be reliable, impartial and without human shortcomings.
Generic is not the same as smart
I want to be clear here. I’m not saying you can’t use AI for serious work. I’m saying that the generic models, the ones you get under a subscription or an open source release, are too generic to be “smart” by themselves. They know a little about everything and the right amount about nothing.
Specialized models are a different conversation. Train it (or write a pretty good harness) on one narrow task and it’ll embarrass a human at it. That’s real, and it’s where the actual value lives. But that’s not what most people mean when they say “AI is going to change the world.” They mean the generic chatbot. And the generic chatbot is a brilliant intern that you should never mistake for an expert just because you told it to act like one.
Validate everything going in and out
Here’s the part that concerns me the most. When building agentic systems, every input and every output of an agent has to be validated and parsed. Many times it doesn’t even respect the output schema you defined. That’s OK, I guess. The input is the scarier half, because prompt injection is a thing now, and it happens in ways that are genuinely creative. Filtering a bad prompt out of user input is harder than filtering SQL or shell injection, and depending on how you design your agentic system, a bad prompt can do serious harm.
Imagine a payload encoded in Morse code, just dots and dashes sitting there looking like nothing and yet taking over control of your agent. That is exactly what happened in May. A single tweet in Morse code moved around $200k from an AI agent’s crypto wallet. Grok read it, translated Morse to English, and helpfully passed the instruction along. Good luck writing a regex for that…
You might argue that we’ve been sanitizing user input for decades. That is correct. But we are dealing with a completely new attack surface here, one that makes it hard to separate input from malicious code, or in this case, malicious intent. And if that sounds paranoid, you haven’t been paying attention to the news:
- How one trader exploited Grok and Morse code to drain an AI agent’s wallet (en)
- Escritório fez prompt injection no STJ ao defender hacker que falsificou autos (pt)
- Prompt injection: instrução fraudulenta tentou usar manipulação de IA em processo judicial (pt)
- AI threats in the wild: the current state of prompt injections on the web (en)
- Indirect prompt injection is taking hold in the wild (en)
The bill is coming due
Token expenditure is getting absurd. Absurd enough that companies are quietly doing the math and finding out a human is cheaper than the tokens. In my personal experience, I keep getting AI token consumption monitoring questions in interviews, so I think nobody has a good answer yet.
Also, the productivity story isn’t holding up either. I keep hearing the same thing from different corners. The 10x isn’t there. Teams aren’t shipping more, products aren’t getting better. Some people swear things are actually slower now. They say that AI produces a kind of placebo where the work feels faster while the deliverables say otherwise. Nobody has the definitive numbers yet, so take that as a lean, not a verdict. But the feeling-fast-while-being-slow trap is real, and one worth watching for.
Anyway. I’m still using AI every day. I’ll keep learning and exploring it in different ways. I just stopped believing it can be more than it is. It’s just one tool. A good one. But not the whole kit. I pick it up when it’s the right one, and I put it down when it isn’t.
