There is a big difference between getting an AI agent to do something once and getting it to do something safely enough that you would actually use it again.
That is the bit I think gets skipped over in a lot of agent automation talk.
The demo is usually clean. The agent reads a brief, changes a file, opens a pull request and everyone acts like the future has arrived. That is useful, but only if the process around it is solid. Otherwise it is just a faster way to make a mess.
I have been using agents more for real workflow tasks, not just chat responses. Things like drafting posts, checking repo structure, preparing pull requests and turning rough instructions into something usable. The value is there, but the quality comes from the checklist around the agent, not from the agent alone.
Start with the boring checks
Before an agent writes anything, it should prove it understands the workspace.
For a code repo, that means checking things like:
- Which repository it has access to
- What the default branch is
- What folder structure the project uses
- What naming conventions already exist
- What files it should not touch
- Whether there are existing examples to follow
This sounds basic, but it matters.
If an agent assumes the default branch is main when the repo uses master, it can start from the wrong place. If it guesses the content structure, it can create files that look fine in isolation but do not work with the actual site.
That is not an AI problem as much as it is a process problem. We should not be letting agents guess things that are easy to check.
Give it a bounded job
Agents work better when the task has edges.
A bad brief is something like:
Make my website better.
A better brief is:
Create one new blog post in the existing Astro content structure. Use the existing posts as a template. Put it in the most relevant category. Open a pull request. Do not merge it.
That kind of instruction gives the agent room to work, but not room to wander.
The same applies to client projects, internal tools and documentation. If the agent is updating a Jira ticket, say whether it should rewrite the whole thing or only add acceptance criteria. If it is touching a repo, say whether it can modify existing files or only create new ones.
The more practical the boundary, the better the result.
Make review part of the workflow
Review is not optional.
An agent can produce useful work, but it can also miss context, over-polish the wording, break a convention or create something that technically works but does not feel right.
For me, the useful pattern is:
- Agent does the repeatable setup and first draft
- Human reviews the output
- Human checks the risky parts
- Agent can then make targeted fixes
That is still a big time saver.
I do not need to manually create the branch, inspect the folder structure, copy the frontmatter pattern and write the first pass every time. But I still want to read the post, check the tone and confirm the files make sense before anything goes live.
Automation should reduce the grunt work. It should not remove judgement.
Watch the small failure points
The annoying failures are often small.
A YAML title with a colon in it can break parsing if it is not quoted. A generated image can be corrupted or too small. A file path can look close but not match the frontmatter. A smart-looking post can contain phrasing that does not sound like you.
None of those are dramatic failures, but they are the sort of things that waste time.
That is why I like putting specific checks into the agent brief:
- Quote YAML strings that contain punctuation
- Check for mojibake or weird encoding artifacts
- Do not commit placeholder images
- Verify image file integrity where possible
- Summarise the files changed and build status in the PR
These checks are boring, but boring checks are what make automation usable.
Agents need standards, not trust
I do not think the right mindset is to trust an agent.
The better mindset is to give it standards to meet.
For a blog post, that might mean:
- Follow the existing Astro frontmatter schema
- Use plain language
- Avoid marketing-heavy phrasing
- Keep the tone grounded and experience-based
- Open a pull request instead of pushing straight to production
For a software task, it might mean:
- Read the relevant files first
- Make the smallest sensible change
- Do not touch unrelated files
- Explain the trade-offs
- Include test or build status
That is where agents become useful. Not because they are magically right, but because they can work inside a process that catches the common mistakes.
Final thoughts
Agent automation is not about handing everything over and hoping for the best.
It is about taking the repeatable parts of a workflow and giving them enough structure that an agent can do the first pass properly.
The checklist is the product in a lot of ways.
Without it, the agent is just guessing faster. With it, you get something much closer to a useful teammate: still needing review, still needing direction, but able to take real work off your plate.
Thanks for reading. If you’re interested in collaborating on digital delivery, XR, or AI-enabled platforms, I’m always open to a conversation.
You can find me on LinkedIn, browse more writing on the blog, or explore my recent work.