The AI That Waters My Lawn Is Still on Probation

I have a new engineer on the team. It's a language model running on my home server, and its job is deciding whether my lawn needs water.

Like any new hire, it didn't get production access on day one. A few times a day it writes up the call it would make, and I review its work before it ever touches a valve. Its best decision so far: skip the watering, rain is coming.

The model is local, the stakes are a lawn, and it hasn't moved one drop of water yet. I built it that way on purpose.

The Job Description

Throughout the day, a small Go service on my home server hands the model an evidence packet: forecasts from multiple weather APIs, live readings from my own weather station wired into Home Assistant, and the current state of the Rain Bird sprinkler system, including any rain delay already in effect.

The weather station is the part I like. A forecast is somebody's prediction about my lawn; the station reports what actually fell on it. And when the forecasts argue with each other, the system trusts them less.

The model's job is to turn that evidence into a plan: whether to water today, when, which zones and for how long, or whether to call a rain delay and sit the day out.

It doesn't work alone, though. There's an org chart. Plain math drafts the baseline: a water balance built on the Hargreaves formula, which estimates how much water the lawn loses to sun and heat. Recent rain in, evaporation out, no AI involved.

Quick confession: Hargreaves wasn't my idea. I'd never have thought to use it, because I'd never heard of it — the AI helping me build this suggested it. It pointed me at math I could check, and that's the kind of help I'll take all day.

From that draft, the model proposes its changes, a safety clamp signs off on whatever comes back, and Home Assistant does the actual work.

No Keys on Day One

The model never toggles a switch, and that doesn't change when probation ends. Even at full trust, all it ever does is write a plan that Home Assistant executes. Right now, in dry-run mode, it doesn't even do that: every decision goes into a journal instead of onto the lawn.

The guardrails don't care who wrote the plan, either. The model's proposals pass through exactly the same clamp as the plain-math baseline: freeze protection, a hold for wet soil, a hold for recent rain, and caps on how often and how long anything can run.

If the model fails (garbled output, a timeout, an answer that isn't a plan), the plain math plan ships instead and the journal says so. The floor under this whole system is a formula I can check by hand.

It fails safe in the other direction too. If the model dies, or the home server all of this lives on goes down entirely, Home Assistant still holds the last good plan and runs it. Everything is visible and overridable from the Home Assistant app on my phone. The AI can be wrong, slow, or dead, and the sprinklers stay sane.

Reviewing the Homework

Probation only works if someone actually reviews the work.

Every run lands in the journal as one entry, and reading one feels like reviewing a junior engineer's pull request: the evidence they had, the call they wanted to make, and what review actually let through. The entry also records the baseline the math drafted and who made the final call, whether that was the AI, the fallback, or plain math.

The standout entry so far came from the model itself, a small local model running through Ollama on my own hardware. It looked at the evidence, saw rain on the way, and proposed skipping the watering entirely. The clamp agreed and let it stand. That's the same call I would have made standing in the driveway looking at the sky, except it made the call from data, on a schedule, without me.

One good decision isn't a track record, though. Before this goes live I want a body of work, not one lucky answer: sound water and no-water calls across enough runs that the journal gets boring to read.

The Standup Notes

A new engineer doesn't just get a ticket. They get context, the stuff you say at standup that never makes it into the requirements.

The model gets the same. Each zone carries its stable facts (what's planted there, how many sprinkler heads it has, how much sun it gets), plus a live condition note I can edit straight from my phone: "backyard grass looks burnt, needs extra water." The model reads those notes on every run.

A note can nudge a zone wetter or drier, but it can't break the rules. However dramatic my note gets, the clamp holds the plan to the same hard limits.

How Trust Gets Granted

When an engineer joins your team, you don't hand them production credentials with the badge. You give them a narrow lane and real problems, review what they produce, and widen the lane as the reviews keep coming back clean.

The model's probation works the same way: real inputs, real decisions, zero consequences, every piece of work reviewed. Validation here is a review queue with my name on it. It's the deal AI gets at my day job too, full speed on anything I can check and no vote on the things I can't.

I've been burned before by trusting an automation I never watched run, and once was enough. The rule I keep coming back to, in the backyard and at work: if you can't review an AI's decisions before they have consequences, you've granted trust it hasn't earned.

Go-live is still one config flip away: dry_run: false. The journal decides when that happens, and right now the journal says not yet.

The AI That Waters My Lawn Is Still on Probation

The AI That Waters My Lawn Is Still on Probation

The Job Description

No Keys on Day One

Reviewing the Homework

The Standup Notes

How Trust Gets Granted

Comments (0)

Leave a Comment

The AI That Waters My Lawn Is Still on Probation

The AI That Waters My Lawn Is Still on Probation ​

The Job Description

No Keys on Day One

Reviewing the Homework

The Standup Notes

How Trust Gets Granted

Comments (0)

Leave a Comment

The AI That Waters My Lawn Is Still on Probation