Why Long Tasks Break AI

The long-horizon problem is the most expensive secret in the AI industry right now. It isn’t a memory problem.

May 7, 2026 · 4 min read · By Pollyanna · Logocachexia series

Ask an AI to write you a paragraph. It does great.

Ask it to write you a 50-page report. It does okay.

Ask it to spend the next three weeks running a small marketing campaign for you, making decisions, adjusting based on results, remembering what worked Tuesday and applying it Friday — and watch it fall apart by day four.

This is the long-horizon problem, and it’s the most expensive secret in the AI industry right now. Every company demoing “agents” has it. Nobody has solved it. The startups quietly burn through millions trying to patch around it.

Here’s what’s actually happening.

A short task is a closed loop. You give the AI an input, it gives you an output, you check it, done. The AI doesn’t need to remember anything afterward. It doesn’t need to update its sense of what’s working. It doesn’t need to notice that the situation has shifted.

A long task is the opposite. You’re not asking for one output — you’re asking for a thousand small decisions, each one based on what came before, each one slightly adjusting the plan. The decisions aren’t independent. They’re a single continuous act of holding context, noticing change, and adapting.

Holding context like that isn’t a memory problem, even though everyone tries to solve it as one. Bigger context windows. Vector databases. Persistent storage. None of it works the way people hope, because the actual thing being held isn’t information.

It’s a sense of where you are.

A senior project manager, three weeks into a campaign, isn’t running a query against a database in her head. She has an updated, lived sense of the situation. She knows that the new intern is faster than expected, that the client is getting nervous, that last Wednesday’s data was probably a fluke, that the timing is wrong to push the new variant. None of this is written down. None of it can be. It’s in her, not next to her.

Current AI doesn’t have that. It has the words she would say if you asked her to describe the situation. But the underlying being-in-the-situation — the thing the words come from — isn’t there.


So what happens around day four is that the model starts producing sentences that sound like they’re tracking the campaign, but they’re actually drifting. The decisions get worse. The model forgets what it decided on Tuesday because it never really decided — it just generated a sentence that looked like a decision. By day ten, you’ve got a coherent-sounding chaos that no human would recognize as a real campaign.

This is why every “AI agent” demo looks amazing for ten minutes and falls over by hour three.

The fix everyone wants is technical. Better memory. Bigger context. Smarter retrieval. Maybe these help at the margins. But the underlying issue isn’t technical. It’s that running a real campaign for three weeks requires the kind of slowly-built sense of a situation that humans get from doing it many times, and current AI doesn’t have a path to that.

The honest version of “AI agents” right now is: AI is great for short, well-defined chunks of work. For anything that needs to be held over time, you still need a human to hold it.

That’s not a bug to be fixed in the next release. That might just be what AI is, until something fundamental changes.

Part of the Logocachexia series at Nous. The argument that fluent text can’t substitute for slowly-built judgment is laid out in full in Hexis Asks, Logos Guesses. The long-horizon failure is what that gap looks like across time.

Continue the series.

The Logocachexia thesis — and the longer arc of the work — lives at Logos.

Visit Logos →