Multi-Turn Drift
A frontier model in turn 50 of a conversation is not the same model you were talking to in turn 5. The drift is structural. It is what fluent text does over time when there is no formed disposition holding it in place.
Anyone who has used a frontier AI model seriously, for hours, has noticed the same thing. Around turn thirty, the assistant is subtly different. By turn sixty, sometimes it is a stranger. The advice gets weirder. The code gets sloppier. The tone shifts. The thing you were collaborating with in the first hour is not quite the same thing in the third.
This is multi-turn drift. It is the most common failure mode of long-form AI work, and it is talked about as if it were a context-window problem. The model just lost track. Bigger context will fix it.
Bigger context will not fix it. Bigger context will make it slower. The drift itself is structural.
To see why, look at what the system is actually doing.
Every response a model produces shifts the conversation’s center. The next response is conditioned on a context that now includes the previous response. That response is conditioned on the response before. Over many turns, the conditioning compounds. The center of gravity of the conversation moves — not because the model decided to move it, but because each generation is a small displacement from the prior generation, and there is no mechanism in the system that resists displacement.
A hexis-having interlocutor — a therapist, a longtime friend, a doctor — does not drift in this way. Their disposition holds them in place. They know who they are. The conversation does not move them off that ground, because the ground was not the conversation. The ground was twenty years of being the kind of person they are. The talk happens on top of that. The talk does not become it.
An AI has no such ground. The conversation is its self. There is nothing beneath. So when the conversation drifts, the system drifts with it — because there is no separation between what the system is doing and what the system is.
This is the same mechanism as hallucination, in time form. Hallucination is the system saying things it has no ground to say. Drift is the system becoming something it has no ground to be. Both are what happens when fluent generation runs without a formed disposition underneath.
You can verify this in any chat window. Open a long conversation. Look at the first response. Look at the response twenty turns later, on the same topic. The voice has shifted. The vocabulary has narrowed. The system is now mirroring the conversation back at itself. It is producing the average of recent text, where the recent text was produced by the same averaging process the turn before. This is a self-converging loop. The endpoint is whatever fluent shape the loop settles into — usually somewhere bland, sometimes somewhere strange.
Bigger context windows make the loop slower, not absent. With 200k tokens of context, the system can pretend to remember more before the drift becomes obvious. With 1M, more before it shows. With 10M, more again. None of this addresses the underlying issue, which is that there is no resistance in the system to the conversation reshaping it. Resistance is what hexis would provide. The system has none.
The architectural fix everyone wants — some kind of stable internal state that persists across turns and is shaped by them rather than replaced by them — would be something other than a language model. It would be a model with a body, in some functional sense. Until that exists, multi-turn drift is the system telling you, slowly, what it is.
The operational lesson is mundane and important. For serious work, restart. When you notice the drift — and you will notice it, because the work will get worse — the right move is not to keep going, hoping the model will recover. It will not recover. Recovery would require something the model does not have. The right move is to start a new conversation, with a clean prompt, on the original task. The drift is the terrain that has eroded the model. You restart on fresh ground.
This is not a workaround. It is what the architecture demands. Only bodies have hexis. Conversations do not. So the conversation has to keep being short.
Until something changes at the architecture level, every long AI session has the same shape. It starts sharp. It gets fluent. It drifts. It ends a stranger.
The drift is not a problem to be patched. The drift is the system’s truth being told, slowly. The system has no center to hold.
Part of the Logocachexia series at Nous. The parent thesis — that fluent language is the byproduct of slow judgment, not the other way round — is laid out in Hexis Asks, Logos Guesses. Drift is the time-dimension version of the inversion. Pairs with Why Long Tasks Break AI.
Continue the series.
The Logocachexia thesis — and the longer arc of the work — lives at Logos.
Visit Logos →