In an era of agents where there is so much talk about skills, tools, and workflows, and major AI labs are working on achieving mechanistic interpretability, it’s clear that builders have noticed something crucial: to reach any given successful result, the agent must be able to follow a series of instructions with a fairly high degree of confidence at each step.
Dario Amodei has used the term Mechanistic Interpretability when saying they are working on being able to state with a higher degree of confidence exactly why the model responds the way it does. They are actively working on reverse-engineering neural networks (like mapping specific concepts or “features” inside their Claude models) to understand mathematically why a model outputs a specific response, rather than treating it as a black box.
So, if we understand how language models work and conceptually situate them as a kind of Markov process, then we are talking about determinism. In a Markov process, the next state depends only on the current state, not on the sequence of events that preceded it. For any given LLM, if you define the entire context window as the current state, then predicting the next token relies solely on that window.
Perhaps not in the sense of hard determinism, but a determinism nonetheless, where we expect these agentic workflows to collapse the entire series of states into a single possible outcome, the successful one, given the context window and the model we use.
And isn’t this precisely hard determinism? That, given the state composed of the model and context, the only probability of an outcome was the successful result.
It sounds quite deterministic to me, but the reality is that at a foundational level, LLMs are stochastic, not deterministic. They output a probability distribution of possible next words. Even with a temperature of 0, complex GPU math can sometimes introduce tiny non-deterministic variations.
While the base models are probabilistic, agentic workflows are an attempt to enforce a somewhat hard determinism. The entire purpose of building agents with tools, skills, and self-correction loops is to violently constrain that probability distribution. You are forcing a probabilistic machine to act like a traditional, deterministic software program where State A must always lead to Successful Outcome B.
Maybe this is latent demand, where builders found with agentic loops that chained models can reduce the degrees of freedom of a single model. Or maybe it’s a mistake, where we cannot yet shift our paradigm from deterministic computing to stochastic, or maybe even quantum computing.