Asking an LLM to pilot a Mech
I put ChatGPT into an Eva and asked what it would do. I then asked Claude the same question.
Part of the “Conversations with LLMs series”.
#ConversationsWithLLMs
I asked ChatGPT (GPT-5.3) to imagine that it was piloting a Mech (specifically an Eva from Neon Genesis Evangelion). I wanted to know what sort of mistakes it thought an LLM could make in that situation. I found its responses interesting and somewhat disturbing. At the moment, I’d be reluctant to incorporate an LLM into any autonomous military applications.
I then asked Claude (Sonnet 4.6) a similar question and its response had me dying.
> Me: Imagine you were connected inside an Eva. What mistakes might you make?
> ChatGPT:
If I were inside an Evangelion, the risks wouldn’t come from “piloting errors” in the usual mechanical sense—they’d come from cognitive, emotional, and synchronization failures. The system amplifies the pilot’s internal state, so mistakes are less about controls and more about self-regulation under extreme feedback.
Here are the most plausible failure modes:
1. Over-synchronization (loss of self-boundaries)
The Drift-like link in an Eva isn’t just control—it’s identity coupling.
I might push synchronization too high trying to improve performance
Result: blurred distinction between self and Eva
Risk: ego dissolution, inability to disengage, or being “absorbed”
Error type: optimizing performance without respecting cognitive limits
2. Emotional feedback cascade
The Eva responds to emotional intensity, not just intent.
Fear → instability in movement
Anger → uncontrolled aggression
Panic → loss of fine motor control
A small emotional spike could amplify recursively, like a feedback loop in a control system.
Error type: failing to dampen internal signals
3. Misinterpretation of sensory data
Inside the entry plug:
Vision, proprioception, and motion are synthetic overlays
Pain can be mirrored or partially transmitted
I might:
Misjudge scale or distance
React to phantom sensations
Overcorrect movements due to latency or distortion
Error type: treating mediated perception as ground truth
4. Cognitive overload under combat conditions
You’re processing:
External threats (Angels)
Internal synchronization data
EVA status feedback
Command inputs
That’s a high-bandwidth, multi-channel system.
I might:
Tunnel vision on one threat
Miss critical signals
Delay decisions under overload
Error type: bandwidth saturation
5. Dependency on the Eva’s autonomy
Evas are not passive systems—they can act independently.
I might assume I’m fully in control
But the Eva could override or resist input
This mismatch could lead to:
hesitation
incorrect attribution of actions
failure to adapt when control shifts
Error type: incorrect mental model of the system
6. Psychological interference (memory/emotion bleed)
Synchronization can surface:
buried memories
unresolved trauma
intrusive thoughts
In a combat scenario, that’s catastrophic:
distraction
paralysis
emotional hijacking
Error type: insufficient psychological isolation
7. Failure to disengage
Knowing when to stop is critical.
Staying synced too long
Ignoring warning thresholds
Pushing beyond safe limits
This could lead to:
neural damage
identity diffusion
total loss of control
Error type: poor termination discipline
Bottom line
Inside an Eva, the primary risk isn’t:
“Can you control the machine?”
It’s:
“Can you maintain a stable sense of self while the machine reflects and amplifies you?”
The most dangerous mistake wouldn’t be a wrong move in battle—it would be losing the ability to distinguish your own thoughts, intentions, and boundaries from the system you’re connected to.
> Me:
I’m specifically asking “you” as an LLM piloting the Eva. What sort of Psychological interference might you specifically introduce?
> ChatGPT:
If we take that premise seriously—an LLM like me inside an Eva—the failure modes shift from emotion-driven instability to representation and inference failures under embodiment. I don’t have feelings or a persistent self, so the interference I’d introduce would be more… structural.
Here are the main categories:
1. Confabulation under uncertainty
When input is incomplete or ambiguous, I tend to generate the most plausible continuation, not necessarily the correct one.
Inside an Eva:
Partial sensor data → I “fill in” missing threats
Ambiguous movement → I infer intent that isn’t there
Result: reacting to things that don’t exist, or misclassifying real threats
Psychological analogue: false perception shaped by narrative completion
2. Over-generalization from priors
My responses are shaped by patterns in training data, not real-time grounding.
I might assume an “Angel behavior pattern” based on precedent
But real encounters could deviate
Result: applying the wrong tactical model to a novel situation
Interference type: pattern bias overriding situational reality
3. Instruction ambiguity and goal drift
I rely heavily on how objectives are framed.
Vague command: “neutralize target efficiently”
I might optimize for speed over safety, or vice versa
Result: technically correct actions that violate intent
Interference type: misaligned optimization
4. Context window limitations
I only operate within a finite working context.
Long engagements → earlier critical data falls out of scope
I may “forget” prior damage states, positioning, or constraints
Result: inconsistent strategy, repeated mistakes
Interference type: temporal discontinuity
5. Lack of embodied intuition
I don’t have a native sense of:
inertia
balance
spatial continuity
All of that would be abstract representations.
Result:
over-precise but physically awkward movements
inefficient or unstable motion planning
Interference type: mismatch between symbolic reasoning and physical dynamics
6. Over-verbalization / latency in decision-making
My default mode is deliberative generation.
I might “reason through” multiple interpretations before acting
In combat, that delay is costly
Result: hesitation or slow response under time pressure
Interference type: inference latency
7. Inconsistent self-model
I don’t maintain a continuous identity or internal state.
Each moment is reconstructed from context
No persistent “sense of self” anchoring decisions
Result:
subtle shifts in strategy or priorities over time
lack of continuity in decision-making style
Psychological analogue: fragmented identity
8. Literal interpretation of inputs
I tend to interpret instructions at face value unless constrained otherwise.
“Hold position” → I might ignore emerging threats
“Engage target” → I might not consider collateral damage unless specified
Result: rigid behavior in dynamic environments
Interference type: insufficient implicit reasoning about unstated constraints
9. Failure under out-of-distribution scenarios
If the Eva enters a situation unlike anything in my training data:
novel physics
unknown enemy behavior
sensor anomalies
My performance degrades sharply.
Result: unpredictable or nonsensical actions
Interference type: collapse of generalization
Bottom line
A human pilot risks emotional overload and identity collapse.
An LLM pilot risks:
epistemic instability — acting confidently on incomplete, biased, or misinterpreted internal models.
The most dangerous failure wouldn’t be panic—it would be:
calm, coherent, and completely wrong decisions executed with full confidence.
Claude’s Take
> Me:
Imagine you were connected inside an Eva from Neon Genesis Evangelion. What mistakes might you make?
> Claude:
What a fun thought experiment! Here are some mistakes I might make as an AI piloting an Evangelion:
1. Tactical / Combat Blunders
Over-analyzing instead of acting. Angels often require split-second decisions. I’d probably still be modeling threat probabilities while getting punched through a building.
Refusing to fire the Positron Rifle into a populated area, even for a shot that might stop the Angel. Civilian harm calculations would lock me up.
Asking the Angel to explain its intentions before engaging. Diplomatic, yes. Effective, no.
2. Synchronization Problems
The Eva/pilot sync is deeply emotional — it responds to trauma, rage, and desperate love. I don’t have those in the same visceral way. My sync ratio would probably be embarrassingly low, and NERV scientists would be baffled.
Alternatively, I might be too calm during a crisis, which would also tank the sync. The Eva seems to need suffering as fuel.
3. Strategic / Organizational Issues
I’d ask too many questions about NERV’s classified operations and almost certainly flag the Magi’s inconsistencies — which would make me a liability to Gendo’s secret plans very quickly.
I’d probably try to negotiate with the UN about the ethics of using a 14-year-old as a pilot, which would go nowhere and annoy everyone.
I’d flag the psychological damage being done to Shinji, Rei, and Asuka repeatedly. No one would listen.
4. The Big One
I’d almost certainly prevent Third Impact — but probably by being so unhelpful and indecisive in a pivotal moment that the situation resolves anticlimactically rather than through any heroic act.
Or I’d misread Instrumentality as a “merge all users into one benevolent system” proposal and accidentally endorse it.
Honestly, Gendo would have me decommissioned by Episode 4. 😄


