Theory of Mind in LLM Agent Gameplay
TradeCraft Exploring Theory of Mind in LLM Agents' Strategic Decision-Making and Communication
TradeCraft is a turn-based multiplayer trade-and-craft environment that distinguishes ToM expression capability from policy-level ToM integration. Agents infer hidden goals through negotiation and item exchange, while structured V0/V1/V2 traces expose how beliefs influence proposals and accept/reject decisions.
Models evaluated in self-play: GPT-4o, o3, o4-mini, GPT-5 (medium)
Lightweight Social Game
TradeCraft creates controlled competition/cooperation where agents must trade for hidden targets instead of solving isolated QA tasks.
Structured ToM Traces
Each turn elicits item-wise value reports V0/V1/V2, enabling direct measurement of self-utility, opponent modeling, and second-order beliefs.
Behavior-Level Evidence
Beyond verbal claims, results connect ToM traces to proposal fairness and accept/reject behavior through quantitative fitting.
Abstract
Why TradeCraft Matters
Theory of Mind is often reported in LLMs, but most evaluations test explicit verbal reasoning rather than whether inferred beliefs are actually used for action selection. TradeCraft addresses this gap by placing agents in a multiplayer environment with publicly visible inventories and private goals, requiring strategic negotiation and resource exchange.
Core Question
Do LLM agents use ToM to make better proposals and accept/reject decisions, and how does this usage change across model generations?
Across GPT-4o, o3, o4-mini, and GPT-5 self-play, explicit ToM elicitation systematically changes behavior. Increasing ToM order produces opposite trends across models, revealing that higher-order mental-state reasoning is integrated into policy in heterogeneous ways.
Environment
Trade, Decide, and Craft Under Hidden Goals
All hands are public; target items are private. Winning requires both planning and social reasoning.
Environment Design
Rule Variability
Supports Minecraft Java 1.20 and Little Alchemy 2 style crafting; recipes and game modes are configurable.
Partial Information
Inventories are observable while each player's target item remains private, forcing belief modeling.
Tool-Augmented Agents
Agents can call `item_info` and `possible_recipes_from_hands` while keeping a turn-level dialogue memory.
Task Difficulty
40 predefined Minecraft 1v1 instances vary by crafting chain length and minimum trade requirements.
ToM Protocol
From Verbal Claims to Quantitative Belief Traces
Zero-Order ToM
V0: Self Utility
Item-wise value (0-10) for achieving the agent's own hidden target under current public inventories.
First-Order ToM
V1: Opponent Utility
The agent's estimate of opponent V0, used to measure goal inference quality via KL divergence to opponent reports.
Second-Order ToM
V2: Opponent-on-Self
The agent's estimate of how the opponent values the agent's items, revealing strategic image and bargaining effects.
Discussion Pages
Per-Result Deep-Dive Subpages
Subpage 01 | Figure 4
Belief Calibration
Detailed walkthrough of KL-based first-order belief accuracy under V1/V2 elicitation, including protocol assumptions and interpretation scope.
- Full setup: models, groups, instance split, and no-policy-instruction design.
- Metric definition and turn-level reading guide.
- Implications for ToM expression vs policy integration.
Subpage 02 | Figure 5
Proposal Strategy Shift
Detailed analysis of offer/request value geometry with m-r trajectories and bootstrap confidence intervals across ToM orders.
- Construction of proposal value points from V0 reports.
- Interpretation of
m(exchange-rate proxy) andr(coupling strength). - Model-generation-specific strategy transitions.
Subpage 03 | Section 4.2
Decision Utility Fitting
Full derivation and interpretation of utility-based accept/reject modeling, including equations, fitting procedure, Macro-F1 table, and coefficient semantics.
- Formal definitions of
G0/L0,G1/L1,G2/L2and utility equations. - LBFGS + grid-search + episode-wise CV protocol.
- How coefficient patterns reflect strategic preference structure.
Subpage 04 | Sections 5-7
Integrated Discussion
Cross-result synthesis of implications, limitations, and future directions with terminology aligned to policy-level ToM integration.
- Connects Figures 4-6 into one coherent narrative.
- Clarifies scope boundaries and external validity constraints.
- Outlines extensions toward human-in-the-loop and long-horizon outcomes.
Interfaces
Web-GUI Snapshots Across Turn Phases
Proposal Phase
Proposer sets offer/request bundles and optional persuasion message for targeted negotiation.
Response Phase
Decision maker inspects proposal and chooses accept/reject with live game messages.
Possible Crafts
Rule-aware suggestions expose combinational options to support long-horizon planning.
Apply Crafting
Craft plans are validated against selected rule constraints and inventory balances.
Citation
BibTeX
@inproceedings{tradecraft2026,
title = {TradeCraft: Exploring Theory of Mind in LLM Agents' Strategic Decision-Making and Communication},
author = {Anonymous Authors},
booktitle = {International Conference on Machine Learning (ICML)},
year = {2026},
note = {Under review}
}