Personal Dungeon Master
What It Is
The Personal Dungeon Master is a locally-run AI that guides players through Dungeons & Dragons 5th edition adventures — tracking the state of the world, remembering what has happened, adjudicating dice rolls, and responding to player decisions with narrative consistency. Everything runs locally via Ollama, with no cloud APIs and no game data leaving the machine.
The system is designed around the concept of interchangeable DM personalities — rather than a single fixed voice, the idea is a library of named personalities that each deliver the same campaign in a subtly different way: different tone, pacing, narrative style. The probabilistic nature of the underlying LLM means that even the same personality running the same campaign will produce a different playthrough each time.
Why I Built It
I've been playing D&D since 2022 with a group of close friends from high school — we meet virtually on Discord and have run a range of campaigns, two of which I've DM'd myself. The game has always sat at an interesting intersection of storytelling, improvisation, and memory. A good DM tracks an enormous amount of state: the motivations of every NPC, the consequences of past decisions, the lore of the world, the rules of the system — and weaves it all into a coherent, responsive narrative in real time.
This project wasn't built to replace that human experience. It was built because D&D turned out to be a genuinely interesting context in which to explore memory systems. The problem of "how does a DM remember and apply what happened three sessions ago" maps almost directly onto questions I care about in AI: episodic memory, context management, the difference between recalling a fact and understanding its narrative significance.
Merging work and personal passions into a single engineering problem is, frankly, just a lot of fun.
Key Features
-
Memory
Persistent World State — NPCs, locations, quest threads, relationships, and player history are stored and retrieved across sessions. Currently experimenting with both RAG and knowledge graph approaches to find what works best for the combination of context recall and factual retrieval that a DM requires.
-
Dice Engine
External Forcing Function — Rather than letting the LLM narrate outcomes freely, the dice engine introduces a hard break in the pipeline. A seeded random result is generated outside the model and handed to it as a constraint — forcing the DM to respond to a specific value in a specific scenario rather than simply choosing whatever outcome serves the story. This keeps the game honest and prevents the LLM from quietly smoothing over failure.
-
Personalities
Interchangeable DM Voices — The system is designed to support multiple named DM personalities, each with distinct tone, narrative style, and pacing. The same campaign run under different personalities should feel genuinely different — and thanks to the probabilistic nature of the LLM, no two runs will ever be identical.
-
Rules
D&D 5e Mechanics — Ability checks, saving throws, combat initiative, spell slots, conditions, and death saving throws are handled according to the 5th edition ruleset, keeping the system grounded in the actual game rather than a loose approximation of it.
-
Spoiler Protection
Context-Aware Revelation — In development. The goal is to use the player's current position in the campaign to determine what information is retrievable, so the DM only surfaces what the character would plausibly know — preserving mystery and making discoveries feel earned.
Current State & Challenges
The project is currently in active experimentation with example campaigns rather than full playthroughs. The memory architecture is the central unsolved problem — RAG and knowledge graphs each have different strengths, and the right approach for a DM use case (which demands both fuzzy narrative recall and precise factual lookup) isn't obvious. This will continue to evolve.
The failure modes I'm actively working through are telling: the DM will occasionally force the storyline in directions that ignore recent player choices, fabricate details that contradict established world state, or drop context from just a few exchanges ago. These aren't random bugs — they're predictable properties of small local models under long-context pressure, and solving them is exactly the memory architecture problem this project exists to explore.
The Connection to ORION
This project feeds directly into ORION, my personal robot project. The DM problem is a compressed, testable version of the robot's memory problem: how much of a limited context window should be dedicated to recent conversation, how much to stored facts, how much to general world knowledge — and how should all of that update dynamically as events unfold?
Running these experiments in the context of a D&D campaign gives me a concrete, evaluable environment: I can tell when the DM has remembered something correctly, when it's confabulated, and when it's lost the thread. That kind of clear feedback is harder to get when testing memory systems in the abstract.
What's Next
The eventual goal is a polished standalone interface: a selection of campaigns to choose from, a roster of named DM personalities to pick between, and a play experience that feels genuinely replayable. Because the LLM is probabilistic and the personalities are distinct, the same campaign under a different DM — or even the same DM on a different day — should produce a meaningfully different story.
It's not something I'd use as a substitute for my actual D&D group — that would miss the point of the game entirely. But as a solo experience, a testing ground, and a genuinely fun engineering problem that sits at the intersection of AI, memory, and something I care about? That's exactly what it's for.