Energy-Based Causal Memory

A common assumption hides inside most software systems. Memory is treated like a box. Put something in, and it stays there until someone deletes it. The past is assumed to be stable by default.

Reality does not work like that. Keeping information is not free. It must be maintained. Forgetting is not a bug, it is the baseline.

To remember is to spend.

This is an architectural constraint, not a design choice.

Time is the cost of keeping a past

There is a price to irreversible operations. Erasing information, merging states, committing to a single history — all of these create a one-way arrow. That arrow is what shows up as time.

So memory cannot be treated as static storage. Memory is a process that continuously pays to preserve structure, and continuously forgets what is not reinforced.

This immediately reframes the goal. The goal is not "store everything forever." The goal is:

Preserve what matters, by paying for it.

Why current LLM "memory" feels broken

Large language models operate on a context window. That creates a definition of memory that is purely positional: what is inside the window is "present," what falls outside is "gone." The past is not metabolized — it is clipped.

When teams patch this with retrieval-augmented generation (RAG), the dominant move is similarity search: find text chunks that look related, paste them back into the window. This works for facts. It breaks for understanding.

Because similarity is not relationship.

Similarity says: these two things look alike. Memory should say: this led to this.

That difference is the entire problem.

Causality is Memory

Similarity is useful. It can wake the past. It can pull relevant material toward the present. But similarity is not what makes a memory a memory.

A memory is a causal structure laid down in time:

what co-occurred
what preceded what
what decision produced what consequence
what assumption broke where
what invariant survived

This is why real memory is not a list of stored facts. It is a graph of connections. The past becomes usable when it is linked, not when it is merely retrieved.

Causality is memory. Similarity is only the trigger.

The Quadratic Wall

There is a deeper cost behind context windows. When a model tries to attend to everything at once, the amount of pairwise interaction grows rapidly as the window grows (quadratic complexity).

That is the hidden tax of "just make the context bigger."

So there are two failures happening at once:

The past is clipped, so continuity breaks
When the past is expanded, the cost explodes

This is not a parameter problem. It is a memory architecture problem.

Infinite Context via Energy Allocation

The phrase "infinite context" is misleading if it sounds like infinite tokens. Infinite context means something else:

The full lived history can grow without bound
But only a bounded, energy-funded subset is active at any moment
What becomes active is determined by causal relevance, not by recency alone

So context becomes a dynamic state, not a transcript. The past is present only when it is earned — meaning when it is activated by the current situation and paid for by energy.

The Mechanics

Mechanism	What It Does
Energy Decay	All memories lose energy over time. Unused memories fade.
Causal Reinforcement	Connected memories share energy. Structure sustains structure.
Resonance	Current input adds energy to causally relevant past nodes.
Active Frontier	Only memories above energy threshold enter working context.

This is how infinite context works: bounded active frontier over unbounded archive.

The Memory Cycle

Each time step follows a natural rhythm:

Input — New information enters, creates a node with high energy
Wake — The input resonates with the archive, exciting relevant past nodes
Think — The active set (above threshold) is processed
Wire — Co-occurring nodes strengthen their connections (Hebbian learning: what fires together, wires together)
Cool — All nodes lose energy (decay)
Sleep — Nodes below threshold drop out of the active set

This is a self-organizing system. We do not write code to "delete" memories. We define the cost of energy, and the system naturally forgets what it does not use.

Structure Over Substance

Isolated facts are unstable. Facts connected to a larger structure are stable.

When "Physics" is active, it transfers energy to "Math" (connected neighbor), keeping it alive even if Math isn't directly mentioned. Isolated nodes ("fluff") lose energy and fade.

The labels of a thing are gauge. The relations are real.

This is why causal memory is a graph, not a list. The edges matter more than the nodes.

Mainland vs Tissue

Not all knowledge should compete inside the same pool.

Mainland: Stable truths — physics, mathematics, definitions, established constraints, conserved quantities, evidences that have survived repeated tests.

Tissue: Living, evolving narrative — conversations, projects, relationships, local assumptions, local goals.

If these are mixed naively:

Global concepts hijack local context (they're always "relevant" in some abstract sense)
Internally consistent but false worlds can become self-reinforcing

Energy-based memory treats this as a stability problem:

Mainland knowledge acts as constraints, not competitors for attention
Tissue forms, decays, and reconfigures under causal reinforcement
False worlds cannot become "immortal" just because they're internally coherent

Memory is not only about recall. Memory is about governance.

Why Trits Matter Here

A major reason memory systems stay noisy is that their substrate cannot represent "nothing" cleanly. Floating point representations treat "almost zero" as "something." That "something" still occupies space, still interacts, still burns energy.

With trits, each coordinate is one of three states:

Supporting evidence (+1)
Opposing evidence (−1)
Nothing at this resolution (0)

This makes "nothingness" a first-class state. In sparse form, "nothing" is literally free — it is not even written down. Energy is spent only on active distinctions.

This matters because energy-based memory is an energy allocation system. A representation that makes silence free is aligned with forgetting as the default.

The Collapse

Old View	New View
Memory is storage	Memory is metabolism
Past persists by default	Past decays unless paid for
Retrieval = similarity search	Similarity wakes, causality decides
More context = bigger window	More context = bounded frontier over unbounded archive
All knowledge in one pool	Mainland constraints + living tissue

From "how much can be stored" to "what deserves to stay alive."

Memory Respects Structure

Memory operations must be representation-independent:

Memory Operation	Requirement
Retrieval	Results depend on causal structure, not label encoding
Activation	What becomes "hot" is determined by structural relevance
Decay	What "cools" is based on lack of causal reinforcement
Consolidation	Stable truths are canonical, not representation-dependent

Two memory states are equivalent if they produce the same outcomes for all valid queries. This ensures memory doesn't create hidden channels through label-dependent retrieval.

Prerequisites: Trits: The Distinction Alphabet

Next: Gauge-Invariant Truth Machine (GITM) — See the full architecture

Time is the cost of keeping a past​

Why current LLM "memory" feels broken​

Causality is Memory​

The Quadratic Wall​

Infinite Context via Energy Allocation​

The Mechanics​

The Memory Cycle​

Structure Over Substance​

Mainland vs Tissue​

Why Trits Matter Here​

The Collapse​

Memory Respects Structure​